OpenAdapt
AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models
Stars: 851
OpenAdapt is an open-source software adapter between Large Multimodal Models (LMMs) and traditional desktop and web Graphical User Interfaces (GUIs). It aims to automate repetitive GUI workflows by leveraging the power of LMMs. OpenAdapt records user input and screenshots, converts them into tokenized format, and generates synthetic input via transformer model completions. It also analyzes recordings to generate task trees and replay synthetic input to complete tasks. OpenAdapt is model agnostic and generates prompts automatically by learning from human demonstration, ensuring that agents are grounded in existing processes and mitigating hallucinations. It works with all types of desktop GUIs, including virtualized and web, and is open source under the MIT license.
README:
Read our Architecture document
Join the Discussion on the Request for Comments
See also:
- https://github.com/OpenAdaptAI/SoM
- https://github.com/OpenAdaptAI/pynput
- https://github.com/OpenAdaptAI/atomacos
OpenAdapt is the open source software adapter between Large Multimodal Models (LMMs) and traditional desktop and web Graphical User Interfaces (GUIs).
Early demos (more coming soon!):
- https://twitter.com/abrichr/status/1784307190062342237
- https://www.loom.com/share/9d77eb7028f34f7f87c6661fb758d1c0
Welcome to OpenAdapt! This Python library implements AI-First Process Automation with the power of Large Multimodal Modals (LMMs) by:
- Recording screenshots and associated user input
- Aggregating and visualizing user input and recordings for development
- Converting screenshots and user input into tokenized format
- Generating synthetic input via transformer model completions
- Generating task trees by analyzing recordings (work-in-progress)
- Replaying synthetic input to complete tasks (work-in-progress)
The goal is similar to that of Robotic Process Automation, except that we use Large Multimodal Models instead of conventional RPA tools.
The direction is adjacent to Adept.ai, with some key differences:
- OpenAdapt is model agnostic.
- OpenAdapt generates prompts automatically by learning from human demonstration (auto-prompted, not user-prompted). This means that agents are grounded in existing processes, which mitigates hallucinations and ensures successful task completion.
- OpenAdapt works with all types of desktop GUIs, including virtualized (e.g. Citrix) and web.
- OpenAdapt is open source (MIT license).
| Installation Method | Recommended for | Ease of Use |
|---|---|---|
| Scripted | Non-technical users | Streamlines the installation process for users unfamiliar with setup steps |
| Manual | Technical Users | Allows for more control and customization during the installation process |
- Press Windows Key, type "powershell", and press Enter
- Copy and paste the following command into the terminal, and press Enter (If Prompted for
User Account Control, click 'Yes'):Start-Process powershell -Verb RunAs -ArgumentList '-NoExit', '-ExecutionPolicy', 'Bypass', '-Command', "iwr -UseBasicParsing -Uri 'https://raw.githubusercontent.com/OpenAdaptAI/OpenAdapt/main/install/install_openadapt.ps1' | Invoke-Expression"
- Download and install Git and Python 3.10
- Press Command+Space, type "terminal", and press Enter
- Copy and paste the following command into the terminal, and press Enter:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/OpenAdaptAI/OpenAdapt/HEAD/install/install_openadapt.sh)"
Prerequisite:
- Python 3.10
- Git
- Tesseract (for OCR)
- nvm (node version manager)
For the setup of any/all of the above dependencies, follow the steps SETUP.md.
Install with Poetry :
git clone https://github.com/OpenAdaptAI/OpenAdapt.git
cd OpenAdapt
pip3 install poetry
poetry install
poetry shell
poetry run postinstall
cd openadapt && alembic upgrade head && cd ..
pytest
See how to set up system permissions on macOS here.
Run this in every new terminal window once (while inside the OpenAdapt root
directory) before running any openadapt commands below:
poetry shell
You should see the something like this:
% poetry shell
Using python3.10 (3.10.13)
...
(openadapt-py3.10) %
Notice the environment prefix (openadapt-py3.10).
Run the following command to start the system tray icon and launch the web dashboard:
python -m openadapt.entrypoint
This command will print the config, update the database to the latest migration, start the system tray icon and launch the web dashboard.
Create a new recording by running the following command:
python -m openadapt.record "testing out openadapt"
Wait until all three event writers have started:
| INFO | __mp_main__:write_events:230 - event_type='screen' starting
| INFO | __mp_main__:write_events:230 - event_type='action' starting
| INFO | __mp_main__:write_events:230 - event_type='window' starting
Type a few words into the terminal and move your mouse around the screen to generate some events, then stop the recording by pressing CTRL+C.
Current limitations:
- recording should be short (i.e. under a minute), as they are somewhat memory intensive, and there is currently an open issue describing a possible memory leak
- the only touchpad and trackpad gestures currently supported are pointing the cursor and left or right clicking, as described in this open issue
To capture (record) browser events in Chrome, follow these steps:
-
Go to: Chrome Extension Page
-
Enable
Developer mode(located at the top right):
- Click
Load unpacked(located at the top left).
- Select the
chrome_extensiondirectory:
- You should see the following confirmation, indicating that the extension is loaded:
- Set the flag to
trueif it is currentlyfalse:
-
Start recording. Once recording begins, navigate to the Chrome browser, browse some pages, and perform a few clicks. Then, stop the recording and let it complete successfully.
-
After recording, check the
openadapt.dbtablebrowser_event. It should contain all your browser activity logs. You can verify the data's correctness using thesqlite3CLI or an extension likeSQLite Viewerin VS Code to opendata/openadapt.db.
Quickly visualize the latest recording you created by running the following command:
python -m openadapt.visualize
This will generate an HTML file and open a tab in your browser that looks something like this:
For a more powerful dashboard, run:
python -m openadapt.app.dashboard.run
This will start a web server locally, and then open a tab in your browser that looks something like this:
For a desktop app-based visualization, run:
python -m openadapt.app.visualize
This will open a scrollable window that looks something like this:
You can play back the recording using the following command:
python -m openadapt.replay NaiveReplayStrategy
Other replay strategies include:
-
StatefulReplayStrategy: Early proof-of-concept which uses the OpenAI GPT-4 API with prompts constructed via OS-level window data. - (*)
VisualReplayStrategy: Uses Fast Segment Anything Model (FastSAM) to segment active window. - (*)
VanillaReplayStrategy: Assumes the model is capable of directly reasoning on states and actions accurately. With future frontier models, we hope that this script will suddenly work a lot better.
The (*) prefix indicates strategies which accept an "instructions" parameter that is used to modify the recording, e.g.:
python -m openadapt.replay VanillaReplayStrategy --instructions "calculate 9-8"
See https://github.com/OpenAdaptAI/OpenAdapt/tree/main/openadapt/strategies for a complete list. More ReplayStrategies coming soon! (see Contributing).
State-of-the-art GUI understanding via Segment Anything in High Quality:
Industry leading privacy (PII/PHI scrubbing) via AWS Comprehend, Microsoft Presidio and Private AI:
Decentralized and secure data distribution via Magic Wormhole:
Detailed performance monitoring via pympler and tracemalloc:
We are thrilled to open new contract positions for developers passionate about pushing boundaries in technology. If you're ready to make a significant impact, consider the following roles:
- Responsibilities: Develop and test key features such as process visualization, demo booking, app store, and blog integration.
- Skills: Proficiency in modern frontend technologies and a knack for UI/UX design.
- Role: Implement and refine process replay strategies using state-of-the-art LLMs/LMMs. Extract dynamic process descriptions from extensive process recordings.
- Skills: Strong background in machine learning, experience with LLMs/LMMs, and problem-solving aptitude.
- Focus: Enhance memory optimization techniques during process recording and replay. Develop sophisticated tools for process observation and productivity measurement.
- Skills: Expertise in software optimization, memory management, and analytics.
- Focus: Maintaining OpenAdapt repositories
- Skills: Passion for writing and/or documentation
-
Step 1: Submit an empty Pull Request to OpenAdapt or OpenAdapt.web. Format your PR title as
[Proposal] <your title here> - Step 2: Include a brief, informal outline of your approach in the PR description. Feel free to add any questions you might have.
- Need Clarifications? Reach out to us on Discord.
We're looking forward to your contributions. Let's build the future 🚀
Notable Works-in-progress (incomplete, see https://github.com/OpenAdaptAI/OpenAdapt/pulls and https://github.com/OpenAdaptAI/OpenAdapt/issues/ for more)
- Video Recording Hardware Acceleration (help wanted)
- Audio Narration (help wanted)
- Chrome Extension (help wanted)
- Gemini Vision (help wanted)
Our goal is to automate the task described and demonstrated in a Recording.
That is, given a new Screenshot, we want to generate the appropriate
ActionEvent(s) based on the previously recorded ActionEvents in order to
accomplish the task specified in the
Recording.task_description
and narrated by the user in
AudioInfo.words_with_timestamps,
while accounting for differences in screen resolution, window size, application
behavior, etc.
If it's not clear what ActionEvent is appropriate for the given Screenshot,
(e.g. if the GUI application is behaving in a way we haven't seen before),
we can ask the user to take over temporarily to demonstrate the appropriate
course of action.
The data model consists of the following entities:
-
Recording: Contains information about the screen dimensions, platform, and other metadata. -
ActionEvent: Represents a user action event such as a mouse click or key press. EachActionEventhas an associatedScreenshottaken immediately before the event occurred.ActionEvents are aggregated to remove unnecessary events (see visualize.) -
Screenshot: Contains the PNG data of a screenshot taken during the recording. -
WindowEvent: Represents a window event such as a change in window title, position, or size.
You can assume that you have access to the following functions:
-
create_recording("doing taxes"): Creates a recording. -
get_latest_recording(): Gets the latest recording. -
get_events(recording): Returns a list ofActionEventobjects for the given recording.
See GitBook Documentation for more.
Join us on Discord. Then:
- Fork this repository and clone it to your local machine.
- Get OpenAdapt up and running by following the instructions under Setup.
- Look through the list of open issues at https://github.com/OpenAdaptAI/OpenAdapt/issues and once you find one you would like to address, indicate your interest with a comment.
- Implement a solution to the issue you selected. Write unit tests for your implementation.
- Submit a Pull Request (PR) to this repository. Note: submitting a PR before your implementation is complete (e.g. with high level documentation and/or implementation stubs) is encouraged, as it provides us with the opportunity to provide early feedback and iterate on the approach.
Your submission will be evaluated based on the following criteria:
-
Functionality : Your implementation should correctly generate the new
ActionEventobjects that can be replayed in order to accomplish the task in the original recording. -
Code Quality : Your code should be well-structured, clean, and easy to understand.
-
Scalability : Your solution should be efficient and scale well with large datasets.
-
Testing : Your tests should cover various edge cases and scenarios to ensure the correctness of your implementation.
-
Commit your changes to your forked repository.
-
Create a pull request to the original repository with your changes.
-
In your pull request, include a brief summary of your approach, any assumptions you made, and how you integrated external libraries.
-
Bonus: interacting with ChatGPT and/or other language transformer models in order to generate code and/or evaluate design decisions is encouraged. If you choose to do so, please include the full transcript.
MacOS: if you encounter system alert messages or find issues when making and replaying recordings, make sure to set up permissions accordingly.
In summary (from https://stackoverflow.com/a/69673312):
- Settings -> Security & Privacy
- Click on the Privacy tab
- Scroll and click on the Accessibility Row
- Click +
- Navigate to /System/Applications/Utilities/ (or wherever Terminal.app is installed)
- Click okay.
From inside the openadapt directory (containing alembic.ini):
alembic revision --autogenerate -m "<msg>"
To ensure code quality and consistency, OpenAdapt uses pre-commit hooks. These hooks will be executed automatically before each commit to perform various checks and validations on your codebase.
The following pre-commit hooks are used in OpenAdapt:
- check-yaml: Validates the syntax and structure of YAML files.
- end-of-file-fixer: Ensures that files end with a newline character.
- trailing-whitespace: Detects and removes trailing whitespace at the end of lines.
-
black: Formats Python code to adhere to the Black code style. Notably, the
--previewfeature is used. - isort: Sorts Python import statements in a consistent and standardized manner.
To set up the pre-commit hooks, follow these steps:
-
Navigate to the root directory of your OpenAdapt repository.
-
Run the following command to install the hooks:
pre-commit install
Now, the pre-commit hooks are installed and will run automatically before each commit. They will enforce code quality standards and prevent committing code that doesn't pass the defined checks.
When you submit a PR, the "Python CI" workflow is triggered for code consistency. It follows organized steps to review your code:
-
Python Black Check : This step verifies code formatting using Python Black style, with the
--previewflag for style. -
Flake8 Review : Next, Flake8 tool thoroughly checks code structure, including flake8-annotations and flake8-docstrings. Though GitHub Actions automates checks, it's wise to locally run
flake8 .before finalizing changes for quicker issue spotting and resolution.
Please submit any issues to https://github.com/OpenAdaptAI/OpenAdapt/issues with the following information:
- Problem description (please include any relevant console output and/or screenshots)
- Steps to reproduce (please help others to help you!)
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for OpenAdapt
Similar Open Source Tools
OpenAdapt
OpenAdapt is an open-source software adapter between Large Multimodal Models (LMMs) and traditional desktop and web Graphical User Interfaces (GUIs). It aims to automate repetitive GUI workflows by leveraging the power of LMMs. OpenAdapt records user input and screenshots, converts them into tokenized format, and generates synthetic input via transformer model completions. It also analyzes recordings to generate task trees and replay synthetic input to complete tasks. OpenAdapt is model agnostic and generates prompts automatically by learning from human demonstration, ensuring that agents are grounded in existing processes and mitigating hallucinations. It works with all types of desktop GUIs, including virtualized and web, and is open source under the MIT license.
testzeus-hercules
Hercules is the world’s first open-source testing agent designed to handle the toughest testing tasks for modern web applications. It turns simple Gherkin steps into fully automated end-to-end tests, making testing simple, reliable, and efficient. Hercules adapts to various platforms like Salesforce and is suitable for CI/CD pipelines. It aims to democratize and disrupt test automation, making top-tier testing accessible to everyone. The tool is transparent, reliable, and community-driven, empowering teams to deliver better software. Hercules offers multiple ways to get started, including using PyPI package, Docker, or building and running from source code. It supports various AI models, provides detailed installation and usage instructions, and integrates with Nuclei for security testing and WCAG for accessibility testing. The tool is production-ready, open core, and open source, with plans for enhanced LLM support, advanced tooling, improved DOM distillation, community contributions, extensive documentation, and a bounty program.
agentok
Agentok Studio is a tool built upon AG2, a powerful agent framework from Microsoft, offering intuitive visual tools to streamline the creation and management of complex agent-based workflows. It simplifies the process for creators and developers by generating native Python code with minimal dependencies, enabling users to create self-contained code that can be executed anywhere. The tool is currently under development and not recommended for production use, but contributions are welcome from the community to enhance its capabilities and functionalities.
supallm
Supallm is a Python library for super resolution of images using deep learning techniques. It provides pre-trained models for enhancing image quality by increasing resolution. The library is easy to use and allows users to upscale images with high fidelity and detail. Supallm is suitable for tasks such as enhancing image quality, improving visual appearance, and increasing the resolution of low-quality images. It is a valuable tool for researchers, photographers, graphic designers, and anyone looking to enhance image quality using AI technology.
langmanus
LangManus is a community-driven AI automation framework that combines language models with specialized tools for tasks like web search, crawling, and Python code execution. It implements a hierarchical multi-agent system with agents like Coordinator, Planner, Supervisor, Researcher, Coder, Browser, and Reporter. The framework supports LLM integration, search and retrieval tools, Python integration, workflow management, and visualization. LangManus aims to give back to the open-source community and welcomes contributions in various forms.
TaskWeaver
TaskWeaver is a code-first agent framework designed for planning and executing data analytics tasks. It interprets user requests through code snippets, coordinates various plugins to execute tasks in a stateful manner, and preserves both chat history and code execution history. It supports rich data structures, customized algorithms, domain-specific knowledge incorporation, stateful execution, code verification, easy debugging, security considerations, and easy extension. TaskWeaver is easy to use with CLI and WebUI support, and it can be integrated as a library. It offers detailed documentation, demo examples, and citation guidelines.
Memori
Memori is a memory fabric designed for enterprise AI that seamlessly integrates into existing software and infrastructure. It is agnostic to LLM, datastore, and framework, providing support for major foundational models and databases. With features like vectorized memories, in-memory semantic search, and a knowledge graph, Memori simplifies the process of attributing LLM interactions and managing sessions. It offers Advanced Augmentation for enhancing memories at different levels and supports various platforms, frameworks, database integrations, and datastores. Memori is designed to reduce development overhead and provide efficient memory management for AI applications.
giskard-oss
Giskard-oss is an Evaluation & Testing framework for AI systems that aims to control risks of performance, bias, and security issues. It focuses on LLM systems, with plans for a new scan and a rewrite of RAGET for version 3. The repository is structured as a Python workspace with three packages: giskard-core, giskard-checks, and giskard-agents. Developers can use the Makefile for common tasks, and contributions from the AI community are welcome. The project encourages stars for visibility and offers sponsorship options for support.
depthai
This repository contains a demo application for DepthAI, a tool that can load different networks, create pipelines, record video, and more. It provides documentation for installation and usage, including running programs through Docker. Users can explore DepthAI features via command line arguments or a clickable QT interface. Supported models include various AI models for tasks like face detection, human pose estimation, and object detection. The tool collects anonymous usage statistics by default, which can be disabled. Users can report issues to the development team for support and troubleshooting.
kollektiv
Kollektiv is a Retrieval-Augmented Generation (RAG) system designed to enable users to chat with their favorite documentation easily. It aims to provide LLMs with access to the most up-to-date knowledge, reducing inaccuracies and improving productivity. The system utilizes intelligent web crawling, advanced document processing, vector search, multi-query expansion, smart re-ranking, AI-powered responses, and dynamic system prompts. The technical stack includes Python/FastAPI for backend, Supabase, ChromaDB, and Redis for storage, OpenAI and Anthropic Claude 3.5 Sonnet for AI/ML, and Chainlit for UI. Kollektiv is licensed under a modified version of the Apache License 2.0, allowing free use for non-commercial purposes.
codellm-devkit
Codellm-devkit (CLDK) is a Python library that serves as a multilingual program analysis framework bridging traditional static analysis tools and Large Language Models (LLMs) specialized for code (CodeLLMs). It simplifies the process of analyzing codebases across multiple programming languages, enabling the extraction of meaningful insights and facilitating LLM-based code analysis. The library provides a unified interface for integrating outputs from various analysis tools and preparing them for effective use by CodeLLMs. Codellm-devkit aims to enable the development and experimentation of robust analysis pipelines that combine traditional program analysis tools and CodeLLMs, reducing friction in multi-language code analysis and ensuring compatibility across different tools and LLM platforms. It is designed to seamlessly integrate with popular analysis tools like WALA, Tree-sitter, LLVM, and CodeQL, acting as a crucial intermediary layer for efficient communication between these tools and CodeLLMs. The project is continuously evolving to include new tools and frameworks, maintaining its versatility for code analysis and LLM integration.
DemoGPT
DemoGPT is an all-in-one agent library that provides tools, prompts, frameworks, and LLM models for streamlined agent development. It leverages GPT-3.5-turbo to generate LangChain code, creating interactive Streamlit applications. The tool is designed for creating intelligent, interactive, and inclusive solutions in LLM-based application development. It offers model flexibility, iterative development, and a commitment to user engagement. Future enhancements include integrating Gorilla for autonomous API usage and adding a publicly available database for refining the generation process.
uAgents
uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.
DevDocs
DevDocs is a platform designed to simplify the process of digesting technical documentation for software engineers and developers. It automates the extraction and conversion of web content into markdown format, making it easier for users to access and understand the information. By crawling through child pages of a given URL, DevDocs provides a streamlined approach to gathering relevant data and integrating it into various tools for software development. The tool aims to save time and effort by eliminating the need for manual research and content extraction, ultimately enhancing productivity and efficiency in the development process.
KlicStudio
Klic Studio is a versatile audio and video localization and enhancement solution developed by Krillin AI. This minimalist yet powerful tool integrates video translation, dubbing, and voice cloning, supporting both landscape and portrait formats. With an end-to-end workflow, users can transform raw materials into beautifully ready-to-use cross-platform content with just a few clicks. The tool offers features like video acquisition, accurate speech recognition, intelligent segmentation, terminology replacement, professional translation, voice cloning, video composition, and cross-platform support. It also supports various speech recognition services, large language models, and TTS text-to-speech services. Users can easily deploy the tool using Docker and configure it for different tasks like subtitle translation, large model translation, and optional voice services.
giskard
Giskard is an open-source Python library that automatically detects performance, bias & security issues in AI applications. The library covers LLM-based applications such as RAG agents, all the way to traditional ML models for tabular data.
For similar tasks
OpenAdapt
OpenAdapt is an open-source software adapter between Large Multimodal Models (LMMs) and traditional desktop and web Graphical User Interfaces (GUIs). It aims to automate repetitive GUI workflows by leveraging the power of LMMs. OpenAdapt records user input and screenshots, converts them into tokenized format, and generates synthetic input via transformer model completions. It also analyzes recordings to generate task trees and replay synthetic input to complete tasks. OpenAdapt is model agnostic and generates prompts automatically by learning from human demonstration, ensuring that agents are grounded in existing processes and mitigating hallucinations. It works with all types of desktop GUIs, including virtualized and web, and is open source under the MIT license.
For similar jobs
lollms-webui
LoLLMs WebUI (Lord of Large Language Multimodal Systems: One tool to rule them all) is a user-friendly interface to access and utilize various LLM (Large Language Models) and other AI models for a wide range of tasks. With over 500 AI expert conditionings across diverse domains and more than 2500 fine tuned models over multiple domains, LoLLMs WebUI provides an immediate resource for any problem, from car repair to coding assistance, legal matters, medical diagnosis, entertainment, and more. The easy-to-use UI with light and dark mode options, integration with GitHub repository, support for different personalities, and features like thumb up/down rating, copy, edit, and remove messages, local database storage, search, export, and delete multiple discussions, make LoLLMs WebUI a powerful and versatile tool.
Azure-Analytics-and-AI-Engagement
The Azure-Analytics-and-AI-Engagement repository provides packaged Industry Scenario DREAM Demos with ARM templates (Containing a demo web application, Power BI reports, Synapse resources, AML Notebooks etc.) that can be deployed in a customer’s subscription using the CAPE tool within a matter of few hours. Partners can also deploy DREAM Demos in their own subscriptions using DPoC.
minio
MinIO is a High Performance Object Storage released under GNU Affero General Public License v3.0. It is API compatible with Amazon S3 cloud storage service. Use MinIO to build high performance infrastructure for machine learning, analytics and application data workloads.
mage-ai
Mage is an open-source data pipeline tool for transforming and integrating data. It offers an easy developer experience, engineering best practices built-in, and data as a first-class citizen. Mage makes it easy to build, preview, and launch data pipelines, and provides observability and scaling capabilities. It supports data integrations, streaming pipelines, and dbt integration.
AiTreasureBox
AiTreasureBox is a versatile AI tool that provides a collection of pre-trained models and algorithms for various machine learning tasks. It simplifies the process of implementing AI solutions by offering ready-to-use components that can be easily integrated into projects. With AiTreasureBox, users can quickly prototype and deploy AI applications without the need for extensive knowledge in machine learning or deep learning. The tool covers a wide range of tasks such as image classification, text generation, sentiment analysis, object detection, and more. It is designed to be user-friendly and accessible to both beginners and experienced developers, making AI development more efficient and accessible to a wider audience.
tidb
TiDB is an open-source distributed SQL database that supports Hybrid Transactional and Analytical Processing (HTAP) workloads. It is MySQL compatible and features horizontal scalability, strong consistency, and high availability.
airbyte
Airbyte is an open-source data integration platform that makes it easy to move data from any source to any destination. With Airbyte, you can build and manage data pipelines without writing any code. Airbyte provides a library of pre-built connectors that make it easy to connect to popular data sources and destinations. You can also create your own connectors using Airbyte's no-code Connector Builder or low-code CDK. Airbyte is used by data engineers and analysts at companies of all sizes to build and manage their data pipelines.
labelbox-python
Labelbox is a data-centric AI platform for enterprises to develop, optimize, and use AI to solve problems and power new products and services. Enterprises use Labelbox to curate data, generate high-quality human feedback data for computer vision and LLMs, evaluate model performance, and automate tasks by combining AI and human-centric workflows. The academic & research community uses Labelbox for cutting-edge AI research.