
AutoAgent
"AutoAgent: Fully-Automated and Zero-Code LLM Agent Framework"
Stars: 1939

AutoAgent is a fully-automated and zero-code framework that enables users to create and deploy LLM agents through natural language alone. It is a top performer on the GAIA Benchmark, equipped with a native self-managing vector database, and allows for easy creation of tools, agents, and workflows without any coding. AutoAgent seamlessly integrates with a wide range of LLMs and supports both function-calling and ReAct interaction modes. It is designed to be dynamic, extensible, customized, and lightweight, serving as a personal AI assistant.
README:
Welcome to AutoAgent! AutoAgent is a Fully-Automated and highly Self-Developing framework that enables users to create and deploy LLM agents through Natural Language Alone.
-
π Top Performer on the GAIA Benchmark AutoAgent has ranked the #1 spot among open-sourced methods, delivering comparable performance to OpenAI's Deep Research.
-
π Agentic-RAG with Native Self-Managing Vector Database AutoAgent equipped with a native self-managing vector database, outperforms industry-leading solutions like LangChain.
-
β¨ Agent and Workflow Create with Ease AutoAgent leverages natural language to effortlessly build ready-to-use tools, agents and workflows - no coding required.
-
π Universal LLM Support AutoAgent seamlessly integrates with A Wide Range of LLMs (e.g., OpenAI, Anthropic, Deepseek, vLLM, Grok, Huggingface ...)
-
π Flexible Interaction Benefit from support for both function-calling and ReAct interaction modes.
-
π€ Dynamic, Extensible, Lightweight AutoAgent is your Personal AI Assistant, designed to be dynamic, extensible, customized, and lightweight.
π Unlock the Future of LLM Agents. Try π₯AutoAgentπ₯ Now!
- [2025, Feb 17]: Β ππWe've updated and released AutoAgent v0.2.0 (formerly known as MetaChain). Detailed changes include: 1) fix the bug of different LLM providers from issues; 2) add automatic installation of AutoAgent in the container environment according to issues; 3) add more easy-to-use commands for the CLI mode. 4) Rename the project to AutoAgent for better understanding.
- [2025, Feb 10]: Β ππWe've released MetaChain!, including framework, evaluation codes and CLI mode! Check our paper for more details.
- β¨ Features
- π₯ News
- π How to Use AutoAgent
- β‘ Quick Start
- βοΈ Todo List
- π¬ How To Reproduce the Results in the Paper
- π Documentation
- π€ Join the Community
- π Acknowledgements
- π Cite
AutoAgent have an out-of-the-box multi-agent system, which you could choose user mode
in the start page to use it. This multi-agent system is a general AI assistant, having the same functionality with OpenAI's Deep Research and the comparable performance with it in GAIA benchmark.
- π High Performance: Matches Deep Research using Claude 3.5 rather than OpenAI's o3 model.
- π Model Flexibility: Compatible with any LLM (including Deepseek-R1, Grok, Gemini, etc.)
- π° Cost-Effective: Open-source alternative to Deep Research's $200/month subscription
- π― User-Friendly: Easy-to-deploy CLI interface for seamless interaction
- π File Support: Handles file uploads for enhanced data interaction
π₯ Deep Research (aka User Mode)
The most distinctive feature of AutoAgent is its natural language customization capability. Unlike other agent frameworks, AutoAgent allows you to create tools, agents, and workflows using natural language alone. Simply choose agent editor
or workflow editor
mode to start your journey of building agents through conversations.
You can use agent editor
as shown in the following figure.
![]() Input what kind of agent you want to create. |
![]() Automated agent profiling. |
![]() Output the agent profiles. |
![]() Create the desired tools. |
![]() Input what do you want to complete with the agent. (Optional) |
![]() Create the desired agent(s) and go to the next step. |
You can also create the agent workflows using natural language description with the workflow editor
mode, as shown in the following figure. (Tips: this mode does not support tool creation temporarily.)
![]() Input what kind of workflow you want to create. |
![]() Automated workflow profiling. |
![]() Output the workflow profiles. |
![]() Input what do you want to complete with the workflow. (Optional) |
![]() Create the desired workflow(s) and go to the next step. |
git clone https://github.com/HKUDS/AutoAgent.git
cd AutoAgent
pip install -e .
We use Docker to containerize the agent-interactive environment. So please install Docker first. You don't need to manually pull the pre-built image, because we have let Auto-Deep-Research automatically pull the pre-built image based on your architecture of your machine.
Create an environment variable file, just like .env.template
, and set the API keys for the LLMs you want to use. Not every LLM API Key is required, use what you need.
# Required Github Tokens of your own
GITHUB_AI_TOKEN=
# Optional API Keys
OPENAI_API_KEY=
DEEPSEEK_API_KEY=
ANTHROPIC_API_KEY=
GEMINI_API_KEY=
HUGGINGFACE_API_KEY=
GROQ_API_KEY=
XAI_API_KEY=
[π¨ News: ] We have updated a more easy-to-use command to start the CLI mode and fix the bug of different LLM providers from issues. You can follow the following steps to start the CLI mode with different LLM providers with much less configuration.
You can run auto main
to start full part of AutoAgent, including user mode
, agent editor
and workflow editor
. Btw, you can also run auto deep-research
to start more lightweight user mode
, just like the Auto-Deep-Research project. Some configuration of this command is shown below.
-
--container_name
: Name of the Docker container (default: 'deepresearch') -
--port
: Port for the container (default: 12346) -
COMPLETION_MODEL
: Specify the LLM model to use, you should follow the name of Litellm to set the model name. (Default:claude-3-5-sonnet-20241022
) -
DEBUG
: Enable debug mode for detailed logs (default: False) -
API_BASE_URL
: The base URL for the LLM provider (default: None) -
FN_CALL
: Enable function calling (default: None). Most of time, you could ignore this option because we have already set the default value based on the model name. -
git_clone
: Clone the AutoAgent repository to the local environment (only support with theauto main
command, default: True) -
test_pull_name
: The name of the test pull. (only support with theauto main
command, default: 'autoagent_mirror')
In the agent editor
and workflow editor
mode, we should clone a mirror of the AutoAgent repository to the local agent-interactive environment and let our AutoAgent automatically update the AutoAgent itself, such as creating new tools, agents and workflows. So if you want to use the agent editor
and workflow editor
mode, you should set the git_clone
to True and set the test_pull_name
to 'autoagent_mirror' or other branches.
Then I will show you how to use the full part of AutoAgent with the auto main
command and different LLM providers. If you want to use the auto deep-research
command, you can refer to the Auto-Deep-Research project for more details.
- set the
ANTHROPIC_API_KEY
in the.env
file.
ANTHROPIC_API_KEY=your_anthropic_api_key
- run the following command to start Auto-Deep-Research.
auto main # default model is claude-3-5-sonnet-20241022
- set the
OPENAI_API_KEY
in the.env
file.
OPENAI_API_KEY=your_openai_api_key
- run the following command to start Auto-Deep-Research.
COMPLETION_MODEL=gpt-4o auto main
- set the
MISTRAL_API_KEY
in the.env
file.
MISTRAL_API_KEY=your_mistral_api_key
- run the following command to start Auto-Deep-Research.
COMPLETION_MODEL=mistral/mistral-large-2407 auto main
- set the
GEMINI_API_KEY
in the.env
file.
GEMINI_API_KEY=your_gemini_api_key
- run the following command to start Auto-Deep-Research.
COMPLETION_MODEL=gemini/gemini-2.0-flash auto main
- set the
HUGGINGFACE_API_KEY
in the.env
file.
HUGGINGFACE_API_KEY=your_huggingface_api_key
- run the following command to start Auto-Deep-Research.
COMPLETION_MODEL=huggingface/meta-llama/Llama-3.3-70B-Instruct auto main
- set the
GROQ_API_KEY
in the.env
file.
GROQ_API_KEY=your_groq_api_key
- run the following command to start Auto-Deep-Research.
COMPLETION_MODEL=groq/deepseek-r1-distill-llama-70b auto main
- set the
OPENAI_API_KEY
in the.env
file.
OPENAI_API_KEY=your_api_key_for_openai_compatible_endpoints
- run the following command to start Auto-Deep-Research.
COMPLETION_MODEL=openai/grok-2-latest API_BASE_URL=https://api.x.ai/v1 auto main
We recommend using OpenRouter as LLM provider of DeepSeek-R1 temporarily. Because official API of DeepSeek-R1 can not be used efficiently.
- set the
OPENROUTER_API_KEY
in the.env
file.
OPENROUTER_API_KEY=your_openrouter_api_key
- run the following command to start Auto-Deep-Research.
COMPLETION_MODEL=openrouter/deepseek/deepseek-r1 auto main
- set the
DEEPSEEK_API_KEY
in the.env
file.
DEEPSEEK_API_KEY=your_deepseek_api_key
- run the following command to start Auto-Deep-Research.
COMPLETION_MODEL=deepseek/deepseek-chat auto main
After the CLI mode is started, you can see the start page of AutoAgent:
You can import the browser cookies to the browser environment to let the agent better access some specific websites. For more details, please refer to the cookies folder.
If you want to create tools from the third-party tool platforms, such as RapidAPI, you should subscribe tools from the platform and add your own API keys by running process_tool_docs.py.
python process_tool_docs.py
More features coming soon! π Web GUI interface under development.
AutoAgent is continuously evolving! Here's what's coming:
- π More Benchmarks: Expanding evaluations to SWE-bench, WebArena, and more
- π₯οΈ GUI Agent: Supporting Computer-Use agents with GUI interaction
- π§ Tool Platforms: Integration with more platforms like Composio
- ποΈ Code Sandboxes: Supporting additional environments like E2B
- π¨ Web Interface: Developing comprehensive GUI for better user experience
Have ideas or suggestions? Feel free to open an issue! Stay tuned for more exciting updates! π
For the GAIA benchmark, you can run the following command to run the inference.
cd path/to/AutoAgent && sh evaluation/gaia/scripts/run_infer.sh
For the evaluation, you can run the following command.
cd path/to/AutoAgent && python evaluation/gaia/get_score.py
For the Agentic-RAG task, you can run the following command to run the inference.
Step1. Turn to this page and download it. Save them to your datapath.
Step2. Run the following command to run the inference.
cd path/to/AutoAgent && sh evaluation/multihoprag/scripts/run_rag.sh
Step3. The result will be saved in the evaluation/multihoprag/result.json
.
A more detailed documentation is coming soon π, and we will update in the Documentation page.
We want to build a community for AutoAgent, and we welcome everyone to join us. You can join our community by:
- Join our Slack workspace - Here we talk about research, architecture, and future development.
- Join our Discord server - This is a community-run server for general discussion, questions, and feedback.
- Read or post Github Issues - Check out the issues we're working on, or add your own ideas.
Rome wasn't built in a day. AutoAgent stands on the shoulders of giants, and we are deeply grateful for the outstanding work that came before us. Our framework architecture draws inspiration from OpenAI Swarm, while our user mode's three-agent design benefits from Magentic-one's insights. We've also learned from OpenHands for documentation structure and many other excellent projects for agent-environment interaction design, among others. We express our sincere gratitude and respect to all these pioneering works that have been instrumental in shaping AutoAgent.
@misc{AutoAgent,
title={{AutoAgent: A Fully-Automated and Zero-Code Framework for LLM Agents}},
author={Jiabin Tang, Tianyu Fan, Chao Huang},
year={2025},
eprint={202502.05957},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2502.05957},
}
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for AutoAgent
Similar Open Source Tools

AutoAgent
AutoAgent is a fully-automated and zero-code framework that enables users to create and deploy LLM agents through natural language alone. It is a top performer on the GAIA Benchmark, equipped with a native self-managing vector database, and allows for easy creation of tools, agents, and workflows without any coding. AutoAgent seamlessly integrates with a wide range of LLMs and supports both function-calling and ReAct interaction modes. It is designed to be dynamic, extensible, customized, and lightweight, serving as a personal AI assistant.

pear-landing-page
PearAI Landing Page is an open-source AI-powered code editor managed by Nang and Pan. It is built with Next.js, Vercel, Tailwind CSS, and TypeScript. The project requires setting up environment variables for proper configuration. Users can run the project locally by starting the development server and visiting the specified URL in the browser. Recommended extensions include Prettier, ESLint, and JavaScript and TypeScript Nightly. Contributions to the project are welcomed and appreciated.

Auto-Deep-Research
Auto-Deep-Research is an open-source and cost-efficient alternative to OpenAI's Deep Research, based on the AutoAgent framework. It offers high performance, universal LLM support, flexible interaction, cost-efficiency, file support, and one-click launch. Users can seamlessly integrate with various LLMs, handle file uploads, and start instantly with a simple command. The tool aims to provide a fully-automated and personalized AI assistant at a lower cost, catering to community needs and showcasing the potential of AutoAgent for practical AI applications.

RainbowGPT
RainbowGPT is a versatile tool that offers a range of functionalities, including Stock Analysis for financial decision-making, MySQL Management for database navigation, and integration of AI technologies like GPT-4 and ChatGlm3. It provides a user-friendly interface suitable for all skill levels, ensuring seamless information flow and continuous expansion of emerging technologies. The tool enhances adaptability, creativity, and insight, making it a valuable asset for various projects and tasks.

trieve
Trieve is an advanced relevance API for hybrid search, recommendations, and RAG. It offers a range of features including self-hosting, semantic dense vector search, typo tolerant full-text/neural search, sub-sentence highlighting, recommendations, convenient RAG API routes, the ability to bring your own models, hybrid search with cross-encoder re-ranking, recency biasing, tunable popularity-based ranking, filtering, duplicate detection, and grouping. Trieve is designed to be flexible and customizable, allowing users to tailor it to their specific needs. It is also easy to use, with a simple API and well-documented features.

openai-kotlin
OpenAI Kotlin API client is a Kotlin client for OpenAI's API with multiplatform and coroutines capabilities. It allows users to interact with OpenAI's API using Kotlin programming language. The client supports various features such as models, chat, images, embeddings, files, fine-tuning, moderations, audio, assistants, threads, messages, and runs. It also provides guides on getting started, chat & function call, file source guide, and assistants. Sample apps are available for reference, and troubleshooting guides are provided for common issues. The project is open-source and licensed under the MIT license, allowing contributions from the community.

Devon
Devon is an open-source pair programmer tool designed to facilitate collaborative coding sessions. It provides features such as multi-file editing, codebase exploration, test writing, bug fixing, and architecture exploration. The tool supports Anthropic, OpenAI, and Groq APIs, with plans to add more models in the future. Devon is community-driven, with ongoing development goals including multi-model support, plugin system for tool builders, self-hostable Electron app, and setting SOTA on SWE-bench Lite. Users can contribute to the project by developing core functionality, conducting research on agent performance, providing feedback, and testing the tool.

extension-gen-ai
The Looker GenAI Extension provides code examples and resources for building a Looker Extension that integrates with Vertex AI Large Language Models (LLMs). Users can leverage the power of LLMs to enhance data exploration and analysis within Looker. The extension offers generative explore functionality to ask natural language questions about data and generative insights on dashboards to analyze data by asking questions. It leverages components like BQML Remote Models, BQML Remote UDF with Vertex AI, and Custom Fine Tune Model for different integration options. Deployment involves setting up infrastructure with Terraform and deploying the Looker Extension by creating a Looker project, copying extension files, configuring BigQuery connection, connecting to Git, and testing the extension. Users can save example prompts and configure user settings for the extension. Development of the Looker Extension environment includes installing dependencies, starting the development server, and building for production.

phospho
Phospho is a text analytics platform for LLM apps. It helps you detect issues and extract insights from text messages of your users or your app. You can gather user feedback, measure success, and iterate on your app to create the best conversational experience for your users.

openmeter
OpenMeter is a real-time and scalable usage metering tool for AI, usage-based billing, infrastructure, and IoT use cases. It provides a REST API for integrations and offers client SDKs in Node.js, Python, Go, and Web. OpenMeter is licensed under the Apache 2.0 License.

BentoML
BentoML is an open-source model serving library for building performant and scalable AI applications with Python. It comes with everything you need for serving optimization, model packaging, and production deployment.

preswald
Preswald is a full-stack platform for building, deploying, and managing interactive data applications in Python. It simplifies the process by combining ingestion, storage, transformation, and visualization into one lightweight SDK. With Preswald, users can connect to various data sources, customize app themes, and easily deploy apps locally. The platform focuses on code-first simplicity, end-to-end coverage, and efficiency by design, making it suitable for prototyping internal tools or deploying production-grade apps with reduced complexity and cost.

agentica
Agentica is a specialized Agentic AI library focused on LLM Function Calling. Users can provide Swagger/OpenAPI documents or TypeScript class types to Agentica for seamless functionality. The library simplifies AI development by handling various tasks effortlessly.

codepair
CodePair is an open-source real-time collaborative markdown editor with AI intelligence, allowing users to collaboratively edit documents, share documents with external parties, and utilize AI intelligence within the editor. It is built using React, NestJS, and LangChain. The repository contains frontend and backend code, with detailed instructions for setting up and running each part. Users can choose between Frontend Development Only Mode or Full Stack Development Mode based on their needs. CodePair also integrates GitHub OAuth for Social Login feature. Contributors are welcome to submit patches and follow the contribution workflow.

nodejs-todo-api-boilerplate
An LLM-powered code generation tool that relies on the built-in Node.js API Typescript Template Project to easily generate clean, well-structured CRUD module code from text description. It orchestrates 3 LLM micro-agents (`Developer`, `Troubleshooter` and `TestsFixer`) to generate code, fix compilation errors, and ensure passing E2E tests. The process includes module code generation, DB migration creation, seeding data, and running tests to validate output. By cycling through these steps, it guarantees consistent and production-ready CRUD code aligned with vertical slicing architecture.

ChatIDE
ChatIDE is an AI assistant that integrates with your IDE, allowing you to converse with OpenAI's ChatGPT or Anthropic's Claude within your development environment. It provides a seamless way to access AI-powered assistance while coding, enabling you to get real-time help, generate code snippets, debug errors, and brainstorm ideas without leaving your IDE.
For similar tasks

activepieces
Activepieces is an open source replacement for Zapier, designed to be extensible through a type-safe pieces framework written in Typescript. It features a user-friendly Workflow Builder with support for Branches, Loops, and Drag and Drop. Activepieces integrates with Google Sheets, OpenAI, Discord, and RSS, along with 80+ other integrations. The list of supported integrations continues to grow rapidly, thanks to valuable contributions from the community. Activepieces is an open ecosystem; all piece source code is available in the repository, and they are versioned and published directly to npmjs.com upon contributions. If you cannot find a specific piece on the pieces roadmap, please submit a request by visiting the following link: Request Piece Alternatively, if you are a developer, you can quickly build your own piece using our TypeScript framework. For guidance, please refer to the following guide: Contributor's Guide

bee-agent-framework
The Bee Agent Framework is an open-source tool for building, deploying, and serving powerful agentic workflows at scale. It provides AI agents, tools for creating workflows in Javascript/Python, a code interpreter, memory optimization strategies, serialization for pausing/resuming workflows, traceability features, production-level control, and upcoming features like model-agnostic support and a chat UI. The framework offers various modules for agents, llms, memory, tools, caching, errors, adapters, logging, serialization, and more, with a roadmap including MLFlow integration, JSON support, structured outputs, chat client, base agent improvements, guardrails, and evaluation.

mastra
Mastra is an opinionated Typescript framework designed to help users quickly build AI applications and features. It provides primitives such as workflows, agents, RAG, integrations, syncs, and evals. Users can run Mastra locally or deploy it to a serverless cloud. The framework supports various LLM providers, offers tools for building language models, workflows, and accessing knowledge bases. It includes features like durable graph-based state machines, retrieval-augmented generation, integrations, syncs, and automated tests for evaluating LLM outputs.

otto-m8
otto-m8 is a flowchart based automation platform designed to run deep learning workloads with minimal to no code. It provides a user-friendly interface to spin up a wide range of AI models, including traditional deep learning models and large language models. The tool deploys Docker containers of workflows as APIs for integration with existing workflows, building AI chatbots, or standalone applications. Otto-m8 operates on an Input, Process, Output paradigm, simplifying the process of running AI models into a flowchart-like UI.

flows-ai
Flows AI is a lightweight, type-safe AI workflow orchestrator inspired by Anthropic's agent patterns and built on top of Vercel AI SDK. It provides a simple and deterministic way to build AI workflows by connecting different input/outputs together, either explicitly defining workflows or dynamically breaking down complex tasks using an orchestrator agent. The library is designed without classes or state, focusing on flexible input/output contracts for nodes.

LangGraph-learn
LangGraph-learn is a community-driven project focused on mastering LangGraph and other AI-related topics. It provides hands-on examples and resources to help users learn how to create and manage language model workflows using LangGraph and related tools. The project aims to foster a collaborative learning environment for individuals interested in AI and machine learning by offering practical examples and tutorials on building efficient and reusable workflows involving language models.

xorq
Xorq (formerly LETSQL) is a data processing library built on top of Ibis and DataFusion to write multi-engine data workflows. It provides a flexible and powerful tool for processing and analyzing data from various sources, enabling users to create complex data pipelines and perform advanced data transformations.

beeai-framework
BeeAI Framework is a versatile tool for building production-ready multi-agent systems. It offers flexibility in orchestrating agents, seamless integration with various models and tools, and production-grade controls for scaling. The framework supports Python and TypeScript libraries, enabling users to implement simple to complex multi-agent patterns, connect with AI services, and optimize token usage and resource management.
For similar jobs

weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

tabby
Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

spear
SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

Magick
Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.