
cagent
Agent Builder and Runtime by Docker Engineering
Stars: 243

cagent is a powerful and easy-to-use multi-agent runtime that orchestrates AI agents with specialized capabilities and tools, allowing users to quickly build, share, and run a team of virtual experts to solve complex problems. It supports creating agents with YAML configuration, improving agents with MCP servers, and delegating tasks to specialists. Key features include multi-agent architecture, rich tool ecosystem, smart delegation, YAML configuration, advanced reasoning tools, and support for multiple AI providers like OpenAI, Anthropic, Gemini, and Docker Model Runner.
README:
A powerful, easy to use, customizable multi-agent runtime that orchestrates AI agents with specialized capabilities and tools, and the interactions between agents.
cagent
lets you create and run intelligent AI agents, where each agent has
specialized knowledge, tools, and capabilities.
Think of it as allowing you to quickly build, share and run a team of virtual experts that collaborate to solve complex problems for you.
And it's dead easy to use!
cagent
is in active development, breaking changes are to be expected
Example basic_agent.yaml:
Creating agents with cagent is very simple. They are described in a short yaml file, like this one:
agents:
root:
model: openai/gpt-5-mini
description: A helpful AI assistant
instruction: |
You are a knowledgeable assistant that helps users with various tasks.
Be helpful, accurate, and concise in your responses.
Run it in a terminal with cagent run basic_agent.yaml
.
Many more examples can be found here!
cagent
supports MCP servers, enabling agents to use a wide variety of external tools and services.
It supports three transport types: stdio
, http
and sse
.
Giving an agent access to tools via MCP is a quick way to greatly improve its capabilities, the quality of its results and its general useful-ness.
Get started quickly with the Docker MCP Toolkit and catalog
Here, we're giving the same basic agent from the example above access to a containerized duckduckgo
mcp server and it's tools by using Docker's MCP Gateway:
agents:
root:
model: openai/gpt-5-mini
description: A helpful AI assistant
instruction: |
You are a knowledgeable assistant that helps users with various tasks.
Be helpful, accurate, and concise in your responses.
toolset:
- type: mcp
ref: docker:duckduckgo # stdio transport
When using a containerized server via the Docker MCP gateway, you can configure any required settings/secrets/authentication using the Docker MCP Toolkit in Docker Desktop.
Aside from the containerized MCP severs the Docker MCP Gateway provides, any standard MCP server can be used with cagent!
Here's an example similar to the above but adding read_file
and write_file
tools from the rust-mcp-filesystem
MCP server:
agents:
root:
model: openai/gpt-5-mini
description: A helpful AI assistant
instruction: |
You are a knowledgeable assistant that helps users with various tasks.
Be helpful, accurate, and concise in your responses. Write your search results to disk.
toolset:
- type: mcp
ref: docker:duckduckgo
- type: mcp
command: rust-mcp-filesystem # installed with `cargo install rust-mcp-filesystem`
args: ["--allow-write", "."]
tools: ["read_file", "write_file"] # Optional: specific tools only
env:
- "RUST_LOG=debug"
See the USAGE docs for more detailed information and examples
- 🏗️ Multi-agent architecture - Create specialized agents for different domains.
- 🔧 Rich tool ecosystem - Agents can use external tools and APIs via the MCP protocol.
- 🔄 Smart delegation - Agents can automatically route tasks to the most suitable specialist.
- 📝 YAML configuration - Declarative model and agent configuration.
- 💭 Advanced reasoning - Built-in "think", "todo" and "memory" tools for complex problem-solving.
- 🌐 Multiple AI providers - Support for OpenAI, Anthropic, Gemini and Docker Model Runner.
Prebuilt binaries for Windows, macOS and Linux can be found on the releases page of the project's GitHub repository
Once you've downloaded the appropriate binary for your platform, you may need to give it executable permissions. On macOS and Linux, this is done with the following command:
# linux amd64 build example
chmod +x /path/to/downloads/cagent-linux-amd64
You can then rename the binary to cagent
and configure your PATH
to be able to find it (configuration varies by platform).
Based on the models you configure your agents to use, you will need to set the corresponding provider API key accordingly, all theses keys are optional, you will likely need at least one of these, though:
# For OpenAI models
export OPENAI_API_KEY=your_api_key_here
# For Anthropic models
export ANTHROPIC_API_KEY=your_api_key_here
# For Gemini models
export GOOGLE_API_KEY=your_api_key_here
# Run an agent!
cagent run ./examples/pirate.yaml
# or specify a different starting agent from the config, useful for agent teams
cagent run ./examples/pirate.yaml -a root
# or run directly from an image reference here I'm pulling the pirate agent from the creek repository
cagent run creek/pirate
agents:
root:
model: claude
description: "Main coordinator agent that delegates tasks and manages workflow"
instruction: |
You are the root coordinator agent. Your job is to:
1. Understand user requests and break them down into manageable tasks
2. Delegate appropriate tasks to your helper agent
3. Coordinate responses and ensure tasks are completed properly
4. Provide final responses to the user
When you receive a request, analyze what needs to be done and decide whether to:
- Handle it yourself if it's simple
- Delegate to the helper agent if it requires specific assistance
- Break complex requests into multiple sub-tasks
sub_agents: ["helper"]
helper:
model: claude
description: "Assistant agent that helps with various tasks as directed by the root agent"
instruction: |
You are a helpful assistant agent. Your role is to:
1. Complete specific tasks assigned by the root agent
2. Provide detailed and accurate responses
3. Ask for clarification if tasks are unclear
4. Report back to the root agent with your results
Focus on being thorough and helpful in whatever task you're given.
models:
claude:
provider: anthropic
model: claude-sonnet-4-0
max_tokens: 64000
You'll find a curated list of agents examples, spread into 3 categories, Basic, Advanced and multi-agents in the /examples/
directory.
Using the command cagent new
you can quickly generate agents or multi-agent teams using a single prompt!
cagent
has a built-in agent dedicated to this task.
To use the feature, you must have an Anthropic, OpenAI or Google API key available in your environment, or specify a local model to run with DMR (Docker Model Runner).
You can choose what provider and model gets used by passing the --model provider/modelname
flag to cagent new
If --model
is unspecified, cagent new
will automatically choose between these 3 providers in order based on the first api key it finds in your environment
export ANTHROPIC_API_KEY=your_api_key_here # first choice. default model claude-sonnet-4-0
export OPENAI_API_KEY=your_api_key_here # if anthropic key not set. default model gpt-5-mini
export GOOGLE_API_KEY=your_api_key_here # if anthropic and openai keys are not set. default model gemini-2.5-flash
Example of provider and model overriding:
# Use GPT-5 via OpenAI
cagent new --model openai/gpt-5
# Use a local model (ai/gemma3-qat:12B) via DMR
cagent new --model dmr/ai/gemma3-qat:12B
$ cagent new
------- Welcome to cagent! -------
(Ctrl+C to stop the agent or exit)
What should your agent/agent team do? (describe its purpose):
> I need an agent team that connects to <some-service> and does...
Agent configurations can be packaged and shared to Docker Hub using the cagent push
command
cagent push ./<agent-file>.yaml namespace/reponame
cagent
will automatically build an OCI image and push it to the desired repository using your Docker credentials
Pulling agents from Docker Hub is also just one cagent pull
command away.
cagent pull creek/pirate
cagent
will pull the image, extract the yaml file and place it in your working directory for ease of use.
cagent run creek.yaml
will run your newly pulled agent
More details on the usage and configuration of cagent
can be found in USAGE.md
We track anonymous usage data to improve the tool. See TELEMETRY.md for details.
Want to hack on cagent
, or help us fix bugs and build out some features? 🔧
Read the information on how to build from source and contribute to the project in CONTRIBUTING.md
A smart way to improve cagent
's codebase and feature set is to do it with the help of a cagent
agent!
We have one that we use and that you should use too:
cd cagent
cagent run ./golang_developer.yaml
This agent is an expert Golang developer specializing in the cagent multi-agent AI system architecture.
Ask it anything about cagent
. It can be questions about the current code or about
improvements to the code. It can also fix issues and implement new features!
We’d love to hear your thoughts on this project. You can find us on Slack
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for cagent
Similar Open Source Tools

cagent
cagent is a powerful and easy-to-use multi-agent runtime that orchestrates AI agents with specialized capabilities and tools, allowing users to quickly build, share, and run a team of virtual experts to solve complex problems. It supports creating agents with YAML configuration, improving agents with MCP servers, and delegating tasks to specialists. Key features include multi-agent architecture, rich tool ecosystem, smart delegation, YAML configuration, advanced reasoning tools, and support for multiple AI providers like OpenAI, Anthropic, Gemini, and Docker Model Runner.

genai-toolbox
Gen AI Toolbox for Databases is an open source server that simplifies building Gen AI tools for interacting with databases. It handles complexities like connection pooling, authentication, and more, enabling easier, faster, and more secure tool development. The toolbox sits between the application's orchestration framework and the database, providing a control plane to modify, distribute, or invoke tools. It offers simplified development, better performance, enhanced security, and end-to-end observability. Users can install the toolbox as a binary, container image, or compile from source. Configuration is done through a 'tools.yaml' file, defining sources, tools, and toolsets. The project follows semantic versioning and welcomes contributions.

TypeGPT
TypeGPT is a Python application that enables users to interact with ChatGPT or Google Gemini from any text field in their operating system using keyboard shortcuts. It provides global accessibility, keyboard shortcuts for communication, and clipboard integration for larger text inputs. Users need to have Python 3.x installed along with specific packages and API keys from OpenAI for ChatGPT access. The tool allows users to run the program normally or in the background, manage processes, and stop the program. Users can use keyboard shortcuts like `/ask`, `/see`, `/stop`, `/chatgpt`, `/gemini`, `/check`, and `Shift + Cmd + Enter` to interact with the application in any text field. Customization options are available by modifying files like `keys.txt` and `system_prompt.txt`. Contributions are welcome, and future plans include adding support for other APIs and a user-friendly GUI.

seer
Seer is a service that provides AI capabilities to Sentry by running inference on Sentry issues and providing user insights. It is currently in early development and not yet compatible with self-hosted Sentry instances. The tool requires access to internal Sentry resources and is intended for internal Sentry employees. Users can set up the environment, download model artifacts, integrate with local Sentry, run evaluations for Autofix AI agent, and deploy to a sandbox staging environment. Development commands include applying database migrations, creating new migrations, running tests, and more. The tool also supports VCRs for recording and replaying HTTP requests.

bolt-python-ai-chatbot
The 'bolt-python-ai-chatbot' is a Slack chatbot app template that allows users to integrate AI-powered conversations into their Slack workspace. Users can interact with the bot in conversations and threads, send direct messages for private interactions, use commands to communicate with the bot, customize bot responses, and store user preferences. The app supports integration with Workflow Builder, custom language models, and different AI providers like OpenAI, Anthropic, and Google Cloud Vertex AI. Users can create user objects, manage user states, and select from various AI models for communication.

ai-town
AI Town is a virtual town where AI characters live, chat, and socialize. This project provides a deployable starter kit for building and customizing your own version of AI Town. It features a game engine, database, vector search, auth, text model, deployment, pixel art generation, background music generation, and local inference. You can customize your own simulation by creating characters and stories, updating spritesheets, changing the background, and modifying the background music.

fabrice-ai
A lightweight, functional, and composable framework for building AI agents that work together to solve complex tasks. Built with TypeScript and designed to be serverless-ready. Fabrice embraces functional programming principles, remains stateless, and stays focused on composability. It provides core concepts like easy teamwork creation, infrastructure-agnosticism, statelessness, and includes all tools and features needed to build AI teams. Agents are specialized workers with specific roles and capabilities, able to call tools and complete tasks. Workflows define how agents collaborate to achieve a goal, with workflow states representing the current state of the workflow. Providers handle requests to the LLM and responses. Tools extend agent capabilities by providing concrete actions they can perform. Execution involves running the workflow to completion, with options for custom execution and BDD testing.

unstructured
The `unstructured` library provides open-source components for ingesting and pre-processing images and text documents, such as PDFs, HTML, Word docs, and many more. The use cases of `unstructured` revolve around streamlining and optimizing the data processing workflow for LLMs. `unstructured` modular functions and connectors form a cohesive system that simplifies data ingestion and pre-processing, making it adaptable to different platforms and efficient in transforming unstructured data into structured outputs.

leptonai
A Pythonic framework to simplify AI service building. The LeptonAI Python library allows you to build an AI service from Python code with ease. Key features include a Pythonic abstraction Photon, simple abstractions to launch models like those on HuggingFace, prebuilt examples for common models, AI tailored batteries, a client to automatically call your service like native Python functions, and Pythonic configuration specs to be readily shipped in a cloud environment.

dravid
Dravid (DRD) is an advanced, AI-powered CLI coding framework designed to follow user instructions until the job is completed, including fixing errors. It can generate code, fix errors, handle image queries, manage file operations, integrate with external APIs, and provide a development server with error handling. Dravid is extensible and requires Python 3.7+ and CLAUDE_API_KEY. Users can interact with Dravid through CLI commands for various tasks like creating projects, asking questions, generating content, handling metadata, and file-specific queries. It supports use cases like Next.js project development, working with existing projects, exploring new languages, Ruby on Rails project development, and Python project development. Dravid's project structure includes directories for source code, CLI modules, API interaction, utility functions, AI prompt templates, metadata management, and tests. Contributions are welcome, and development setup involves cloning the repository, installing dependencies with Poetry, setting up environment variables, and using Dravid for project enhancements.

aides-jeunes
The user interface (and the main server) of the simulator of aids and social benefits for young people. It is based on the free socio-fiscal simulator Openfisca.

feeds.fun
Feeds Fun is a self-hosted news reader tool that automatically assigns tags to news entries. Users can create rules to score news based on tags, filter and sort news as needed, and track read news. The tool offers multi/single-user support, feeds management, and various features for personalized news consumption. Users can access the tool's backend as the ffun package on PyPI and the frontend as the feeds-fun package on NPM. Feeds Fun requires setting up OpenAI or Gemini API keys for full tag generation capabilities. The tool uses tag processors to detect tags for news entries, with options for simple and complex processors. Feeds Fun primarily relies on LLM tag processors from OpenAI and Google for tag generation.

aider-composer
Aider Composer is a VSCode extension that integrates Aider into your development workflow. It allows users to easily add and remove files, toggle between read-only and editable modes, review code changes, use different chat modes, and reference files in the chat. The extension supports multiple models, code generation, code snippets, and settings customization. It has limitations such as lack of support for multiple workspaces, Git repository features, linting, testing, voice features, in-chat commands, and configuration options.

warc-gpt
WARC-GPT is an experimental retrieval augmented generation pipeline for web archive collections. It allows users to interact with WARC files, extract text, generate text embeddings, visualize embeddings, and interact with a web UI and API. The tool is highly customizable, supporting various LLMs, providers, and embedding models. Users can configure the application using environment variables, ingest WARC files, start the server, and interact with the web UI and API to search for content and generate text completions. WARC-GPT is designed for exploration and experimentation in exploring web archives using AI.

opencommit
OpenCommit is a tool that auto-generates meaningful commits using AI, allowing users to quickly create commit messages for their staged changes. It provides a CLI interface for easy usage and supports customization of commit descriptions, emojis, and AI models. Users can configure local and global settings, switch between different AI providers, and set up Git hooks for integration with IDE Source Control. Additionally, OpenCommit can be used as a GitHub Action to automatically improve commit messages on push events, ensuring all commits are meaningful and not generic. Payments for OpenAI API requests are handled by the user, with the tool storing API keys locally.
For similar tasks

Forza-Mods-AIO
Forza Mods AIO is a free and open-source tool that enhances the gaming experience in Forza Horizon 4 and 5. It offers a range of time-saving and quality-of-life features, making gameplay more enjoyable and efficient. The tool is designed to streamline various aspects of the game, improving user satisfaction and overall enjoyment.

hass-ollama-conversation
The Ollama Conversation integration adds a conversation agent powered by Ollama in Home Assistant. This agent can be used in automations to query information provided by Home Assistant about your house, including areas, devices, and their states. Users can install the integration via HACS and configure settings such as API timeout, model selection, context size, maximum tokens, and other parameters to fine-tune the responses generated by the AI language model. Contributions to the project are welcome, and discussions can be held on the Home Assistant Community platform.

crawl4ai
Crawl4AI is a powerful and free web crawling service that extracts valuable data from websites and provides LLM-friendly output formats. It supports crawling multiple URLs simultaneously, replaces media tags with ALT, and is completely free to use and open-source. Users can integrate Crawl4AI into Python projects as a library or run it as a standalone local server. The tool allows users to crawl and extract data from specified URLs using different providers and models, with options to include raw HTML content, force fresh crawls, and extract meaningful text blocks. Configuration settings can be adjusted in the `crawler/config.py` file to customize providers, API keys, chunk processing, and word thresholds. Contributions to Crawl4AI are welcome from the open-source community to enhance its value for AI enthusiasts and developers.

MaterialSearch
MaterialSearch is a tool for searching local images and videos using natural language. It provides functionalities such as text search for images, image search for images, text search for videos (providing matching video clips), image search for videos (searching for the segment in a video through a screenshot), image-text similarity calculation, and Pexels video search. The tool can be deployed through the source code or Docker image, and it supports GPU acceleration. Users can configure the tool through environment variables or a .env file. The tool is still under development, and configurations may change frequently. Users can report issues or suggest improvements through issues or pull requests.

tenere
Tenere is a TUI interface for Language Model Libraries (LLMs) written in Rust. It provides syntax highlighting, chat history, saving chats to files, Vim keybindings, copying text from/to clipboard, and supports multiple backends. Users can configure Tenere using a TOML configuration file, set key bindings, and use different LLMs such as ChatGPT, llama.cpp, and ollama. Tenere offers default key bindings for global and prompt modes, with features like starting a new chat, saving chats, scrolling, showing chat history, and quitting the app. Users can interact with the prompt in different modes like Normal, Visual, and Insert, with various key bindings for navigation, editing, and text manipulation.

openkore
OpenKore is a custom client and intelligent automated assistant for Ragnarok Online. It is a free, open source, and cross-platform program (Linux, Windows, and MacOS are supported). To run OpenKore, you need to download and extract it or clone the repository using Git. Configure OpenKore according to the documentation and run openkore.pl to start. The tool provides a FAQ section for troubleshooting, guidelines for reporting issues, and information about botting status on official servers. OpenKore is developed by a global team, and contributions are welcome through pull requests. Various community resources are available for support and communication. Users are advised to comply with the GNU General Public License when using and distributing the software.

QA-Pilot
QA-Pilot is an interactive chat project that leverages online/local LLM for rapid understanding and navigation of GitHub code repository. It allows users to chat with GitHub public repositories using a git clone approach, store chat history, configure settings easily, manage multiple chat sessions, and quickly locate sessions with a search function. The tool integrates with `codegraph` to view Python files and supports various LLM models such as ollama, openai, mistralai, and localai. The project is continuously updated with new features and improvements, such as converting from `flask` to `fastapi`, adding `localai` API support, and upgrading dependencies like `langchain` and `Streamlit` to enhance performance.

extension-gen-ai
The Looker GenAI Extension provides code examples and resources for building a Looker Extension that integrates with Vertex AI Large Language Models (LLMs). Users can leverage the power of LLMs to enhance data exploration and analysis within Looker. The extension offers generative explore functionality to ask natural language questions about data and generative insights on dashboards to analyze data by asking questions. It leverages components like BQML Remote Models, BQML Remote UDF with Vertex AI, and Custom Fine Tune Model for different integration options. Deployment involves setting up infrastructure with Terraform and deploying the Looker Extension by creating a Looker project, copying extension files, configuring BigQuery connection, connecting to Git, and testing the extension. Users can save example prompts and configure user settings for the extension. Development of the Looker Extension environment includes installing dependencies, starting the development server, and building for production.
For similar jobs

sweep
Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

teams-ai
The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

uAgents
uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.