atomic-agents
Building AI agents, atomically
Stars: 1808
The Atomic Agents framework is a modular and extensible tool designed for creating powerful applications. It leverages Pydantic for data validation and serialization. The framework follows the principles of Atomic Design, providing small and single-purpose components that can be combined. It integrates with Instructor for AI agent architecture and supports various APIs like Cohere, Anthropic, and Gemini. The tool includes documentation, examples, and testing features to ensure smooth development and usage.
README:
The Atomic Agents framework is designed around the concept of atomicity to be an extremely lightweight and modular framework for building Agentic AI pipelines and applications without sacrificing developer experience and maintainability. The framework provides a set of tools and agents that can be combined to create powerful applications. It is built on top of Instructor and leverages the power of Pydantic for data and schema validation and serialization. All logic and control flows are written in Python, enabling developers to apply familiar best practices and workflows from traditional software development without compromising flexibility or clarity.
If you want to learn more about the motivation and philosophy behind Atomic Agents, I suggest checking out the overview video below:
If you just want to dive into the code straight away, I suggest checking out the quickstart video below:
While existing frameworks for agentic AI focus on building autonomous multi-agent systems, they often lack the control and predictability required for real-world applications. Businesses need AI systems that produce consistent, reliable outputs aligned with their brand and objectives.
Atomic Agents addresses this need by providing:
- Modularity: Build AI applications by combining small, reusable components.
- Predictability: Define clear input and output schemas to ensure consistent behavior.
- Extensibility: Easily swap out components or integrate new ones without disrupting the entire system.
- Control: Fine-tune each part of the system individually, from system prompts to tool integrations.
In Atomic Agents, an agent is composed of several key components:
- System Prompt: Defines the agent's behavior and purpose.
- Input Schema: Specifies the structure and validation rules for the agent's input.
- Output Schema: Specifies the structure and validation rules for the agent's output.
- Memory: Stores conversation history or other relevant data.
- Context Providers: Inject dynamic context into the agent's system prompt at runtime.
Here's a high-level architecture diagram:
To install Atomic Agents, you can use pip:
pip install atomic-agents
Make sure you also install the provider you want to use. For example, to use OpenAI and Groq, you can install the openai
and groq
packages:
pip install openai groq
This also installs the CLI Atomic Assembler, which can be used to download Tools (and soon also Agents and Pipelines).
For local development, you can install from the repository:
git clone https://github.com/BrainBlend-AI/atomic-agents.git
cd atomic-agents
poetry install
Atomic Agents uses a monorepo structure with the following main components:
-
atomic-agents/
: The core Atomic Agents library -
atomic-assembler/
: The CLI tool for managing Atomic Agents components -
atomic-examples/
: Example projects showcasing Atomic Agents usage -
atomic-forge/
: A collection of tools that can be used with Atomic Agents
A complete list of examples can be found in the examples directory.
We strive to thoroughly document each example, but if something is unclear, please don't hesitate to open an issue or pull request to improve the documentation.
Here's a quick snippet demonstrating how easy it is to create a powerful agent with Atomic Agents:
# Define a custom output schema
class CustomOutputSchema(BaseIOSchema):
"""
docstring for the custom output schema
"""
chat_message: str = Field(..., description="The chat message from the agent.")
suggested_questions: List[str] = Field(..., description="Suggested follow-up questions.")
# Set up the system prompt
system_prompt_generator = SystemPromptGenerator(
background=["This assistant is knowledgeable, helpful, and suggests follow-up questions."],
steps=[
"Analyze the user's input to understand the context and intent.",
"Formulate a relevant and informative response.",
"Generate 3 suggested follow-up questions for the user."
],
output_instructions=[
"Provide clear and concise information in response to user queries.",
"Conclude each response with 3 relevant suggested questions for the user."
]
)
# Initialize the agent
agent = BaseAgent(
config=BaseAgentConfig(
client=your_openai_client, # Replace with your actual client
model="gpt-4o-mini",
system_prompt_generator=system_prompt_generator,
memory=AgentMemory(),
output_schema=CustomOutputSchema
)
)
# Use the agent
response = agent.run(user_input)
print(f"Agent: {response.chat_message}")
print("Suggested questions:")
for question in response.suggested_questions:
print(f"- {question}")
This snippet showcases how to create a customizable agent that responds to user queries and suggests follow-up questions. For full, runnable examples, please refer to the following files in the atomic-examples/quickstart/quickstart/
directory:
-
Basic Chatbot A minimal chatbot example to get you started.
-
Custom Chatbot A more advanced example with a custom system prompt.
-
Custom Chatbot with Schema An advanced example featuring a custom output schema.
-
Multi-Provider Chatbot Demonstrates how to use different providers such as Ollama or Groq.
In addition to the quickstart examples, we have more complex examples demonstrating the power of Atomic Agents:
-
Web Search Agent: An intelligent agent that performs web searches and answers questions based on the results.
-
YouTube Summarizer: An agent that extracts and summarizes knowledge from YouTube videos.
For a complete list of examples, see the examples directory.
These examples provide a great starting point for understanding and using Atomic Agents.
Atomic Agents allows you to enhance your agents with dynamic context using Context Providers. Context Providers enable you to inject additional information into the agent's system prompt at runtime, making your agents more flexible and context-aware.
To use a Context Provider, create a class that inherits from SystemPromptContextProviderBase
and implements the get_info()
method, which returns the context string to be added to the system prompt.
Here's a simple example:
from atomic_agents.lib.components.system_prompt_generator import SystemPromptContextProviderBase
class SearchResultsProvider(SystemPromptContextProviderBase):
def __init__(self, title: str, search_results: List[str]):
super().__init__(title=title)
self.search_results = search_results
def get_info(self) -> str:
return "\n".join(self.search_results)
You can then register your Context Provider with the agent:
# Initialize your context provider with dynamic data
search_results_provider = SearchResultsProvider(
title="Search Results",
search_results=["Result 1", "Result 2", "Result 3"]
)
# Register the context provider with the agent
agent.register_context_provider("search_results", search_results_provider)
This allows your agent to include the search results (or any other context) in its system prompt, enhancing its responses based on the latest information.
Atomic Agents makes it easy to chain agents and tools together by aligning their input and output schemas. This design allows you to swap out components effortlessly, promoting modularity and reusability in your AI applications.
Suppose you have an agent that generates search queries and you want to use these queries with different search tools. By aligning the agent's output schema with the input schema of the search tool, you can easily chain them together or switch between different search providers.
Here's how you can achieve this:
import instructor
import openai
from pydantic import Field
from atomic_agents.agents.base_agent import BaseIOSchema, BaseAgent, BaseAgentConfig
from atomic_agents.lib.components.system_prompt_generator import SystemPromptGenerator
# Import the search tool you want to use
from web_search_agent.tools.searxng_search import SearxNGSearchTool
# Define the input schema for the query agent
class QueryAgentInputSchema(BaseIOSchema):
"""Input schema for the QueryAgent."""
instruction: str = Field(..., description="Instruction to generate search queries for.")
num_queries: int = Field(..., description="Number of queries to generate.")
# Initialize the query agent
query_agent = BaseAgent(
BaseAgentConfig(
client=instructor.from_openai(openai.OpenAI()),
model="gpt-4o-mini",
system_prompt_generator=SystemPromptGenerator(
background=[
"You are an intelligent query generation expert.",
"Your task is to generate a specified number of diverse and highly relevant queries based on a given instruction."
],
steps=[
"Receive the instruction and the number of queries to generate.",
"Generate the queries in JSON format."
],
output_instructions=[
"Ensure each query is unique and relevant.",
"Provide the queries in the expected schema."
],
),
input_schema=QueryAgentInputSchema,
output_schema=SearxNGSearchTool.input_schema, # Align output schema
)
)
In this example:
-
Modularity: By setting the
output_schema
of thequery_agent
to match theinput_schema
ofSearxNGSearchTool
, you can directly use the output of the agent as input to the tool. -
Swapability: If you decide to switch to a different search provider, you can import a different search tool and update the
output_schema
accordingly.
For instance, to switch to another search service:
# Import a different search tool
from web_search_agent.tools.another_search import AnotherSearchTool
# Update the output schema
query_agent.config.output_schema = AnotherSearchTool.input_schema
This design pattern simplifies the process of chaining agents and tools, making your AI applications more adaptable and easier to maintain.
To run the CLI, simply run the following command:
atomic
Or if you installed Atomic Agents with Poetry, for example:
poetry run atomic
Or if you installed Atomic Agents with uv:
uv run atomic
After running this command, you will be presented with a menu allowing you to download tools.
Each tool's has its own:
- Input schema
- Output schema
- Usage example
- Dependencies
- Installation instructions
The atomic-assembler
CLI gives you complete control over your tools, avoiding the clutter of unnecessary dependencies. It makes modifying tools straightforward additionally, each tool comes with its own set of tests for reliability.
But you're not limited to the CLI! If you prefer, you can directly access the tool folders and manage them manually by simply copying and pasting as needed.
Atomic Agents depends on the Instructor package. This means that in all examples where OpenAI is used, any other API supported by Instructor can also be used—such as Ollama, Groq, Mistral, Cohere, Anthropic, Gemini, and more. For a complete list, please refer to the Instructor documentation on its GitHub page.
API documentation can be found here.
Atomic Forge is a collection of tools that can be used with Atomic Agents to extend its functionality. Current tools include:
- Calculator
- SearxNG Search
- YouTube Transcript Scraper
For more information on using and creating tools, see the Atomic Forge README.
We welcome contributions! Please see the Developer Guide for detailed information on how to contribute to Atomic Agents. Here are some quick steps:
- Fork the repository
- Create a new branch (
git checkout -b feature-branch
) - Make your changes
- Run tests (
poetry run pytest --cov=atomic_agents atomic-agents
) - Format your code (
poetry run black atomic-agents atomic-assembler atomic-examples atomic-forge
) - Lint your code (
poetry run flake8 --extend-exclude=.venv atomic-agents atomic-assembler atomic-examples atomic-forge
) - Commit your changes (
git commit -m 'Add some feature'
) - Push to the branch (
git push origin feature-branch
) - Open a pull request
For full development setup and guidelines, please refer to the Developer Guide.
This project is licensed under the MIT License—see the LICENSE file for details.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for atomic-agents
Similar Open Source Tools
atomic-agents
The Atomic Agents framework is a modular and extensible tool designed for creating powerful applications. It leverages Pydantic for data validation and serialization. The framework follows the principles of Atomic Design, providing small and single-purpose components that can be combined. It integrates with Instructor for AI agent architecture and supports various APIs like Cohere, Anthropic, and Gemini. The tool includes documentation, examples, and testing features to ensure smooth development and usage.
NeMo-Guardrails
NeMo Guardrails is an open-source toolkit for easily adding _programmable guardrails_ to LLM-based conversational applications. Guardrails (or "rails" for short) are specific ways of controlling the output of a large language model, such as not talking about politics, responding in a particular way to specific user requests, following a predefined dialog path, using a particular language style, extracting structured data, and more.
Tiger
Tiger is a community-driven project developing a reusable and integrated tool ecosystem for LLM Agent Revolution. It utilizes Upsonic for isolated tool storage, profiling, and automatic document generation. With Tiger, you can create a customized environment for your agents or leverage the robust and publicly maintained Tiger curated by the community itself.
crewAI
CrewAI is a cutting-edge framework designed to orchestrate role-playing autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks. It enables AI agents to assume roles, share goals, and operate in a cohesive unit, much like a well-oiled crew. Whether you're building a smart assistant platform, an automated customer service ensemble, or a multi-agent research team, CrewAI provides the backbone for sophisticated multi-agent interactions. With features like role-based agent design, autonomous inter-agent delegation, flexible task management, and support for various LLMs, CrewAI offers a dynamic and adaptable solution for both development and production workflows.
codebase-context-spec
The Codebase Context Specification (CCS) project aims to standardize embedding contextual information within codebases to enhance understanding for both AI and human developers. It introduces a convention similar to `.env` and `.editorconfig` files but focused on documenting code for both AI and humans. By providing structured contextual metadata, collaborative documentation guidelines, and standardized context files, developers can improve code comprehension, collaboration, and development efficiency. The project includes a linter for validating context files and provides guidelines for using the specification with AI assistants. Tooling recommendations suggest creating memory systems, IDE plugins, AI model integrations, and agents for context creation and utilization. Future directions include integration with existing documentation systems, dynamic context generation, and support for explicit context overriding.
langchain
LangChain is a framework for developing Elixir applications powered by language models. It enables applications to connect language models to other data sources and interact with the environment. The library provides components for working with language models and off-the-shelf chains for specific tasks. It aims to assist in building applications that combine large language models with other sources of computation or knowledge. LangChain is written in Elixir and is not aimed for parity with the JavaScript and Python versions due to differences in programming paradigms and design choices. The library is designed to make it easy to integrate language models into applications and expose features, data, and functionality to the models.
generative-ai-sagemaker-cdk-demo
This repository showcases how to deploy generative AI models from Amazon SageMaker JumpStart using the AWS CDK. Generative AI is a type of AI that can create new content and ideas, such as conversations, stories, images, videos, and music. The repository provides a detailed guide on deploying image and text generative AI models, utilizing pre-trained models from SageMaker JumpStart. The web application is built on Streamlit and hosted on Amazon ECS with Fargate. It interacts with the SageMaker model endpoints through Lambda functions and Amazon API Gateway. The repository also includes instructions on setting up the AWS CDK application, deploying the stacks, using the models, and viewing the deployed resources on the AWS Management Console.
ScreenAgent
ScreenAgent is a project focused on creating an environment for Visual Language Model agents (VLM Agent) to interact with real computer screens. The project includes designing an automatic control process for agents to interact with the environment and complete multi-step tasks. It also involves building the ScreenAgent dataset, which collects screenshots and action sequences for various daily computer tasks. The project provides a controller client code, configuration files, and model training code to enable users to control a desktop with a large model.
BentoDiffusion
BentoDiffusion is a BentoML example project that demonstrates how to serve and deploy diffusion models in the Stable Diffusion (SD) family. These models are specialized in generating and manipulating images based on text prompts. The project provides a guide on using SDXL Turbo as an example, along with instructions on prerequisites, installing dependencies, running the BentoML service, and deploying to BentoCloud. Users can interact with the deployed service using Swagger UI or other methods. Additionally, the project offers the option to choose from various diffusion models available in the repository for deployment.
rosa
ROSA is an AI Agent designed to interact with ROS-based robotics systems using natural language queries. It can generate system reports, read and parse ROS log files, adapt to new robots, and run various ROS commands using natural language. The tool is versatile for robotics research and development, providing an easy way to interact with robots and the ROS environment.
kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.
lmql
LMQL is a programming language designed for large language models (LLMs) that offers a unique way of integrating traditional programming with LLM interaction. It allows users to write programs that combine algorithmic logic with LLM calls, enabling model reasoning capabilities within the context of the program. LMQL provides features such as Python syntax integration, rich control-flow options, advanced decoding techniques, powerful constraints via logit masking, runtime optimization, sync and async API support, multi-model compatibility, and extensive applications like JSON decoding and interactive chat interfaces. The tool also offers library integration, flexible tooling, and output streaming options for easy model output handling.
MARS5-TTS
MARS5 is a novel English speech model (TTS) developed by CAMB.AI, featuring a two-stage AR-NAR pipeline with a unique NAR component. The model can generate speech for various scenarios like sports commentary and anime with just 5 seconds of audio and a text snippet. It allows steering prosody using punctuation and capitalization in the transcript. Speaker identity is specified using an audio reference file, enabling 'deep clone' for improved quality. The model can be used via torch.hub or HuggingFace, supporting both shallow and deep cloning for inference. Checkpoints are provided for AR and NAR models, with hardware requirements of 750M+450M params on GPU. Contributions to improve model stability, performance, and reference audio selection are welcome.
OlympicArena
OlympicArena is a comprehensive benchmark designed to evaluate advanced AI capabilities across various disciplines. It aims to push AI towards superintelligence by tackling complex challenges in science and beyond. The repository provides detailed data for different disciplines, allows users to run inference and evaluation locally, and offers a submission platform for testing models on the test set. Additionally, it includes an annotation interface and encourages users to cite their paper if they find the code or dataset helpful.
VoiceStreamAI
VoiceStreamAI is a Python 3-based server and JavaScript client solution for near-realtime audio streaming and transcription using WebSocket. It employs Huggingface's Voice Activity Detection (VAD) and OpenAI's Whisper model for accurate speech recognition. The system features real-time audio streaming, modular design for easy integration of VAD and ASR technologies, customizable audio chunk processing strategies, support for multilingual transcription, and secure sockets support. It uses a factory and strategy pattern implementation for flexible component management and provides a unit testing framework for robust development.
nagato-ai
Nagato-AI is an intuitive AI Agent library that supports multiple LLMs including OpenAI's GPT, Anthropic's Claude, Google's Gemini, and Groq LLMs. Users can create agents from these models and combine them to build an effective AI Agent system. The library is named after the powerful ninja Nagato from the anime Naruto, who can control multiple bodies with different abilities. Nagato-AI acts as a linchpin to summon and coordinate AI Agents for specific missions. It provides flexibility in programming and supports tools like Coordinator, Researcher, Critic agents, and HumanConfirmInputTool.
For similar tasks
axoned
Axone is a public dPoS layer 1 designed for connecting, sharing, and monetizing resources in the AI stack. It is an open network for collaborative AI workflow management compatible with any data, model, or infrastructure, allowing sharing of data, algorithms, storage, compute, APIs, both on-chain and off-chain. The 'axoned' node of the AXONE network is built on Cosmos SDK & Tendermint consensus, enabling companies & individuals to define on-chain rules, share off-chain resources, and create new applications. Validators secure the network by maintaining uptime and staking $AXONE for rewards. The blockchain supports various platforms and follows Semantic Versioning 2.0.0. A docker image is available for quick start, with documentation on querying networks, creating wallets, starting nodes, and joining networks. Development involves Go and Cosmos SDK, with smart contracts deployed on the AXONE blockchain. The project provides a Makefile for building, installing, linting, and testing. Community involvement is encouraged through Discord, open issues, and pull requests.
langstream
LangStream is a tool for natural language processing tasks, providing a CLI for easy installation and usage. Users can try sample applications like Chat Completions and create their own applications using the developer documentation. It supports running on Kubernetes for production-ready deployment, with support for various Kubernetes distributions and external components like Apache Kafka or Apache Pulsar cluster. Users can deploy LangStream locally using minikube and manage the cluster with mini-langstream. Development requirements include Docker, Java 17, Git, Python 3.11+, and PIP, with the option to test local code changes using mini-langstream.
atomic_agents
Atomic Agents is a modular and extensible framework designed for creating powerful applications. It follows the principles of Atomic Design, emphasizing small and single-purpose components. Leveraging Pydantic for data validation and serialization, the framework offers a set of tools and agents that can be combined to build AI applications. It depends on the Instructor package and supports various APIs like OpenAI, Cohere, Anthropic, and Gemini. Atomic Agents is suitable for developers looking to create AI agents with a focus on modularity and flexibility.
writer-framework
Writer Framework is an open-source framework for creating AI applications. It allows users to build user interfaces using a visual editor and write the backend code in Python. The framework is fast, flexible, and provides separation of concerns between UI and business logic. It is reactive and state-driven, highly customizable without requiring CSS, fast in event handling, developer-friendly with easy installation and quick start options, and contains full documentation for using its AI module and deployment options.
atomic-agents
The Atomic Agents framework is a modular and extensible tool designed for creating powerful applications. It leverages Pydantic for data validation and serialization. The framework follows the principles of Atomic Design, providing small and single-purpose components that can be combined. It integrates with Instructor for AI agent architecture and supports various APIs like Cohere, Anthropic, and Gemini. The tool includes documentation, examples, and testing features to ensure smooth development and usage.
promptfoo
Promptfoo is a tool for testing and evaluating LLM output quality. With promptfoo, you can build reliable prompts, models, and RAGs with benchmarks specific to your use-case, speed up evaluations with caching, concurrency, and live reloading, score outputs automatically by defining metrics, use as a CLI, library, or in CI/CD, and use OpenAI, Anthropic, Azure, Google, HuggingFace, open-source models like Llama, or integrate custom API providers for any LLM API.
LLM-SFT
LLM-SFT is a Chinese large model fine-tuning tool that supports models such as ChatGLM, LlaMA, Bloom, Baichuan-7B, and frameworks like LoRA, QLoRA, DeepSpeed, UI, and TensorboardX. It facilitates tasks like fine-tuning, inference, evaluation, and API integration. The tool provides pre-trained weights for various models and datasets for Chinese language processing. It requires specific versions of libraries like transformers and torch for different functionalities.
EDDI
E.D.D.I (Enhanced Dialog Driven Interface) is an enterprise-certified chatbot middleware that offers advanced prompt and conversation management for Conversational AI APIs. Developed in Java using Quarkus, it is lean, RESTful, scalable, and cloud-native. E.D.D.I is highly scalable and designed to efficiently manage conversations in AI-driven applications, with seamless API integration capabilities. Notable features include configurable NLP and Behavior rules, support for multiple chatbots running concurrently, and integration with MongoDB, OAuth 2.0, and HTML/CSS/JavaScript for UI. The project requires Java 21, Maven 3.8.4, and MongoDB >= 5.0 to run. It can be built as a Docker image and deployed using Docker or Kubernetes, with additional support for integration testing and monitoring through Prometheus and Kubernetes endpoints.
For similar jobs
weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.
LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.
kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.
PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.
tabby
Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.
spear
SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.
Magick
Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.