
MassGen
π MassGen: An Open-source Multi-Agent Scaling System Inspired by Grok Heavy and Gemini Deep Think. Join the discord channel: https://discord.com/invite/VVrT2rQaz5
Stars: 408

README:
MassGen is a cutting-edge multi-agent system that leverages the power of collaborative AI to solve complex tasks.
Multi-agent scaling through intelligent collaboration in Grok Heavy style
MassGen is a cutting-edge multi-agent system that leverages the power of collaborative AI to solve complex tasks. It assigns a task to multiple AI agents who work in parallel, observe each other's progress, and refine their approaches to converge on the best solution to deliver a comprehensive and high-quality result. The power of this "parallel study group" approach is exemplified by advanced systems like xAI's Grok Heavy and Google DeepMind's Gemini Deep Think.
This project started with the "threads of thought" and "iterative refinement" ideas presented in The Myth of Reasoning, and extends the classic "multi-agent conversation" idea in AG2. Here is a video recording of the background context introduction presented at the Berkeley Agentic AI Summit 2025.
- Recent Achievements
-
Key Future Enhancements
- Advanced Agent Collaboration
- Expanded Model, Tool & Agent Integrations
- Improved Performance & Scalability
- Enhanced Developer Experience
- Web Interface
- v0.0.15 Roadmap
Feature | Description |
---|---|
π€ Cross-Model/Agent Synergy | Harness strengths from diverse frontier model-powered agents |
β‘ Parallel Processing | Multiple agents tackle problems simultaneously |
π₯ Intelligence Sharing | Agents share and learn from each other's work |
π Consensus Building | Natural convergence through collaborative refinement |
π Live Visualization | See agents' working processes in real-time |
MassGen operates through an architecture designed for seamless multi-agent collaboration:
graph TB
O[π MassGen Orchestrator<br/>π Task Distribution & Coordination]
subgraph Collaborative Agents
A1[Agent 1<br/>ποΈ Anthropic/Claude + Tools]
A2[Agent 2<br/>π Google/Gemini + Tools]
A3[Agent 3<br/>π€ OpenAI/GPT + Tools]
A4[Agent 4<br/>β‘ xAI/Grok + Tools]
end
H[π Shared Collaboration Hub<br/>π‘ Real-time Notification & Consensus]
O --> A1 & A2 & A3 & A4
A1 & A2 & A3 & A4 <--> H
classDef orchestrator fill:#e1f5fe,stroke:#0288d1,stroke-width:3px
classDef agent fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef hub fill:#e8f5e8,stroke:#388e3c,stroke-width:2px
class O orchestrator
class A1,A2,A3,A4 agent
class H hub
The system's workflow is defined by the following key principles:
Parallel Processing - Multiple agents tackle the same task simultaneously, each leveraging their unique capabilities (different models, tools, and specialized approaches).
Real-time Collaboration - Agents continuously share their working summaries and insights through a notification system, allowing them to learn from each other's approaches and build upon collective knowledge.
Convergence Detection - The system intelligently monitors when agents have reached stability in their solutions and achieved consensus through natural collaboration rather than forced agreement.
Adaptive Coordination - Agents can restart and refine their work when they receive new insights from others, creating a dynamic and responsive problem-solving environment.
This collaborative approach ensures that the final output leverages collective intelligence from multiple AI systems, leading to more robust and well-rounded results than any single agent could achieve alone.
Core Installation:
git clone https://github.com/Leezekun/MassGen.git
cd MassGen
pip install uv
uv venv
Optional CLI Tools (for enhanced capabilities):
# Claude Code CLI - Advanced coding assistant
npm install -g @anthropic-ai/claude-code
# LM Studio - Local model inference
# For MacOS/Linux
sudo ~/.lmstudio/bin/lms bootstrap
# For Windows
cmd /c %USERPROFILE%/.lmstudio/bin/lms.exe bootstrap
Using the template file .env.example
to create a .env
file in the massgen
directory with your API keys. Note that only the API keys of the models used by your MassGen agent team is needed.
# Copy example configuration
cp .env.example .env
Useful links to get API keys:
The system currently supports multiple model providers with advanced capabilities:
API-based Models:
- Azure OpenAI (NEW in v0.0.10): GPT-4, GPT-4o, GPT-3.5-turbo, GPT-4.1, GPT-5-chat
- Cerebras AI: GPT-OSS-120B...
- Claude: Claude Haiku 3.5, Claude Sonnet 4, Claude Opus 4...
- Claude Code: Native Claude Code SDK with comprehensive dev tools
- Gemini: Gemini 2.5 Flash, Gemini 2.5 Pro...
- Grok: Grok-4, Grok-3, Grok-3-mini...
- OpenAI: GPT-5 series (GPT-5, GPT-5-mini, GPT-5-nano)...
- Together AI, Fireworks AI, Groq, Nebius AI Studio, OpenRouter: LLaMA, Mistral, Qwen...
- Z AI: GLM-4.5
Local Model Support (NEW in v0.0.7):
-
LM Studio: Run open-weight models locally with automatic server management
- Automatic LM Studio CLI installation
- Auto-download and loading of models
- Zero-cost usage reporting
- Support for LLaMA, Mistral, Qwen and other open-weight models
More providers and local inference engines (vllm, sglang) are welcome to be added.
MassGen agents can leverage various tools to enhance their problem-solving capabilities. Both API-based and CLI-based backends support different tool capabilities.
Supported Built-in Tools by Backend:
Backend | Live Search | Code Execution | File Operations | MCP Support | Advanced Features |
---|---|---|---|---|---|
Azure OpenAI (NEW in v0.0.10) | β | β | β | β | Code interpreter, Azure deployment management |
Claude API | β | β | β | β | Web search, code interpreter |
Claude Code | β | β | β | β | Native Claude Code SDK, comprehensive dev tools, MCP integration |
Gemini API | β | β | β | β | Web search, code execution |
Grok API | β | β | β | β | Web search only |
OpenAI API | β | β | β | β | Web search, code interpreter |
ZAI API | β | β | β | β | - |
API-based backends:
uv run python -m massgen.cli --model claude-3-5-sonnet-latest "When is your knowledge up to"
uv run python -m massgen.cli --model gemini-2.5-flash "When is your knowledge up to"
uv run python -m massgen.cli --model grok-3-mini "When is your knowledge up to"
uv run python -m massgen.cli --model gpt-5-nano "When is your knowledge up to"
uv run python -m massgen.cli --backend chatcompletion --base-url https://api.cerebras.ai/v1 --model gpt-oss-120b "When is your knowledge up to"
# Azure OpenAI (NEW in v0.0.10, requires environment variables)
uv run python -m massgen.cli --backend azure_openai --model gpt-4.1 "When is your knowledge up to"
All the models with a default backend can be found here.
Local models (NEW in v0.0.7):
# Use LM Studio with automatic model management
uv run python -m massgen.cli --config lmstudio.yaml "Explain quantum computing"
CLI-based backends:
# Claude Code - Native Claude Code SDK with comprehensive dev tools
uv run python -m massgen.cli --backend claude_code --model sonnet "Can I use claude-3-5-haiku for claude code?"
uv run python -m massgen.cli --backend claude_code --model sonnet "Debug this Python script"
--backend
is required for CLI-based backends. Note: --model
parameter is required but ignored for Claude Code backend (uses Claude's latest model automatically).
# Use configuration file
uv run python -m massgen.cli --config three_agents_default.yaml "Summarize latest news of github.com/Leezekun/MassGen"
# Mixed API and CLI backends
uv run python -m massgen.cli --config claude_code_flash2.5.yaml "find 5 papers which are related to multi-agent scaling system Massgen, download them and list their title in markdown"
uv run python -m massgen.cli --config claude_code_gpt5nano.yaml "find 5 papers which are related to multi-agent scaling system Massgen, download them and list their title in markdown"
# Azure OpenAI configurations (NEW in v0.0.10)
uv run python -m massgen.cli --config azure_openai_single.yaml "What is machine learning?"
uv run python -m massgen.cli --config azure_openai_multi.yaml "Compare different approaches to renewable energy"
# MCP-enabled configurations (NEW in v0.0.9)
uv run python -m massgen.cli --config multi_agent_playwright_automation.yaml "Browse https://github.com/Leezekun/MassGen and generate reports with screenshots"
uv run python -m massgen.cli --config claude_code_discord_mcp_example.yaml "Extract 3 latest discord messages"
uv run python -m massgen.cli --config claude_code_twitter_mcp_example.yaml "Search for the 3 latest tweets from @massgen_ai"
# Hybrid local and API-based models (NEW in v0.0.7)
uv run python -m massgen.cli --config two_agents_opensource_lmstudio.yaml "Analyze this algorithm's complexity"
uv run python -m massgen.cli --config gpt5nano_glm_qwen.yaml "Design a distributed system architecture"
# Debug mode for troubleshooting (NEW in v0.0.13)
uv run python -m massgen.cli --model claude-3-5-sonnet-latest --debug "What is machine learning?"
uv run python -m massgen.cli --config three_agents_default.yaml --debug "Debug multi-agent coordination"
All available quick configuration files can be found here.
See MCP server setup guides: Discord MCP | Twitter MCP | Playwright MCP |
Parameter | Description |
---|---|
--config |
Path to YAML configuration file with agent definitions, model parameters, backend parameters and UI settings |
--backend |
Backend type for quick setup without a config file (claude , claude_code , gemini , grok , openai , azure_openai , zai ). Optional for models with default backends. |
--model |
Model name for quick setup (e.g., gemini-2.5-flash , gpt-5-nano , ...). --config and --model are mutually exclusive - use one or the other. |
--system-message |
System prompt for the agent in quick setup mode. If --config is provided, --system-message is omitted. |
--no-display |
Disable real-time streaming UI coordination display (fallback to simple text output). |
--no-logs |
Disable real-time logging. |
--debug |
Enable debug mode with verbose logging (NEW in v0.0.13). Shows detailed orchestrator activities, agent messages, backend operations, and tool calls. Debug logs are saved to agent_outputs/log_{time}/massgen_debug.log . |
"<your question>" |
Optional single-question input; if omitted, MassGen enters interactive chat mode. |
MassGen supports YAML/JSON configuration files with the following structure (All available quick configuration files can be found here):
Single Agent Configuration:
Use the agent
field to define a single agent with its backend and settings:
agent:
id: "<agent_name>"
backend:
type: "azure_openai" | "chatcompletion" | "claude" | "claude_code" | "gemini" | "grok" | "openai" | "zai" | "lmstudio" #Type of backend
model: "<model_name>" # Model name
api_key: "<optional_key>" # API key for backend. Uses env vars by default.
system_message: "..." # System Message for Single Agent
Multi-Agent Configuration:
Use the agents
field to define multiple agents, each with its own backend and config:
agents: # Multiple agents (alternative to 'agent')
- id: "<agent1 name>"
backend:
type: "azure_openai" | "chatcompletion" | "claude" | "claude_code" | "gemini" | "grok" | "openai" | "zai" | "lmstudio" #Type of backend
model: "<model_name>" # Model name
api_key: "<optional_key>" # API key for backend. Uses env vars by default.
system_message: "..." # System Message for Single Agent
- id: "..."
backend:
type: "..."
model: "..."
...
system_message: "..."
Backend Configuration:
Detailed parameters for each agent's backend can be specified using the following configuration formats:
backend:
type: "chatcompletion"
model: "gpt-oss-120b" # Model name
base_url: "https://api.cerebras.ai/v1" # Base URL for API endpoint
api_key: "<optional_key>" # API key for backend. Uses env vars by default.
temperature: 0.7 # Creativity vs consistency (0.0-1.0)
max_tokens: 2500 # Maximum response length
backend:
type: "claude"
model: "claude-sonnet-4-20250514" # Model name
api_key: "<optional_key>" # API key for backend. Uses env vars by default.
temperature: 0.7 # Creativity vs consistency (0.0-1.0)
max_tokens: 2500 # Maximum response length
enable_web_search: true # Web search capability
enable_code_execution: true # Code execution capability
backend:
type: "gemini"
model: "gemini-2.5-flash" # Model name
api_key: "<optional_key>" # API key for backend. Uses env vars by default.
temperature: 0.7 # Creativity vs consistency (0.0-1.0)
max_tokens: 2500 # Maximum response length
enable_web_search: true # Web search capability
enable_code_execution: true # Code execution capability
backend:
type: "grok"
model: "grok-3-mini" # Model name
api_key: "<optional_key>" # API key for backend. Uses env vars by default.
temperature: 0.7 # Creativity vs consistency (0.0-1.0)
max_tokens: 2500 # Maximum response length
enable_web_search: true # Web search capability (uses default: mode="auto", return_citations=true)
# OR manually specify search parameters via extra_body (conflicts with enable_web_search):
# extra_body:
# search_parameters:
# mode: "auto" # Search strategy (see Grok API docs for valid values)
# return_citations: true # Include search result citations
backend:
type: "azure_openai"
model: "gpt-4.1" # Azure OpenAI deployment name
base_url: "https://your-resource.openai.azure.com/" # Azure OpenAI endpoint
api_key: "<optional_key>" # API key for backend. Uses AZURE_OPENAI_API_KEY env var by default.
api_version: "2024-02-15-preview" # Azure OpenAI API version
temperature: 0.7 # Creativity vs consistency (0.0-1.0)
max_tokens: 2500 # Maximum response length
enable_code_interpreter: true # Code interpreter capability
backend:
type: "openai"
model: "gpt-5-mini" # Model name
api_key: "<optional_key>" # API key for backend. Uses env vars by default.
temperature: 0.7 # Creativity vs consistency (0.0-1.0, GPT-5 series models and GPT o-series models don't support this)
max_tokens: 2500 # Maximum response length (GPT-5 series models and GPT o-series models don't support this)
text:
verbosity: "medium" # Response detail level (low/medium/high, only supported in GPT-5 series models)
reasoning:
effort: "medium" # Reasoning depth (low/medium/high, only supported in GPT-5 series models and GPT o-series models)
summary: "auto" # Automatic reasoning summaries (optional)
enable_web_search: true # Web search capability - can be used with reasoning
enable_code_interpreter: true # Code interpreter capability - can be used with reasoning
backend:
type: "claude_code"
cwd: "claude_code_workspace" # Working directory for file operations
api_key: "<optional_key>" # API key for backend. Uses env vars by default.
# Claude Code specific options
system_prompt: "" # Custom system prompt to replace default
append_system_prompt: "" # Custom system prompt to append
max_thinking_tokens: 4096 # Maximum thinking tokens
# MCP (Model Context Protocol) servers configuration
mcp_servers:
# Discord integration server
discord:
type: "stdio" # Communication type: stdio (standard input/output)
command: "npx" # Command to execute: Node Package Execute
args: ["-y", "mcp-discord", "--config", "YOUR_DISCORD_TOKEN"] # Arguments: -y (auto-confirm), mcp-discord package, config with Discord bot token
# Playwright web automation server
playwright:
type: "stdio" # Communication type: stdio (standard input/output)
command: "npx" # Command to execute: Node Package Execute
args: [
"@playwright/mcp@latest",
"--browser=chrome", # Use Chrome browser
"--caps=vision,pdf", # Enable vision and PDF capabilities
"--user-data-dir=/tmp/playwright-profile", # Persistent browser profile
"--save-trace" # Save Playwright traces for debugging
]
# Tool configuration (Claude Code's native tools)
allowed_tools:
- "Read" # Read files from filesystem
- "Write" # Write files to filesystem
- "Edit" # Edit existing files
- "MultiEdit" # Multiple edits in one operation
- "Bash" # Execute shell commands
- "Grep" # Search within files
- "Glob" # Find files by pattern
- "LS" # List directory contents
- "WebSearch" # Search the web
- "WebFetch" # Fetch web content
- "TodoWrite" # Task management
- "NotebookEdit" # Jupyter notebook editing
# MCP tools (if available), MCP tools will be auto-discovered from the server
- "mcp__discord__discord_login"
- "mcp__discord__discord_readmessages"
- "mcp__playwright"
backend:
type: "zai"
model: "glm-4.5" # Model name
base_url: "https://api.z.ai/api/paas/v4/" # Base URL for API endpoint
api_key: "<optional_key>" # API key for backend. Uses env vars by default.
temperature: 0.7 # Creativity vs consistency (0.0-1.0)
top_p: 0.7 # Nucleus sampling cutoff; keeps smallest set of tokens with cumulative probability β₯ top_p
backend:
type: "lmstudio"
model: "qwen2.5-7b-instruct" # Model to load in LM Studio
temperature: 0.7 # Creativity vs consistency (0.0-1.0)
max_tokens: 2000 # Maximum response length
UI Configuration:
Configure how MassGen displays information and handles logging during execution:
ui:
display_type: "rich_terminal" | "terminal" | "simple" # Display format for agent interactions
logging_enabled: true | false # Enable/disable real-time logging
-
display_type
: Controls the visual presentation of agent interactions-
"rich_terminal"
: Full-featured display with multi-region layout, live status updates, and colored output -
"terminal"
: Standard terminal display with basic formatting and sequential output -
"simple"
: Plain text output without any formatting or special display features
-
-
logging_enabled
: Whentrue
, saves detailed timestamp, agent outputs and system status
Time Control Configuration:
Configure timeout settings to control how long MassGen's orchestrator can run:
timeout_settings:
orchestrator_timeout_seconds: 30 # Maximum time for orchestration
-
orchestrator_timeout_seconds
: Sets the maximum time allowed for the orchestration phase
Orchestrator Configuration:
Configure the orchestrator settings for managing agent workspace snapshots and temporary workspaces:
orchestrator:
snapshot_storage: "claude_code_snapshots" # Directory to store workspace snapshots
agent_temporary_workspace: "claude_code_temp_workspaces" # Directory for temporary agent workspaces
-
snapshot_storage
: Directory where MassGen saves workspace snapshots for Claude Code agents to share context -
agent_temporary_workspace
: Directory where temporary agent workspaces are created and managed during collaboration
MassGen supports an interactive mode where you can have ongoing conversations with the system:
# Start interactive mode with a single agent (no tool enabled by default)
uv run python -m massgen.cli --model gpt-5-mini
# Start interactive mode with configuration file
uv run python -m massgen.cli --config three_agents_default.yaml
Interactive Mode Features:
- Multi-turn conversations: Multiple agents collaborate to chat with you in an ongoing conversation
- Real-time feedback: Displays real-time agent and system status
-
Clear conversation history: Type
/clear
to reset the conversation and start fresh -
Easy exit: Type
/quit
,/exit
,/q
, or pressCtrl+C
to stop
Watch the recorded demo:
The system provides multiple ways to view and analyze results:
- Live Collaboration View: See agents working in parallel through a multi-region terminal display
- Status Updates: Real-time phase transitions, voting progress, and consensus building
- Streaming Output: Watch agents' reasoning and responses as they develop
Watch an example here:
All sessions are automatically logged with detailed information. The file can be viewed throught the interaction with UI.
Logging storage are organized in the following directory hierarchy:
massgen_logs/
βββ log_{timestamp}/
βββ agent_outputs/
β βββ agent_id.txt
β βββ final_presentation_agent_id.txt
β βββ system_status.txt
βββ agent_id/
β βββ {answer_generation_timestamp}/
β βββ files_included_in_generated_answer
βββ final_workspace/
β βββ agent_id/
β βββ {answer_generation_timestamp}/
β βββ files_included_in_generated_answer
βββ massgen.log / massgen_debug.log
-
log_{timestamp}
: Main log directory identified by session timestamp -
agent_outputs/
: Contains text outputs from each agent-
agent_id.txt
: Raw output from each agent -
final_presentation_agent_id.txt
: Final presentation for the selected agent -
system_status.txt
: System status information
-
-
agent_id/
: Directory for each agent containing answer versions-
{answer_generation_timestamp}/
: Timestamp directory for each answer version-
files_included_in_generated_answer
: All relevant files in that version
-
-
-
final_workspace/
: Final presentation for selected agents-
agent_id/
: Selected agent id-
{answer_generation_timestamp}/
: Timestamp directory for final presentation-
files_included_in_generated_answer
: All relevant files in final presentation
-
-
-
-
massgen.log
ormassgen_debug.log
: Main log file,massgen.log
for general logging,massgen_debug.log
for verbose debugging information.
The final presentation continues to be stored in each Claude Code Agent's workspace as before. After generating the final presentation, the relevant files will be copied to the final_workspace/
directory.
Here are a few examples of how you can use MassGen for different tasks:
To see how MassGen works in practice, check out these detailed case studies based on real session logs:
# Ask a question about a complex topic
uv run python -m massgen.cli --config massgen/configs/gemini_4o_claude.yaml "what's best to do in Stockholm in October 2025"
uv run python -m massgen.cli --config massgen/configs/gemini_4o_claude.yaml "give me all the talks on agent frameworks in Berkeley Agentic AI Summit 2025, note, the sources must include the word Berkeley, don't include talks from any other agentic AI summits"
# Generate a short story
uv run python -m massgen.cli --config massgen/configs/gemini_4o_claude.yaml "Write a short story about a robot who discovers music."
uv run python -m massgen.cli --config massgen/configs/gemini_4o_claude.yaml "How much does it cost to run HLE benchmark with Grok-4"
# Single agent with comprehensive development tools
uv run python -m massgen.cli --config massgen/configs/claude_code_single.yaml "Create a Flask web app with user authentication and database integration"
# Multi-agent development team collaboration
uv run python -m massgen.cli --config massgen/configs/claude_code_flash2.5_gptoss.yaml "Debug and optimize this React application, then write comprehensive tests"
# Quick coding task with claude_code backend
uv run python -m massgen.cli --backend claude_code "Refactor this Python code to use async/await and add error handling"
# Multi-agent web automation with Playwright MCP
uv run python -m massgen.cli --config massgen/configs/multi_agent_playwright_automation.yaml "browse https://github.com/Leezekun/MassGen and suggest improvement. Include screenshots and suggestions in a PDF."
# Web scraping and analysis
uv run python -m massgen.cli --config massgen/configs/multi_agent_playwright_automation.yaml "Navigate to https://news.ycombinator.com, extract the top 10 stories, and create a summary report"
# E-commerce testing automation
uv run python -m massgen.cli --config massgen/configs/multi_agent_playwright_automation.yaml "Test the checkout flow on an e-commerce site and generate a detailed test report"
MassGen is currently in its foundational stage, with a focus on parallel, asynchronous multi-agent collaboration and orchestration. Our roadmap is centered on transforming this foundation into a highly robust, intelligent, and user-friendly system, while enabling frontier research and exploration. An earlier version of MassGen can be found here.
π Enhanced Logging
- Improved Logging System: Enhanced logging system for better agents' answer debugging.
- Final Answer Directory: New structure in Claude Code and logs for storing final results
- Documentation Updates: Detailed architecture documentation and future development plans for permission-based context sharing
β Unified Logging System (v0.0.13): Centralized logging infrastructure with debug mode and enhanced terminal display formatting
β Windows Platform Support (v0.0.13): Windows platform compatibility with improved path handling and process management
β Enhanced Claude Code Agent Context Sharing (v0.0.12): Claude Code agents now share workspace context by maintaining snapshots and temporary workspace in orchestrator's side
β Documentation Improvement (v0.0.12): Updated README with current features and improved setup instructions
β Custom System Messages (v0.0.11): Enhanced system message configuration and preservation with backend-specific system prompt customization
β Claude Code Backend Enhancements (v0.0.11): Improved integration with better system message handling, JSON response parsing, and coordination action descriptions
β Azure OpenAI Support (v0.0.10): Integration with Azure OpenAI services including GPT-4.1 and GPT-5-chat models with async streaming
β MCP (Model Context Protocol) Support (v0.0.9): Integration with MCP for advanced tool capabilities in Claude Code Agent, including Discord and Twitter integration
β Timeout Management System (v0.0.8): Orchestrator-level timeout with graceful fallback and enhanced error messages
β Local Model Support (v0.0.7): Complete LM Studio integration for running open-weight models locally with automatic server management
β GPT-5 Series Integration (v0.0.6): Support for OpenAI's GPT-5, GPT-5-mini, GPT-5-nano with advanced reasoning parameters
β Claude Code Integration (v0.0.5): Native Claude Code backend with streaming capabilities and tool support
β GLM-4.5 Model Support (v0.0.4): Integration with ZhipuAI's GLM-4.5 model family
β Foundation Architecture (v0.0.3): Complete multi-agent orchestration system with async streaming, builtin tools, and multi-backend support
β Extended Provider Ecosystem: Support for 15+ providers including Cerebras AI, Together AI, Fireworks AI, Groq, Nebius AI Studio, and OpenRouter
- Advanced Agent Collaboration: Exploring improved communication patterns and consensus-building protocols to improve agent synergy
- Expanded Model, Tool & Agent Integration: Adding & enhancing support for more models/tools/agents, including a wider range of tools like MCP Servers, and coding agents
- Improved Performance & Scalability: Optimizing the streaming and logging mechanisms for better performance and resource management
- Enhanced Developer Experience: Introducing a more modular agent design and a comprehensive benchmarking framework for easier extension and evaluation
- Web Interface: Developing a web-based UI for better visualization and interaction with the agent ecosystem
- Claude Code Context Sharing: Enabling seamless Claude code agents context sharing and other models (planned for v0.0.12)
We welcome community contributions to achieve these goals.
Version 0.0.15 will focus on Gemini MCP Implementation, bringing Model Context Protocol support to Google's Gemini models for the first time. Key planned enhancements include:
- Gemini MCP Integration: π Native MCP support for Gemini backend with tool ecosystem access
- Cross-Provider MCP: π οΈ Unified MCP interface across Claude Code and Gemini backends
- Enhanced Tool Discovery: π Improved tool discovery and execution management for Gemini agents
- Performance Optimization: β‘ Optimized MCP server communication and resource management
- Enhanced Context Sharing: π Improved multi-agent context preservation and sharing mechanisms
For detailed milestones and technical specifications, see the full v0.0.15 roadmap.
We welcome contributions! Please see our Contributing Guidelines for details.
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
β Star this repo if you find it useful! β
Made with β€οΈ by the MassGen team
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for MassGen
Similar Open Source Tools

quantalogic
QuantaLogic is a ReAct framework for building advanced AI agents that seamlessly integrates large language models with a robust tool system. It aims to bridge the gap between advanced AI models and practical implementation in business processes by enabling agents to understand, reason about, and execute complex tasks through natural language interaction. The framework includes features such as ReAct Framework, Universal LLM Support, Secure Tool System, Real-time Monitoring, Memory Management, and Enterprise Ready components.

pilottai
PilottAI is a Python framework for building autonomous multi-agent systems with advanced orchestration capabilities. It provides enterprise-ready features for building scalable AI applications. The framework includes hierarchical agent systems, production-ready features like asynchronous processing and fault tolerance, advanced memory management with semantic storage, and integrations with multiple LLM providers and custom tools. PilottAI offers specialized agents for various tasks such as customer service, document processing, email handling, knowledge acquisition, marketing, research analysis, sales, social media, and web search. The framework also provides documentation, example use cases, and advanced features like memory management, load balancing, and fault tolerance.

mcp-omnisearch
mcp-omnisearch is a Model Context Protocol (MCP) server that acts as a unified gateway to multiple search providers and AI tools. It integrates Tavily, Perplexity, Kagi, Jina AI, Brave, Exa AI, and Firecrawl to offer a wide range of search, AI response, content processing, and enhancement features through a single interface. The server provides powerful search capabilities, AI response generation, content extraction, summarization, web scraping, structured data extraction, and more. It is designed to work flexibly with the API keys available, enabling users to activate only the providers they have keys for and easily add more as needed.

LLMVoX
LLMVoX is a lightweight 30M-parameter, LLM-agnostic, autoregressive streaming Text-to-Speech (TTS) system designed to convert text outputs from Large Language Models into high-fidelity streaming speech with low latency. It achieves significantly lower Word Error Rate compared to speech-enabled LLMs while operating at comparable latency and speech quality. Key features include being lightweight & fast with only 30M parameters, LLM-agnostic for easy integration with existing models, multi-queue streaming for continuous speech generation, and multilingual support for easy adaptation to new languages.

R2R
R2R (RAG to Riches) is a fast and efficient framework for serving high-quality Retrieval-Augmented Generation (RAG) to end users. The framework is designed with customizable pipelines and a feature-rich FastAPI implementation, enabling developers to quickly deploy and scale RAG-based applications. R2R was conceived to bridge the gap between local LLM experimentation and scalable production solutions. **R2R is to LangChain/LlamaIndex what NextJS is to React**. A JavaScript client for R2R deployments can be found here. ### Key Features * **π Deploy** : Instantly launch production-ready RAG pipelines with streaming capabilities. * **π§© Customize** : Tailor your pipeline with intuitive configuration files. * **π Extend** : Enhance your pipeline with custom code integrations. * **βοΈ Autoscale** : Scale your pipeline effortlessly in the cloud using SciPhi. * **π€ OSS** : Benefit from a framework developed by the open-source community, designed to simplify RAG deployment.

oxylabs-mcp
The Oxylabs MCP Server acts as a bridge between AI models and the web, providing clean, structured data from any site. It enables scraping of URLs, rendering JavaScript-heavy pages, content extraction for AI use, bypassing anti-scraping measures, and accessing geo-restricted web data from 195+ countries. The implementation utilizes the Model Context Protocol (MCP) to facilitate secure interactions between AI assistants and web content. Key features include scraping content from any site, automatic data cleaning and conversion, bypassing blocks and geo-restrictions, flexible setup with cross-platform support, and built-in error handling and request management.

arkflow
ArkFlow is a high-performance Rust stream processing engine that seamlessly integrates AI capabilities, providing powerful real-time data processing and intelligent analysis. It supports multiple input/output sources and processors, enabling easy loading and execution of machine learning models for streaming data and inference, anomaly detection, and complex event processing. The tool is built on Rust and Tokio async runtime, offering excellent performance and low latency. It features built-in SQL queries, Python script, JSON processing, Protobuf encoding/decoding, and batch processing capabilities. ArkFlow is extensible with a modular design, making it easy to extend with new components.

cua
Cua is a tool for creating and running high-performance macOS and Linux virtual machines on Apple Silicon, with built-in support for AI agents. It provides libraries like Lume for running VMs with near-native performance, Computer for interacting with sandboxes, and Agent for running agentic workflows. Users can refer to the documentation for onboarding, explore demos showcasing AI-Gradio and GitHub issue fixing, and utilize accessory libraries like Core, PyLume, Computer Server, and SOM. Contributions are welcome, and the tool is open-sourced under the MIT License.

simba
Simba is an open source, portable Knowledge Management System (KMS) designed to seamlessly integrate with any Retrieval-Augmented Generation (RAG) system. It features a modern UI and modular architecture, allowing developers to focus on building advanced AI solutions without the complexities of knowledge management. Simba offers a user-friendly interface to visualize and modify document chunks, supports various vector stores and embedding models, and simplifies knowledge management for developers. It is community-driven, extensible, and aims to enhance AI functionality by providing a seamless integration with RAG-based systems.

UMbreLLa
UMbreLLa is a tool designed for deploying Large Language Models (LLMs) for personal agents. It combines offloading, speculative decoding, and quantization to optimize single-user LLM deployment scenarios. With UMbreLLa, 70B-level models can achieve performance comparable to human reading speed on an RTX 4070Ti, delivering exceptional efficiency and responsiveness, especially for coding tasks. The tool supports deploying models on various GPUs and offers features like code completion and CLI/Gradio chatbots. Users can configure the LLM engine for optimal performance based on their hardware setup.

code_puppy
Code Puppy is an AI-powered code generation agent designed to understand programming tasks, generate high-quality code, and explain its reasoning. It supports multi-language code generation, interactive CLI, and detailed code explanations. The tool requires Python 3.9+ and API keys for various models like GPT, Google's Gemini, Cerebras, and Claude. It also integrates with MCP servers for advanced features like code search and documentation lookups. Users can create custom JSON agents for specialized tasks and access a variety of tools for file management, code execution, and reasoning sharing.

arxiv-mcp-server
The ArXiv MCP Server acts as a bridge between AI assistants and arXiv's research repository, enabling AI models to search for and access papers programmatically through the Message Control Protocol (MCP). It offers features like paper search, access, listing, local storage, and research prompts. Users can install it via Smithery or manually for Claude Desktop. The server provides tools for paper search, download, listing, and reading, along with specialized prompts for paper analysis. Configuration can be done through environment variables, and testing is supported with a test suite. The tool is released under the MIT License and is developed by the Pearl Labs Team.