MassGen

🚀 MassGen: An Open-source Multi-Agent Scaling System Inspired by Grok Heavy and Gemini Deep Think. Join the discord channel: https://discord.com/invite/VVrT2rQaz5

Stars: 454

Visit

MassGen is a cutting-edge multi-agent system that leverages the power of collaborative AI to solve complex tasks. It assigns a task to multiple AI agents who work in parallel, observe each other's progress, and refine their approaches to converge on the best solution to deliver a comprehensive and high-quality result. The system operates through an architecture designed for seamless multi-agent collaboration, with key features including cross-model/agent synergy, parallel processing, intelligence sharing, consensus building, and live visualization. Users can install the system, configure API settings, and run MassGen for various tasks such as question answering, creative writing, research, development & coding tasks, and web automation & browser tasks. The roadmap includes plans for advanced agent collaboration, expanded model, tool & agent integration, improved performance & scalability, enhanced developer experience, and a web interface.

README:

🚀 MassGen: Multi-Agent Scaling System for GenAI

MassGen is a cutting-edge multi-agent system that leverages the power of collaborative AI to solve complex tasks.

Multi-agent scaling through intelligent collaboration in Grok Heavy style

This project started with the "threads of thought" and "iterative refinement" ideas presented in The Myth of Reasoning, and extends the classic "multi-agent conversation" idea in AG2. Here is a video recording of the background context introduction presented at the Berkeley Agentic AI Summit 2025.

📋 Table of Contents

🗺️ Roadmap

Recent Achievements
- v0.0.24
- v0.0.3 - v0.0.23
Key Future Enhancements
- Advanced Agent Collaboration
- Expanded Model, Tool & Agent Integrations
- Improved Performance & Scalability
- Enhanced Developer Experience
- Web Interface
v0.0.25 Roadmap

✨ Key Features

Feature	Description
🤝 Cross-Model/Agent Synergy	Harness strengths from diverse frontier model-powered agents
⚡ Parallel Processing	Multiple agents tackle problems simultaneously
👥 Intelligence Sharing	Agents share and learn from each other's work
🔄 Consensus Building	Natural convergence through collaborative refinement
📊 Live Visualization	See agents' working processes in real-time

🆕 Latest Features (v0.0.24)

What's New in v0.0.24:

vLLM Backend Support - Complete integration with vLLM for high-performance local model serving with OpenAI-compatible API
POE Provider Support - Extended ChatCompletions backend to support POE (Platform for Open Exploration) for accessing multiple AI models
GPT-5-Codex Model Recognition - Added gpt-5-codex to model registry for code generation tasks
Backend Utility Modules - Major refactoring with new api_params_handler, formatter, and token_manager modules (1,400+ lines of new utility code)
Bug Fixes - Fixed streaming chunk processing and Gemini backend session management

Try v0.0.24 Features Now:

# Try vLLM backend with local models (requires vLLM server running)
# First start vLLM server: python -m vllm.entrypoints.openai.api_server --model Qwen/Qwen3-0.6B --host 0.0.0.0 --port 8000
uv run python -m massgen.cli \
  --config massgen/configs/basic/multi/two_qwen_vllm.yaml \
  "What is machine learning?"

→ See all release examples

🏗️ System Design

MassGen operates through an architecture designed for seamless multi-agent collaboration:

graph TB
    O[🚀 MassGen Orchestrator<br/>📋 Task Distribution & Coordination]

    subgraph Collaborative Agents
        A1[Agent 1<br/>🏗️ Anthropic/Claude + Tools]
        A2[Agent 2<br/>🌟 Google/Gemini + Tools]
        A3[Agent 3<br/>🤖 OpenAI/GPT + Tools]
        A4[Agent 4<br/>⚡ xAI/Grok + Tools]
    end

    H[🔄 Shared Collaboration Hub<br/>📡 Real-time Notification & Consensus]

    O --> A1 & A2 & A3 & A4
    A1 & A2 & A3 & A4 <--> H

    classDef orchestrator fill:#e1f5fe,stroke:#0288d1,stroke-width:3px
    classDef agent fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
    classDef hub fill:#e8f5e8,stroke:#388e3c,stroke-width:2px

    class O orchestrator
    class A1,A2,A3,A4 agent
    class H hub

The system's workflow is defined by the following key principles:

Parallel Processing - Multiple agents tackle the same task simultaneously, each leveraging their unique capabilities (different models, tools, and specialized approaches).

Real-time Collaboration - Agents continuously share their working summaries and insights through a notification system, allowing them to learn from each other's approaches and build upon collective knowledge.

Convergence Detection - The system intelligently monitors when agents have reached stability in their solutions and achieved consensus through natural collaboration rather than forced agreement.

Adaptive Coordination - Agents can restart and refine their work when they receive new insights from others, creating a dynamic and responsive problem-solving environment.

This collaborative approach ensures that the final output leverages collective intelligence from multiple AI systems, leading to more robust and well-rounded results than any single agent could achieve alone.

🚀 Quick Start

1. 📥 Installation

Core Installation (requires Python 3.11+):

git clone https://github.com/Leezekun/MassGen.git
cd MassGen

pip install uv
uv venv

Optional CLI Tools (for enhanced capabilities):

# Claude Code CLI - Advanced coding assistant
npm install -g @anthropic-ai/claude-code

# LM Studio - Local model inference
# For MacOS/Linux
sudo ~/.lmstudio/bin/lms bootstrap
# For Windows
cmd /c %USERPROFILE%/.lmstudio/bin/lms.exe bootstrap

2. 🔐 API Configuration

Using the template file .env.example to create a .env file in the massgen directory with your API keys. Note that only the API keys of the models used by your MassGen agent team is needed.

# Copy example configuration
cp .env.example .env

Useful links to get API keys:

3. 🧩 Supported Models and Tools

Models

The system currently supports multiple model providers with advanced capabilities:

API-based Models:

Azure OpenAI (NEW in v0.0.10): GPT-4, GPT-4o, GPT-3.5-turbo, GPT-4.1, GPT-5-chat
Cerebras AI: GPT-OSS-120B...
Claude: Claude Haiku 3.5, Claude Sonnet 4, Claude Opus 4...
Claude Code: Native Claude Code SDK with comprehensive dev tools
Gemini: Gemini 2.5 Flash, Gemini 2.5 Pro...
Grok: Grok-4, Grok-3, Grok-3-mini...
OpenAI: GPT-5 series (GPT-5, GPT-5-mini, GPT-5-nano)...
Together AI, Fireworks AI, Groq, Kimi/Moonshot, Nebius AI Studio, OpenRouter, POE: LLaMA, Mistral, Qwen...
Z AI: GLM-4.5

Local Model Support:

vLLM (NEW in v0.0.24): High-performance local model serving with OpenAI-compatible API
- Support for vLLM-specific parameters (top_k, repetition_penalty, guided_json)
- Optimized for large-scale model inference
- Configuration examples: three_agents_vllm.yaml, two_qwen_vllm.yaml
LM Studio (v0.0.7+): Run open-weight models locally with automatic server management
- Automatic LM Studio CLI installation
- Auto-download and loading of models
- Zero-cost usage reporting
- Support for LLaMA, Mistral, Qwen and other open-weight models

More providers and local inference engines (sglang) are welcome to be added.

Tools

MassGen agents can leverage various tools to enhance their problem-solving capabilities. Both API-based and CLI-based backends support different tool capabilities.

Supported Built-in Tools by Backend:

Backend	Live Search	Code Execution	File Operations	MCP Support	Advanced Features
Azure OpenAI (NEW in v0.0.10)	❌	❌	❌	❌	Code interpreter, Azure deployment management
Claude API	✅	✅	✅	✅	Web search, code interpreter, MCP integration
Claude Code	✅	✅	✅	✅	Native Claude Code SDK, comprehensive dev tools, MCP integration
Gemini API	✅	✅	✅	✅	Web search, code execution, MCP integration
Grok API	✅	❌	✅	✅	Web search, MCP integration
OpenAI API	✅	✅	✅	✅	Web search, code interpreter, MCP integration
ZAI API	❌	❌	✅	✅	MCP integration

4. 🏃 Run MassGen

🚀 Getting Started

CLI Configuration Parameters

Parameter	Description
`--config`	Path to YAML configuration file with agent definitions, model parameters, backend parameters and UI settings
`--backend`	Backend type for quick setup without a config file (`claude`, `claude_code`, `gemini`, `grok`, `openai`, `azure_openai`, `zai`). Optional for models with default backends.
`--model`	Model name for quick setup (e.g., `gemini-2.5-flash`, `gpt-5-nano`, ...). `--config` and `--model` are mutually exclusive - use one or the other.
`--system-message`	System prompt for the agent in quick setup mode. If `--config` is provided, `--system-message` is omitted.
`--no-display`	Disable real-time streaming UI coordination display (fallback to simple text output).
`--no-logs`	Disable real-time logging.
`--debug`	Enable debug mode with verbose logging (NEW in v0.0.13). Shows detailed orchestrator activities, agent messages, backend operations, and tool calls. Debug logs are saved to `agent_outputs/log_{time}/massgen_debug.log`.
`"<your question>"`	Optional single-question input; if omitted, MassGen enters interactive chat mode.

1. Single Agent (Easiest Start)

Quick Start Commands:

# Quick test with any supported model - no configuration needed
uv run python -m massgen.cli --model claude-3-5-sonnet-latest "What is machine learning?"
uv run python -m massgen.cli --model gemini-2.5-flash "Explain quantum computing"
uv run python -m massgen.cli --model gpt-5-nano "Summarize the latest AI developments"

Configuration:

Use the agent field to define a single agent with its backend and settings:

agent:
  id: "<agent_name>"
  backend:
    type: "azure_openai" | "chatcompletion" | "claude" | "claude_code" | "gemini" | "grok" | "openai" | "zai" | "lmstudio" #Type of backend
    model: "<model_name>" # Model name
    api_key: "<optional_key>"  # API key for backend. Uses env vars by default.
  system_message: "..."    # System Message for Single Agent

→ See all single agent configs

2. Multi-Agent Collaboration (Recommended)

Configuration:

Use the agents field to define multiple agents, each with its own backend and config:

Quick Start Commands:

# Three powerful agents working together - Gemini, GPT-5, and Grok
uv run python -m massgen.cli \
  --config massgen/configs/basic/multi/three_agents_default.yaml \
  "Analyze the pros and cons of renewable energy"

This showcases MassGen's core strength:

Gemini 2.5 Flash - Fast research with web search
GPT-5 Nano - Advanced reasoning with code execution
Grok-3 Mini - Real-time information and alternative perspectives

agents:  # Multiple agents (alternative to 'agent')
  - id: "<agent1 name>"
    backend:
      type: "azure_openai" | "chatcompletion" | "claude" | "claude_code" | "gemini" | "grok" | "openai" |  "zai" | "lmstudio" #Type of backend
      model: "<model_name>" # Model name
      api_key: "<optional_key>"  # API key for backend. Uses env vars by default.
    system_message: "..."    # System Message for Single Agent
  - id: "..."
    backend:
      type: "..."
      model: "..."
      ...
    system_message: "..."

→ Explore more multi-agent setups

3. Model context protocol (MCP)

The Model context protocol (MCP) standardises how applications expose tools and context to language models. From the official documentation:

MCP is an open protocol that standardizes how applications provide context to LLMs. Think of MCP like a USB-C port for AI applications. Just as USB-C provides a standardized way to connect your devices to various peripherals and accessories, MCP provides a standardized way to connect AI models to different data sources and tools.

MCP Configuration Parameters:

Parameter	Type	Required	Description
`mcp_servers`	dict	Yes (for MCP)	Container for MCP server definitions
└─ `type`	string	Yes	Transport: `"stdio"` or `"streamable-http"`
└─ `command`	string	stdio only	Command to run the MCP server
└─ `args`	list	stdio only	Arguments for the command
└─ `url`	string	http only	Server endpoint URL
└─ `env`	dict	No	Environment variables to pass
`allowed_tools`	list	No	Whitelist specific tools (if omitted, all tools available)
`exclude_tools`	list	No	Blacklist dangerous/unwanted tools

Quick Start Commands (Check backend MCP support here):

# Weather service with GPT-5
uv run python -m massgen.cli \
  --config massgen/configs/tools/mcp/gpt5_mini_mcp_example.yaml \
  "What's the weather forecast for San Francisco this week?"

# Multi-tool MCP with Gemini - Search + Weather + Filesystem
uv run python -m massgen.cli \
  --config massgen/configs/tools/mcp/multimcp_gemini.yaml \
  "Find the best restaurants in Paris and save the recommendations to a file"

Configuration:

agents:
  # Basic MCP Configuration:
  backend:
    type: "openai"              # Your backend choice
    model: "gpt-5-mini"         # Your model choice

    # Add MCP servers here
    mcp_servers:
      weather:                  # Server name (you choose this)
        type: "stdio"           # Communication type
        command: "npx"          # Command to run
        args: ["-y", "@modelcontextprotocol/server-weather"]  # MCP server package

  # That's it! The agent can now check weather.

  # Multiple MCP Tools Example:
  backend:
    type: "gemini"
    model: "gemini-2.5-flash"
    mcp_servers:
      # Web search
      search:
        type: "stdio"
        command: "npx"
        args: ["-y", "@modelcontextprotocol/server-brave-search"]
        env:
          BRAVE_API_KEY: "${BRAVE_API_KEY}"  # Set in .env file

      # HTTP-based MCP server (streamable-http transport)
      custodm_api:
        type: "streamable-http"   # For HTTP/SSE servers
        url: "http://localhost:8080/mcp/sse"  # Server endpoint


  # Tool configuration (MCP tools are auto-discovered)
  allowed_tools:                        # Optional: whitelist specific tools
    - "mcp__weather__get_current_weather"
    - "mcp__test_server__mcp_echo"
    - "mcp__test_server__add_numbers"

  exclude_tools:                        # Optional: blacklist specific tools
    - "mcp__test_server__current_time"

→ View more MCP examples

4. File System Operations & Workspace Management

MassGen provides comprehensive file system support through multiple backends, enabling agents to read, write, and manipulate files in organized workspaces.

Filesystem Configuration Parameters:

Parameter	Type	Required	Description
`cwd`	string	Yes (for file ops)	Working directory for file operations (agent-specific workspace)
`snapshot_storage`	string	Yes	Directory for workspace snapshots
`agent_temporary_workspace`	string	Yes	Parent directory for temporary workspaces

Quick Start Commands:

# File operations with Claude Code
uv run python -m massgen.cli \
  --config massgen/configs/tools/filesystem/claude_code_single.yaml \
  "Create a Python web scraper and save results to CSV"

# Multi-agent file collaboration
uv run python -m massgen.cli \
  --config massgen/configs/tools/filesystem/claude_code_context_sharing.yaml \
  "Generate a comprehensive project report with charts and analysis"

Configuration:

# Basic Workspace Setup:
agents:
  - id: "file-agent"
    backend:
      type: "claude_code"          # Backend with file support
      model: "claude-sonnet-4"     # Your model choice
      cwd: "workspace"             # Isolated workspace for file operations

# Multi-Agent Workspace Isolation:
agents:
  - id: "analyzer"
    backend:
      type: "claude_code"
      cwd: "workspace1"            # Agent-specific workspace

  - id: "reviewer"
    backend:
      type: "gemini"
      cwd: "workspace2"            # Separate workspace

orchestrator:
  snapshot_storage: "snapshots"              # Shared snapshots directory
  agent_temporary_workspace: "temp_workspaces" # Temporary workspace management

Available File Operations:

Claude Code: Built-in tools (Read, Write, Edit, MultiEdit, Bash, Grep, Glob, LS, TodoWrite)
Other Backends: Via MCP Filesystem Server

Workspace Management:

Isolated Workspaces: Each agent's cwd is fully isolated and writable
Snapshot Storage: Share workspace context between Claude Code agents
Temporary Workspaces: Agents can access previous coordination results

→ View more filesystem examples

5. Project Integration & User Context Paths (NEW in v0.0.21)

Work directly with your existing projects! User Context Paths allow you to share specific directories and files with all agents while maintaining granular permission control. This enables secure multi-agent collaboration on your real codebases, documentation, and data.

Project Integration Parameters:

Parameter	Type	Required	Description
`context_paths`	list	Yes (for project integration)	Shared directories/files for all agents
└─ `path`	string	Yes	Absolute path to your project directory or file
└─ `permission`	string	Yes	Access level: `"read"` or `"write"`

Quick Start Commands:

# Code analysis and security audit
uv run python -m massgen.cli \
  --config massgen/configs/tools/filesystem/fs_permissions_test.yaml \
  "Analyze all Python files in this project and create a comprehensive security audit report"

# Project modernization
uv run python -m massgen.cli \
  --config massgen/configs/tools/filesystem/claude_code_context_sharing.yaml \
  "Review this legacy codebase and create a modernization plan with updated dependencies"

Configuration:

# Basic Project Integration:
agents:
  - id: "code-reviewer"
    backend:
      type: "claude_code"
      cwd: "workspace"             # Agent's isolated work area

orchestrator:
  context_paths:
    - path: "/home/user/my-project/src"
      permission: "read"           # Agents can analyze your code
    - path: "/home/user/my-project/docs"
      permission: "write"          # Final agent can update docs

# Advanced: Multi-Agent Project Collaboration
agents:
  - id: "analyzer"
    backend:
      type: "gemini"
      cwd: "analysis_workspace"

  - id: "implementer"
    backend:
      type: "claude_code"
      cwd: "implementation_workspace"

orchestrator:
  context_paths:
    - path: "/home/user/legacy-app/src"
      permission: "read"           # Read existing codebase
    - path: "/home/user/legacy-app/tests"
      permission: "write"          # Write new tests
    - path: "/home/user/modernized-app"
      permission: "write"          # Create modernized version

This showcases project integration:

Real Project Access - Work with your actual codebases, not copies
Secure Permissions - Granular control over what agents can read/modify
Multi-Agent Collaboration - Multiple agents safely work on the same project
Context Agents (during coordination): Always READ-only access to protect your files
Final Agent (final execution): Gets the configured permission (READ or write)

Use Cases:

Code Review: Agents analyze your source code and suggest improvements
Documentation: Agents read project docs to understand context and generate updates
Data Processing: Agents access shared datasets and generate analysis reports
Project Migration: Agents examine existing projects and create modernized versions

→ Learn more about project integration

Security Considerations:

Agent ID Safety: Avoid using agent+incremental digits for IDs (e.g., agent1, agent2). This may cause ID exposure during voting
File Access Control: Restrict file access using MCP server configurations when needed
Path Validation: All paths are resolved to absolute paths to prevent directory traversal attacks

Additional Examples by Provider

Claude (Recursive MCP Execution - v0.0.20+)

# Claude with advanced tool chaining
uv run python -m massgen.cli \
  --config massgen/configs/tools/mcp/claude_mcp_example.yaml \
  "Research and compare weather in Beijing and Shanghai"

OpenAI (GPT-5 Series with MCP - v0.0.17+)

# GPT-5 with weather and external tools
uv run python -m massgen.cli \
  --config massgen/configs/tools/mcp/gpt5_mini_mcp_example.yaml \
  "What's the weather of Tokyo"

Gemini (Multi-Server MCP - v0.0.15+)

# Gemini with multiple MCP services
uv run python -m massgen.cli \
  --config massgen/configs/tools/mcp/multimcp_gemini.yaml \
  "Find accommodations in Paris with neighborhood analysis"    # (requires BRAVE_API_KEY in .env)

Claude Code (Development Tools)

# Professional development environment
uv run python -m massgen.cli \
  --backend claude_code \
  --model sonnet \
  "Create a Flask web app with authentication"

Local Models (LM Studio - v0.0.7+)

# Run open-source models locally
uv run python -m massgen.cli \
  --config massgen/configs/providers/local/lmstudio.yaml \
  "Explain machine learning concepts"

→ Browse by provider | Browse by tools | Browse teams

Additional Use Case Examples

Question Answering & Research:

# Complex research with multiple perspectives
uv run python -m massgen.cli \
  --config massgen/configs/basic/multi/gemini_4o_claude.yaml \
  "What's best to do in Stockholm in October 2025"

# Specific research requirements
uv run python -m massgen.cli \
  --config massgen/configs/basic/multi/gemini_4o_claude.yaml \
  "Give me all the talks on agent frameworks in Berkeley Agentic AI Summit 2025"

Creative Writing:

# Story generation with multiple creative agents
uv run python -m massgen.cli \
  --config massgen/configs/basic/multi/gemini_4o_claude.yaml \
  "Write a short story about a robot who discovers music"

Development & Coding:

# Full-stack development with file operations
uv run python -m massgen.cli \
  --config  massgen/configs/tools/filesystem/claude_code_single.yaml \
  "Create a Flask web app with authentication"

Web Automation: (still in test)

# Browser automation with screenshots and reporting
uv run python -m massgen.cli \
  --config massgen/configs/tools/code-execution/multi_agent_playwright_automation.yaml \
  "Browse https://github.com/Leezekun/MassGen and suggest improvements. Include screenshots in a PDF"

# Data extraction and analysis
uv run python -m massgen.cli \
  --config massgen/configs/tools/code-execution/multi_agent_playwright_automation.yaml \
  "Navigate to https://news.ycombinator.com, extract the top 10 stories, and create a summary report"

→ See detailed case studies with real session logs and outcomes

Interactive Mode & Advanced Usage

Multi-Turn Conversations:

# Start interactive chat (no initial question)
uv run python -m massgen.cli \
  --config massgen/configs/basic/multi/three_agents_default.yaml

# Debug mode for troubleshooting
uv run python -m massgen.cli \
  --config massgen/configs/basic/multi/three_agents_default.yaml \
  --debug "Your question"

Configuration Files

MassGen configurations are organized by features and use cases. See the Configuration Guide for detailed organization and examples.

Quick navigation:

Basic setups: Single agent | Multi-agent
Tool integrations: MCP servers | Web search | Filesystem
Provider examples: OpenAI | Claude | Gemini
Specialized teams: Creative | Research | Development

See MCP server setup guides: Discord MCP | Twitter MCP

Backend Configuration Reference

For detailed configuration of all supported backends (OpenAI, Claude, Gemini, Grok, etc.), see:

→ Backend Configuration Guide

Interactive Multi-Turn Mode

MassGen supports an interactive mode where you can have ongoing conversations with the system:

# Start interactive mode with a single agent (no tool enabled by default)
uv run python -m massgen.cli --model gpt-5-mini

# Start interactive mode with configuration file
uv run python -m massgen.cli \
  --config massgen/configs/basic/multi/three_agents_default.yaml

Interactive Mode Features:

Multi-turn conversations: Multiple agents collaborate to chat with you in an ongoing conversation
Real-time coordination tracking: Live visualization of agent interactions, votes, and decision-making processes
Interactive coordination table: Press r to view complete history of agent coordination events and state transitions
Real-time feedback: Displays real-time agent and system status with enhanced coordination visualization
Clear conversation history: Type /clear to reset the conversation and start fresh
Easy exit: Type /quit, /exit, /q, or press Ctrl+C to stop

Watch the recorded demo:

5. 📊 View Results

The system provides multiple ways to view and analyze results:

Real-time Display

Live Collaboration View: See agents working in parallel through a multi-region terminal display
Status Updates: Real-time phase transitions, voting progress, and consensus building
Streaming Output: Watch agents' reasoning and responses as they develop

Watch an example here:

Comprehensive Logging

All sessions are automatically logged with detailed information. The file can be viewed throught the interaction with UI.

Logging Storage Structure

Logging storage are organized in the following directory hierarchy:

massgen_logs/
└── log_{timestamp}/
    ├── agent_outputs/
    │   ├── agent_id.txt
    │   ├── final_presentation_agent_id.txt
    │   └── system_status.txt
    ├── agent_id/
    │   └── {answer_generation_timestamp}/
    │       └── files_included_in_generated_answer
    ├── final_workspace/
    │   └── agent_id/
    │       └── {answer_generation_timestamp}/
    │           └── files_included_in_generated_answer
    └── massgen.log / massgen_debug.log

Directory Structure Explanation

log_{timestamp}: Main log directory identified by session timestamp
agent_outputs/: Contains text outputs from each agent
- agent_id.txt: Raw output from each agent
- final_presentation_agent_id.txt: Final presentation for the selected agent
- system_status.txt: System status information
agent_id/: Directory for each agent containing answer versions
- {answer_generation_timestamp}/: Timestamp directory for each answer version
  - files_included_in_generated_answer: All relevant files in that version
final_workspace/: Final presentation for selected agents
- agent_id/: Selected agent id
  - {answer_generation_timestamp}/: Timestamp directory for final presentation
    - files_included_in_generated_answer: All relevant files in final presentation
massgen.log or massgen_debug.log: Main log file, massgen.log for general logging, massgen_debug.log for verbose debugging information.

Important Note

The final presentation continues to be stored in each Claude Code Agent's workspace as before. After generating the final presentation, the relevant files will be copied to the final_workspace/ directory.

💡 Case Studies

Case Studies

To see how MassGen works in practice, check out these detailed case studies based on real session logs:

MassGen Case Studies

🗺️ Roadmap

MassGen is currently in its foundational stage, with a focus on parallel, asynchronous multi-agent collaboration and orchestration. Our roadmap is centered on transforming this foundation into a highly robust, intelligent, and user-friendly system, while enabling frontier research and exploration. An earlier version of MassGen can be found here.

⚠️ Early Stage Notice: As MassGen is in active development, please expect upcoming breaking architecture changes as we continue to refine and improve the system.

Recent Achievements (v0.0.24)

🎉 Released: September 26, 2025

Version 0.0.24 introduces vLLM Backend Support and Backend Utility Modules, enabling high-performance local model inference and improved code organization:

vLLM Backend Support

Local Model Serving: Run powerful AI models locally with vLLM for cost-effective, private inference
Easy Setup: Ready-to-use configurations (three_agents_vllm.yaml, two_qwen_vllm.yaml)
Advanced Controls: Fine-tune model behavior with top_k sampling, repetition penalties, and thinking modes
High Performance: Optimized for serving large language models at scale

Code Organization Improvements

Cleaner Backend Code: Reorganized backend utilities into focused modules for easier maintenance
Better Error Handling: Improved API parameter validation and processing across all backends
Unified Token Management: Consistent token counting and cost tracking across different model providers
Streamlined File Operations: Moved filesystem tools to dedicated backend utilities

New Provider & Model Support

POE Integration: Access multiple AI models through POE platform with single API key
GPT-5-Codex Support: Enhanced code generation capabilities with latest OpenAI model
Stability Improvements: Fixed streaming issues and memory leaks for better reliability

Enhanced Documentation

vLLM Setup Guide: Step-by-step instructions for running local models with vLLM
Advanced Filesystem Case Study: Showing how agents can work together with controlled file access and workspace sharing

Previous Achievements (v0.0.3 - v0.0.23)

✅ Backend Architecture Refactoring (v0.0.23): Major code consolidation with new base_with_mcp.py class reducing ~1,932 lines across backends, extracted formatter module for better code organization, and improved maintainability through unified MCP integration

✅ Workspace Copy Tools via MCP (v0.0.22): Seamless file copying capabilities between workspaces, configuration organization with hierarchical structure, and enhanced file operations for large-scale collaboration

✅ Grok MCP Integration (v0.0.21): Unified backend architecture with full MCP server support, filesystem capabilities through MCP servers, and enhanced configuration files

✅ Claude Backend MCP Support (v0.0.20): Extended MCP integration to Claude backend, full MCP protocol and filesystem support, robust error handling, and comprehensive documentation

✅ Comprehensive Coordination Tracking (v0.0.19): Complete coordination tracking and visualization system with event-based tracking, interactive coordination table display, and advanced debugging capabilities for multi-agent collaboration patterns

✅ Comprehensive MCP Integration (v0.0.18): Extended MCP to all Chat Completions backends (Cerebras AI, Together AI, Fireworks AI, Groq, Nebius AI Studio, OpenRouter), cross-provider function calling compatibility, 9 new MCP configuration examples

✅ OpenAI MCP Integration (v0.0.17): Extended MCP (Model Context Protocol) support to OpenAI backend with full tool discovery and execution capabilities for GPT models, unified MCP architecture across multiple backends, and enhanced debugging

✅ Unified Filesystem Support with MCP Integration (v0.0.16): Complete FilesystemManager class providing unified filesystem access for Gemini and Claude Code backends, with MCP-based operations for file manipulation and cross-agent collaboration

✅ MCP Integration Framework (v0.0.15): Complete MCP implementation for Gemini backend with multi-server support, circuit breaker patterns, and comprehensive security framework

✅ Enhanced Logging (v0.0.14): Improved logging system for better agents' answer debugging, new final answer directory structure, and detailed architecture documentation

✅ Unified Logging System (v0.0.13): Centralized logging infrastructure with debug mode and enhanced terminal display formatting

✅ Windows Platform Support (v0.0.13): Windows platform compatibility with improved path handling and process management

✅ Enhanced Claude Code Agent Context Sharing (v0.0.12): Claude Code agents now share workspace context by maintaining snapshots and temporary workspace in orchestrator's side

✅ Documentation Improvement (v0.0.12): Updated README with current features and improved setup instructions

✅ Custom System Messages (v0.0.11): Enhanced system message configuration and preservation with backend-specific system prompt customization

✅ Claude Code Backend Enhancements (v0.0.11): Improved integration with better system message handling, JSON response parsing, and coordination action descriptions

✅ Azure OpenAI Support (v0.0.10): Integration with Azure OpenAI services including GPT-4.1 and GPT-5-chat models with async streaming

✅ MCP (Model Context Protocol) Support (v0.0.9): Integration with MCP for advanced tool capabilities in Claude Code Agent, including Discord and Twitter integration

✅ Timeout Management System (v0.0.8): Orchestrator-level timeout with graceful fallback and enhanced error messages

✅ Local Model Support (v0.0.7): Complete LM Studio integration for running open-weight models locally with automatic server management

✅ GPT-5 Series Integration (v0.0.6): Support for OpenAI's GPT-5, GPT-5-mini, GPT-5-nano with advanced reasoning parameters

✅ Claude Code Integration (v0.0.5): Native Claude Code backend with streaming capabilities and tool support

✅ GLM-4.5 Model Support (v0.0.4): Integration with ZhipuAI's GLM-4.5 model family

✅ Foundation Architecture (v0.0.3): Complete multi-agent orchestration system with async streaming, builtin tools, and multi-backend support

✅ Extended Provider Ecosystem: Support for 15+ providers including Cerebras AI, Together AI, Fireworks AI, Groq, Nebius AI Studio, and OpenRouter

Key Future Enhancements

Advanced Agent Collaboration: Exploring improved communication patterns and consensus-building protocols to improve agent synergy
Expanded Model Integration: Adding support for more frontier models and local inference engines
Improved Performance & Scalability: Optimizing the streaming and logging mechanisms for better performance and resource management
Enhanced Developer Experience: Introducing a more modular agent design and a comprehensive benchmarking framework for easier extension and evaluation
Web Interface: Developing a web-based UI for better visualization and interaction with the agent ecosystem

We welcome community contributions to achieve these goals.

v0.0.25 Roadmap

Version 0.0.25 builds upon the vLLM backend support and utility modules refactoring of v0.0.24 by focusing on orchestrator improvements and agent communication fixes. Key priorities include:

Required Features

Agent System Prompt Fixes: Fix the problem where the final agent expects human feedback through system prompt changes
Refactor Orchestrator: Streamline orchestrator code for better maintainability and performance

Optional Features

MCP Marketplace Integration: Integrate MCP Marketplace support for expanded tool ecosystem
Refactor Send/Receive Messaging: Extract messaging system to use stream chunks class for multimodal support

Key technical approach:

Autonomous Agent Behavior: Ensure final agents complete tasks without expecting human feedback through system prompt improvements
Orchestrator Refactoring: Extract coordination logic into separate modules for better maintainability
Marketplace Integration: Enable tool discovery and installation from MCP Marketplace
Messaging Architecture: Foundation for future multimodal support through unified stream chunks

For detailed milestones and technical specifications, see the full v0.0.25 roadmap.

🤝 Contributing

We welcome contributions! Please see our Contributing Guidelines for details.

📄 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

⭐ Star this repo if you find it useful! ⭐

Made with ❤️ by the MassGen team

⭐ Star History

For Tasks:

Click tags to check more tools for each tasks

answer questions write creatively conduct research develop software automate web tasks

For Jobs:

ai researcher data scientist machine learning engineer software developer research scientist

Alternative AI tools for MassGen

Similar Open Source Tools

MassGen

github

: 454

tunacode

TunaCode CLI is an AI-powered coding assistant that provides a command-line interface for developers to enhance their coding experience. It offers features like model selection, parallel execution for faster file operations, and various commands for code management. The tool aims to improve coding efficiency and provide a seamless coding environment for developers.

github

: 83

zotero-mcp

Zotero MCP is an open-source project that integrates AI capabilities with Zotero using the Model Context Protocol. It consists of a Zotero plugin and an MCP server, enabling AI assistants to search, retrieve, and cite references from Zotero library. The project features a unified architecture with an integrated MCP server, eliminating the need for a separate server process. It provides features like intelligent search, detailed reference information, filtering by tags and identifiers, aiding in academic tasks such as literature reviews and citation management.

github

: 99

llm4s

LLM4S provides a simple, robust, and scalable framework for building Large Language Models (LLM) applications in Scala. It aims to leverage Scala's type safety, functional programming, JVM ecosystem, concurrency, and performance advantages to create reliable and maintainable AI-powered applications. The framework supports multi-provider integration, execution environments, error handling, Model Context Protocol (MCP) support, agent frameworks, multimodal generation, and Retrieval-Augmented Generation (RAG) workflows. It also offers observability features like detailed trace logging, monitoring, and analytics for debugging and performance insights.

github

: 135

R2R

R2R (RAG to Riches) is a fast and efficient framework for serving high-quality Retrieval-Augmented Generation (RAG) to end users. The framework is designed with customizable pipelines and a feature-rich FastAPI implementation, enabling developers to quickly deploy and scale RAG-based applications. R2R was conceived to bridge the gap between local LLM experimentation and scalable production solutions. **R2R is to LangChain/LlamaIndex what NextJS is to React**. A JavaScript client for R2R deployments can be found here. ### Key Features * **🚀 Deploy** : Instantly launch production-ready RAG pipelines with streaming capabilities. * **🧩 Customize** : Tailor your pipeline with intuitive configuration files. * **🔌 Extend** : Enhance your pipeline with custom code integrations. * **⚖️ Autoscale** : Scale your pipeline effortlessly in the cloud using SciPhi. * **🤖 OSS** : Benefit from a framework developed by the open-source community, designed to simplify RAG deployment.

github

: 5.9k

aigne-doc-smith

AIGNE DocSmith is a powerful AI-driven documentation generation tool that automates the creation of detailed, structured, and multi-language documentation directly from source code. It intelligently analyzes codebase to generate a comprehensive document structure, populates content with high-quality AI-powered generation, supports seamless translation into 12+ languages, integrates with AIGNE Hub for large language models, offers Discuss Kit publishing, automatically updates documentation with source code changes, and allows for individual document optimization.

github

: 307

trpc-agent-go

A powerful Go framework for building intelligent agent systems with large language models (LLMs), hierarchical planners, memory, telemetry, and a rich tool ecosystem. tRPC-Agent-Go enables the creation of autonomous or semi-autonomous agents that reason, call tools, collaborate with sub-agents, and maintain long-term state. The framework provides detailed documentation, examples, and tools for accelerating the development of AI applications.

github

: 122

ai-development-patterns

github

: 238

quantalogic

QuantaLogic is a ReAct framework for building advanced AI agents that seamlessly integrates large language models with a robust tool system. It aims to bridge the gap between advanced AI models and practical implementation in business processes by enabling agents to understand, reason about, and execute complex tasks through natural language interaction. The framework includes features such as ReAct Framework, Universal LLM Support, Secure Tool System, Real-time Monitoring, Memory Management, and Enterprise Ready components.

github

: 376

paelladoc

PAELLADOC is an intelligent documentation system that uses AI to analyze code repositories and generate comprehensive technical documentation. It offers a modular architecture with MECE principles, interactive documentation process, key features like Orchestrator and Commands, and a focus on context for successful AI programming. The tool aims to streamline documentation creation, code generation, and product management tasks for software development teams, providing a definitive standard for AI-assisted development documentation.

github

: 221

conduit

Conduit is an open-source, cross-platform mobile application for Open-WebUI, providing a native mobile experience for interacting with your self-hosted AI infrastructure. It supports real-time chat, model selection, conversation management, markdown rendering, theme support, voice input, file uploads, multi-modal support, secure storage, folder management, and tools invocation. Conduit offers multiple authentication flows and follows a clean architecture pattern with Riverpod for state management, Dio for HTTP networking, WebSocket for real-time streaming, and Flutter Secure Storage for credential management.

github

: 429

klavis

Klavis AI is a production-ready solution for managing Multiple Communication Protocol (MCP) servers. It offers self-hosted solutions and a hosted service with enterprise OAuth support. With Klavis AI, users can easily deploy and manage over 50 MCP servers for various services like GitHub, Gmail, Google Sheets, YouTube, Slack, and more. The tool provides instant access to MCP servers, seamless authentication, and integration with AI frameworks, making it ideal for individuals and businesses looking to streamline their communication and data management workflows.

github

: 4.2k

paperless-gpt

paperless-gpt is a tool designed to generate accurate and meaningful document titles and tags for paperless-ngx using Large Language Models (LLMs). It supports multiple LLM providers, including OpenAI and Ollama. With paperless-gpt, you can streamline your document management by automatically suggesting appropriate titles and tags based on the content of your scanned documents. The tool offers features like multiple LLM support, customizable prompts, easy integration with paperless-ngx, user-friendly interface for reviewing and applying suggestions, dockerized deployment, automatic document processing, and an experimental OCR feature.

github

: 1.4k

simba

Simba is an open source, portable Knowledge Management System (KMS) designed to seamlessly integrate with any Retrieval-Augmented Generation (RAG) system. It features a modern UI and modular architecture, allowing developers to focus on building advanced AI solutions without the complexities of knowledge management. Simba offers a user-friendly interface to visualize and modify document chunks, supports various vector stores and embedding models, and simplifies knowledge management for developers. It is community-driven, extensible, and aims to enhance AI functionality by providing a seamless integration with RAG-based systems.

github

: 1.2k

pilottai

PilottAI is a Python framework for building autonomous multi-agent systems with advanced orchestration capabilities. It provides enterprise-ready features for building scalable AI applications. The framework includes hierarchical agent systems, production-ready features like asynchronous processing and fault tolerance, advanced memory management with semantic storage, and integrations with multiple LLM providers and custom tools. PilottAI offers specialized agents for various tasks such as customer service, document processing, email handling, knowledge acquisition, marketing, research analysis, sales, social media, and web search. The framework also provides documentation, example use cases, and advanced features like memory management, load balancing, and fault tolerance.

github

: 219

Dive

Dive is an open-source MCP Host Desktop Application that seamlessly integrates with any LLMs supporting function calling capabilities. It offers universal LLM support, cross-platform compatibility, model context protocol for AI agent integration, OAP cloud integration, dual architecture for optimal performance, multi-language support, advanced API management, granular tool control, custom instructions, auto-update mechanism, and more. Dive provides a user-friendly interface for managing multiple AI models and tools, with recent updates introducing major architecture changes, new features, improvements, and platform availability. Users can easily download and install Dive on Windows, MacOS, and Linux, and set up MCP tools through local servers or OAP cloud services.

github

: 1.6k

For similar tasks

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

ai-guide

This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

github

: 159

onnxruntime-genai

ONNX Runtime Generative AI is a library that provides the generative AI loop for ONNX models, including inference with ONNX Runtime, logits processing, search and sampling, and KV cache management. Users can call a high level `generate()` method, or run each iteration of the model in a loop. It supports greedy/beam search and TopP, TopK sampling to generate token sequences, has built in logits processing like repetition penalties, and allows for easy custom scoring.

github

: 831

jupyter-ai

Jupyter AI connects generative AI with Jupyter notebooks. It provides a user-friendly and powerful way to explore generative AI models in notebooks and improve your productivity in JupyterLab and the Jupyter Notebook. Specifically, Jupyter AI offers: * An `%%ai` magic that turns the Jupyter notebook into a reproducible generative AI playground. This works anywhere the IPython kernel runs (JupyterLab, Jupyter Notebook, Google Colab, Kaggle, VSCode, etc.). * A native chat UI in JupyterLab that enables you to work with generative AI as a conversational assistant. * Support for a wide range of generative model providers, including AI21, Anthropic, AWS, Cohere, Gemini, Hugging Face, NVIDIA, and OpenAI. * Local model support through GPT4All, enabling use of generative AI models on consumer grade machines with ease and privacy.

github

: 3.5k

khoj

Khoj is an open-source, personal AI assistant that extends your capabilities by creating always-available AI agents. You can share your notes and documents to extend your digital brain, and your AI agents have access to the internet, allowing you to incorporate real-time information. Khoj is accessible on Desktop, Emacs, Obsidian, Web, and Whatsapp, and you can share PDF, markdown, org-mode, notion files, and GitHub repositories. You'll get fast, accurate semantic search on top of your docs, and your agents can create deeply personal images and understand your speech. Khoj is self-hostable and always will be.

github

: 28.5k

langchain_dart

LangChain.dart is a Dart port of the popular LangChain Python framework created by Harrison Chase. LangChain provides a set of ready-to-use components for working with language models and a standard interface for chaining them together to formulate more advanced use cases (e.g. chatbots, Q&A with RAG, agents, summarization, extraction, etc.). The components can be grouped into a few core modules: * **Model I/O:** LangChain offers a unified API for interacting with various LLM providers (e.g. OpenAI, Google, Mistral, Ollama, etc.), allowing developers to switch between them with ease. Additionally, it provides tools for managing model inputs (prompt templates and example selectors) and parsing the resulting model outputs (output parsers). * **Retrieval:** assists in loading user data (via document loaders), transforming it (with text splitters), extracting its meaning (using embedding models), storing (in vector stores) and retrieving it (through retrievers) so that it can be used to ground the model's responses (i.e. Retrieval-Augmented Generation or RAG). * **Agents:** "bots" that leverage LLMs to make informed decisions about which available tools (such as web search, calculators, database lookup, etc.) to use to accomplish the designated task. The different components can be composed together using the LangChain Expression Language (LCEL).

github

: 497

danswer

Danswer is an open-source Gen-AI Chat and Unified Search tool that connects to your company's docs, apps, and people. It provides a Chat interface and plugs into any LLM of your choice. Danswer can be deployed anywhere and for any scale - on a laptop, on-premise, or to cloud. Since you own the deployment, your user data and chats are fully in your own control. Danswer is MIT licensed and designed to be modular and easily extensible. The system also comes fully ready for production usage with user authentication, role management (admin/basic users), chat persistence, and a UI for configuring Personas (AI Assistants) and their Prompts. Danswer also serves as a Unified Search across all common workplace tools such as Slack, Google Drive, Confluence, etc. By combining LLMs and team specific knowledge, Danswer becomes a subject matter expert for the team. Imagine ChatGPT if it had access to your team's unique knowledge! It enables questions such as "A customer wants feature X, is this already supported?" or "Where's the pull request for feature Y?"

github

: 10.5k

infinity

Infinity is an AI-native database designed for LLM applications, providing incredibly fast full-text and vector search capabilities. It supports a wide range of data types, including vectors, full-text, and structured data, and offers a fused search feature that combines multiple embeddings and full text. Infinity is easy to use, with an intuitive Python API and a single-binary architecture that simplifies deployment. It achieves high performance, with 0.1 milliseconds query latency on million-scale vector datasets and up to 15K QPS.

github

: 4.1k

For similar jobs

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 980

LLMStack

github

: 1.5k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 2.9k

tabby

Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

github

: 32.1k

spear

SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

github

: 224

Magick

Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.

github

: 675

MassGen

README:

🚀 MassGen: Multi-Agent Scaling System for GenAI

📋 Table of Contents

✨ Key Features

🆕 Latest Features

🏗️ System Design

🚀 Quick Start

💡 Case Studies & Examples

🗺️ Roadmap

📚 Additional Resources

✨ Key Features

🆕 Latest Features (v0.0.24)

🏗️ System Design

🚀 Quick Start

1. 📥 Installation

2. 🔐 API Configuration

3. 🧩 Supported Models and Tools

Models

Tools

4. 🏃 Run MassGen

🚀 Getting Started

CLI Configuration Parameters

1. Single Agent (Easiest Start)

2. Multi-Agent Collaboration (Recommended)

3. Model context protocol (MCP)

4. File System Operations & Workspace Management

5. Project Integration & User Context Paths (NEW in v0.0.21)

Additional Examples by Provider

Additional Use Case Examples

Interactive Mode & Advanced Usage

Configuration Files

Backend Configuration Reference

Interactive Multi-Turn Mode

5. 📊 View Results

Real-time Display

Comprehensive Logging

Logging Storage Structure

Directory Structure Explanation

Important Note

💡 Case Studies

Case Studies

🗺️ Roadmap

Recent Achievements (v0.0.24)

vLLM Backend Support

Code Organization Improvements

New Provider & Model Support

Enhanced Documentation

Previous Achievements (v0.0.3 - v0.0.23)

Key Future Enhancements

v0.0.25 Roadmap

Required Features

Optional Features

🤝 Contributing

📄 License

⭐ Star History

For Tasks:

For Jobs:

Alternative AI tools for MassGen

Similar Open Source Tools

MassGen

tunacode

zotero-mcp

llm4s

R2R

aigne-doc-smith

trpc-agent-go

ai-development-patterns

quantalogic

paelladoc

conduit

klavis

paperless-gpt

simba

pilottai

Dive

For similar tasks

LLMStack

ai-guide

onnxruntime-genai