Shannon
Kubernetes/Linux of AI Agents - An enterprise-ready AI agent platform built with Rust for performance, Go for orchestration, Python for LLMs, and Solana for Web3 trust.
Stars: 258
Shannon is a battle-tested infrastructure for AI agents that solves problems at scale, such as runaway costs, non-deterministic failures, and security concerns. It offers features like intelligent caching, deterministic replay of workflows, time-travel debugging, WASI sandboxing, and hot-swapping between LLM providers. Shannon allows users to ship faster with zero configuration multi-agent setup, multiple AI patterns, time-travel debugging, and hot configuration changes. It is production-ready with features like WASI sandbox, token budget control, policy engine (OPA), and multi-tenancy. Shannon helps scale without breaking by reducing costs, being provider agnostic, observable by default, and designed for horizontal scaling with Temporal workflow orchestration.
README:
Stop burning money on AI tokens. Ship reliable agents that won't break in production.
Shannon is battle-tested infrastructure for AI agents that solves the problems you'll hit at scale: runaway costs, non-deterministic failures, and security nightmares. Built on Temporal workflows and WASI sandboxing, it's the platform we wished existed when our LLM bills hit $50k/month.
Real-time observability dashboard showing agent traffic control, metrics, and event streams
┌──────────────────────────────────────────────────────────────────────────────┐
│ │
│ Please ⭐ star this repo to show your support and stay updated! ⭐ │
│ │
└──────────────────────────────────────────────────────────────────────────────┘
- Zero Configuration Multi-Agent - Just describe what you want: "Analyze data, then create report" → Shannon handles dependencies automatically
-
Multiple AI Patterns - ReAct, Tree-of-Thoughts, Chain-of-Thought, Debate, and Reflection (configurable via
cognitive_strategy) - Time-Travel Debugging - Export and replay any workflow to reproduce exact agent behavior
- Hot Configuration - Change models, prompts, and policies without restarts
- WASI Sandbox - Full Python 3.11 support with bulletproof security (→ Guide)
- Token Budget Control - Hard limits per user/session with real-time tracking
- Policy Engine (OPA) - Define who can use which tools, models, and data
- Multi-Tenancy - Complete isolation between users, sessions, and organizations
- Human-in-the-Loop - Approval workflow for high-risk operations (complexity >0.7 or dangerous tools)
- 70% Cost Reduction - Smart caching, session management, and token optimization
- Provider Agnostic - OpenAI, Anthropic, Google, Azure, Bedrock, DeepSeek, Groq, and more
- Observable by Default - Real-time dashboard, Prometheus metrics, OpenTelemetry tracing
- Distributed by Design - Horizontal scaling with Temporal workflow orchestration
Model pricing is centralized in config/models.yaml - all services load from this single source for consistent cost tracking.
| Challenge | Shannon | LangGraph | AutoGen | CrewAI |
|---|---|---|---|---|
| Multi-Agent Orchestration | ✅ DAG/Graph workflows | ✅ Stateful graphs | ✅ Group chat | ✅ Crew/roles |
| Agent Communication | ✅ Message passing | ✅ Tool calling | ✅ Conversations | ✅ Delegation |
| Memory & Context | ✅ Chunked storage (character-based), MMR diversity | ✅ Multiple types | ✅ Conversation history | ✅ Shared memory |
| Debugging Production Issues | ✅ Replay any workflow | ❌ Limited debugging | ❌ Basic logging | ❌ |
| Token Cost Control | ✅ Hard budget limits | ❌ | ❌ | ❌ |
| Security Sandbox | ✅ WASI isolation | ❌ | ❌ | ❌ |
| Policy Control (OPA) | ✅ Fine-grained rules | ❌ | ❌ | ❌ |
| Deterministic Replay | ✅ Time-travel debugging | ❌ | ❌ | ❌ |
| Session Persistence | ✅ Redis-backed, durable | ❌ | ||
| Multi-Language | ✅ Go/Rust/Python | |||
| Production Metrics | ✅ Dashboard/Prometheus | ❌ | ❌ |
- Docker and Docker Compose
- Make, curl, grpcurl
- An API key for at least one supported LLM provider
Docker Setup Instructions (click to expand)
macOS:
# Install Docker Desktop from https://www.docker.com/products/docker-desktop/
# Or using Homebrew:
brew install --cask dockerLinux (Ubuntu/Debian):
# Install Docker Engine
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo usermod -aG docker $USER
# Log out and back in for group changes to take effect
# Install Docker Compose
sudo apt-get update
sudo apt-get install docker-compose-plugindocker --version
docker compose versionThe make dev command starts all services:
- PostgreSQL: Database on port 5432
- Redis: Cache on port 6379
- Qdrant: Vector store on port 6333
- Temporal: Workflow engine on port 7233 (UI on 8088)
- Orchestrator: Go service on port 50052
- Agent Core: Rust service on port 50051
- LLM Service: Python service on port 8000
- Gateway: REST API gateway on port 8080
- Dashboard: Real-time observability UI on port 2111
git clone https://github.com/Kocoro-lab/Shannon.git
cd Shannon
# One-stop setup: creates .env, generates protobuf files
make setup
# Add your LLM API key to .env
echo "OPENAI_API_KEY=your-key-here" >> .env
# Download Python WASI interpreter for secure code execution (20MB)
./scripts/setup_python_wasi.sh
# Start all services and verify
make dev
make smokeShannon provides multiple ways to interact with your AI agents:
# Open the Shannon Dashboard in your browser
open http://localhost:2111
# The dashboard provides:
# - Visual task submission interface
# - Real-time event streaming
# - System metrics and monitoring
# - Task history and results# For development (no auth required)
export GATEWAY_SKIP_AUTH=1
# Submit a task via API
curl -X POST http://localhost:8080/api/v1/tasks \
-H "Content-Type: application/json" \
-d '{
"query": "Analyze the sentiment of: Shannon makes AI agents simple!",
"session_id": "demo-session-123"
}'
# Response includes workflow_id for tracking
# {"workflow_id":"task-dev-1234567890","status":"running"}# Stream live events as your agent works (replace with your workflow_id)
curl -N http://localhost:8081/stream/sse?workflow_id=task-dev-1234567890
# You'll see human-readable events like:
# event: AGENT_THINKING
# data: {"message":"Analyzing sentiment: Shannon makes AI agents simple!"}
#
# event: TOOL_INVOKED
# data: {"message":"Processing natural language sentiment analysis"}
#
# event: AGENT_COMPLETED
# data: {"message":"Task completed successfully"}# Check final status and result
curl http://localhost:8080/api/v1/tasks/task-dev-1234567890
# Response includes status, result, tokens used, and metadataFor production, use API keys instead of GATEWAY_SKIP_AUTH:
# Create an API key (one-time setup)
make seed-api-key # Creates test key: sk_test_123456
# Use in requests
curl -X POST http://localhost:8080/api/v1/tasks \
-H "X-API-Key: sk_test_123456" \
-H "Content-Type: application/json" \
-d '{"query":"Your task here"}'Advanced Methods: Scripts, gRPC, and Command Line (click to expand)
# Submit a simple task
./scripts/submit_task.sh "Analyze the sentiment of: 'Shannon makes AI agents simple!'"
# Check session usage and token tracking (session ID is in SubmitTask response message)
grpcurl -plaintext \
-d '{"sessionId":"YOUR_SESSION_ID"}' \
localhost:50052 shannon.orchestrator.OrchestratorService/GetSessionContext
# Export and replay a workflow history (use the workflow ID from submit_task output)
./scripts/replay_workflow.sh <WORKFLOW_ID># Submit via gRPC
grpcurl -plaintext \
-d '{"query":"Analyze sentiment","sessionId":"test-session"}' \
localhost:50052 shannon.orchestrator.OrchestratorService/SubmitTask
# Stream events via gRPC
grpcurl -plaintext \
-d '{"workflowId":"task-dev-1234567890"}' \
localhost:50052 shannon.orchestrator.StreamingService/StreamTaskExecution# Connect to WebSocket for bidirectional streaming
# Via admin port (no auth):
wscat -c ws://localhost:8081/stream/ws?workflow_id=task-dev-1234567890
# Or via gateway (with auth):
# wscat -c ws://localhost:8080/api/v1/stream/ws?workflow_id=task-dev-1234567890 \
# -H "Authorization: Bearer YOUR_API_KEY"# Access Shannon Dashboard for real-time monitoring
open http://localhost:2111
# Dashboard features:
# - Real-time task execution and event streams
# - System metrics and performance graphs
# - Token usage tracking and budget monitoring
# - Agent traffic control visualization
# - Interactive command execution
# Access Temporal Web UI for workflow debugging
open http://localhost:8088
# Temporal UI provides:
# - Workflow execution history and timeline
# - Task status, retries, and failures
# - Input/output data for each step
# - Real-time workflow progress
# - Search workflows by ID, type, or statusThe visual tools provide comprehensive monitoring:
- Shannon Dashboard (http://localhost:2111) - Real-time agent traffic control, metrics, and events
- Temporal UI (http://localhost:8088) - Workflow debugging and state inspection
- Combined view - Full visibility into your AI agents' behavior and system performance
Click each example below to expand. These showcase Shannon's unique features that set it apart from other frameworks.
Example 1: Cost-Controlled Customer Support
curl -X POST http://localhost:8080/api/v1/tasks \
-H "Content-Type: application/json" \
-d '{
"query": "Help me troubleshoot my deployment issue",
"session_id": "user-123-session"
}'Key features:
- Session persistence - Maintains conversation context across requests
- Token tracking - Every request returns token usage and costs
- Policy control - Apply OPA policies for allowed actions (see Example 3)
- Result: 70% cost reduction through smart caching and session management
Example 2: Debugging Production Failures
# Production agent failed at 3am? No problem.
# Export and replay the workflow in one command
./scripts/replay_workflow.sh task-prod-failure-123
# Or specify a particular run ID
./scripts/replay_workflow.sh task-prod-failure-123 abc-def-ghi
# Output shows step-by-step execution with token counts, decisions, and state changes
# Fix the issue, add a test case, never see it againExample 3: Multi-Team Model Governance
# config/opa/policies/data-science.rego
package shannon.teams.datascience
default allow = false
allow {
input.team == "data-science"
input.model in ["gpt-4o", "claude-3-sonnet"]
}
max_tokens = 50000 {
input.team == "data-science"
}
# config/opa/policies/customer-support.rego
package shannon.teams.support
default allow = false
allow {
input.team == "support"
input.model == "gpt-4o-mini"
}
max_tokens = 5000 {
input.team == "support"
}
deny_tool["database_write"] {
input.team == "support"
}Example 4: Security-First Code Execution
# Python code runs in isolated WASI sandbox with full standard library
./scripts/submit_task.sh "Execute Python: print('Hello from secure WASI!')"
# Even malicious code is safe
./scripts/submit_task.sh "Execute Python: import os; os.system('rm -rf /')"
# Result: OSError - system calls blocked by WASI sandbox
# Advanced: Session persistence for data analysis
./scripts/submit_task.sh "Execute Python with session 'analysis': data = [1,2,3,4,5]"
./scripts/submit_task.sh "Execute Python with session 'analysis': print(sum(data))"
# Output: 15Example 5: Human-in-the-Loop Approval
# Configure approval for high-complexity or dangerous operations
cat > config/features.yaml << 'EOF'
workflows:
approval:
enabled: true
complexity_threshold: 0.7 # Require approval for complex tasks
dangerous_tools: ["file_delete", "database_write", "api_call"]
EOF
# Submit a complex task that triggers approval
./scripts/submit_task.sh "Delete all temporary files older than 30 days from /tmp"
# Workflow pauses and waits for human approval
# Check Temporal UI: http://localhost:8088
# Approve via signal: temporal workflow signal --workflow-id <ID> --name approval --input '{"approved":true}'Unique to Shannon: Configurable approval workflows based on complexity scoring and tool usage.
Example 6: Multi-Agent Memory & Learning
# Agent learns from conversation and applies knowledge
SESSION="learning-session-$(date +%s)"
# Agent learns your preferences
./scripts/submit_task.sh "I prefer Python over Java for data science" "$SESSION"
./scripts/submit_task.sh "I like using pandas and numpy for analysis" "$SESSION"
./scripts/submit_task.sh "My projects usually involve machine learning" "$SESSION"
# Later, agent recalls and applies this knowledge
./scripts/submit_task.sh "What language and tools should I use for my new data project?" "$SESSION"
# Response includes personalized recommendations based on learned preferences
# Check memory storage (character-based chunking with MMR diversity)
grpcurl -plaintext -d "{\"sessionId\":\"$SESSION\"}" \
localhost:50052 shannon.orchestrator.OrchestratorService/GetSessionContextUnique to Shannon: Persistent memory with intelligent chunking (4 chars ≈ 1 token) and MMR diversity ranking.
Example 7: Supervisor Workflow with Dynamic Strategy
# Complex task automatically delegates to multiple specialized agents
./scripts/submit_task.sh "Analyze our website performance, identify bottlenecks, and create an optimization plan with specific recommendations"
# Watch the orchestration in real-time
curl -N "http://localhost:8081/stream/sse?workflow_id=<WORKFLOW_ID>"
# Events show:
# - Complexity analysis (score: 0.85)
# - Strategy selection (supervisor pattern chosen)
# - Dynamic agent spawning (analyzer, investigator, planner)
# - Parallel execution with coordination
# - Synthesis and quality reflectionUnique to Shannon: Automatic workflow pattern selection based on task complexity.
Example 8: Time-Travel Debugging with State Inspection
# Production issue at 3am? Debug it step-by-step
FAILED_WORKFLOW="task-prod-failure-20250928-0300"
# Export with full state history
./scripts/replay_workflow.sh export $FAILED_WORKFLOW debug.json
# Inspect specific decision points
go run ./tools/replay -history debug.json -inspect-step 5
# Modify and test fix locally
go run ./tools/replay -history debug.json -override-activity GetLLMResponse
# Validate fix passes all historical workflows
make ci-replayUnique to Shannon: Complete workflow state inspection and modification for debugging.
Example 9: Token Budget with Circuit Breakers
# Set strict budget with automatic fallbacks
curl -X POST http://localhost:8080/api/v1/tasks \
-H "Content-Type: application/json" \
-H "X-API-Key: sk_test_123456" \
-d '{
"query": "Generate a comprehensive market analysis report",
"session_id": "budget-test",
"config": {
"budget": {
"max_tokens": 5000,
"fallback_model": "gpt-4o-mini",
"circuit_breaker": {
"threshold": 0.8,
"cooldown_seconds": 60
}
}
}
}'
# System automatically:
# - Switches to cheaper model when 80% budget consumed
# - Implements cooldown period to prevent runaway costs
# - Returns partial results if budget exhaustedUnique to Shannon: Real-time budget enforcement with automatic degradation.
Example 10: Multi-Tenant Agent Isolation
# Each tenant gets isolated agents with separate policies
# Tenant A: Data Science team
curl -X POST http://localhost:8080/api/v1/tasks \
-H "X-API-Key: sk_tenant_a_key" \
-H "X-Tenant-ID: data-science" \
-d '{"query": "Train a model on our dataset"}'
# Tenant B: Customer Support
curl -X POST http://localhost:8080/api/v1/tasks \
-H "X-API-Key: sk_tenant_b_key" \
-H "X-Tenant-ID: support" \
-d '{"query": "Access customer database"}' # Denied by OPA policy
# Complete isolation:
# - Separate memory/vector stores per tenant
# - Independent token budgets
# - Custom model access
# - Isolated session managementUnique to Shannon: Enterprise-grade multi-tenancy with OPA policy enforcement.
More Production Examples (click to expand)
- Incident Response Bot: Auto-triages alerts with budget limits
- Code Review Agent: Enforces security policies via OPA rules
- Data Pipeline Monitor: Replays failed workflows for debugging
- Compliance Auditor: Full trace of every decision and data access
- Multi-Tenant SaaS: Complete isolation between customer agents
┌─────────────┐ ┌──────────────┐ ┌─────────────┐
│ Client │────▶│ Orchestrator │────▶│ Agent Core │
└─────────────┘ │ (Go) │ │ (Rust) │
└──────────────┘ └─────────────┘
│ │
▼ ▼
┌──────────────┐ ┌─────────────┐
│ Temporal │ │ WASI Tools │
│ Workflows │ │ Sandbox │
└──────────────┘ └─────────────┘
│
▼
┌──────────────┐
│ LLM Service │
│ (Python) │
└──────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ CLIENT LAYER │
├─────────────┬─────────────┬─────────────┬───────────────────────┤
│ HTTP │ gRPC │ SSE │ WebSocket (soon) │
└─────────────┴─────────────┴─────────────┴───────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ ORCHESTRATOR (Go) │
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌──────────┐ │
│ │ Router │──│ Budget │──│ Session │──│ OPA │ │
│ │ │ │ Manager │ │ Store │ │ Policies │ │
│ └────────────┘ └────────────┘ └────────────┘ └──────────┘ │
└─────────────────────────────────────────────────────────────────┘
│ │ │ │
▼ ▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Temporal │ │ Redis │ │ PostgreSQL │ │ Qdrant │
│ Workflows │ │ Cache │ │ State │ │ Vectors │
│ │ │ Sessions │ │ History │ │ Memory │
└──────────────┘ └──────────────┘ └──────────────┘ └──────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ AGENT CORE (Rust) │
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌──────────┐ │
│ │ WASI │──│ Policy │──│ Tool │──│ Agent │ │
│ │ Sandbox │ │ Enforcer │ │ Registry │ │ Comms │ │
│ └────────────┘ └────────────┘ └────────────┘ └──────────┘ │
└─────────────────────────────────────────────────────────────────┘
│ │
▼ ▼
┌────────────────────────────────┐ ┌─────────────────────────────────┐
│ LLM SERVICE (Python) │ │ OBSERVABILITY LAYER │
│ ┌────────────┐ ┌────────────┐ │ │ ┌────────────┐ ┌────────────┐ │
│ │ Provider │ │ MCP │ │ │ │ Prometheus │ │ OpenTel │ │
│ │ Adapter │ │ Tools │ │ │ │ Metrics │ │ Traces │ │
│ └────────────┘ └────────────┘ │ │ └────────────┘ └────────────┘ │
└────────────────────────────────┘ └─────────────────────────────────┘
- Orchestrator (Go): Task routing, budget enforcement, session management, OPA policy evaluation
- Agent Core (Rust): WASI sandbox execution, policy enforcement, agent-to-agent communication
- LLM Service (Python): Provider abstraction (15+ LLMs), MCP tools, prompt optimization
- Gateway (Go): REST API, authentication, rate limiting, request validation
- Dashboard (React/Next.js): Real-time monitoring, metrics visualization, event streaming
- Data Layer: PostgreSQL (workflow state), Redis (session cache), Qdrant (vector memory)
- Observability: Built-in dashboard, Prometheus metrics, OpenTelemetry tracing
# Clone and configure
git clone https://github.com/Kocoro-lab/Shannon.git
cd Shannon
make setup-env
echo "OPENAI_API_KEY=sk-..." >> .env
# Launch
make dev
# Set budgets per request (see "Examples That Actually Matter" section)
# Configure in SubmitTask payload: {"budget": {"max_tokens": 5000}}# Create your first OPA policy
cat > config/opa/policies/default.rego << EOF
package shannon
default allow = false
# Allow all for dev, restrict in prod
allow {
input.environment == "development"
}
# Production rules
allow {
input.environment == "production"
input.tokens_requested < 10000
input.model in ["gpt-4o-mini", "claude-4-haiku"]
}
EOF
# Hot reload - no restart needed!# Something went wrong in production?
# 1. Find the workflow ID from logs
grep ERROR logs/orchestrator.log | tail -1
# 2. Export the workflow
./scripts/replay_workflow.sh export task-xxx-failed debug.json
# 3. Replay locally to see exactly what happened
./scripts/replay_workflow.sh replay debug.json
# 4. Fix, test, deploy with confidence# config/teams.yaml
teams:
data-science:
models: ["gpt-4o", "claude-4-sonnet"]
max_tokens_per_day: 1000000
tools: ["*"]
customer-support:
models: ["gpt-4o-mini"]
max_tokens_per_day: 50000
tools: ["search", "respond", "escalate"]
engineering:
models: ["claude-4-sonnet", "gpt-4o"]
max_tokens_per_day: 500000
tools: ["code_*", "test_*", "deploy_*"]- Platform Architecture Overview
- Multi-Agent Workflow Architecture
- Agent Core Architecture
- Pattern Selection Guide
# Run linters and formatters
make lint
make fmt
# Run smoke tests
make smoke
# View logs
make logs
# Check service status
make psWe love contributions! Please see our Contributing Guide for details.
- Discord: Join our Discord
- Twitter/X: @shannon_agents
Now → v0.1 (Production Ready)
- ✅ Core platform stable - Go orchestrator, Rust agent-core, Python LLM service
- ✅ Deterministic replay debugging - Export and replay any workflow execution
- ✅ OPA policy enforcement - Fine-grained security and governance rules
- ✅ WebSocket streaming - Real-time agent communication with event filtering and replay
- ✅ SSE streaming - Server-sent events for browser-native streaming
- ✅ MCP integration - Model Context Protocol for standardized tool interfaces
- ✅ WASI sandbox - Secure code execution environment with resource limits
- ✅ Multi-agent orchestration - DAG workflows with parallel execution
- ✅ Vector memory - Qdrant-based semantic search and context retrieval
- ✅ Circuit breaker patterns - Automatic failure recovery and degradation
- ✅ Multi-provider LLM support - OpenAI, Anthropic, Google, DeepSeek, and more
- ✅ Token budget management - Hard limits with real-time tracking
- ✅ Session management - Durable state with Redis/PostgreSQL persistence
- 🚧 LangGraph adapter - Bridge to LangChain ecosystem (integration framework complete)
- 🚧 AutoGen adapter - Bridge to Microsoft AutoGen multi-agent conversations
v0.2
- [ ] Enterprise SSO - SAML/OAuth integration with existing identity providers
- [ ] Natural language policies - Human-readable policy definitions with AI assistance
- [ ] Enhanced monitoring - Custom dashboards and alerting rules
- [ ] Advanced caching - Multi-level caching with semantic deduplication
- [ ] Real-time collaboration - Multi-user agent sessions with shared context
- [ ] Plugin ecosystem - Third-party tool and integration marketplace
- [ ] Workflow marketplace - Community-contributed agent templates and patterns
- [ ] Edge deployment - WASM execution in browser environments
v0.3
- [ ] Autonomous agent swarms - Self-organizing multi-agent systems
- [ ] Cross-organization federation - Secure agent communication across tenants
- [ ] Predictive scaling - ML-based resource allocation and optimization
- [ ] Blockchain integration - Proof-of-execution and decentralized governance
- [ ] Advanced personalization - User-specific LoRA adapters and preferences
v0.4
- [ ] Continuous learning - Automated prompt and strategy optimization
- [ ] Multi-agent marketplaces - Economic incentives and reputation systems
- [ ] Advanced reasoning - Hybrid symbolic + neural approaches
- [ ] Global deployment - Multi-region, multi-cloud architecture
- [ ] Regulatory compliance - SOC 2, GDPR, HIPAA automation
- [ ] AI safety frameworks - Constitutional AI and alignment mechanisms
- Python Code Execution - Secure Python execution via WASI sandbox
- Multi-Agent Workflows - Orchestration patterns and best practices
- Pattern Usage Guide - ReAct, Tree-of-Thoughts, Debate patterns
- Streaming APIs - Real-time agent output streaming
- Authentication & Access Control - Multi-tenancy and OPA policies
- Agent Core API - Rust service endpoints
- Orchestrator Service - Workflow management and patterns
- LLM Service API - Provider abstraction
- 🐛 Found a bug? Open an issue
- 💡 Have an idea? Start a discussion
- 💬 Need help? Join our Discord
- ⭐ Like the project? Give us a star!
We're building decentralized trust infrastructure with Solana blockchain:
- Cryptographic Verification: On-chain attestation of AI agent actions and results
- Immutable Audit Trail: Blockchain-based proof of task execution
- Smart Contract Interoperability: Enable AI agents to interact with DeFi and Web3 protocols
- Token-Gated Capabilities: Control agent permissions through blockchain tokens
- Decentralized Reputation: Build trust through verifiable on-chain agent performance
Stay tuned for our Web3 trust layer - bringing transparency and verifiability to AI systems!
Shannon builds upon and integrates amazing work from the open-source community:
- Agent Traffic Control - The original inspiration for our retro terminal UI design and agent visualization concept
- Model Context Protocol (MCP) - Anthropic's protocol for standardized LLM-tool interactions
- Claude Code - Used extensively in developing Shannon's codebase
- Temporal - The bulletproof workflow orchestration engine powering Shannon's reliability
- LangGraph - Inspiration for stateful agent architectures
- AutoGen - Microsoft's multi-agent conversation framework
- WASI - WebAssembly System Interface for secure code execution
- Open Policy Agent - Policy engine for fine-grained access control
Special thanks to all our contributors and the broader AI agent community for feedback, bug reports, and feature suggestions.
MIT License - Use it anywhere, modify anything, zero restrictions. See LICENSE.
Stop debugging AI failures. Start shipping reliable agents.
Discord •
GitHub
If Shannon saves you time or money, let us know! We love success stories.
Twitter/X: @shannon_agents
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for Shannon
Similar Open Source Tools
Shannon
Shannon is a battle-tested infrastructure for AI agents that solves problems at scale, such as runaway costs, non-deterministic failures, and security concerns. It offers features like intelligent caching, deterministic replay of workflows, time-travel debugging, WASI sandboxing, and hot-swapping between LLM providers. Shannon allows users to ship faster with zero configuration multi-agent setup, multiple AI patterns, time-travel debugging, and hot configuration changes. It is production-ready with features like WASI sandbox, token budget control, policy engine (OPA), and multi-tenancy. Shannon helps scale without breaking by reducing costs, being provider agnostic, observable by default, and designed for horizontal scaling with Temporal workflow orchestration.
solo-server
Solo Server is a lightweight server designed for managing hardware-aware inference. It provides seamless setup through a simple CLI and HTTP servers, an open model registry for pulling models from platforms like Ollama and Hugging Face, cross-platform compatibility for effortless deployment of AI models on hardware, and a configurable framework that auto-detects hardware components (CPU, GPU, RAM) and sets optimal configurations.
distill
Distill is a reliability layer for LLM context that provides deterministic deduplication to remove redundancy before reaching the model. It aims to reduce redundant data, lower costs, provide faster responses, and offer more efficient and deterministic results. The tool works by deduplicating, compressing, summarizing, and caching context to ensure reliable outputs. It offers various installation methods, including binary download, Go install, Docker usage, and building from source. Distill can be used for tasks like deduplicating chunks, connecting to vector databases, integrating with AI assistants, analyzing files for duplicates, syncing vectors to Pinecone, querying from the command line, and managing configuration files. The tool supports self-hosting via Docker, Docker Compose, building from source, Fly.io deployment, Render deployment, and Railway integration. Distill also provides monitoring capabilities with Prometheus-compatible metrics, Grafana dashboard, and OpenTelemetry tracing.
mesh
MCP Mesh is an open-source control plane for MCP traffic that provides a unified layer for authentication, routing, and observability. It replaces multiple integrations with a single production endpoint, simplifying configuration management. Built for multi-tenant organizations, it offers workspace/project scoping for policies, credentials, and logs. With core capabilities like MeshContext, AccessControl, and OpenTelemetry, it ensures fine-grained RBAC, full tracing, and metrics for tools and workflows. Users can define tools with input/output validation, access control checks, audit logging, and OpenTelemetry traces. The project structure includes apps for full-stack MCP Mesh, encryption, observability, and more, with deployment options ranging from Docker to Kubernetes. The tech stack includes Bun/Node runtime, TypeScript, Hono API, React, Kysely ORM, and Better Auth for OAuth and API keys.
helix
HelixML is a private GenAI platform that allows users to deploy the best of open AI in their own data center or VPC while retaining complete data security and control. It includes support for fine-tuning models with drag-and-drop functionality. HelixML brings the best of open source AI to businesses in an ergonomic and scalable way, optimizing the tradeoff between GPU memory and latency.
pilot
Pilot is an AI tool designed to streamline the process of handling tickets from GitHub, Linear, Jira, or Asana. It plans the implementation, writes the code, runs tests, and opens a PR for you to review and merge. With features like Autopilot, Epic Decomposition, Self-Review, and more, Pilot aims to automate the ticket handling process and reduce the time spent on prioritizing and completing tasks. It integrates with various platforms, offers intelligence features, and provides real-time visibility through a dashboard. Pilot is free to use, with costs associated with Claude API usage. It is designed for bug fixes, small features, refactoring, tests, docs, and dependency updates, but may not be suitable for large architectural changes or security-critical code.
vllm-mlx
vLLM-MLX is a tool that brings native Apple Silicon GPU acceleration to vLLM by integrating Apple's ML framework with unified memory and Metal kernels. It offers optimized LLM inference with KV cache and quantization, vision-language models for multimodal inference, speech-to-text and text-to-speech with native voices, text embeddings for semantic search and RAG, and more. Users can benefit from features like multimodal support for text, image, video, and audio, native GPU acceleration on Apple Silicon, compatibility with OpenAI API, Anthropic Messages API, reasoning models extraction, integration with external tools via Model Context Protocol, memory-efficient caching, and high throughput for multiple concurrent users.
myclaw
myclaw is a personal AI assistant built on agentsdk-go that offers a CLI agent for single message or interactive REPL mode, full orchestration with channels, cron, and heartbeat, support for various messaging channels like Telegram, Feishu, WeCom, WhatsApp, and a web UI, multi-provider support for Anthropic and OpenAI models, image recognition and document processing, scheduled tasks with JSON persistence, long-term and daily memory storage, custom skill loading, and more. It provides a comprehensive solution for interacting with AI models and managing tasks efficiently.
tinyclaw
TinyClaw is a lightweight wrapper around Claude Code that connects WhatsApp via QR code, processes messages sequentially, maintains conversation context, runs 24/7 in tmux, and is ready for multi-channel support. Its key innovation is the file-based queue system that prevents race conditions and enables multi-channel support. TinyClaw consists of components like whatsapp-client.js for WhatsApp I/O, queue-processor.js for message processing, heartbeat-cron.sh for health checks, and tinyclaw.sh as the main orchestrator with a CLI interface. It ensures no race conditions, is multi-channel ready, provides clean responses using claude -c -p, and supports persistent sessions. Security measures include local storage of WhatsApp session and queue files, channel-specific authentication, and running Claude with user permissions.
gpt-all-star
GPT-All-Star is an AI-powered code generation tool designed for scratch development of web applications with team collaboration of autonomous AI agents. The primary focus of this research project is to explore the potential of autonomous AI agents in software development. Users can organize their team, choose leaders for each step, create action plans, and work together to complete tasks. The tool supports various endpoints like OpenAI, Azure, and Anthropic, and provides functionalities for project management, code generation, and team collaboration.
memsearch
Memsearch is a tool that allows users to give their AI agents persistent memory in a few lines of code. It enables users to write memories as markdown and search them semantically. Inspired by OpenClaw's markdown-first memory architecture, Memsearch is pluggable into any agent framework. The tool offers features like smart deduplication, live sync, and a ready-made Claude Code plugin for building agent memory.
mimiclaw
MimiClaw is a pocket AI assistant that runs on a $5 chip, specifically designed for the ESP32-S3 board. It operates without Linux or Node.js, using pure C language. Users can interact with MimiClaw through Telegram, enabling it to handle various tasks and learn from local memory. The tool is energy-efficient, running on USB power 24/7. With MimiClaw, users can have a personal AI assistant on a chip the size of a thumb, making it convenient and accessible for everyday use.
claudex
Claudex is an open-source, self-hosted Claude Code UI that runs entirely on your machine. It provides multiple sandboxes, allows users to use their own plans, offers a full IDE experience with VS Code in the browser, and is extensible with skills, agents, slash commands, and MCP servers. Users can run AI agents in isolated environments, view and interact with a browser via VNC, switch between multiple AI providers, automate tasks with Celery workers, and enjoy various chat features and preview capabilities. Claudex also supports marketplace plugins, secrets management, integrations like Gmail, and custom instructions. The tool is configured through providers and supports various providers like Anthropic, OpenAI, OpenRouter, and Custom. It has a tech stack consisting of React, FastAPI, Python, PostgreSQL, Celery, Redis, and more.
vibium
Vibium is a browser automation infrastructure designed for AI agents, providing a single binary that manages browser lifecycle, WebDriver BiDi protocol, and an MCP server. It offers zero configuration, AI-native capabilities, and is lightweight with no runtime dependencies. It is suitable for AI agents, test automation, and any tasks requiring browser interaction.
aiohomematic
AIO Homematic (hahomematic) is a lightweight Python 3 library for controlling and monitoring HomeMatic and HomematicIP devices, with support for third-party devices/gateways. It automatically creates entities for device parameters, offers custom entity classes for complex behavior, and includes features like caching paramsets for faster restarts. Designed to integrate with Home Assistant, it requires specific firmware versions for HomematicIP devices. The public API is defined in modules like central, client, model, exceptions, and const, with example usage provided. Useful links include changelog, data point definitions, troubleshooting, and developer resources for architecture, data flow, model extension, and Home Assistant lifecycle.
AgentX
AgentX is a next-generation open-source AI agent development framework and runtime platform. It provides an event-driven runtime with a simple framework and minimal UI. The platform is ready-to-use and offers features like multi-user support, session persistence, real-time streaming, and Docker readiness. Users can build AI Agent applications with event-driven architecture using TypeScript for server-side (Node.js) and client-side (Browser/React) development. AgentX also includes comprehensive documentation, core concepts, guides, API references, and various packages for different functionalities. The architecture follows an event-driven design with layered components for server-side and client-side interactions.
For similar tasks
Azure-Analytics-and-AI-Engagement
The Azure-Analytics-and-AI-Engagement repository provides packaged Industry Scenario DREAM Demos with ARM templates (Containing a demo web application, Power BI reports, Synapse resources, AML Notebooks etc.) that can be deployed in a customer’s subscription using the CAPE tool within a matter of few hours. Partners can also deploy DREAM Demos in their own subscriptions using DPoC.
sorrentum
Sorrentum is an open-source project that aims to combine open-source development, startups, and brilliant students to build machine learning, AI, and Web3 / DeFi protocols geared towards finance and economics. The project provides opportunities for internships, research assistantships, and development grants, as well as the chance to work on cutting-edge problems, learn about startups, write academic papers, and get internships and full-time positions at companies working on Sorrentum applications.
tidb
TiDB is an open-source distributed SQL database that supports Hybrid Transactional and Analytical Processing (HTAP) workloads. It is MySQL compatible and features horizontal scalability, strong consistency, and high availability.
zep-python
Zep is an open-source platform for building and deploying large language model (LLM) applications. It provides a suite of tools and services that make it easy to integrate LLMs into your applications, including chat history memory, embedding, vector search, and data enrichment. Zep is designed to be scalable, reliable, and easy to use, making it a great choice for developers who want to build LLM-powered applications quickly and easily.
telemetry-airflow
This repository codifies the Airflow cluster that is deployed at workflow.telemetry.mozilla.org (behind SSO) and commonly referred to as "WTMO" or simply "Airflow". Some links relevant to users and developers of WTMO: * The `dags` directory in this repository contains some custom DAG definitions * Many of the DAGs registered with WTMO don't live in this repository, but are instead generated from ETL task definitions in bigquery-etl * The Data SRE team maintains a WTMO Developer Guide (behind SSO)
mojo
Mojo is a new programming language that bridges the gap between research and production by combining Python syntax and ecosystem with systems programming and metaprogramming features. Mojo is still young, but it is designed to become a superset of Python over time.
pandas-ai
PandasAI is a Python library that makes it easy to ask questions to your data in natural language. It helps you to explore, clean, and analyze your data using generative AI.
databend
Databend is an open-source cloud data warehouse that serves as a cost-effective alternative to Snowflake. With its focus on fast query execution and data ingestion, it's designed for complex analysis of the world's largest datasets.
For similar jobs
sweep
Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.
teams-ai
The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.
ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.
classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.
chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.
BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students
uAgents
uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.
griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.
