alphora

alphora

A Production-Ready Framework for Building Composable AI Agents

Stars: 161

Visit
 screenshot

Alphora is a full-stack framework for building production AI agents, providing agent orchestration, prompt engineering, tool execution, memory management, streaming, and deployment with an async-first, OpenAI-compatible design. It offers features like agent derivation, reasoning-action loop, async streaming, visual debugger, OpenAI compatibility, multimodal support, tool system with zero-config tools and type safety, prompt engine with dynamic prompts, memory and storage management, sandbox for secure execution, deployment as API, and more. Alphora allows users to build sophisticated AI agents easily and efficiently.

README:

Alphora

Python License PRs Welcome

A Production-Ready Framework for Building Composable AI Agents

Build powerful, modular, and maintainable AI agent applications with ease.

DocsQuick StartExamples中文


What is Alphora?

Alphora is a full-stack framework for building production AI agents. It provides everything you need: agent orchestration, prompt engineering, tool execution, memory management, streaming, and deployment—all with an async-first, OpenAI-compatible design.

from alphora.agent import ReActAgent
from alphora.models import OpenAILike
from alphora.sandbox import Sandbox
from alphora.tools import tool

@tool
def search_database(query: str) -> str:
    """Search the product database."""
    return f"Found 3 results for: {query}"


sandbox = Sandbox.create_docker()

agent = ReActAgent(
    llm=OpenAILike(model_name="gpt-4"),
    tools=[search_database],
    system_prompt="You are a helpful assistant.",
    sandbox=sandbox
)

response = await agent.run("Find laptops under $1000")

Installation

pip install alphora

Features

Alphora is packed with features for building sophisticated AI agents:

Agent System

  • Agent Derivation — Child agents inherit LLM, memory, and config from parents. Build hierarchies that share context.
  • ReAct Loop — Built-in reasoning-action loop with automatic tool orchestration, retry logic, and iteration control.
  • Streaming First — Native async streaming with OpenAI SSE format. Multiple content types: char, think, result, sql, chart.
  • Debug Tracing — Built-in visual debugger for agent execution flow, LLM calls, and tool invocations.

Model Layer

  • OpenAI Compatible — Works with any OpenAI-compatible API: GPT, Claude, Qwen, DeepSeek, local models.
  • Multimodal Support — Unified Message class for text, images, audio, and video inputs.
  • Load Balancing — Built-in round-robin/random load balancing across multiple LLM backends.
  • Thinking Mode — Support for reasoning models (Qwen3, etc.) with separate thinking/content streams.
  • Embedding API — Unified text embedding interface with batch processing.

Tool System

  • Zero-Config Tools@tool decorator auto-generates OpenAI function calling schema from type hints and docstrings.
  • Type Safety — Pydantic V2 validation for all tool parameters. Automatic error feedback to LLM.
  • Async Native — Async tools run natively; sync tools auto-execute in thread pool.
  • Parallel Execution — Execute multiple tool calls concurrently for better performance.
  • Instance Methods — Register class methods as tools with access to self context (DB connections, user state, etc.).

Prompt Engine

  • Jinja2 Templates — Dynamic prompts with variable interpolation, conditionals, loops, and includes.
  • Long Text Continuation — Auto-detect truncation and continue generation to bypass token limits.
  • Parallel Prompts — Execute multiple prompts concurrently with ParallelPrompt.
  • Post-Processors — Transform streaming output with pluggable processor pipeline.
  • Template Files — Load prompts from external files for better organization.

Memory & Storage

  • Session Memory — Multi-session conversation management with full OpenAI message format support.
  • Tool Call Tracking — Complete function calling chain management with validation.
  • Pin/Tag System — Protect important messages from being trimmed or modified.
  • Undo/Redo — Rollback conversation operations when needed.
  • Multiple Backends — In-memory, JSON file, SQLite storage options.
  • TTL Support — Automatic session cleanup with time-to-live.

Sandbox

  • Secure Execution — Run agent-generated code in isolated environments.
  • File Isolation — Sandboxed file system for safe file operations.
  • Resource Tracking — Monitor and limit compute resources.

Deployment

  • One-Line API — Publish any agent as OpenAI-compatible REST API with publish_agent_api().
  • FastAPI Integration — Built on FastAPI with automatic OpenAPI documentation.
  • SSE Streaming — Server-Sent Events for real-time streaming responses.
  • Session Management — Built-in session handling with configurable TTL.

Quick Start

1. Basic Agent

from alphora.agent import BaseAgent
from alphora.models import OpenAILike

agent = BaseAgent(llm=OpenAILike(model_name="gpt-4"))

prompt = agent.create_prompt(
    system_prompt="You are a helpful assistant.",
    user_prompt="{{query}}"
)

response = await prompt.acall(query="What is Python?")

2. Tools with @tool Decorator

from alphora.tools import tool, ToolRegistry, ToolExecutor

@tool
def get_weather(city: str, unit: str = "celsius") -> str:
    """Get current weather for a city."""
    return f"Weather in {city}: 22°{unit[0].upper()}, Sunny"

@tool
async def search_docs(query: str, limit: int = 5) -> list:
    """Search internal documents."""
    return [{"title": "Result 1", "score": 0.95}]

registry = ToolRegistry()
registry.register(get_weather)
registry.register(search_docs)

# Get OpenAI-compatible schema
tools_schema = registry.get_openai_tools_schema()

3. ReAct Agent (Auto Tool Loop)

from alphora.agent import ReActAgent

agent = ReActAgent(
    llm=llm,
    tools=[get_weather, search_docs],
    system_prompt="You are a helpful assistant.",
    max_iterations=10
)

# Agent automatically handles tool calling loop
result = await agent.run("What's the weather in Tokyo?")

4. Agent Derivation (Shared Context)

from alphora.agent import BaseAgent
from alphora.memory import MemoryManager

# Parent with shared resources
parent = BaseAgent(
    llm=llm,
    memory=MemoryManager(),
    config={"project": "demo"}
)

# Children inherit llm, memory, config
researcher = parent.derive(ResearchAgent)
analyst = parent.derive(AnalysisAgent)

# All agents share the same memory
parent.memory.add_user(session_id="s1", content="Hello")
# researcher and analyst can see this message

5. Multimodal Messages

from alphora.models.message import Message

# Create multimodal message
msg = Message()
msg.add_text("What's in this image?")
msg.add_image(base64_data, format="png")

response = await llm.ainvoke(msg)

6. Load Balancing

# Primary LLM
llm1 = OpenAILike(model_name="gpt-4", api_key="key1", base_url="https://api1.com/v1")

# Backup LLM
llm2 = OpenAILike(model_name="gpt-4", api_key="key2", base_url="https://api2.com/v1")

# Combine with automatic load balancing
llm = llm1 + llm2

response = await llm.ainvoke("Hello")  # Auto round-robin

7. Memory Management

from alphora.memory import MemoryManager

memory = MemoryManager()

# Add conversation
memory.add_user(session_id="user_123", content="Hello")
memory.add_assistant(session_id="user_123", content="Hi there!")

# Add tool results
memory.add_tool_result(session_id="user_123", result=tool_output)

# Build history for LLM
history = memory.build_history(session_id="user_123")

8. Deploy as API

from alphora.server import publish_agent_api, APIPublisherConfig

config = APIPublisherConfig(
    path="/chat",
    api_title="My Agent API",
    memory_ttl=3600
)

app = publish_agent_api(agent=agent, method="run", config=config)

# Run: uvicorn main:app --port 8000
curl -X POST http://localhost:8000/chat/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "Hello!"}], "stream": true}'

Examples

Example Description
ChatExcel Data analysis agent with sandbox code execution
RAG Agent Retrieval-augmented generation with vector search
Multi-Agent Hierarchical agents with tool-as-agent pattern
Streaming Chat Real-time chat with thinking mode

Configuration

# Environment variables
export LLM_API_KEY="your-api-key"
export LLM_BASE_URL="https://api.openai.com/v1"
export DEFAULT_LLM="gpt-4"

# Optional: Embedding
export EMBEDDING_API_KEY="your-key"
export EMBEDDING_BASE_URL="https://api.openai.com/v1"
# Programmatic configuration
from alphora.models import OpenAILike

llm = OpenAILike(
    model_name="gpt-4",
    api_key="sk-xxx",
    base_url="https://api.openai.com/v1",
    temperature=0.7,
    max_tokens=4096,
    is_multimodal=True  # Enable vision
)

Documentation

For detailed system design, component relationships, and implementation patterns, see the Architecture Guide.

Component Overview

Component Description
Agent Core agent lifecycle, derivation, ReAct loop
Prompter Jinja2 templates, LLM invocation, streaming
Models LLM interface, multimodal, load balancing
Tools tool decorator, registry, parallel execution
Memory Session management, history, pin/tag system
Storage Persistent backends (memory, JSON, SQLite)
Sandbox Secure code execution environment
Server API publishing, SSE streaming
Postprocess Stream transformation pipeline

Contributors

Crafted by the AlphaData Team.

Tian Tian
Tian Tian

Project Lead & Core Dev
📧
Yuhang Liang
Yuhang Liang

Developer
📧
Jianhui Shi
Jianhui Shi

Developer
📧
Yingdi Liu
Yingdi Liu

Developer
📧
Qiuyang He
Qiuyang He

Developer
-
LiuJX
LiuJX

Developer
-
Cjdddd
Cjdddd

Developer
📧
Weiyu Wang
Weiyu Wang

Developer
📧

License

This project is licensed under the Apache License 2.0.
See LICENSE for details.

Contributions require acceptance of the Contributor License Agreement (CLA).

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for alphora

Similar Open Source Tools

For similar tasks

For similar jobs