hub
High-scale LLM gateway, written in Rust. OpenTelemetry-based observability included
Stars: 129
Hub is an open-source, high-performance LLM gateway written in Rust. It serves as a smart proxy for LLM applications, centralizing control and tracing of all LLM calls and traces. Built for efficiency, it provides a single API to connect to any LLM provider. The tool is designed to be fast, efficient, and completely open-source under the Apache 2.0 license.
README:
Open-source, high-performance LLM gateway written in Rust. Connect to any LLM provider with a single API. Observability Included.
Traceloop Hub is a next-gen high-performance LLM gateway written in Rust that centralizes control and tracing of all LLM calls. It provides a unified OpenAI-compatible API for connecting to multiple LLM providers with observability built-in.
- Multi-Provider Support: OpenAI, Anthropic, Azure OpenAI, Google VertexAI, AWS Bedrock
- OpenAI Compatible API: Drop-in replacement for OpenAI API calls
-
Two Deployment Modes:
- YAML Mode: Simple static configuration with config files
- Database Mode: Dynamic configuration with PostgreSQL and Management API
- Built-in Observability: OpenTelemetry tracing and Prometheus metrics
- High Performance: Written in Rust with async/await support
- Hot Reload: Dynamic configuration updates (Database mode)
- Pipeline System: Extensible request/response processing
- Unified Architecture: Single crate structure with integrated Management API
# YAML Mode (simple deployment)
docker run -p 3000:3000 -v $(pwd)/config.yaml:/app/config.yaml traceloop/hub
# Database Mode (with management API)
docker run -p 3000:3000 -p 8080:8080 \
-e HUB_MODE=database \
-e DATABASE_URL=postgresql://user:pass@host:5432/db \
traceloop/hub# Clone and build
git clone https://github.com/traceloop/hub.git
cd hub
cargo build --release
# YAML Mode
./target/release/hub
# Database Mode
HUB_MODE=database DATABASE_URL=postgresql://user:pass@host:5432/db ./target/release/hubThe project uses a unified single-crate architecture:
hub/
├── src/ # Main application code
│ ├── main.rs # Application entry point
│ ├── lib.rs # Library exports
│ ├── config/ # Configuration management
│ ├── providers/ # LLM provider implementations
│ ├── models/ # Data models
│ ├── pipelines/ # Request processing pipelines
│ ├── routes.rs # HTTP routing
│ ├── state.rs # Application state management
│ ├── management/ # Management API (Database mode)
│ │ ├── api/ # REST API endpoints
│ │ ├── db/ # Database models and repositories
│ │ ├── services/ # Business logic
│ │ └── dto.rs # Data transfer objects
│ └── types/ # Shared type definitions
├── migrations/ # Database migrations
├── helm/ # Kubernetes deployment
├── tests/ # Integration tests
└── docs/ # Documentation
Perfect for simple deployments and development environments.
Features:
- Static configuration via
config.yaml - No external dependencies
- Simple provider and model setup
- No management API
- Single port (3000)
Example config.yaml:
providers:
- key: openai
type: openai
api_key: sk-...
models:
- key: gpt-4
type: gpt-4
provider: openai
pipelines:
- name: chat
type: Chat
plugins:
- ModelRouter:
models: [gpt-4]Ideal for production environments requiring dynamic configuration.
Features:
- PostgreSQL-backed configuration
- REST Management API (
/api/v1/management/*) - Hot reload without restarts
- Configuration polling and synchronization
- SecretObject system for credential management
- Dual ports (3000 for Gateway, 8080 for Management)
Setup:
- Set up PostgreSQL database
- Run migrations:
sqlx migrate run - Set environment variables:
HUB_MODE=database DATABASE_URL=postgresql://user:pass@host:5432/db
Port 3000:
-
POST /api/v1/chat/completions- Chat completions -
POST /api/v1/completions- Text completions -
POST /api/v1/embeddings- Text embeddings -
GET /health- Health check -
GET /metrics- Prometheus metrics -
GET /swagger-ui- OpenAPI documentation
Port 8080:
-
GET /health- Management API health check -
GET|POST|PUT|DELETE /api/v1/management/providers- Provider management -
GET|POST|PUT|DELETE /api/v1/management/model-definitions- Model management -
GET|POST|PUT|DELETE /api/v1/management/pipelines- Pipeline management
providers:
- key: openai
type: openai
api_key: sk-...
# Optional
organization_id: org-...
base_url: https://api.openai.com/v1providers:
- key: anthropic
type: anthropic
api_key: sk-ant-...providers:
- key: azure
type: azure
api_key: your-key
resource_name: your-resource
api_version: "2023-05-15"providers:
- key: bedrock
type: bedrock
region: us-east-1
# Uses IAM roles or AWS credentialsproviders:
- key: vertexai
type: vertexai
project_id: your-project
location: us-central1
# Uses service account JSON or API key# YAML Mode
helm install hub ./helm
# Database Mode
helm install hub ./helm \
--set management.enabled=true \
--set management.database.host=postgres \
--set management.database.existingSecret=postgres-secretversion: '3.8'
services:
# YAML Mode
hub-yaml:
image: traceloop/hub
ports:
- "3000:3000"
volumes:
- ./config.yaml:/app/config.yaml
# Database Mode
hub-database:
image: traceloop/hub
ports:
- "3000:3000"
- "8080:8080"
environment:
- HUB_MODE=database
- DATABASE_URL=postgresql://hub:password@postgres:5432/hub
depends_on:
- postgres
postgres:
image: postgres:15
environment:
- POSTGRES_DB=hub
- POSTGRES_USER=hub
- POSTGRES_PASSWORD=password| Variable | Description | Default | Required |
|---|---|---|---|
HUB_MODE |
Deployment mode: yaml or database
|
yaml |
No |
CONFIG_FILE_PATH |
Path to YAML config file | config.yaml |
YAML mode |
DATABASE_URL |
PostgreSQL connection string | - | Database mode |
DB_POLL_INTERVAL_SECONDS |
Config polling interval | 30 |
No |
PORT |
Gateway server port | 3000 |
No |
MANAGEMENT_PORT |
Management API port | 8080 |
Database mode |
TRACE_CONTENT_ENABLED |
Enable request/response tracing | true |
No |
- Rust 1.87+
- PostgreSQL 12+ (for database mode)
-
sqlx-cli(for migrations)
# Build OSS version
cargo build
# Test
cargo test
# Format
cargo fmt
# Lint
cargo clippy
# Run YAML mode
cargo run
# Run database mode
HUB_MODE=database DATABASE_URL=postgresql://... cargo run# Install sqlx-cli
cargo install sqlx-cli --no-default-features --features postgres
# Run migrations
sqlx migrate run
# Use setup script for complete setup
./scripts/setup-db.shThe project follows a unified single-crate architecture:
-
src/main.rs: Application entry point with mode detection -
src/lib.rs: Library exports for all modules -
src/config/: Configuration management and validation -
src/providers/: LLM provider implementations -
src/models/: Request/response data models -
src/pipelines/: Request processing pipelines -
src/management/: Management API (Database mode) -
src/types/: Shared type definitions -
src/state.rs: Thread-safe application state -
src/routes.rs: Dynamic HTTP routing
- Hot Reload: Configuration changes without restarts (Database mode)
- Atomic Updates: Thread-safe configuration updates
- Dynamic Routing: Pipeline-based request steering
- Comprehensive Testing: Integration tests with testcontainers
- OpenAPI Documentation: Auto-generated API specs
Configure in your pipeline:
pipelines:
- name: traced-chat
type: Chat
plugins:
- Tracing:
endpoint: http://jaeger:14268/api/traces
api_key: your-key
- ModelRouter:
models: [gpt-4]Available at /metrics:
- Request counts and latencies
- Provider-specific metrics
- Error rates
- Active connections
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Client App │───▶│ Traceloop Hub │───▶│ LLM Provider │
└─────────────────┘ │ │ │ (OpenAI, etc.) │
│ ┌─────────────┐ │ └─────────────────┘
│ │ Config Mode │ │
│ │ YAML | DB │ │ ┌─────────────────┐
│ └─────────────┘ │───▶│ Observability │
│ │ │ (OTel, Metrics) │
│ ┌─────────────┐ │ └─────────────────┘
│ │ Management │ │
│ │ API (DB) │ │
│ └─────────────┘ │
└──────────────────┘
Licensed under the Apache License, Version 2.0. See LICENSE for details.
We welcome contributions! Please see our Contributing Guide for details.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for hub
Similar Open Source Tools
hub
Hub is an open-source, high-performance LLM gateway written in Rust. It serves as a smart proxy for LLM applications, centralizing control and tracing of all LLM calls and traces. Built for efficiency, it provides a single API to connect to any LLM provider. The tool is designed to be fast, efficient, and completely open-source under the Apache 2.0 license.
mcp-prompts
mcp-prompts is a Python library that provides a collection of prompts for generating creative writing ideas. It includes a variety of prompts such as story starters, character development, plot twists, and more. The library is designed to inspire writers and help them overcome writer's block by offering unique and engaging prompts to spark creativity. With mcp-prompts, users can access a wide range of writing prompts to kickstart their imagination and enhance their storytelling skills.
sandbox
AIO Sandbox is an all-in-one agent sandbox environment that combines Browser, Shell, File, MCP operations, and VSCode Server in a single Docker container. It provides a unified, secure execution environment for AI agents and developers, with features like unified file system, multiple interfaces, secure execution, zero configuration, and agent-ready MCP-compatible APIs. The tool allows users to run shell commands, perform file operations, automate browser tasks, and integrate with various development tools and services.
DeepTutor
DeepTutor is an AI-powered personalized learning assistant that offers a suite of modules for massive document knowledge Q&A, interactive learning visualization, knowledge reinforcement with practice exercise generation, deep research, and idea generation. The tool supports multi-agent collaboration, dynamic topic queues, and structured outputs for various tasks. It provides a unified system entry for activity tracking, knowledge base management, and system status monitoring. DeepTutor is designed to streamline learning and research processes by leveraging AI technologies and interactive features.
mcp-ts-template
The MCP TypeScript Server Template is a production-grade framework for building powerful and scalable Model Context Protocol servers with TypeScript. It features built-in observability, declarative tooling, robust error handling, and a modular, DI-driven architecture. The template is designed to be AI-agent-friendly, providing detailed rules and guidance for developers to adhere to best practices. It enforces architectural principles like 'Logic Throws, Handler Catches' pattern, full-stack observability, declarative components, and dependency injection for decoupling. The project structure includes directories for configuration, container setup, server resources, services, storage, utilities, tests, and more. Configuration is done via environment variables, and key scripts are available for development, testing, and publishing to the MCP Registry.
frankenterm
A swarm-native terminal platform designed to replace legacy terminal workflows for massive AI agent orchestration. `ft` is a full terminal platform for agent swarms with first-class observability, deterministic eventing, policy-gated automation, and machine-native control surfaces. It offers perfect observability, intelligent detection, event-driven automation, Robot Mode API, lexical + hybrid search, and a policy engine for safe multi-agent control. The platform is actively expanding with concepts learned from Ghostty and Zellij, purpose-built subsystems for agent swarms, and integrations from other projects like `/dp/asupersync`, `/dp/frankensqlite`, and `/frankentui`.
VT.ai
VT.ai is a multimodal AI platform that offers dynamic conversation routing with SemanticRouter, multi-modal interactions (text/image/audio), an assistant framework with code interpretation, real-time response streaming, cross-provider model switching, and local model support with Ollama integration. It supports various AI providers such as OpenAI, Anthropic, Google Gemini, Groq, Cohere, and OpenRouter, providing a wide range of core capabilities for AI orchestration.
fluid.sh
fluid.sh is a tool designed to manage and debug VMs using AI agents in isolated environments before applying changes to production. It provides a workflow where AI agents work autonomously in sandbox VMs, and human approval is required before any changes are made to production. The tool offers features like autonomous execution, full VM isolation, human-in-the-loop approval workflow, Ansible export, and a Python SDK for building autonomous agents.
morgana-form
MorGana Form is a full-stack form builder project developed using Next.js, React, TypeScript, Ant Design, PostgreSQL, and other technologies. It allows users to quickly create and collect data through survey forms. The project structure includes components, hooks, utilities, pages, constants, Redux store, themes, types, server-side code, and component packages. Environment variables are required for database settings, NextAuth login configuration, and file upload services. Additionally, the project integrates an AI model for form generation using the Ali Qianwen model API.
rtk
RTK is a lightweight and flexible tool for real-time kinematic positioning. It provides accurate positioning data by combining data from GPS satellites with a reference station. RTK is commonly used in surveying, agriculture, construction, and drone navigation. The tool offers real-time corrections to improve the accuracy of GPS data, making it ideal for applications requiring precise location information. With RTK, users can achieve centimeter-level accuracy in their positioning data, enabling them to perform tasks that demand high precision and reliability.
easyclaw
EasyClaw is a desktop application that simplifies the usage of OpenClaw, a powerful agent runtime, by providing a user-friendly interface for non-programmers. Users can write rules in plain language, configure multiple LLM providers and messaging channels, manage API keys, and interact with the agent through a local web panel. The application ensures data privacy by keeping all information on the user's machine and offers features like natural language rules, multi-provider LLM support, Gemini CLI OAuth, proxy support, messaging integration, token tracking, speech-to-text, file permissions control, and more. EasyClaw aims to lower the barrier of entry for utilizing OpenClaw by providing a user-friendly cockpit for managing the engine.
claude-code-orchestrator-kit
The Claude Code Orchestrator Kit is a professional automation and orchestration system for Claude Code, featuring 39 AI agents, 38 skills, 25 slash commands, auto-optimized MCP, Beads issue tracking, Gastown multi-agent orchestration, ready-to-use prompts, and quality gates. It transforms Claude Code into an intelligent orchestration system by delegating complex tasks to specialized sub-agents, preserving context and enabling indefinite work sessions.
solo-server
Solo Server is a lightweight server designed for managing hardware-aware inference. It provides seamless setup through a simple CLI and HTTP servers, an open model registry for pulling models from platforms like Ollama and Hugging Face, cross-platform compatibility for effortless deployment of AI models on hardware, and a configurable framework that auto-detects hardware components (CPU, GPU, RAM) and sets optimal configurations.
topsha
LocalTopSH is an AI Agent Framework designed for companies and developers who require 100% on-premise AI agents with data privacy. It supports various OpenAI-compatible LLM backends and offers production-ready security features. The framework allows simple deployment using Docker compose and ensures that data stays within the user's network, providing full control and compliance. With cost-effective scaling options and compatibility in regions with restrictions, LocalTopSH is a versatile solution for deploying AI agents on self-hosted infrastructure.
lihil
Lihil is a performant, productive, and professional web framework designed to make Python the mainstream programming language for web development. It is 100% test covered and strictly typed, offering fast performance, ergonomic API, and built-in solutions for common problems. Lihil is suitable for enterprise web development, delivering robust and scalable solutions with best practices in microservice architecture and related patterns. It features dependency injection, OpenAPI docs generation, error response generation, data validation, message system, testability, and strong support for AI features. Lihil is ASGI compatible and uses starlette as its ASGI toolkit, ensuring compatibility with starlette classes and middlewares. The framework follows semantic versioning and has a roadmap for future enhancements and features.
For similar tasks
hub
Hub is an open-source, high-performance LLM gateway written in Rust. It serves as a smart proxy for LLM applications, centralizing control and tracing of all LLM calls and traces. Built for efficiency, it provides a single API to connect to any LLM provider. The tool is designed to be fast, efficient, and completely open-source under the Apache 2.0 license.
earl
AI-safe CLI for AI agents. Earl sits between your agent and external services, ensuring secrets stay in the OS keychain, requests follow reviewed templates, and outbound traffic obeys egress rules.
aiges
AIGES is a core component of the Athena Serving Framework, designed as a universal encapsulation tool for AI developers to deploy AI algorithm models and engines quickly. By integrating AIGES, you can deploy AI algorithm models and engines rapidly and host them on the Athena Serving Framework, utilizing supporting auxiliary systems for networking, distribution strategies, data processing, etc. The Athena Serving Framework aims to accelerate the cloud service of AI algorithm models and engines, providing multiple guarantees for cloud service stability through cloud-native architecture. You can efficiently and securely deploy, upgrade, scale, operate, and monitor models and engines without focusing on underlying infrastructure and service-related development, governance, and operations.
holoinsight
HoloInsight is a cloud-native observability platform that provides low-cost and high-performance monitoring services for cloud-native applications. It offers deep insights through real-time log analysis and AI integration. The platform is designed to help users gain a comprehensive understanding of their applications' performance and behavior in the cloud environment. HoloInsight is easy to deploy using Docker and Kubernetes, making it a versatile tool for monitoring and optimizing cloud-native applications. With a focus on scalability and efficiency, HoloInsight is suitable for organizations looking to enhance their observability and monitoring capabilities in the cloud.
awesome-AIOps
awesome-AIOps is a curated list of academic researches and industrial materials related to Artificial Intelligence for IT Operations (AIOps). It includes resources such as competitions, white papers, blogs, tutorials, benchmarks, tools, companies, academic materials, talks, workshops, papers, and courses covering various aspects of AIOps like anomaly detection, root cause analysis, incident management, microservices, dependency tracing, and more.
OpenLLM
OpenLLM is a platform that helps developers run any open-source Large Language Models (LLMs) as OpenAI-compatible API endpoints, locally and in the cloud. It supports a wide range of LLMs, provides state-of-the-art serving and inference performance, and simplifies cloud deployment via BentoML. Users can fine-tune, serve, deploy, and monitor any LLMs with ease using OpenLLM. The platform also supports various quantization techniques, serving fine-tuning layers, and multiple runtime implementations. OpenLLM seamlessly integrates with other tools like OpenAI Compatible Endpoints, LlamaIndex, LangChain, and Transformers Agents. It offers deployment options through Docker containers, BentoCloud, and provides a community for collaboration and contributions.
laravel-slower
Laravel Slower is a powerful package designed for Laravel developers to optimize the performance of their applications by identifying slow database queries and providing AI-driven suggestions for optimal indexing strategies and performance improvements. It offers actionable insights for debugging and monitoring database interactions, enhancing efficiency and scalability.
genkit
Firebase Genkit (beta) is a framework with powerful tooling to help app developers build, test, deploy, and monitor AI-powered features with confidence. Genkit is cloud optimized and code-centric, integrating with many services that have free tiers to get started. It provides unified API for generation, context-aware AI features, evaluation of AI workflow, extensibility with plugins, easy deployment to Firebase or Google Cloud, observability and monitoring with OpenTelemetry, and a developer UI for prototyping and testing AI features locally. Genkit works seamlessly with Firebase or Google Cloud projects through official plugins and templates.
For similar jobs
minio
MinIO is a High Performance Object Storage released under GNU Affero General Public License v3.0. It is API compatible with Amazon S3 cloud storage service. Use MinIO to build high performance infrastructure for machine learning, analytics and application data workloads.
ai-on-gke
This repository contains assets related to AI/ML workloads on Google Kubernetes Engine (GKE). Run optimized AI/ML workloads with Google Kubernetes Engine (GKE) platform orchestration capabilities. A robust AI/ML platform considers the following layers: Infrastructure orchestration that support GPUs and TPUs for training and serving workloads at scale Flexible integration with distributed computing and data processing frameworks Support for multiple teams on the same infrastructure to maximize utilization of resources
kong
Kong, or Kong API Gateway, is a cloud-native, platform-agnostic, scalable API Gateway distinguished for its high performance and extensibility via plugins. It also provides advanced AI capabilities with multi-LLM support. By providing functionality for proxying, routing, load balancing, health checking, authentication (and more), Kong serves as the central layer for orchestrating microservices or conventional API traffic with ease. Kong runs natively on Kubernetes thanks to its official Kubernetes Ingress Controller.
AI-in-a-Box
AI-in-a-Box is a curated collection of solution accelerators that can help engineers establish their AI/ML environments and solutions rapidly and with minimal friction, while maintaining the highest standards of quality and efficiency. It provides essential guidance on the responsible use of AI and LLM technologies, specific security guidance for Generative AI (GenAI) applications, and best practices for scaling OpenAI applications within Azure. The available accelerators include: Azure ML Operationalization in-a-box, Edge AI in-a-box, Doc Intelligence in-a-box, Image and Video Analysis in-a-box, Cognitive Services Landing Zone in-a-box, Semantic Kernel Bot in-a-box, NLP to SQL in-a-box, Assistants API in-a-box, and Assistants API Bot in-a-box.
awsome-distributed-training
This repository contains reference architectures and test cases for distributed model training with Amazon SageMaker Hyperpod, AWS ParallelCluster, AWS Batch, and Amazon EKS. The test cases cover different types and sizes of models as well as different frameworks and parallel optimizations (Pytorch DDP/FSDP, MegatronLM, NemoMegatron...).
generative-ai-cdk-constructs
The AWS Generative AI Constructs Library is an open-source extension of the AWS Cloud Development Kit (AWS CDK) that provides multi-service, well-architected patterns for quickly defining solutions in code to create predictable and repeatable infrastructure, called constructs. The goal of AWS Generative AI CDK Constructs is to help developers build generative AI solutions using pattern-based definitions for their architecture. The patterns defined in AWS Generative AI CDK Constructs are high level, multi-service abstractions of AWS CDK constructs that have default configurations based on well-architected best practices. The library is organized into logical modules using object-oriented techniques to create each architectural pattern model.
model_server
OpenVINO™ Model Server (OVMS) is a high-performance system for serving models. Implemented in C++ for scalability and optimized for deployment on Intel architectures, the model server uses the same architecture and API as TensorFlow Serving and KServe while applying OpenVINO for inference execution. Inference service is provided via gRPC or REST API, making deploying new algorithms and AI experiments easy.
dify-helm
Deploy langgenius/dify, an LLM based chat bot app on kubernetes with helm chart.