
hub
High-scale LLM gateway, written in Rust. OpenTelemetry-based observability included
Stars: 129

Hub is an open-source, high-performance LLM gateway written in Rust. It serves as a smart proxy for LLM applications, centralizing control and tracing of all LLM calls and traces. Built for efficiency, it provides a single API to connect to any LLM provider. The tool is designed to be fast, efficient, and completely open-source under the Apache 2.0 license.
README:
Open-source, high-performance LLM gateway written in Rust. Connect to any LLM provider with a single API. Observability Included.
Traceloop Hub is a next-gen high-performance LLM gateway written in Rust that centralizes control and tracing of all LLM calls. It provides a unified OpenAI-compatible API for connecting to multiple LLM providers with observability built-in.
- Multi-Provider Support: OpenAI, Anthropic, Azure OpenAI, Google VertexAI, AWS Bedrock
- OpenAI Compatible API: Drop-in replacement for OpenAI API calls
-
Two Deployment Modes:
- YAML Mode: Simple static configuration with config files
- Database Mode: Dynamic configuration with PostgreSQL and Management API
- Built-in Observability: OpenTelemetry tracing and Prometheus metrics
- High Performance: Written in Rust with async/await support
- Hot Reload: Dynamic configuration updates (Database mode)
- Pipeline System: Extensible request/response processing
- Unified Architecture: Single crate structure with integrated Management API
# YAML Mode (simple deployment)
docker run -p 3000:3000 -v $(pwd)/config.yaml:/app/config.yaml traceloop/hub
# Database Mode (with management API)
docker run -p 3000:3000 -p 8080:8080 \
-e HUB_MODE=database \
-e DATABASE_URL=postgresql://user:pass@host:5432/db \
traceloop/hub
# Clone and build
git clone https://github.com/traceloop/hub.git
cd hub
cargo build --release
# YAML Mode
./target/release/hub
# Database Mode
HUB_MODE=database DATABASE_URL=postgresql://user:pass@host:5432/db ./target/release/hub
The project uses a unified single-crate architecture:
hub/
├── src/ # Main application code
│ ├── main.rs # Application entry point
│ ├── lib.rs # Library exports
│ ├── config/ # Configuration management
│ ├── providers/ # LLM provider implementations
│ ├── models/ # Data models
│ ├── pipelines/ # Request processing pipelines
│ ├── routes.rs # HTTP routing
│ ├── state.rs # Application state management
│ ├── management/ # Management API (Database mode)
│ │ ├── api/ # REST API endpoints
│ │ ├── db/ # Database models and repositories
│ │ ├── services/ # Business logic
│ │ └── dto.rs # Data transfer objects
│ └── types/ # Shared type definitions
├── migrations/ # Database migrations
├── helm/ # Kubernetes deployment
├── tests/ # Integration tests
└── docs/ # Documentation
Perfect for simple deployments and development environments.
Features:
- Static configuration via
config.yaml
- No external dependencies
- Simple provider and model setup
- No management API
- Single port (3000)
Example config.yaml:
providers:
- key: openai
type: openai
api_key: sk-...
models:
- key: gpt-4
type: gpt-4
provider: openai
pipelines:
- name: chat
type: Chat
plugins:
- ModelRouter:
models: [gpt-4]
Ideal for production environments requiring dynamic configuration.
Features:
- PostgreSQL-backed configuration
- REST Management API (
/api/v1/management/*
) - Hot reload without restarts
- Configuration polling and synchronization
- SecretObject system for credential management
- Dual ports (3000 for Gateway, 8080 for Management)
Setup:
- Set up PostgreSQL database
- Run migrations:
sqlx migrate run
- Set environment variables:
HUB_MODE=database DATABASE_URL=postgresql://user:pass@host:5432/db
Port 3000:
-
POST /api/v1/chat/completions
- Chat completions -
POST /api/v1/completions
- Text completions -
POST /api/v1/embeddings
- Text embeddings -
GET /health
- Health check -
GET /metrics
- Prometheus metrics -
GET /swagger-ui
- OpenAPI documentation
Port 8080:
-
GET /health
- Management API health check -
GET|POST|PUT|DELETE /api/v1/management/providers
- Provider management -
GET|POST|PUT|DELETE /api/v1/management/model-definitions
- Model management -
GET|POST|PUT|DELETE /api/v1/management/pipelines
- Pipeline management
providers:
- key: openai
type: openai
api_key: sk-...
# Optional
organization_id: org-...
base_url: https://api.openai.com/v1
providers:
- key: anthropic
type: anthropic
api_key: sk-ant-...
providers:
- key: azure
type: azure
api_key: your-key
resource_name: your-resource
api_version: "2023-05-15"
providers:
- key: bedrock
type: bedrock
region: us-east-1
# Uses IAM roles or AWS credentials
providers:
- key: vertexai
type: vertexai
project_id: your-project
location: us-central1
# Uses service account JSON or API key
# YAML Mode
helm install hub ./helm
# Database Mode
helm install hub ./helm \
--set management.enabled=true \
--set management.database.host=postgres \
--set management.database.existingSecret=postgres-secret
version: '3.8'
services:
# YAML Mode
hub-yaml:
image: traceloop/hub
ports:
- "3000:3000"
volumes:
- ./config.yaml:/app/config.yaml
# Database Mode
hub-database:
image: traceloop/hub
ports:
- "3000:3000"
- "8080:8080"
environment:
- HUB_MODE=database
- DATABASE_URL=postgresql://hub:password@postgres:5432/hub
depends_on:
- postgres
postgres:
image: postgres:15
environment:
- POSTGRES_DB=hub
- POSTGRES_USER=hub
- POSTGRES_PASSWORD=password
Variable | Description | Default | Required |
---|---|---|---|
HUB_MODE |
Deployment mode: yaml or database
|
yaml |
No |
CONFIG_FILE_PATH |
Path to YAML config file | config.yaml |
YAML mode |
DATABASE_URL |
PostgreSQL connection string | - | Database mode |
DB_POLL_INTERVAL_SECONDS |
Config polling interval | 30 |
No |
PORT |
Gateway server port | 3000 |
No |
MANAGEMENT_PORT |
Management API port | 8080 |
Database mode |
TRACE_CONTENT_ENABLED |
Enable request/response tracing | true |
No |
- Rust 1.87+
- PostgreSQL 12+ (for database mode)
-
sqlx-cli
(for migrations)
# Build OSS version
cargo build
# Test
cargo test
# Format
cargo fmt
# Lint
cargo clippy
# Run YAML mode
cargo run
# Run database mode
HUB_MODE=database DATABASE_URL=postgresql://... cargo run
# Install sqlx-cli
cargo install sqlx-cli --no-default-features --features postgres
# Run migrations
sqlx migrate run
# Use setup script for complete setup
./scripts/setup-db.sh
The project follows a unified single-crate architecture:
-
src/main.rs
: Application entry point with mode detection -
src/lib.rs
: Library exports for all modules -
src/config/
: Configuration management and validation -
src/providers/
: LLM provider implementations -
src/models/
: Request/response data models -
src/pipelines/
: Request processing pipelines -
src/management/
: Management API (Database mode) -
src/types/
: Shared type definitions -
src/state.rs
: Thread-safe application state -
src/routes.rs
: Dynamic HTTP routing
- Hot Reload: Configuration changes without restarts (Database mode)
- Atomic Updates: Thread-safe configuration updates
- Dynamic Routing: Pipeline-based request steering
- Comprehensive Testing: Integration tests with testcontainers
- OpenAPI Documentation: Auto-generated API specs
Configure in your pipeline:
pipelines:
- name: traced-chat
type: Chat
plugins:
- Tracing:
endpoint: http://jaeger:14268/api/traces
api_key: your-key
- ModelRouter:
models: [gpt-4]
Available at /metrics
:
- Request counts and latencies
- Provider-specific metrics
- Error rates
- Active connections
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Client App │───▶│ Traceloop Hub │───▶│ LLM Provider │
└─────────────────┘ │ │ │ (OpenAI, etc.) │
│ ┌─────────────┐ │ └─────────────────┘
│ │ Config Mode │ │
│ │ YAML | DB │ │ ┌─────────────────┐
│ └─────────────┘ │───▶│ Observability │
│ │ │ (OTel, Metrics) │
│ ┌─────────────┐ │ └─────────────────┘
│ │ Management │ │
│ │ API (DB) │ │
│ └─────────────┘ │
└──────────────────┘
Licensed under the Apache License, Version 2.0. See LICENSE for details.
We welcome contributions! Please see our Contributing Guide for details.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for hub
Similar Open Source Tools

hub
Hub is an open-source, high-performance LLM gateway written in Rust. It serves as a smart proxy for LLM applications, centralizing control and tracing of all LLM calls and traces. Built for efficiency, it provides a single API to connect to any LLM provider. The tool is designed to be fast, efficient, and completely open-source under the Apache 2.0 license.

gonzo
Gonzo is a powerful, real-time log analysis terminal UI tool inspired by k9s. It allows users to analyze log streams with beautiful charts, AI-powered insights, and advanced filtering directly from the terminal. The tool provides features like live streaming log processing, OTLP support, interactive dashboard with real-time charts, advanced filtering options including regex support, and AI-powered insights such as pattern detection, anomaly analysis, and root cause suggestions. Users can also configure AI models from providers like OpenAI, LM Studio, and Ollama for intelligent log analysis. Gonzo is built with Bubble Tea, Lipgloss, Cobra, Viper, and OpenTelemetry, following a clean architecture with separate modules for TUI, log analysis, frequency tracking, OTLP handling, and AI integration.

wealth-tracker
Wealth Tracker is a personal finance management tool designed to help users track their income, expenses, and investments in one place. With intuitive features and customizable categories, users can easily monitor their financial health and make informed decisions. The tool provides detailed reports and visualizations to analyze spending patterns and set financial goals. Whether you are budgeting, saving for a big purchase, or planning for retirement, Wealth Tracker offers a comprehensive solution to manage your money effectively.

mcp-context-forge
MCP Context Forge is a powerful tool for generating context-aware data for machine learning models. It provides functionalities to create diverse datasets with contextual information, enhancing the performance of AI algorithms. The tool supports various data formats and allows users to customize the context generation process easily. With MCP Context Forge, users can efficiently prepare training data for tasks requiring contextual understanding, such as sentiment analysis, recommendation systems, and natural language processing.

google_workspace_mcp
The Google Workspace MCP Server is a production-ready server that integrates major Google Workspace services with AI assistants. It supports single-user and multi-user authentication via OAuth 2.1, making it a powerful backend for custom applications. Built with FastMCP for optimal performance, it features advanced authentication handling, service caching, and streamlined development patterns. The server provides full natural language control over Google Calendar, Drive, Gmail, Docs, Sheets, Slides, Forms, Tasks, and Chat through all MCP clients, AI assistants, and developer tools. It supports free Google accounts and Google Workspace plans with expanded app options like Chat & Spaces. The server also offers private cloud instance options.

klavis
Klavis AI is a production-ready solution for managing Multiple Communication Protocol (MCP) servers. It offers self-hosted solutions and a hosted service with enterprise OAuth support. With Klavis AI, users can easily deploy and manage over 50 MCP servers for various services like GitHub, Gmail, Google Sheets, YouTube, Slack, and more. The tool provides instant access to MCP servers, seamless authentication, and integration with AI frameworks, making it ideal for individuals and businesses looking to streamline their communication and data management workflows.

ChatPilot
ChatPilot is a chat agent tool that enables AgentChat conversations, supports Google search, URL conversation (RAG), and code interpreter functionality, replicates Kimi Chat (file, drag and drop; URL, send out), and supports OpenAI/Azure API. It is based on LangChain and implements ReAct and OpenAI Function Call for agent Q&A dialogue. The tool supports various automatic tools such as online search using Google Search API, URL parsing tool, Python code interpreter, and enhanced RAG file Q&A with query rewriting support. It also allows front-end and back-end service separation using Svelte and FastAPI, respectively. Additionally, it supports voice input/output, image generation, user management, permission control, and chat record import/export.

langchain4j-aideepin-web
The langchain4j-aideepin-web repository is the frontend project of langchain4j-aideepin, an open-source, offline deployable retrieval enhancement generation (RAG) project based on large language models such as ChatGPT and application frameworks such as Langchain4j. It includes features like registration & login, multi-sessions (multi-roles), image generation (text-to-image, image editing, image-to-image), suggestions, quota control, knowledge base (RAG) based on large models, model switching, and search engine switching.

ailab
The 'ailab' project is an experimental ground for code generation combining AI (especially coding agents) and Deno. It aims to manage configuration files defining coding rules and modes in Deno projects, enhancing the quality and efficiency of code generation by AI. The project focuses on defining clear rules and modes for AI coding agents, establishing best practices in Deno projects, providing mechanisms for type-safe code generation and validation, applying test-driven development (TDD) workflow to AI coding, and offering implementation examples utilizing design patterns like adapter pattern.

inspector
A developer tool for testing and debugging Model Context Protocol (MCP) servers. It allows users to test the compliance of their MCP servers with the latest MCP specs, supports various transports like STDIO, SSE, and Streamable HTTP, features an LLM Playground for testing server behavior against different models, provides comprehensive logging and error reporting for MCP server development, and offers a modern developer experience with multiple server connections and saved configurations. The tool is built using Next.js and integrates MCP capabilities, AI SDKs from OpenAI, Anthropic, and Ollama, and various technologies like Node.js, TypeScript, and Next.js.

chatgpt-webui
ChatGPT WebUI is a user-friendly web graphical interface for various LLMs like ChatGPT, providing simplified features such as core ChatGPT conversation and document retrieval dialogues. It has been optimized for better RAG retrieval accuracy and supports various search engines. Users can deploy local language models easily and interact with different LLMs like GPT-4, Azure OpenAI, and more. The tool offers powerful functionalities like GPT4 API configuration, system prompt setup for role-playing, and basic conversation features. It also provides a history of conversations, customization options, and a seamless user experience with themes, dark mode, and PWA installation support.

meet-libai
The 'meet-libai' project aims to promote and popularize the cultural heritage of the Chinese poet Li Bai by constructing a knowledge graph of Li Bai and training a professional AI intelligent body using large models. The project includes features such as data preprocessing, knowledge graph construction, question-answering system development, and visualization exploration of the graph structure. It also provides code implementations for large models and RAG retrieval enhancement.

readme-ai
README-AI is a developer tool that auto-generates README.md files using a combination of data extraction and generative AI. It streamlines documentation creation and maintenance, enhancing developer productivity. This project aims to enable all skill levels, across all domains, to better understand, use, and contribute to open-source software. It offers flexible README generation, supports multiple large language models (LLMs), provides customizable output options, works with various programming languages and project types, and includes an offline mode for generating boilerplate README files without external API calls.

gpt-load
GPT-Load is a high-performance, enterprise-grade AI API transparent proxy service designed for enterprises and developers needing to integrate multiple AI services. Built with Go, it features intelligent key management, load balancing, and comprehensive monitoring capabilities for high-concurrency production environments. The tool serves as a transparent proxy service, preserving native API formats of various AI service providers like OpenAI, Google Gemini, and Anthropic Claude. It supports dynamic configuration, distributed leader-follower deployment, and a Vue 3-based web management interface. GPT-Load is production-ready with features like dual authentication, graceful shutdown, and error recovery.

one
ONE is a modern web and AI agent development toolkit that empowers developers to build AI-powered applications with high performance, beautiful UI, AI integration, responsive design, type safety, and great developer experience. It is perfect for building modern web applications, from simple landing pages to complex AI-powered platforms.

VimLM
VimLM is an AI-powered coding assistant for Vim that integrates AI for code generation, refactoring, and documentation directly into your Vim workflow. It offers native Vim integration with split-window responses and intuitive keybindings, offline first execution with MLX-compatible models, contextual awareness with seamless integration with codebase and external resources, conversational workflow for iterating on responses, project scaffolding for generating and deploying code blocks, and extensibility for creating custom LLM workflows with command chains.
For similar tasks

hub
Hub is an open-source, high-performance LLM gateway written in Rust. It serves as a smart proxy for LLM applications, centralizing control and tracing of all LLM calls and traces. Built for efficiency, it provides a single API to connect to any LLM provider. The tool is designed to be fast, efficient, and completely open-source under the Apache 2.0 license.

aiges
AIGES is a core component of the Athena Serving Framework, designed as a universal encapsulation tool for AI developers to deploy AI algorithm models and engines quickly. By integrating AIGES, you can deploy AI algorithm models and engines rapidly and host them on the Athena Serving Framework, utilizing supporting auxiliary systems for networking, distribution strategies, data processing, etc. The Athena Serving Framework aims to accelerate the cloud service of AI algorithm models and engines, providing multiple guarantees for cloud service stability through cloud-native architecture. You can efficiently and securely deploy, upgrade, scale, operate, and monitor models and engines without focusing on underlying infrastructure and service-related development, governance, and operations.

holoinsight
HoloInsight is a cloud-native observability platform that provides low-cost and high-performance monitoring services for cloud-native applications. It offers deep insights through real-time log analysis and AI integration. The platform is designed to help users gain a comprehensive understanding of their applications' performance and behavior in the cloud environment. HoloInsight is easy to deploy using Docker and Kubernetes, making it a versatile tool for monitoring and optimizing cloud-native applications. With a focus on scalability and efficiency, HoloInsight is suitable for organizations looking to enhance their observability and monitoring capabilities in the cloud.

awesome-AIOps
awesome-AIOps is a curated list of academic researches and industrial materials related to Artificial Intelligence for IT Operations (AIOps). It includes resources such as competitions, white papers, blogs, tutorials, benchmarks, tools, companies, academic materials, talks, workshops, papers, and courses covering various aspects of AIOps like anomaly detection, root cause analysis, incident management, microservices, dependency tracing, and more.

OpenLLM
OpenLLM is a platform that helps developers run any open-source Large Language Models (LLMs) as OpenAI-compatible API endpoints, locally and in the cloud. It supports a wide range of LLMs, provides state-of-the-art serving and inference performance, and simplifies cloud deployment via BentoML. Users can fine-tune, serve, deploy, and monitor any LLMs with ease using OpenLLM. The platform also supports various quantization techniques, serving fine-tuning layers, and multiple runtime implementations. OpenLLM seamlessly integrates with other tools like OpenAI Compatible Endpoints, LlamaIndex, LangChain, and Transformers Agents. It offers deployment options through Docker containers, BentoCloud, and provides a community for collaboration and contributions.

laravel-slower
Laravel Slower is a powerful package designed for Laravel developers to optimize the performance of their applications by identifying slow database queries and providing AI-driven suggestions for optimal indexing strategies and performance improvements. It offers actionable insights for debugging and monitoring database interactions, enhancing efficiency and scalability.

genkit
Firebase Genkit (beta) is a framework with powerful tooling to help app developers build, test, deploy, and monitor AI-powered features with confidence. Genkit is cloud optimized and code-centric, integrating with many services that have free tiers to get started. It provides unified API for generation, context-aware AI features, evaluation of AI workflow, extensibility with plugins, easy deployment to Firebase or Google Cloud, observability and monitoring with OpenTelemetry, and a developer UI for prototyping and testing AI features locally. Genkit works seamlessly with Firebase or Google Cloud projects through official plugins and templates.

llmops-workshop
LLMOps Workshop is a course designed to help users build, evaluate, monitor, and deploy Large Language Model solutions efficiently using Azure AI, Azure Machine Learning Prompt Flow, Content Safety, and Azure OpenAI. The workshop covers various aspects of LLMOps to help users master the process.
For similar jobs

minio
MinIO is a High Performance Object Storage released under GNU Affero General Public License v3.0. It is API compatible with Amazon S3 cloud storage service. Use MinIO to build high performance infrastructure for machine learning, analytics and application data workloads.

ai-on-gke
This repository contains assets related to AI/ML workloads on Google Kubernetes Engine (GKE). Run optimized AI/ML workloads with Google Kubernetes Engine (GKE) platform orchestration capabilities. A robust AI/ML platform considers the following layers: Infrastructure orchestration that support GPUs and TPUs for training and serving workloads at scale Flexible integration with distributed computing and data processing frameworks Support for multiple teams on the same infrastructure to maximize utilization of resources

kong
Kong, or Kong API Gateway, is a cloud-native, platform-agnostic, scalable API Gateway distinguished for its high performance and extensibility via plugins. It also provides advanced AI capabilities with multi-LLM support. By providing functionality for proxying, routing, load balancing, health checking, authentication (and more), Kong serves as the central layer for orchestrating microservices or conventional API traffic with ease. Kong runs natively on Kubernetes thanks to its official Kubernetes Ingress Controller.

AI-in-a-Box
AI-in-a-Box is a curated collection of solution accelerators that can help engineers establish their AI/ML environments and solutions rapidly and with minimal friction, while maintaining the highest standards of quality and efficiency. It provides essential guidance on the responsible use of AI and LLM technologies, specific security guidance for Generative AI (GenAI) applications, and best practices for scaling OpenAI applications within Azure. The available accelerators include: Azure ML Operationalization in-a-box, Edge AI in-a-box, Doc Intelligence in-a-box, Image and Video Analysis in-a-box, Cognitive Services Landing Zone in-a-box, Semantic Kernel Bot in-a-box, NLP to SQL in-a-box, Assistants API in-a-box, and Assistants API Bot in-a-box.

awsome-distributed-training
This repository contains reference architectures and test cases for distributed model training with Amazon SageMaker Hyperpod, AWS ParallelCluster, AWS Batch, and Amazon EKS. The test cases cover different types and sizes of models as well as different frameworks and parallel optimizations (Pytorch DDP/FSDP, MegatronLM, NemoMegatron...).

generative-ai-cdk-constructs
The AWS Generative AI Constructs Library is an open-source extension of the AWS Cloud Development Kit (AWS CDK) that provides multi-service, well-architected patterns for quickly defining solutions in code to create predictable and repeatable infrastructure, called constructs. The goal of AWS Generative AI CDK Constructs is to help developers build generative AI solutions using pattern-based definitions for their architecture. The patterns defined in AWS Generative AI CDK Constructs are high level, multi-service abstractions of AWS CDK constructs that have default configurations based on well-architected best practices. The library is organized into logical modules using object-oriented techniques to create each architectural pattern model.

model_server
OpenVINO™ Model Server (OVMS) is a high-performance system for serving models. Implemented in C++ for scalability and optimized for deployment on Intel architectures, the model server uses the same architecture and API as TensorFlow Serving and KServe while applying OpenVINO for inference execution. Inference service is provided via gRPC or REST API, making deploying new algorithms and AI experiments easy.

dify-helm
Deploy langgenius/dify, an LLM based chat bot app on kubernetes with helm chart.