data:image/s3,"s3://crabby-images/74c83/74c83df2ebf176f02fdd6a78b77f5efae33d2d47" alt="ai-gateway"
ai-gateway
Govern, Secure, and Optimize your AI Traffic. AI Gateway provides unified interface to all LLMs using OpenAI API format with a focus on performance and reliability. Built in Rust.
Stars: 74
data:image/s3,"s3://crabby-images/19a03/19a0320582925c9dd2dc5689a6212f088c01c221" alt="screenshot"
LangDB AI Gateway is an open-source enterprise AI gateway built in Rust. It provides a unified interface to all LLMs using the OpenAI API format, focusing on high performance, enterprise readiness, and data control. The gateway offers features like comprehensive usage analytics, cost tracking, rate limiting, data ownership, and detailed logging. It supports various LLM providers and provides OpenAI-compatible endpoints for chat completions, model listing, embeddings generation, and image generation. Users can configure advanced settings, such as rate limiting, cost control, dynamic model routing, and observability with OpenTelemetry tracing. The gateway can be run with Docker Compose and integrated with MCP tools for server communication.
README:
Govern, Secure, and Optimize your AI Traffic. LangDB AI Gateway provides unified interface to all LLMs using OpenAI API format. Built with performance and reliability in mind.
🚀 High Performance
- Built in Rust for maximum speed and reliability
- Seamless integration with any framework (Langchain, Vercel AI SDK, CrewAI, etc.)
- Integrate with any MCP servers(https://docs.langdb.ai/ai-gateway/features/mcp-support)
📊 Enterprise Ready
- Comprehensive usage analytics and cost tracking
- Rate limiting and cost control
- Advanced routing, load balancing and failover
- Evaluations
🔒 Data Control
- Full ownership of your LLM usage data
- Detailed logging and tracing
🌟 Hosted Version - Get started in minutes with our fully managed solution
- Zero infrastructure management
- Automatic updates and maintenance
- Pay-as-you-go pricing
💼 Enterprise Version - Enhanced features for large-scale deployments
- Advanced team management and access controls
- Custom security guardrails and compliance features
- Intuitive monitoring dashboard
- Priority support and SLA guarantees
- Custom deployment options
Contact our team to learn more about enterprise solutions.
Choose one of these installation methods:
docker run -it \
-p 8080:8080 \
-e LANGDB_KEY=your-langdb-key-here \
langdb/ai-gateway serve
Install from crates.io:
export RUSTFLAGS="--cfg tracing_unstable --cfg aws_sdk_unstable"
cargo install ai-gateway
export LANGDB_KEY=your-langdb-key-here
ai-gateway serve
Test the gateway with a simple chat completion:
# Chat completion with GPT-4
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "What is the capital of France?"}]
}'
# Or try Claude
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "claude-3-opus",
"messages": [
{"role": "user", "content": "What is the capital of France?"}
]
}'
LangDB AI Gateway currently supports the following LLM providers. Find all the available models here.
Provider | |
---|---|
![]() |
OpenAI |
![]() |
Google Gemini |
![]() |
Anthropic |
![]() |
DeepSeek |
TogetherAI | |
![]() |
XAI |
![]() |
Meta ( Provided by Bedrock ) |
![]() |
Cohere ( Provided by Bedrock ) |
![]() |
Mistral ( Provided by Bedrock ) |
The gateway provides the following OpenAI-compatible endpoints:
-
POST /v1/chat/completions
- Chat completions -
GET /v1/models
- List available models -
POST /v1/embeddings
- Generate embeddings -
POST /v1/images/generations
- Generate images
Create a config.yaml
file:
providers:
openai:
api_key: "your-openai-key-here"
anthropic:
api_key: "your-anthropic-key-here"
# Supports mustache style variables
gemini:
api_key: {{LANGDB_GEMINI_API_KEY}}
http:
host: "0.0.0.0"
port: 8080
# Run with custom host and port
ai-gateway serve --host 0.0.0.0 --port 3000
# Run with CORS origins
ai-gateway serve --cors-origins "http://localhost:3000,http://example.com"
# Run with rate limiting
ai-gateway serve --rate-hourly 1000
# Run with cost limits
ai-gateway serve --cost-daily 100.0 --cost-monthly 1000.0
# Run with custom database connections
ai-gateway serve --clickhouse-url "clickhouse://localhost:9000"
Download the sample configuration from our repo.
- Copy the example config file:
curl -sL https://raw.githubusercontent.com/langdb/ai-gateway/main/config.sample.yaml -o config.sample.yaml
cp config.sample.yaml config.yaml
Command line options will override corresponding config file settings when both are specified.
Rate limiting helps prevent API abuse by limiting the number of requests within a time window. Configure rate limits using:
# Limit to 1000 requests per hour
ai-gateway serve --rate-hourly 1000
Or in config.yaml
:
rate_limit:
hourly: 1000
daily: 10000
monthly: 100000
Cost control helps manage API spending by setting daily, monthly, or total cost limits. Configure cost limits using:
# Set daily and monthly limits
ai-gateway serve \
--cost-daily 100.0 \
--cost-monthly 1000.0 \
--cost-total 5000.0
Or in config.yaml
:
cost_control:
daily: 100.0 # $100 per day
monthly: 1000.0 # $1000 per month
total: 5000.0 # $5000 total
When a cost limit is reached, the API will return a 429 response with a message indicating the limit has been exceeded.
When a rate limit is exceeded, the API will return a 429 (Too Many Requests) response.
LangDB AI Gateway empowers you to implement sophisticated routing strategies for your LLM requests. By utilizing features such as fallback routing, script-based routing, and latency-based routing, you can optimize your AI traffic to balance cost, speed, and availability.
Here's an example of a dynamic routing configuration:
{
"model": "router/dynamic",
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "What is the formula of a square plot?" }
],
"router": {
"router": "router",
"type": "fallback", // Type: fallback/script/optimized/percentage/latency
"targets": [
{ "model": "openai/gpt-4o-mini", "temperature": 0.9, "max_tokens": 500, "top_p": 0.9 },
{ "model": "deepseek/deepseek-chat", "frequency_penalty": 1, "presence_penalty": 0.6 }
]
},
"stream": false
}
This configuration demonstrates how you can define multiple targets with specific parameters to ensure your requests are handled by the most suitable models. For more detailed information, explore our routing documentation.
The gateway supports OpenTelemetry tracing with ClickHouse as the storage backend. All traces are stored in the langdb.traces
table.
- Create the traces table in ClickHouse:
# Create langdb database if it doesn't exist
clickhouse-client --query "CREATE DATABASE IF NOT EXISTS langdb"
# Import the traces table schema
clickhouse-client --query "$(cat sql/traces.sql)"
- Enable tracing by providing the ClickHouse URL when running the server:
ai-gateway serve --clickhouse-url "clickhouse://localhost:9000"
You can also set the URL in your config.yaml
:
clickhouse:
url: "http://localhost:8123"
The traces are stored in the langdb.traces
table. Here are some example queries:
-- Get recent traces
SELECT
trace_id,
operation_name,
start_time_us,
finish_time_us,
(finish_time_us - start_time_us) as duration_us
FROM langdb.traces
WHERE finish_date >= today() - 1
ORDER BY finish_time_us DESC
LIMIT 10;
Did you know you can call LangDB APIs directly within ClickHouse? Check out our UDF documentation to learn how to use LLMs in your SQL queries!
For a complete setup including ClickHouse for analytics and tracing, follow these steps:
- Start the services using Docker Compose:
docker-compose up -d
This will start:
- ClickHouse server on ports 8123 (HTTP)
- All necessary configurations will be loaded from
docker/clickhouse/server/config.d
- Build and run the gateway:
ai-gateway
The gateway will now be running with full analytics and logging capabilities, storing data in ClickHouse.
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "Ping the server using the tool and return the response"}],
"mcp_servers": [{"server_url": "http://localhost:3004"}]
}'
To get started with development:
- Clone the repository
- Copy
config.sample.yaml
toconfig.yaml
and configure as needed - Run
cargo build
to compile - Run
cargo test
to run tests
We welcome contributions! Please check out our Contributing Guide for guidelines on:
- How to submit issues
- How to submit pull requests
- Code style conventions
- Development workflow
- Testing requirements
The gateway uses tracing
for logging. Set the RUST_LOG
environment variable to control log levels:
RUST_LOG=debug cargo run serve # For detailed logs
RUST_LOG=info cargo run serve # For standard logs
This project is released under the Apache License 2.0. See the license file for more information.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for ai-gateway
Similar Open Source Tools
data:image/s3,"s3://crabby-images/19a03/19a0320582925c9dd2dc5689a6212f088c01c221" alt="ai-gateway Screenshot"
ai-gateway
LangDB AI Gateway is an open-source enterprise AI gateway built in Rust. It provides a unified interface to all LLMs using the OpenAI API format, focusing on high performance, enterprise readiness, and data control. The gateway offers features like comprehensive usage analytics, cost tracking, rate limiting, data ownership, and detailed logging. It supports various LLM providers and provides OpenAI-compatible endpoints for chat completions, model listing, embeddings generation, and image generation. Users can configure advanced settings, such as rate limiting, cost control, dynamic model routing, and observability with OpenTelemetry tracing. The gateway can be run with Docker Compose and integrated with MCP tools for server communication.
data:image/s3,"s3://crabby-images/9942f/9942f1e0c801191d065470d23438a9ac37f96d9d" alt="scylla Screenshot"
scylla
Scylla is an intelligent proxy pool tool designed for humanities, enabling users to extract content from the internet and build their own Large Language Models in the AI era. It features automatic proxy IP crawling and validation, an easy-to-use JSON API, a simple web-based user interface, HTTP forward proxy server, Scrapy and requests integration, and headless browser crawling. Users can start using Scylla with just one command, making it a versatile tool for various web scraping and content extraction tasks.
data:image/s3,"s3://crabby-images/f1861/f186199cec8b2d26e6c6e37ce6112036d8971273" alt="e2m Screenshot"
e2m
E2M is a Python library that can parse and convert various file types into Markdown format. It supports the conversion of multiple file formats, including doc, docx, epub, html, htm, url, pdf, ppt, pptx, mp3, and m4a. The ultimate goal of the E2M project is to provide high-quality data for Retrieval-Augmented Generation (RAG) and model training or fine-tuning. The core architecture consists of a Parser responsible for parsing various file types into text or image data, and a Converter responsible for converting text or image data into Markdown format.
data:image/s3,"s3://crabby-images/364f5/364f5d3fb8863345d2b0fa5a26d713cc94cc9f76" alt="langchainrb Screenshot"
langchainrb
Langchain.rb is a Ruby library that makes it easy to build LLM-powered applications. It provides a unified interface to a variety of LLMs, vector search databases, and other tools, making it easy to build and deploy RAG (Retrieval Augmented Generation) systems and assistants. Langchain.rb is open source and available under the MIT License.
data:image/s3,"s3://crabby-images/eb5dd/eb5ddfeabf9a46dfefc2f2a4bc87f7eeef768caf" alt="langcorn Screenshot"
langcorn
LangCorn is an API server that enables you to serve LangChain models and pipelines with ease, leveraging the power of FastAPI for a robust and efficient experience. It offers features such as easy deployment of LangChain models and pipelines, ready-to-use authentication functionality, high-performance FastAPI framework for serving requests, scalability and robustness for language processing applications, support for custom pipelines and processing, well-documented RESTful API endpoints, and asynchronous processing for faster response times.
data:image/s3,"s3://crabby-images/c56e9/c56e956f86469ac902c02b23b412c3b92d937f69" alt="client-python Screenshot"
client-python
The Mistral Python Client is a tool inspired by cohere-python that allows users to interact with the Mistral AI API. It provides functionalities to access and utilize the AI capabilities offered by Mistral. Users can easily install the client using pip and manage dependencies using poetry. The client includes examples demonstrating how to use the API for various tasks, such as chat interactions. To get started, users need to obtain a Mistral API Key and set it as an environment variable. Overall, the Mistral Python Client simplifies the integration of Mistral AI services into Python applications.
data:image/s3,"s3://crabby-images/94aa1/94aa1e132a6a19383d7f8bf2ffd83bfb99d0c4bc" alt="json-repair Screenshot"
json-repair
JSON Repair is a toolkit designed to address JSON anomalies that can arise from Large Language Models (LLMs). It offers a comprehensive solution for repairing JSON strings, ensuring accuracy and reliability in your data processing. With its user-friendly interface and extensive capabilities, JSON Repair empowers developers to seamlessly integrate JSON repair into their workflows.
data:image/s3,"s3://crabby-images/8972f/8972fd471970743281b88d17ef0ee3c38bcd6dc1" alt="pocketgroq Screenshot"
pocketgroq
PocketGroq is a tool that provides advanced functionalities for text generation, web scraping, web search, and AI response evaluation. It includes features like an Autonomous Agent for answering questions, web crawling and scraping capabilities, enhanced web search functionality, and flexible integration with Ollama server. Users can customize the agent's behavior, evaluate responses using AI, and utilize various methods for text generation, conversation management, and Chain of Thought reasoning. The tool offers comprehensive methods for different tasks, such as initializing RAG, error handling, and tool management. PocketGroq is designed to enhance development processes and enable the creation of AI-powered applications with ease.
data:image/s3,"s3://crabby-images/2c77d/2c77d5a6c8423bdea61ee119e966c654c4f9bf28" alt="mergoo Screenshot"
mergoo
Mergoo is a library for easily merging multiple LLM experts and efficiently training the merged LLM. With Mergoo, you can efficiently integrate the knowledge of different generic or domain-based LLM experts. Mergoo supports several merging methods, including Mixture-of-Experts, Mixture-of-Adapters, and Layer-wise merging. It also supports various base models, including LLaMa, Mistral, and BERT, and trainers, including Hugging Face Trainer, SFTrainer, and PEFT. Mergoo provides flexible merging for each layer and supports training choices such as only routing MoE layers or fully fine-tuning the merged LLM.
data:image/s3,"s3://crabby-images/8356f/8356f19494d2ab0c9bddb2a02098a1c3c60d6a5d" alt="candle-vllm Screenshot"
candle-vllm
Candle-vllm is an efficient and easy-to-use platform designed for inference and serving local LLMs, featuring an OpenAI compatible API server. It offers a highly extensible trait-based system for rapid implementation of new module pipelines, streaming support in generation, efficient management of key-value cache with PagedAttention, and continuous batching. The tool supports chat serving for various models and provides a seamless experience for users to interact with LLMs through different interfaces.
data:image/s3,"s3://crabby-images/8a9b9/8a9b9526be4e7a083c2f84dddb5025faaf8c0579" alt="clarifai-python Screenshot"
clarifai-python
The Clarifai Python SDK offers a comprehensive set of tools to integrate Clarifai's AI platform to leverage computer vision capabilities like classification , detection ,segementation and natural language capabilities like classification , summarisation , generation , Q&A ,etc into your applications. With just a few lines of code, you can leverage cutting-edge artificial intelligence to unlock valuable insights from visual and textual content.
data:image/s3,"s3://crabby-images/8786a/8786a25066dff12dfe13400f4762cc9db6b4d93e" alt="starknet-agent-kit Screenshot"
starknet-agent-kit
starknet-agent-kit is a NestJS-based toolkit for creating AI agents that can interact with the Starknet blockchain. It allows users to perform various actions such as retrieving account information, creating accounts, transferring assets, playing with DeFi, interacting with dApps, and executing RPC read methods. The toolkit provides a secure environment for developing AI agents while emphasizing caution when handling sensitive information. Users can make requests to the Starknet agent via API endpoints and utilize tools from Langchain directly.
data:image/s3,"s3://crabby-images/a9f25/a9f257b354e4e04671396394eeecc320611aa694" alt="agentlang Screenshot"
agentlang
AgentLang is an open-source programming language and framework designed for solving complex tasks with the help of AI agents. It allows users to build business applications rapidly from high-level specifications, making it more efficient than traditional programming languages. The language is data-oriented and declarative, with a syntax that is intuitive and closer to natural languages. AgentLang introduces innovative concepts such as first-class AI agents, graph-based hierarchical data model, zero-trust programming, declarative dataflow, resolvers, interceptors, and entity-graph-database mapping.
data:image/s3,"s3://crabby-images/4062f/4062f5dfbe48ef1f66387eb351f07ebf064086e4" alt="mcp-framework Screenshot"
mcp-framework
MCP-Framework is a TypeScript framework for building Model Context Protocol (MCP) servers with automatic directory-based discovery for tools, resources, and prompts. It provides powerful abstractions, simple server setup, and a CLI for rapid development and project scaffolding.
data:image/s3,"s3://crabby-images/6dbf9/6dbf9fb724aa909c724406105d8398f654f9386f" alt="UHGEval Screenshot"
UHGEval
UHGEval is a comprehensive framework designed for evaluating the hallucination phenomena. It includes UHGEval, a framework for evaluating hallucination, XinhuaHallucinations dataset, and UHGEval-dataset pipeline for creating XinhuaHallucinations. The framework offers flexibility and extensibility for evaluating common hallucination tasks, supporting various models and datasets. Researchers can use the open-source pipeline to create customized datasets. Supported tasks include QA, dialogue, summarization, and multi-choice tasks.
data:image/s3,"s3://crabby-images/db3ce/db3ce4eba641d3dfadbee9fb46237ed3d41a1862" alt="ai00_server Screenshot"
ai00_server
AI00 RWKV Server is an inference API server for the RWKV language model based upon the web-rwkv inference engine. It supports VULKAN parallel and concurrent batched inference and can run on all GPUs that support VULKAN. No need for Nvidia cards!!! AMD cards and even integrated graphics can be accelerated!!! No need for bulky pytorch, CUDA and other runtime environments, it's compact and ready to use out of the box! Compatible with OpenAI's ChatGPT API interface. 100% open source and commercially usable, under the MIT license. If you are looking for a fast, efficient, and easy-to-use LLM API server, then AI00 RWKV Server is your best choice. It can be used for various tasks, including chatbots, text generation, translation, and Q&A.
For similar tasks
data:image/s3,"s3://crabby-images/19a03/19a0320582925c9dd2dc5689a6212f088c01c221" alt="ai-gateway Screenshot"
ai-gateway
LangDB AI Gateway is an open-source enterprise AI gateway built in Rust. It provides a unified interface to all LLMs using the OpenAI API format, focusing on high performance, enterprise readiness, and data control. The gateway offers features like comprehensive usage analytics, cost tracking, rate limiting, data ownership, and detailed logging. It supports various LLM providers and provides OpenAI-compatible endpoints for chat completions, model listing, embeddings generation, and image generation. Users can configure advanced settings, such as rate limiting, cost control, dynamic model routing, and observability with OpenTelemetry tracing. The gateway can be run with Docker Compose and integrated with MCP tools for server communication.
data:image/s3,"s3://crabby-images/92aa4/92aa472f19ea887122ce6130eb24f030d88b86d4" alt="AIG-ModelMatching-For-MSFS Screenshot"
AIG-ModelMatching-For-MSFS
This tool is an AIG install for MSFS ONLY EXCLUDING offline AI flight plans. It provides a solution to model matching for online networks along with providing a tool to inject live traffic to your simulator, directly from Flightradar24. The tool is designed for use with online virtual traffic networks like VATSIM, but it will also work for offline traffic. A VMR File for VATSIM usage has been included in the folder.
For similar jobs
data:image/s3,"s3://crabby-images/7a828/7a828889d979cbf4be5a04454f679734bb36585f" alt="sweep Screenshot"
sweep
Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.
data:image/s3,"s3://crabby-images/cac11/cac1100b7e92d3c9c9529eacfe5a6e8d943d8f57" alt="teams-ai Screenshot"
teams-ai
The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.
data:image/s3,"s3://crabby-images/10f6b/10f6b939c21eecaacb4aeb678159f5a587a20256" alt="ai-guide Screenshot"
ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.
data:image/s3,"s3://crabby-images/8b8c3/8b8c30180bcfba25fde40a102b6ae98fd35704b8" alt="classifai Screenshot"
classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.
data:image/s3,"s3://crabby-images/c6b52/c6b52a0438e707c19f9dcb358608627496141f31" alt="chatbot-ui Screenshot"
chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.
data:image/s3,"s3://crabby-images/2fa15/2fa15d62e208bea0a119405a82ad37a6b24564c0" alt="BricksLLM Screenshot"
BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students
data:image/s3,"s3://crabby-images/e597e/e597e24a3c2657c376591c1e0da9159b22cd2ff2" alt="uAgents Screenshot"
uAgents
uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.
data:image/s3,"s3://crabby-images/8ab69/8ab692a869eef895ffca840dda9b43d13f3cf958" alt="griptape Screenshot"
griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.