
MCP-Bridge
A middleware to provide an openAI compatible endpoint that can call MCP tools
Stars: 308

MCP-Bridge is a middleware tool designed to provide an openAI compatible endpoint for calling MCP tools. It acts as a bridge between the OpenAI API and MCP tools, allowing developers to leverage MCP tools through the OpenAI API interface. The tool facilitates the integration of MCP tools with the OpenAI API by providing endpoints for interaction. It supports non-streaming and streaming chat completions with MCP, as well as non-streaming completions without MCP. The tool is designed to work with inference engines that support tool call functionalities, such as vLLM and ollama. Installation can be done using Docker or manually, and the application can be run to interact with the OpenAI API. Configuration involves editing the config.json file to add new MCP servers. Contributions to the tool are welcome under the MIT License.
README:
MCP-Bridge acts as a bridge between the OpenAI API and MCP (MCP) tools, allowing developers to leverage MCP tools through the OpenAI API interface.
MCP-Bridge is designed to facilitate the integration of MCP tools with the OpenAI API. It provides a set of endpoints that can be used to interact with MCP tools in a way that is compatible with the OpenAI API. This allows you to use any client with any MCP tool without explicit support for MCP. For example, see this example of using Open Web UI with the official MCP fetch tool.
working features:
-
non streaming chat completions with MCP
-
streaming chat completions with MCP
-
non streaming completions without MCP
-
MCP tools
-
MCP sampling
-
SSE Bridge for external clients
planned features:
-
streaming completions are not implemented yet
-
MCP resources are planned to be supported
The recommended way to install MCP-Bridge is to use Docker. See the example compose.yml file for an example of how to set up docker.
Note that this requires an inference engine with tool call support. I have tested this with vLLM with success, though ollama should also be compatible.
-
Clone the repository
-
Edit the compose.yml file
You will need to add a reference to the config.json file in the compose.yml file. Pick any of
- add the config.json file to the same directory as the compose.yml file and use a volume mount (you will need to add the volume manually)
- add a http url to the environment variables to download the config.json file from a url
- add the config json directly as an environment variable
see below for an example of each option:
environment:
- MCP_BRIDGE__CONFIG__FILE=config.json # mount the config file for this to work
- MCP_BRIDGE__CONFIG__HTTP_URL=http://10.88.100.170:8888/config.json
- MCP_BRIDGE__CONFIG__JSON={"inference_server":{"base_url":"http://example.com/v1","api_key":"None"},"mcp_servers":{"fetch":{"command":"uvx","args":["mcp-server-fetch"]}}}
The mount point for using the config file would look like:
volumes:
- ./config.json:/mcp_bridge/config.json
- run the service
docker-compose up --build -d
If you want to run the application without docker, you will need to install the requirements and run the application manually.
-
Clone the repository
-
Set up a dependencies:
uv sync
- Create a config.json file in the root directory
Here is an example config.json file:
{
"inference_server": {
"base_url": "http://example.com/v1",
"api_key": "None"
},
"mcp_servers": {
"fetch": {
"command": "uvx",
"args": ["mcp-server-fetch"]
}
}
}
- Run the application:
uv run mcp_bridge/main.py
Once the application is running, you can interact with it using the OpenAI API.
View the documentation at http://yourserver:8000/docs. There is an endpoint to list all the MCP tools available on the server, which you can use to test the application configuration.
MCP-Bridge exposes many rest api endpoints for interacting with all of the native MCP primatives. This lets you outsource the complexity of dealing with MCP servers to MCP-Bridge without comprimising on functionality. See the openapi docs for examples of how to use this functionality.
MCP-Bridge also provides an SSE bridge for external clients. This lets external chat apps with explicit MCP support use MCP-Bridge as a MCP server. Point your client at the SSE endpoint (http://yourserver:8000/mcp-server/sse) and you should be able to see all the MCP tools available on the server.
This also makes it easy to test if your configuration is working correctly. You can use wong2/mcp-cli to test your configuration. npx @wong2/mcp-cli --sse http://localhost:8000/mcp-server/sse
If you want to use the tools inside of claude desktop or other STDIO
only MCP clients, you can do this with a tool such as lightconetech/mcp-gateway
To add new MCP servers, edit the config.json file.
an example config.json file with most of the options explicitly set:
{
"inference_server": {
"base_url": "http://localhost:8000/v1",
"api_key": "None"
},
"sampling": {
"timeout": 10,
"models": [
{
"model": "gpt-4o",
"intelligence": 0.8,
"cost": 0.9,
"speed": 0.3
},
{
"model": "gpt-4o-mini",
"intelligence": 0.4,
"cost": 0.1,
"speed": 0.7
}
]
},
"mcp_servers": {
"fetch": {
"command": "uvx",
"args": [
"mcp-server-fetch"
]
}
},
"network": {
"host": "0.0.0.0",
"port": 9090
},
"logging": {
"log_level": "DEBUG"
}
}
Section | Description |
---|---|
inference_server | The inference server configuration |
mcp_servers | The MCP servers configuration |
network | uvicorn network configuration |
logging | The logging configuration |
If you encounter any issues please open an issue or join the discord.
There is also documentation available here.
The application sits between the OpenAI API and the inference engine. An incoming request is modified to include tool definitions for all MCP tools available on the MCP servers. The request is then forwarded to the inference engine, which uses the tool definitions to create tool calls. MCP bridge then manage the calls to the tools. The request is then modified to include the tool call results, and is returned to the inference engine again so the LLM can create a response. Finally, the response is returned to the OpenAI API.
sequenceDiagram
participant OpenWebUI as Open Web UI
participant MCPProxy as MCP Proxy
participant MCPserver as MCP Server
participant InferenceEngine as Inference Engine
OpenWebUI ->> MCPProxy: Request
MCPProxy ->> MCPserver: list tools
MCPserver ->> MCPProxy: list of tools
MCPProxy ->> InferenceEngine: Forward Request
InferenceEngine ->> MCPProxy: Response
MCPProxy ->> MCPserver: call tool
MCPserver ->> MCPProxy: tool response
MCPProxy ->> InferenceEngine: llm uses tool response
InferenceEngine ->> MCPProxy: Response
MCPProxy ->> OpenWebUI: Return Response
Contributions to MCP-Bridge are welcome! To contribute, please follow these steps:
- Fork the repository.
- Create a new branch for your feature or bug fix.
- Make your changes and commit them.
- Push your changes to your fork.
- Create a pull request to the main repository.
MCP-Bridge is licensed under the MIT License. See the LICENSE file for more information.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for MCP-Bridge
Similar Open Source Tools

MCP-Bridge
MCP-Bridge is a middleware tool designed to provide an openAI compatible endpoint for calling MCP tools. It acts as a bridge between the OpenAI API and MCP tools, allowing developers to leverage MCP tools through the OpenAI API interface. The tool facilitates the integration of MCP tools with the OpenAI API by providing endpoints for interaction. It supports non-streaming and streaming chat completions with MCP, as well as non-streaming completions without MCP. The tool is designed to work with inference engines that support tool call functionalities, such as vLLM and ollama. Installation can be done using Docker or manually, and the application can be run to interact with the OpenAI API. Configuration involves editing the config.json file to add new MCP servers. Contributions to the tool are welcome under the MIT License.

otto-m8
otto-m8 is a flowchart based automation platform designed to run deep learning workloads with minimal to no code. It provides a user-friendly interface to spin up a wide range of AI models, including traditional deep learning models and large language models. The tool deploys Docker containers of workflows as APIs for integration with existing workflows, building AI chatbots, or standalone applications. Otto-m8 operates on an Input, Process, Output paradigm, simplifying the process of running AI models into a flowchart-like UI.

AutoNode
AutoNode is a self-operating computer system designed to automate web interactions and data extraction processes. It leverages advanced technologies like OCR (Optical Character Recognition), YOLO (You Only Look Once) models for object detection, and a custom site-graph to navigate and interact with web pages programmatically. Users can define objectives, create site-graphs, and utilize AutoNode via API to automate tasks on websites. The tool also supports training custom YOLO models for object detection and OCR for text recognition on web pages. AutoNode can be used for tasks such as extracting product details, automating web interactions, and more.

Figma-Context-MCP
Figma-Context-MCP is a plugin for Figma that allows users to easily manage and switch between multiple design contexts within a single Figma file. This tool simplifies the process of working on different design variations or versions by providing a seamless way to organize and switch between them. With Figma-Context-MCP, designers can streamline their workflow and improve collaboration by keeping all design contexts in one place and easily accessible. This plugin enhances productivity and efficiency for Figma users who frequently work on multiple design iterations or versions within a project.

langchain
LangChain is a framework for developing Elixir applications powered by language models. It enables applications to connect language models to other data sources and interact with the environment. The library provides components for working with language models and off-the-shelf chains for specific tasks. It aims to assist in building applications that combine large language models with other sources of computation or knowledge. LangChain is written in Elixir and is not aimed for parity with the JavaScript and Python versions due to differences in programming paradigms and design choices. The library is designed to make it easy to integrate language models into applications and expose features, data, and functionality to the models.

ai-component-generator
AI Component Generator with ChatGPT is a project that utilizes OpenAI's ChatGPT and Vercel Edge functions to generate various UI components based on user input. It allows users to export components in HTML format or choose combinations of Tailwind CSS, Next.js, React.js, or Material UI. The tool can be used to quickly bootstrap projects and create custom UI components. Users can run the project locally with Next.js and TailwindCSS, and customize ChatGPT prompts to generate specific components or code snippets. The project is open for contributions and aims to simplify the process of creating UI components with AI assistance.

neo4j-graphrag-python
The Neo4j GraphRAG package for Python is an official repository that provides features for creating and managing vector indexes in Neo4j databases. It aims to offer developers a reliable package with long-term commitment, maintenance, and fast feature updates. The package supports various Python versions and includes functionalities for creating vector indexes, populating them, and performing similarity searches. It also provides guidelines for installation, examples, and development processes such as installing dependencies, making changes, and running tests.

aiconfig
AIConfig is a framework that makes it easy to build generative AI applications for production. It manages generative AI prompts, models and model parameters as JSON-serializable configs that can be version controlled, evaluated, monitored and opened in a local editor for rapid prototyping. It allows you to store and iterate on generative AI behavior separately from your application code, offering a streamlined AI development workflow.

mastra
Mastra is an opinionated Typescript framework designed to help users quickly build AI applications and features. It provides primitives such as workflows, agents, RAG, integrations, syncs, and evals. Users can run Mastra locally or deploy it to a serverless cloud. The framework supports various LLM providers, offers tools for building language models, workflows, and accessing knowledge bases. It includes features like durable graph-based state machines, retrieval-augmented generation, integrations, syncs, and automated tests for evaluating LLM outputs.

Trinity
Trinity is an Explainable AI (XAI) Analysis and Visualization tool designed for Deep Learning systems or other models performing complex classification or decoding. It provides performance analysis through interactive 3D projections that are hyper-dimensional aware, allowing users to explore hyperspace, hypersurface, projections, and manifolds. Trinity primarily works with JSON data formats and supports the visualization of FeatureVector objects. Users can analyze and visualize data points, correlate inputs with classification results, and create custom color maps for better data interpretation. Trinity has been successfully applied to various use cases including Deep Learning Object detection models, COVID gene/tissue classification, Brain Computer Interface decoders, and Large Language Model (ChatGPT) Embeddings Analysis.

odoo-expert
RAG-Powered Odoo Documentation Assistant is a comprehensive documentation processing and chat system that converts Odoo's documentation to a searchable knowledge base with an AI-powered chat interface. It supports multiple Odoo versions (16.0, 17.0, 18.0) and provides semantic search capabilities powered by OpenAI embeddings. The tool automates the conversion of RST to Markdown, offers real-time semantic search, context-aware AI-powered chat responses, and multi-version support. It includes a Streamlit-based web UI, REST API for programmatic access, and a CLI for document processing and chat. The system operates through a pipeline of data processing steps and an interface layer for UI and API access to the knowledge base.

julius-gpt
julius-gpt is a Node.js CLI and API tool that enables users to generate content such as blog posts and landing pages using Large Language Models (LLMs) like OpenAI. It supports generating text in multiple languages provided by the available LLMs. The tool offers different modes for content generation, including automatic, interactive, or using a content template. Users can fine-tune the content generation process with completion parameters and create SEO-friendly content with post titles, descriptions, and slugs. Additionally, users can publish content on WordPress and access upcoming features like image generation and RAG. The tool also supports custom prompts for personalized content generation and offers various commands for WordPress-related tasks.

promptwright
Promptwright is a Python library designed for generating large synthetic datasets using a local LLM and various LLM service providers. It offers flexible interfaces for generating prompt-led synthetic datasets. The library supports multiple providers, configurable instructions and prompts, YAML configuration for tasks, command line interface for running tasks, push to Hugging Face Hub for dataset upload, and system message control. Users can define generation tasks using YAML configuration or Python code. Promptwright integrates with LiteLLM to interface with LLM providers and supports automatic dataset upload to Hugging Face Hub.

promptwright
Promptwright is a Python library designed for generating large synthetic datasets using local LLM and various LLM service providers. It offers flexible interfaces for generating prompt-led synthetic datasets. The library supports multiple providers, configurable instructions and prompts, YAML configuration, command line interface, push to Hugging Face Hub, and system message control. Users can define generation tasks using YAML configuration files or programmatically using Python code. Promptwright integrates with LiteLLM for LLM providers and supports automatic dataset upload to Hugging Face Hub. The library is not responsible for the content generated by models and advises users to review the data before using it in production environments.

ActionWeaver
ActionWeaver is an AI application framework designed for simplicity, relying on OpenAI and Pydantic. It supports both OpenAI API and Azure OpenAI service. The framework allows for function calling as a core feature, extensibility to integrate any Python code, function orchestration for building complex call hierarchies, and telemetry and observability integration. Users can easily install ActionWeaver using pip and leverage its capabilities to create, invoke, and orchestrate actions with the language model. The framework also provides structured extraction using Pydantic models and allows for exception handling customization. Contributions to the project are welcome, and users are encouraged to cite ActionWeaver if found useful.

rag-experiment-accelerator
The RAG Experiment Accelerator is a versatile tool that helps you conduct experiments and evaluations using Azure AI Search and RAG pattern. It offers a rich set of features, including experiment setup, integration with Azure AI Search, Azure Machine Learning, MLFlow, and Azure OpenAI, multiple document chunking strategies, query generation, multiple search types, sub-querying, re-ranking, metrics and evaluation, report generation, and multi-lingual support. The tool is designed to make it easier and faster to run experiments and evaluations of search queries and quality of response from OpenAI, and is useful for researchers, data scientists, and developers who want to test the performance of different search and OpenAI related hyperparameters, compare the effectiveness of various search strategies, fine-tune and optimize parameters, find the best combination of hyperparameters, and generate detailed reports and visualizations from experiment results.
For similar tasks

MCP-Bridge
MCP-Bridge is a middleware tool designed to provide an openAI compatible endpoint for calling MCP tools. It acts as a bridge between the OpenAI API and MCP tools, allowing developers to leverage MCP tools through the OpenAI API interface. The tool facilitates the integration of MCP tools with the OpenAI API by providing endpoints for interaction. It supports non-streaming and streaming chat completions with MCP, as well as non-streaming completions without MCP. The tool is designed to work with inference engines that support tool call functionalities, such as vLLM and ollama. Installation can be done using Docker or manually, and the application can be run to interact with the OpenAI API. Configuration involves editing the config.json file to add new MCP servers. Contributions to the tool are welcome under the MIT License.

simple-openai
Simple-OpenAI is a Java library that provides a simple way to interact with the OpenAI API. It offers consistent interfaces for various OpenAI services like Audio, Chat Completion, Image Generation, and more. The library uses CleverClient for HTTP communication, Jackson for JSON parsing, and Lombok to reduce boilerplate code. It supports asynchronous requests and provides methods for synchronous calls as well. Users can easily create objects to communicate with the OpenAI API and perform tasks like text-to-speech, transcription, image generation, and chat completions.
For similar jobs

sweep
Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

teams-ai
The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

uAgents
uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.