
MCPJungle
Self-hosted MCP Gateway and Registry for AI agents
Stars: 479

MCPJungle is a self-hosted MCP Gateway for private AI agents, serving as a registry for Model Context Protocol Servers. Developers use it to manage servers and tools centrally, while clients discover and consume tools from a single 'Gateway' MCP Server. Suitable for developers using MCP Clients like Claude & Cursor, building production-grade AI Agents, and organizations managing client-server interactions. The tool allows quick start, installation, usage, server and client setup, connection to Claude and Cursor, enabling/disabling tools, managing tool groups, authentication, enterprise features like access control and OpenTelemetry metrics. Limitations include lack of long-running connections to servers and no support for OAuth flow. Contributions are welcome.
README:
Self-hosted MCP Gateway for your private AI agents
MCPJungle is a single source-of-truth registry for all Model Context Protocol Servers running in your Organisation.
🧑💻 Developers use it to register & manage MCP servers and the tools they provide from a central place.
🤖 MCP Clients use it to discover and consume all these tools from a single "Gateway" MCP Server.
MCPJungle is the only MCP Server your AI agents need to connect to!
- Developers using MCP Clients like Claude & Cursor that need to access MCP servers for tool-calling
- Developers building production-grade AI Agents that need to access MCP servers with built-in security, privacy and Access Control.
- Organisations wanting to view & manage all MCP client-server interactions from a central place. Hosted in their own datacenter 🔒
- Quick Start guide
- Installation
- Usage
- Limitations
- Contributing
This quickstart guide will show you how to:
- Start the MCPJungle server locally using
docker compose
- Register a simple MCP server in mcpjungle
- Connect your Claude to mcpjungle to access your MCP tools
curl -O https://raw.githubusercontent.com/mcpjungle/MCPJungle/refs/heads/main/docker-compose.yaml
docker compose up -d
Download the mcpjungle
CLI on your local machine either using brew or directly from the Releases Page.
brew install mcpjungle/mcpjungle/mcpjungle
The CLI lets you manage everything in mcpjungle.
Next, lets add an MCP server to mcpjungle using the CLI. For this example, we'll use context7.
mcpjungle register --name context7 --url https://mcp.context7.com/mcp
Use the following configuration for your Claude MCP servers config:
{
"mcpServers": {
"mcpjungle": {
"command": "npx",
"args": [
"mcp-remote",
"http://localhost:8080/mcp",
"--allow-http"
]
}
}
}
Once mcpjungle is added as an MCP to your Claude, try asking it the following:
Use context7 to get the documentation for `/lodash/lodash`
Claude will then attempt to call the context7__get-library-docs
tool via MCPJungle, which will return the documentation for the Lodash library.
Congratulations! 🎉
You have successfully registered a remote MCP server in MCPJungle and called one of its tools via Claude
You can now proceed to play around with the mcpjungle and explore the documentation & CLI for more details.
[!WARNING] MCPJungle is BETA software.
We're actively working to make it production-ready. You can provide your feedback by starting a discussion in this repository.
MCPJungle is shipped as a stand-alone binary.
You can either download it from the Releases Page or use Homebrew to install it:
brew install mcpjungle/mcpjungle/mcpjungle
Verify your installation by running
mcpjungle version
[!IMPORTANT] On MacOS, you will have to use homebrew because the compiled binary is not Notarized yet.
MCPJungle provides a Docker image which is useful for running the registry server (more about it later).
docker pull mcpjungle/mcpjungle
MCPJungle has a Client-Server architecture and the binary lets you run both the Server and the Client.
The MCPJungle server is responsible for managing all the MCP servers registered in it and providing a unified MCP gateway for AI Agents to discover and call tools provided by these registered servers.
The gateway itself runs over streamable http transport and is accessible at the /mcp
endpoint.
For running the MCPJungle server locally, docker compose is the recommended way:
# docker-compose.yaml is optimized for individuals running mcpjungle on their local machines for personal use.
# mcpjungle will run in `development` mode by default.
curl -O https://raw.githubusercontent.com/mcpjungle/MCPJungle/refs/heads/main/docker-compose.yaml
docker compose up -d
# docker-compose.prod.yaml is optimized for orgs deploying mcpjungle on a remote server for multiple users.
# mcpjungle will run in `production` mode by default, which enables enterprise features.
curl -O https://raw.githubusercontent.com/mcpjungle/MCPJungle/refs/heads/main/docker-compose.prod.yaml
docker compose -f docker-compose.prod.yaml up -d
This will start the MCPJungle server along with a persistent Postgres database container.
You can quickly verify that the server is running:
curl http://localhost:8080/health
If you plan on registering stdio-based MCP servers that rely on npx
or uvx
, use mcpjungle's stdio
tagged docker image instead.
MCPJUNGLE_IMAGE_TAG=latest-stdio docker compose up -d
[!NOTE] If you're using
docker-compose.yaml
, this is already the default image tag. You only need to specify the stdio image tag if you're usingdocker-compose.prod.yaml
.
This image is significantly larger. But it is very convenient and recommended for running locally when you rely on stdio-based MCP servers.
For example, if you only want to register remote mcp servers like context7 and deepwiki, you can use the standard (minimal) image.
But if you also want to use stdio-based servers like filesystem
, time
, github
, etc., you should use the stdio
-tagged image instead.
[!NOTE] If your stdio servers rely on tools other than
npx
oruvx
, you will have to create a custom docker image that includes those dependencies along with the mcpjungle binary.
Production Deployment
The default MCPJungle Docker image is very lightweight - it only contains a minimal base image and the mcpjungle
binary.
It is therefore suitable and recommended for production deployments.
For the database, we recommend you deploy a separate Postgres DB cluster and supply its endpoint to mcpjungle (see Database section below).
You can see the definitions of the standard Docker image and the stdio Docker image.
You can also run the server directly on your host machine using the binary:
mcpjungle start
This starts the main registry server and MCP gateway, accessible on port 8080
by default.
The mcpjungle server relies on a database and by default, creates a SQLite DB in the current working directory.
This is okay when you're just testing things out locally.
Alternatively, you can supply a DSN for a Postgresql database to the server:
export DATABASE_URL=postgres://admin:root@localhost:5432/mcpjungle_db
#run as container
docker run mcpjungle/mcpjungle:latest
# or run directly
mcpjungle start
Once the server is up, you can use the mcpjungle CLI to interact with it.
MCPJungle currently supports MCP servers using stdio and Streamable HTTP Transports.
Let's see how to register them in mcpjungle.
Let's say you're already running a streamable http MCP server locally at http://127.0.0.1:8000/mcp
which provides basic math tools like add
, subtract
, etc.
You can register this MCP server with MCPJungle:
mcpjungle register --name calculator --description "Provides some basic math tools" --url http://127.0.0.1:8000/mcp
If you used docker compose to run the server, and you're not on Linux, you will have to use host.docker.internal
instead of your local loopback address.
mcpjungle register --name calculator --description "Provides some basic math tools" --url http://host.docker.internal:8000/mcp
The registry will now start tracking this MCP server and load its tools.
You can also provide a configuration file to register the MCP server:
cat ./calculator.json
{
"name": "calculator",
"transport": "streamable_http",
"description": "Provides some basic math tools",
"url": "http://127.0.0.1:8000/mcp"
}
mcpjungle register -c ./calculator.json
All tools provided by this server are now accessible via MCPJungle:
mcpjungle list tools
# Check tool usage
mcpjungle usage calculator__multiply
# Call a tool
mcpjungle invoke calculator__multiply --input '{"a": 100, "b": 50}'
[!NOTE] A tool in MCPJungle must be referred to by its canonical name which follows the pattern
<mcp-server-name>__<tool-name>
. Server name and tool name are separated by a double underscore__
.eg- If you register a MCP server
github
which provides a tool calledgit_commit
, you can invoke it in MCPJungle using the namegithub__git_commit
.Your MCP client must also use this canonical name to call the tool via MCPJungle.
The config file format for registering a Streamable HTTP-based MCP server is:
{
"name": "<name of your mcp server>",
"transport": "streamable_http",
"description": "<description>",
"url": "<url of the mcp server>",
"bearer_token": "<optional bearer token for authentication>"
}
Here's an example configuration file (let's call it filesystem.json
) for a MCP server that uses the STDIO transport:
{
"name": "filesystem",
"transport": "stdio",
"description": "filesystem mcp server",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "."]
}
You can register this MCP server in MCPJungle by providing the configuration file:
mcpjungle register -c ./filesystem.json
The config file format for registering a STDIO-based MCP server is:
{
"name": "<name of your mcp server>",
"transport": "stdio",
"description": "<description>",
"command": "<command to run the mcp server, eg- 'npx', 'uvx'>",
"args": ["arguments", "to", "pass", "to", "the", "command"],
"env": {
"KEY": "value"
}
}
You can also watch a quick video on How to register a STDIO-based MCP server.
[!TIP] If your STDIO server fails or throws errors for some reason, check the mcpjungle server's logs to view its
stderr
output.
Limitation 🚧
MCPJungle creates a new connection when a tool is called. This means a new sub-process for a STDIO mcp server is started for every tool call.
This has some performance overhead but ensures that there are no memory leaks.
But it also means that currently MCPJungle doesn't support stateful connections with your MCP server.
We want to hear your feedback to improve this mechanism, feel free to create an issue, start a discussion or just reach out on Discord.
You can remove a MCP server from mcpjungle.
mcpjungle deregister calculator
mcpjungle deregister filesystem
Once removed, this mcp server and its tools are no longer available to you or your MCP clients.
Assuming that MCPJungle is running on http://localhost:8080
, use the following configurations to connect to it:
{
"mcpServers": {
"mcpjungle": {
"command": "npx",
"args": [
"mcp-remote",
"http://localhost:8080/mcp",
"--allow-http"
]
}
}
}
{
"mcpServers": {
"mcpjungle": {
"url": "http://localhost:8080/mcp"
}
}
}
You can watch a quick video on How to connect Cursor to MCPJungle.
You can enable or disable a specific tool or all the tools provided by an MCP Server.
If a tool is disabled, it is not available via the MCPJungle Proxy, so no MCP clients can view or call it.
# disable the `get-library-docs` tool provided by the `context7` MCP server
mcpjungle disable context7__get-library-docs
# re-enable the tool
mcpjungle enable context7__get-library-docs
# disable all tools provided by the `context7` MCP server
mcpjungle disable context7
# re-enable all tools of `context7`
mcpjungle enable context7
A disabled tool is still accessible via mcpjungle's HTTP API, so humans can still manage it from the CLI (or any other HTTP client).
[!NOTE] When a new server is registered in MCPJungle, all its tools are enabled by default.
As you add more MCP servers to MCPJungle, the number of tools available through the Gateway can grow significantly.
If your MCP client is exposed to hundreds of tools through the gateway MCP, its performance may degrade.
MCPJungle allows you to expose only a subset of all available tools to your MCP clients using Tool Groups.
You can create a new group and only include specific tools that you wish to expose.
Once a group is created, mcpjungle returns a unique endpoint for it.
You can then configure your MCP client to use this group-specific endpoint instead of the main gateway endpoint.
You can create a new tool group by providing a JSON configuration file to the create group
command.
You must specify a unique name
for the group and a list of included_tools
that you want to expose via its MCP proxy.
Here is an example of a tool group configuration file (claude-tools-group.json
):
{
"name": "claude-tools",
"description": "This group only contains tools for Claude Desktop to use",
"included_tools": [
"filesystem__read_file",
"deepwiki__read_wiki_contents",
"time__get_current_time"
]
}
Instead of exposing 20 tools across all MCP servers, this group only exposes 3 handpicked ones.
You can create this group in mcpjungle:
$ mcpjungle create group -c ./claude-tools-group.json
Tool Group claude-tools created successfully
It is now accessible at the following streamable http endpoint:
http://127.0.0.1:8080/v0/groups/claude-tools/mcp
You can then configure Claude (or any other MCP client) to use this group-specific endpoint to access the MCP server.
The client will then ONLY see and be able to use these 3 tools and will not be aware of any other tools registered in MCPJungle.
[!TIP] You can run
mcpjungle list tools
to view all available tools and pick the ones you want to include in your group.
You can also watch a Video on using Tool Groups.
You can currently perform operations like listing all groups, viewing details of a specific group and deleting a group.
# list all tool groups
mcpjungle list groups
# view details of a specific group
mcpjungle get group claude-tools
# delete a group
mcpjungle delete group claude-tools
[!NOTE] If a tool is included in a group but is later disabled globally or deleted, then it will not be available via the group's MCP endpoint.
But if the tool is re-enabled or added again later, it will automatically become available in the group again.
Limitations 🚧
- Currently, you cannot update an existing tool group. You must delete the group and create a new one with the modified configuration file.
- In
production
mode, currently only an admin can create a Tool Group. We're working on allowing standard Users to create their own groups as well.
MCPJungle currently supports authentication if your Streamable HTTP MCP Server accepts static tokens for auth.
This is useful when using SaaS-provided MCP Servers like HuggingFace, Stripe, etc. which require your API token for authentication.
You can supply your token while registering the MCP server:
# If you specify the `--bearer-token` flag, MCPJungle will add the `Authorization: Bearer <token>` header to all requests made to this MCP server.
mcpjungle register --name huggingface --description "HuggingFace MCP Server" --url https://huggingface.co/mcp --bearer-token <your-hf-api-token>
Or from your configuration file
{
"name": "huggingface",
"transport": "streamable_http",
"url": "https://huggingface.co/mcp",
"description": "hugging face mcp server",
"bearer_token": "<your-hf-api-token>"
}
Support for Oauth flow is coming soon!
If you're running MCPJungle in your organisation, we recommend running the Server in the production
mode:
# enable enterprise features by running in production mode
mcpjungle start --prod
# you can also specify the server mode as environment variable (valid values are `development` and `production`)
export SERVER_MODE=production
mcpjungle start
# Or use the production docker compose file as described above
docker compose -f docker-compose.prod.yaml up -d
By default, mcpjungle server runs in development
mode which is ideal for individuals running it locally.
In Production mode, the server enforces stricter security policies and will provide additional features like Authentication, ACLs, observability and more.
After starting the server in production mode, you must initialize it by running the following command on your client machine:
mcpjungle init-server
This will create an admin user in the server and store its API access token in your home directory (~/.mcpjungle.conf
).
You can then use the mcpjungle cli to make authenticated requests to the server.
In development
mode, all MCP clients have full access to all the MCP servers registered in MCPJungle Proxy.
production
mode lets you control which MCP clients can access which MCP servers.
Suppose you have registered 2 MCP servers calculator
and github
in MCPJungle in production mode.
By default, no MCP client can access these servers. You must create an MCP Client in mcpjungle and explicitly allow it to access the MCP servers.
# Create a new MCP client for your Cursor IDE to use. It can access the calculator and github MCP servers
mcpjungle create mcp-client cursor-local --allow "calculator, github"
MCP client 'cursor-local' created successfully!
Servers accessible: calculator,github
Access token: 1YHf2LwE1LXtp5lW_vM-gmdYHlPHdqwnILitBhXE4Aw
Send this token in the `Authorization: Bearer {token}` HTTP header.
Mcpjungle creates an access token for your client.
Configure your client or agent to send this token in the Authorization
header when making requests to the mcpjungle proxy.
For example, you can add the following configuration in Cursor to connect to MCPJungle:
{
"mcpServers": {
"mcpjungle": {
"url": "http://localhost:8080/mcp",
"headers": {
"Authorization": "Bearer 1YHf2LwE1LXtp5lW_vM-gmdYHlPHdqwnILitBhXE4Aw"
}
}
}
}
A client that has access to a particular server this way can view and call all the tools provided by that server.
[!NOTE] If you don't specify the
--allow
flag, the MCP client will not be able to access any MCP servers.
MCPJungle supports Prometheus-compatible OpenTelemetry Metrics for observability.
- In
production
mode, OpenTelemetry is enabled by default. - In
development
mode, telemetry is disabled by default. You can enable it by setting theOTEL_ENABLED
environment variable totrue
before starting the server:
# enable OpenTelemetry metrics
export OTEL_ENABLED=true
# optionally, set additional attributes to be added to all metrics
export OTEL_RESOURCE_ATTRIBUTES=deployment.environment.name=production
# start the server
mcpjungle start
Once the mcpjungle server is started, metrics are available at the /metrics
endpoint.
We're not perfect yet, but we're working hard to get there!
When you call a tool in a Streamable HTTP server, mcpjungle creates a new connection to the server to serve the request.
When you call a tool in a STDIO server, mcpjungle creates a new connection and starts a new sub-process to run this server.
After servicing your request, it terminates this sub-process.
So a new stdio server process is started for every tool call.
This has some performance overhead but ensures that there are no memory leaks.
It also means that if you rely on stateful connections with your MCP server, mcpjungle can currently not provide that.
We plan on improving this mechanism in future releases and are open to ideas from the community!
This is a work in progress.
We're collecting more feedback on how people use OAuth with MCP servers, so feel free to start a Discussion or open an issue to share your use case.
We welcome contributions from the community!
- For contribution guidelines and standards, see CONTRIBUTION.md
- For development setup and technical details, see DEVELOPMENT.md
Join our Discord community to connect with other contributors and maintainers.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for MCPJungle
Similar Open Source Tools

MCPJungle
MCPJungle is a self-hosted MCP Gateway for private AI agents, serving as a registry for Model Context Protocol Servers. Developers use it to manage servers and tools centrally, while clients discover and consume tools from a single 'Gateway' MCP Server. Suitable for developers using MCP Clients like Claude & Cursor, building production-grade AI Agents, and organizations managing client-server interactions. The tool allows quick start, installation, usage, server and client setup, connection to Claude and Cursor, enabling/disabling tools, managing tool groups, authentication, enterprise features like access control and OpenTelemetry metrics. Limitations include lack of long-running connections to servers and no support for OAuth flow. Contributions are welcome.

ray-llm
RayLLM (formerly known as Aviary) is an LLM serving solution that makes it easy to deploy and manage a variety of open source LLMs, built on Ray Serve. It provides an extensive suite of pre-configured open source LLMs, with defaults that work out of the box. RayLLM supports Transformer models hosted on Hugging Face Hub or present on local disk. It simplifies the deployment of multiple LLMs, the addition of new LLMs, and offers unique autoscaling support, including scale-to-zero. RayLLM fully supports multi-GPU & multi-node model deployments and offers high performance features like continuous batching, quantization and streaming. It provides a REST API that is similar to OpenAI's to make it easy to migrate and cross test them. RayLLM supports multiple LLM backends out of the box, including vLLM and TensorRT-LLM.

aiac
AIAC is a library and command line tool to generate Infrastructure as Code (IaC) templates, configurations, utilities, queries, and more via LLM providers such as OpenAI, Amazon Bedrock, and Ollama. Users can define multiple 'backends' targeting different LLM providers and environments using a simple configuration file. The tool allows users to ask a model to generate templates for different scenarios and composes an appropriate request to the selected provider, storing the resulting code to a file and/or printing it to standard output.

vectorflow
VectorFlow is an open source, high throughput, fault tolerant vector embedding pipeline. It provides a simple API endpoint for ingesting large volumes of raw data, processing, and storing or returning the vectors quickly and reliably. The tool supports text-based files like TXT, PDF, HTML, and DOCX, and can be run locally with Kubernetes in production. VectorFlow offers functionalities like embedding documents, running chunking schemas, custom chunking, and integrating with vector databases like Pinecone, Qdrant, and Weaviate. It enforces a standardized schema for uploading data to a vector store and supports features like raw embeddings webhook, chunk validation webhook, S3 endpoint, and telemetry. The tool can be used with the Python client and provides detailed instructions for running and testing the functionalities.

smartcat
Smartcat is a CLI interface that brings language models into the Unix ecosystem, allowing power users to leverage the capabilities of LLMs in their daily workflows. It features a minimalist design, seamless integration with terminal and editor workflows, and customizable prompts for specific tasks. Smartcat currently supports OpenAI, Mistral AI, and Anthropic APIs, providing access to a range of language models. With its ability to manipulate file and text streams, integrate with editors, and offer configurable settings, Smartcat empowers users to automate tasks, enhance code quality, and explore creative possibilities.

dravid
Dravid (DRD) is an advanced, AI-powered CLI coding framework designed to follow user instructions until the job is completed, including fixing errors. It can generate code, fix errors, handle image queries, manage file operations, integrate with external APIs, and provide a development server with error handling. Dravid is extensible and requires Python 3.7+ and CLAUDE_API_KEY. Users can interact with Dravid through CLI commands for various tasks like creating projects, asking questions, generating content, handling metadata, and file-specific queries. It supports use cases like Next.js project development, working with existing projects, exploring new languages, Ruby on Rails project development, and Python project development. Dravid's project structure includes directories for source code, CLI modules, API interaction, utility functions, AI prompt templates, metadata management, and tests. Contributions are welcome, and development setup involves cloning the repository, installing dependencies with Poetry, setting up environment variables, and using Dravid for project enhancements.

seer
Seer is a service that provides AI capabilities to Sentry by running inference on Sentry issues and providing user insights. It is currently in early development and not yet compatible with self-hosted Sentry instances. The tool requires access to internal Sentry resources and is intended for internal Sentry employees. Users can set up the environment, download model artifacts, integrate with local Sentry, run evaluations for Autofix AI agent, and deploy to a sandbox staging environment. Development commands include applying database migrations, creating new migrations, running tests, and more. The tool also supports VCRs for recording and replaying HTTP requests.

call-gpt
Call GPT is a voice application that utilizes Deepgram for Speech to Text, elevenlabs for Text to Speech, and OpenAI for GPT prompt completion. It allows users to chat with ChatGPT on the phone, providing better transcription, understanding, and speaking capabilities than traditional IVR systems. The app returns responses with low latency, allows user interruptions, maintains chat history, and enables GPT to call external tools. It coordinates data flow between Deepgram, OpenAI, ElevenLabs, and Twilio Media Streams, enhancing voice interactions.

aisuite
Aisuite is a simple, unified interface to multiple Generative AI providers. It allows developers to easily interact with various Language Model (LLM) providers like OpenAI, Anthropic, Azure, Google, AWS, and more through a standardized interface. The library focuses on chat completions and provides a thin wrapper around python client libraries, enabling creators to test responses from different LLM providers without changing their code. Aisuite maximizes stability by using HTTP endpoints or SDKs for making calls to the providers. Users can install the base package or specific provider packages, set up API keys, and utilize the library to generate chat completion responses from different models.

llamafile
llamafile is a tool that enables users to distribute and run Large Language Models (LLMs) with a single file. It combines llama.cpp with Cosmopolitan Libc to create a framework that simplifies the complexity of LLMs into a single-file executable called a 'llamafile'. Users can run these executable files locally on most computers without the need for installation, making open LLMs more accessible to developers and end users. llamafile also provides example llamafiles for various LLM models, allowing users to try out different LLMs locally. The tool supports multiple CPU microarchitectures, CPU architectures, and operating systems, making it versatile and easy to use.

slack-bot
The Slack Bot is a tool designed to enhance the workflow of development teams by integrating with Jenkins, GitHub, GitLab, and Jira. It allows for custom commands, macros, crons, and project-specific commands to be implemented easily. Users can interact with the bot through Slack messages, execute commands, and monitor job progress. The bot supports features like starting and monitoring Jenkins jobs, tracking pull requests, querying Jira information, creating buttons for interactions, generating images with DALL-E, playing quiz games, checking weather, defining custom commands, and more. Configuration is managed via YAML files, allowing users to set up credentials for external services, define custom commands, schedule cron jobs, and configure VCS systems like Bitbucket for automated branch lookup in Jenkins triggers.

cagent
cagent is a powerful and easy-to-use multi-agent runtime that orchestrates AI agents with specialized capabilities and tools, allowing users to quickly build, share, and run a team of virtual experts to solve complex problems. It supports creating agents with YAML configuration, improving agents with MCP servers, and delegating tasks to specialists. Key features include multi-agent architecture, rich tool ecosystem, smart delegation, YAML configuration, advanced reasoning tools, and support for multiple AI providers like OpenAI, Anthropic, Gemini, and Docker Model Runner.

cog-comfyui
Cog-comfyui allows users to run ComfyUI workflows on Replicate. ComfyUI is a visual programming tool for creating and sharing generative art workflows. With cog-comfyui, users can access a variety of pre-trained models and custom nodes to create their own unique artworks. The tool is easy to use and does not require any coding experience. Users simply need to upload their API JSON file and any necessary input files, and then click the "Run" button. Cog-comfyui will then generate the output image or video file.

chroma
Chroma is an open-source embedding database that simplifies building LLM apps by enabling the integration of knowledge, facts, and skills for LLMs. The Ruby client for Chroma Database, chroma-rb, facilitates connecting to Chroma's database via its API. Users can configure the host, check server version, create collections, and add embeddings. The gem supports Chroma Database version 0.3.22 or newer, requiring Ruby 3.1.4 or later. It can be used with the hosted Chroma service at trychroma.com by setting configuration options like api_key, tenant, and database. Additionally, the gem provides integration with Jupyter Notebook for creating embeddings using Ollama and Nomic embed text with a Ruby HTTP client.

dir-assistant
Dir-assistant is a tool that allows users to interact with their current directory's files using local or API Language Models (LLMs). It supports various platforms and provides API support for major LLM APIs. Users can configure and customize their local LLMs and API LLMs using the tool. Dir-assistant also supports model downloads and configurations for efficient usage. It is designed to enhance file interaction and retrieval using advanced language models.

cog-comfyui
Cog-ComfyUI is a tool designed to run ComfyUI workflows on Replicate. It allows users to easily integrate their own workflows into their app or website using the Replicate API. The tool includes popular model weights and custom nodes, with the option to request more custom nodes or models. Users can get their API JSON, gather input files, and use custom LoRAs from CivitAI or HuggingFace. Additionally, users can run their workflows and set up their own dedicated instances for better performance and control. The tool provides options for private deployments, forking using Cog, or creating new models from the train tab on Replicate. It also offers guidance on developing locally and running the Web UI from a Cog container.
For similar tasks

MCPJungle
MCPJungle is a self-hosted MCP Gateway for private AI agents, serving as a registry for Model Context Protocol Servers. Developers use it to manage servers and tools centrally, while clients discover and consume tools from a single 'Gateway' MCP Server. Suitable for developers using MCP Clients like Claude & Cursor, building production-grade AI Agents, and organizations managing client-server interactions. The tool allows quick start, installation, usage, server and client setup, connection to Claude and Cursor, enabling/disabling tools, managing tool groups, authentication, enterprise features like access control and OpenTelemetry metrics. Limitations include lack of long-running connections to servers and no support for OAuth flow. Contributions are welcome.

octelium
Octelium is a free and open source, self-hosted, unified zero trust secure access platform that operates as a modern zero-config remote access VPN, a comprehensive Zero Trust Network Access (ZTNA)/BeyondCorp platform, an ngrok/Cloudflare Tunnel alternative, an API gateway, an AI/LLM gateway, a PaaS-like platform, a Kubernetes gateway/ingress, and a homelab infrastructure. It provides scalable zero trust architecture for identity-based, application-layer aware secure access via private client-based access over WireGuard/QUIC tunnels and public clientless access, with context-aware access control. Octelium offers dynamic secretless access, fine-grained access control, identity-based routing, continuous strong authentication, OpenTelemetry-native auditing, passwordless SSH, effortless deployment of containerized applications, centralized management, and more. It is open source, designed for self-hosting, and provides a commercial license option for businesses.
For similar jobs

promptflow
**Prompt flow** is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.

deepeval
DeepEval is a simple-to-use, open-source LLM evaluation framework specialized for unit testing LLM outputs. It incorporates various metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., and runs locally on your machine for evaluation. It provides a wide range of ready-to-use evaluation metrics, allows for creating custom metrics, integrates with any CI/CD environment, and enables benchmarking LLMs on popular benchmarks. DeepEval is designed for evaluating RAG and fine-tuning applications, helping users optimize hyperparameters, prevent prompt drifting, and transition from OpenAI to hosting their own Llama2 with confidence.

MegaDetector
MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".

leapfrogai
LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped resource-constrained environments. It brings sophisticated AI solutions to these environments by hosting all the necessary components of an AI stack, including vector databases, model backends, API, and UI. LeapfrogAI's API closely matches that of OpenAI, allowing tools built for OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend. It provides several backends for various use cases, including llama-cpp-python, whisper, text-embeddings, and vllm. LeapfrogAI leverages Chainguard's apko to harden base python images, ensuring the latest supported Python versions are used by the other components of the stack. The LeapfrogAI SDK provides a standard set of protobuffs and python utilities for implementing backends and gRPC. LeapfrogAI offers UI options for common use-cases like chat, summarization, and transcription. It can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. LeapfrogAI is supported by a community of users and contributors, including Defense Unicorns, Beast Code, Chainguard, Exovera, Hypergiant, Pulze, SOSi, United States Navy, United States Air Force, and United States Space Force.

llava-docker
This Docker image for LLaVA (Large Language and Vision Assistant) provides a convenient way to run LLaVA locally or on RunPod. LLaVA is a powerful AI tool that combines natural language processing and computer vision capabilities. With this Docker image, you can easily access LLaVA's functionalities for various tasks, including image captioning, visual question answering, text summarization, and more. The image comes pre-installed with LLaVA v1.2.0, Torch 2.1.2, xformers 0.0.23.post1, and other necessary dependencies. You can customize the model used by setting the MODEL environment variable. The image also includes a Jupyter Lab environment for interactive development and exploration. Overall, this Docker image offers a comprehensive and user-friendly platform for leveraging LLaVA's capabilities.

carrot
The 'carrot' repository on GitHub provides a list of free and user-friendly ChatGPT mirror sites for easy access. The repository includes sponsored sites offering various GPT models and services. Users can find and share sites, report errors, and access stable and recommended sites for ChatGPT usage. The repository also includes a detailed list of ChatGPT sites, their features, and accessibility options, making it a valuable resource for ChatGPT users seeking free and unlimited GPT services.

TrustLLM
TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.

AI-YinMei
AI-YinMei is an AI virtual anchor Vtuber development tool (N card version). It supports fastgpt knowledge base chat dialogue, a complete set of solutions for LLM large language models: [fastgpt] + [one-api] + [Xinference], supports docking bilibili live broadcast barrage reply and entering live broadcast welcome speech, supports Microsoft edge-tts speech synthesis, supports Bert-VITS2 speech synthesis, supports GPT-SoVITS speech synthesis, supports expression control Vtuber Studio, supports painting stable-diffusion-webui output OBS live broadcast room, supports painting picture pornography public-NSFW-y-distinguish, supports search and image search service duckduckgo (requires magic Internet access), supports image search service Baidu image search (no magic Internet access), supports AI reply chat box [html plug-in], supports AI singing Auto-Convert-Music, supports playlist [html plug-in], supports dancing function, supports expression video playback, supports head touching action, supports gift smashing action, supports singing automatic start dancing function, chat and singing automatic cycle swing action, supports multi scene switching, background music switching, day and night automatic switching scene, supports open singing and painting, let AI automatically judge the content.