
mcp-hub
A centralized manager for Model Context Protocol (MCP) servers with dynamic server management and monitoring
Stars: 51

MCP Hub is a centralized manager for Model Context Protocol (MCP) servers, offering dynamic server management and monitoring, REST API for tool execution and resource access, MCP Server marketplace integration, real-time server status tracking, client connection management, and process lifecycle handling. It acts as a central management server connecting to and managing multiple MCP servers, providing unified API endpoints for client access, handling server lifecycle and health monitoring, and routing requests between clients and MCP servers.
README:
A centralized manager for Model Context Protocol (MCP) servers that provides:
- Dynamic MCP server management and monitoring
- REST API for tool execution and resource access
- MCP Server marketplace (using Cline marketplace)
- Real-time server status tracking
- Client connection management
- Process lifecycle handling
-
Hub Server (MCP Hub)
- Central management server that connects to and manages multiple MCP servers
- Provides unified API endpoints for clients to access MCP server capabilities
- Handles server lifecycle, health monitoring, and client connections
- Routes requests between clients and appropriate MCP servers
-
MCP Servers
- Individual servers that provide specific tools and resources
- Each server has its own capabilities (tools, resources, templates)
- Connected to and managed by the Hub server
- Process requests from clients through the Hub
npm install -g mcp-hub
Start the hub server:
mcp-hub --port 3000 --config path/to/config.json
Options:
--port Port to run the server on (default: 3000)
--config Path to config file (required)
--watch Watch config file for changes (default: false)
--shutdown-delay Delay in milliseconds before shutting down when no clients are connected (default: 0)
-h, --help Show help information
The server outputs JSON-formatted status messages on startup and state changes:
{
"status": "ready",
"server_id": "mcp-hub",
"version": "1.0.0",
"port": 3000,
"pid": 12345,
"servers": [],
"timestamp": "2024-02-20T05:55:00.000Z"
}
coming...
Just add it to your NixOS flake.nix or home-manager:
inputs = {
mcp-hub.url = "github:ravitemer/mcp-hub";
...
}
To integrate mcp-hub to your NixOS/Home Manager configuration, add the following to your environment.systemPackages or home.packages respectively:
inputs.mcp-hub.packages."${system}".default
If you want to use mcphub.nvim without having mcp-hub server in your PATH you can link the server under the hood adding
the mcp-hub nix store path to the cmd
command in the plugin config like
Nixvim example:
{ mcphub-nvim, mcp-hub, ... }:
{
extraPlugins = [mcphub-nvim];
extraConfigLua = ''
require("mcphub").setup({
port = 3000,
config = vim.fn.expand("~/mcp-hub/mcp-servers.json"),
cmd = "${mcp-hub}/bin/mcp-hub"
})
'';
}
# where
{
# For nixpkgs (not available yet)
mcp-hub = pkgs.mcp-hub;
# For flakes
mcp-hub = inputs.mcp-hub.packages."${system}".default;
}
MCP Hub uses a JSON configuration file to define managed servers:
{
"mcpServers": {
"stdio-server": {
"command": "npx",
"args": ["example-server"],
"env": {
"API_KEY": "", // Will use process.env.API_KEY
"DEBUG": "true", // Will use this value
"SECRET_TOKEN": null // Will use process.env.SECRET_TOKEN
},
"disabled": false
},
"sse-server": {
"url": "https://api.example.com/mcp",
"headers": {
"Authorization": "Bearer token",
"Content-Type": "application/json"
},
"disabled": false
}
}
}
MCP Hub supports two types of servers: STDIO (local) and SSE (remote). The server type is automatically determined based on the configuration fields provided.
- command: Command to start the local MCP server
- args: Array of command line arguments
- env: Environment variables for the server. If a variable is specified with a falsy value (empty string, null, undefined), it will fall back to using the corresponding system environment variable if available.
- disabled: Whether the server is disabled (default: false)
- url: The URL of the remote SSE server endpoint
- headers: Optional HTTP headers for the SSE connection (e.g., for authentication)
- disabled: Whether the server is disabled (default: false)
The server type (STDIO or SSE) is automatically determined based on the presence of specific fields:
- If
command
is present → STDIO server - If
url
is present → SSE server
Note: A server configuration cannot mix STDIO and SSE fields - it must be one type or the other.
The ravitemer/mcphub.nvim plugin provides seamless integration with Neovim, allowing direct interaction with MCP Hub from your editor:
- Execute MCP tools directly from Neovim
- Access MCP resources within your editing workflow
- Real-time status updates in Neovim
- Auto install mcp servers with marketplace addition
MCP Hub uses structured JSON logging for all events:
{
"type": "error",
"code": "TOOL_ERROR",
"message": "Failed to execute tool",
"data": {
"server": "example-server",
"tool": "example-tool",
"error": "Invalid parameters"
},
"timestamp": "2024-02-20T05:55:00.000Z"
}
Log levels include:
-
info
: Normal operational messages -
warn
: Warning conditions -
debug
: Detailed debug information -
error
: Error conditions (includes error code and details)
GET /api/health
Response:
{
"status": "ok",
"server_id": "mcp-hub",
"version": "1.0.0",
"activeClients": 2,
"timestamp": "2024-02-20T05:55:00.000Z",
"servers": []
}
GET /api/servers
POST /api/servers/info
Content-Type: application/json
{
"server_name": "example-server"
}
POST /api/servers/refresh
Content-Type: application/json
{
"server_name": "example-server"
}
Response:
{
"status": "ok",
"server": {
"name": "example-server",
"capabilities": {
"tools": ["tool1", "tool2"],
"resources": ["resource1", "resource2"],
"resourceTemplates": []
}
},
"timestamp": "2024-02-20T05:55:00.000Z"
}
POST /api/refresh
Response:
{
"status": "ok",
"servers": [
{
"name": "example-server",
"capabilities": {
"tools": ["tool1", "tool2"],
"resources": ["resource1", "resource2"],
"resourceTemplates": []
}
}
],
"timestamp": "2024-02-20T05:55:00.000Z"
}
POST /api/servers/start
Content-Type: application/json
{
"server_name": "example-server"
}
Response:
{
"status": "ok",
"server": {
"name": "example-server",
"status": "connected",
"uptime": 123
},
"timestamp": "2024-02-20T05:55:00.000Z"
}
POST /api/servers/stop?disable=true|false
Content-Type: application/json
{
"server_name": "example-server"
}
The optional disable
query parameter can be set to true
to disable the server in the configuration.
Response:
{
"status": "ok",
"server": {
"name": "example-server",
"status": "disconnected",
"uptime": 0
},
"timestamp": "2024-02-20T05:55:00.000Z"
}
POST /api/client/register
{
"clientId": "unique_client_id"
}
POST /api/client/unregister
{
"clientId": "unique_client_id"
}
GET /api/marketplace
Query Parameters:
-
search
: Filter by name, description, or tags -
category
: Filter by category -
tags
: Filter by comma-separated tags -
sort
: Sort by "newest", "stars", or "name"
Response:
{
"items": [
{
"mcpId": "github.com/user/repo/server",
"name": "Example Server",
"description": "Description here",
"category": "search",
"tags": ["search", "ai"],
"githubStars": 100,
"isRecommended": true,
"createdAt": "2024-02-20T05:55:00.000Z"
}
],
"timestamp": "2024-02-20T05:55:00.000Z"
}
POST /api/marketplace/details
Content-Type: application/json
{
"mcpId": "github.com/user/repo/server"
}
Response:
{
"server": {
"mcpId": "github.com/user/repo/server",
"name": "Example Server",
"description": "Description here",
"githubUrl": "https://github.com/user/repo",
"readmeContent": "# Server Documentation...",
"llmsInstallationContent": "Installation guide..."
},
"timestamp": "2024-02-20T05:55:00.000Z"
}
POST /api/servers/tools
Content-Type: application/json
{
"server_name": "example-server",
"tool": "tool_name",
"arguments": {}
}
POST /api/servers/resources
Content-Type: application/json
{
"server_name": "example-server",
"uri": "resource://uri"
}
POST /api/servers/prompts
Content-Type: application/json
{
"server_name": "example-server",
"prompt": "prompt_name",
"arguments": {}
}
Response:
{
"result": {
"messages": [
{
"role": "assistant",
"content": {
"type": "text",
"text": "Text response example"
}
},
{
"role": "assistant",
"content": {
"type": "image",
"data": "base64_encoded_image_data",
"mimeType": "image/png"
}
}
]
},
"timestamp": "2024-02-20T05:55:00.000Z"
}
POST /api/restart
Reloads the configuration file and restarts all MCP servers.
Response:
{
"status": "ok",
"timestamp": "2024-02-20T05:55:00.000Z"
}
The Hub Server provides real-time updates via Server-Sent Events (SSE) at /api/events
. Connect to this endpoint to receive real-time updates about server status, client connections, and capability changes.
- server_info - Initial connection information
{
"server_id": "mcp-hub",
"version": "1.0.0",
"status": "connected",
"pid": 12345,
"port": 3000,
"activeClients": 1,
"timestamp": "2024-02-20T05:55:00.000Z"
}
- server_ready - Server started and ready
{
"status": "ready",
"server_id": "mcp-hub",
"version": "1.0.0",
"port": 3000,
"pid": 12345,
"servers": [],
"timestamp": "2024-02-20T05:55:00.000Z"
}
- client_registered/unregistered - Client connection events
{
"activeClients": 2,
"clientId": "client_123",
"timestamp": "2024-02-20T05:55:00.000Z"
}
- tool_list_changed - Server's tools list has changed
{
"type": "TOOL",
"server": "example-server",
"tools": ["tool1", "tool2"],
"timestamp": "2024-02-20T05:55:00.000Z"
}
- resource_list_changed - Server's resources list has changed
{
"type": "RESOURCE",
"server": "example-server",
"resources": ["resource1", "resource2"],
"resourceTemplates": [],
"timestamp": "2024-02-20T05:55:00.000Z"
}
- prompt_list_changed - Server's prompts list has changed
{
"type": "PROMPT",
"server": "example-server",
"prompts": ["prompt1", "prompt2"],
"timestamp": "2024-02-20T05:55:00.000Z"
}
MCP Hub implements a comprehensive error handling system with custom error classes for different types of errors:
- ConfigError: Configuration-related errors (invalid config, missing fields)
- ConnectionError: Server connection issues (failed connections, transport errors)
- ServerError: Server startup/initialization problems
- ToolError: Tool execution failures
- ResourceError: Resource access issues
- ValidationError: Request validation errors
Each error includes:
- Error code for easy identification
- Detailed error message
- Additional context in the details object
- Stack trace for debugging
Example error structure:
{
"code": "CONNECTION_ERROR",
"message": "Failed to communicate with server",
"details": {
"server": "example-server",
"error": "connection timeout"
},
"timestamp": "2024-02-20T05:55:00.000Z"
}
-
Configuration Errors
- Invalid config format
- Missing required fields
- Environment variable issues
-
Server Management Errors
- Connection failures
- Lost connections
- Capability fetch issues
- Server startup problems
-
Request Processing Errors
- Invalid parameters
- Server availability
- Tool execution failures
- Resource access issues
-
Client Management Errors
- Registration failures
- Duplicate registrations
- Invalid client IDs
sequenceDiagram
participant H as Hub Server
participant M1 as MCP Server 1
participant M2 as MCP Server 2
participant C as Client
Note over H: Server Start
activate H
H->>+M1: Connect
M1-->>-H: Connected + Capabilities
H->>+M2: Connect
M2-->>-H: Connected + Capabilities
Note over C,H: Client Interactions
C->>H: Register Client
H-->>C: Servers List & Capabilities
C->>H: Call Tool (M1)
H->>M1: Execute Tool
M1-->>H: Tool Result
H-->>C: Response
C->>H: Access Resource (M2)
H->>M2: Get Resource
M2-->>H: Resource Data
H-->>C: Response
Note over H: Server Management
H->>H: Monitor Server Health
H->>H: Track Server Status
H->>H: Update Capabilities
Note over H: Shutdown Process
C->>H: Unregister
H->>M1: Disconnect
H->>M2: Disconnect
deactivate H
The Hub Server coordinates communication between clients and MCP servers:
- Starts and connects to configured MCP servers
- Manages client registrations
- Routes tool execution and resource requests
- Handles server monitoring and health checks
- Performs clean shutdown of all connections
flowchart TB
A[Hub Server Start] --> B{Config Available?}
B -->|Yes| C[Load Server Configs]
B -->|No| D[Use Default Settings]
C --> E[Initialize Connections]
D --> E
E --> F{For Each MCP Server}
F -->|Enabled| G[Attempt Connection]
F -->|Disabled| H[Skip Server]
G --> I{Connection Status}
I -->|Success| J[Fetch Capabilities]
I -->|Failure| K[Log Error]
J --> L[Store Server Info]
K --> M[Mark Server Unavailable]
L --> N[Monitor Health]
M --> N
N --> O{Health Check}
O -->|Healthy| P[Update Capabilities]
O -->|Unhealthy| Q[Attempt Reconnect]
Q -->|Success| P
Q -->|Failure| R[Update Status]
P --> N
R --> N
The Hub Server actively manages MCP servers through:
- Configuration-based server initialization
- Connection and capability discovery
- Health monitoring and status tracking
- Automatic reconnection attempts
- Server state management
sequenceDiagram
participant C as Client
participant H as Hub Server
participant M as MCP Server
Note over C,H: Tool Execution Flow
C->>H: POST /api/servers/{name}/tools
H->>H: Validate Request
H->>H: Check Server Status
alt Server Not Connected
H-->>C: Error: Server Unavailable
else Server Connected
H->>M: Execute Tool
alt Tool Success
M-->>H: Tool Result
H-->>C: Success Response
else Tool Error
M-->>H: Error Details
H-->>C: Error Response
end
end
Note over C,H: Resource Access Flow
C->>H: POST /api/servers/{name}/resources
H->>H: Validate URI
H->>H: Check Server Status
alt Valid Resource
H->>M: Request Resource
M-->>H: Resource Data
H-->>C: Resource Content
else Invalid Resource
H-->>C: 404 Not Found
end
All client requests follow a standardized flow:
- Request validation
- Server status verification
- Request routing to appropriate MCP server
- Response handling and error management
- Node.js >= 18.0.0
- [ ] Implement custom marketplace rather than depending on mcp-marketplace
- Cline mcp-marketplace - For providing the MCP server marketplace endpoints that power MCP Hub's marketplace integration
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for mcp-hub
Similar Open Source Tools

mcp-hub
MCP Hub is a centralized manager for Model Context Protocol (MCP) servers, offering dynamic server management and monitoring, REST API for tool execution and resource access, MCP Server marketplace integration, real-time server status tracking, client connection management, and process lifecycle handling. It acts as a central management server connecting to and managing multiple MCP servers, providing unified API endpoints for client access, handling server lifecycle and health monitoring, and routing requests between clients and MCP servers.

python-utcp
The Universal Tool Calling Protocol (UTCP) is a secure and scalable standard for defining and interacting with tools across various communication protocols. UTCP emphasizes scalability, extensibility, interoperability, and ease of use. It offers a modular core with a plugin-based architecture, making it extensible, testable, and easy to package. The repository contains the complete UTCP Python implementation with core components and protocol-specific plugins for HTTP, CLI, Model Context Protocol, file-based tools, and more.

firecrawl-mcp-server
Firecrawl MCP Server is a Model Context Protocol (MCP) server implementation that integrates with Firecrawl for web scraping capabilities. It supports features like scrape, crawl, search, extract, and batch scrape. It provides web scraping with JS rendering, URL discovery, web search with content extraction, automatic retries with exponential backoff, credit usage monitoring, comprehensive logging system, support for cloud and self-hosted FireCrawl instances, mobile/desktop viewport support, and smart content filtering with tag inclusion/exclusion. The server includes configurable parameters for retry behavior and credit usage monitoring, rate limiting and batch processing capabilities, and tools for scraping, batch scraping, checking batch status, searching, crawling, and extracting structured information from web pages.

jambo
Jambo is a Python package that automatically converts JSON Schema definitions into Pydantic models. It streamlines schema validation and enforces type safety using Pydantic's validation features. The tool supports various JSON Schema features like strings, integers, floats, booleans, arrays, nested objects, and more. It enforces constraints such as minLength, maxLength, pattern, minimum, maximum, uniqueItems, and provides a zero-config approach for generating models. Jambo is designed to simplify the process of dynamically generating Pydantic models for AI frameworks.

step-free-api
The StepChat Free service provides high-speed streaming output, multi-turn dialogue support, online search support, long document interpretation, and image parsing. It offers zero-configuration deployment, multi-token support, and automatic session trace cleaning. It is fully compatible with the ChatGPT interface. Additionally, it provides seven other free APIs for various services. The repository includes a disclaimer about using reverse APIs and encourages users to avoid commercial use to prevent service pressure on the official platform. It offers online testing links, showcases different demos, and provides deployment guides for Docker, Docker-compose, Render, Vercel, and native deployments. The repository also includes information on using multiple accounts, optimizing Nginx reverse proxy, and checking the liveliness of refresh tokens.

qwen-free-api
Qwen AI Free service supports high-speed streaming output, multi-turn dialogue, watermark-free AI drawing, long document interpretation, image parsing, zero-configuration deployment, multi-token support, automatic session trace cleaning. It is fully compatible with the ChatGPT interface. The repository provides various free APIs for different AI services. Users can access the service through different deployment methods like Docker, Docker-compose, Render, Vercel, and native deployment. It offers interfaces for chat completions, AI drawing, document interpretation, image parsing, and token checking. Users need to provide 'login_tongyi_ticket' for authorization. The project emphasizes research, learning, and personal use only, discouraging commercial use to avoid service pressure on the official platform.

spark-free-api
Spark AI Free 服务 provides high-speed streaming output, multi-turn dialogue support, AI drawing support, long document interpretation, and image parsing. It offers zero-configuration deployment, multi-token support, and automatic session trace cleaning. It is fully compatible with the ChatGPT interface. The repository includes multiple free-api projects for various AI services. Users can access the API for tasks such as chat completions, AI drawing, document interpretation, image analysis, and ssoSessionId live checking. The project also provides guidelines for deployment using Docker, Docker-compose, Render, Vercel, and native deployment methods. It recommends using custom clients for faster and simpler access to the free-api series projects.

firecrawl-mcp-server
Firecrawl MCP Server is a Model Context Protocol (MCP) server implementation that integrates with Firecrawl for web scraping capabilities. It offers features such as web scraping, crawling, and discovery, search and content extraction, deep research and batch scraping, automatic retries and rate limiting, cloud and self-hosted support, and SSE support. The server can be configured to run with various tools like Cursor, Windsurf, SSE Local Mode, Smithery, and VS Code. It supports environment variables for cloud API and optional configurations for retry settings and credit usage monitoring. The server includes tools for scraping, batch scraping, mapping, searching, crawling, and extracting structured data from web pages. It provides detailed logging and error handling functionalities for robust performance.

glm-free-api
GLM AI Free 服务 provides high-speed streaming output, multi-turn dialogue support, intelligent agent dialogue support, AI drawing support, online search support, long document interpretation support, image parsing support. It offers zero-configuration deployment, multi-token support, and automatic session trace cleaning. It is fully compatible with the ChatGPT interface. The repository also includes six other free APIs for various services like Moonshot AI, StepChat, Qwen, Metaso, Spark, and Emohaa. The tool supports tasks such as chat completions, AI drawing, document interpretation, image parsing, and refresh token survival check.

lego-ai-parser
Lego AI Parser is an open-source application that uses OpenAI to parse visible text of HTML elements. It is built on top of FastAPI, ready to set up as a server, and make calls from any language. It supports preset parsers for Google Local Results, Amazon Listings, Etsy Listings, Wayfair Listings, BestBuy Listings, Costco Listings, Macy's Listings, and Nordstrom Listings. Users can also design custom parsers by providing prompts, examples, and details about the OpenAI model under the classifier key.

RagaAI-Catalyst
RagaAI Catalyst is a comprehensive platform designed to enhance the management and optimization of LLM projects. It offers features such as project management, dataset management, evaluation management, trace management, prompt management, synthetic data generation, and guardrail management. These functionalities enable efficient evaluation and safeguarding of LLM applications.

beelzebub
Beelzebub is an advanced honeypot framework designed to provide a highly secure environment for detecting and analyzing cyber attacks. It offers a low code approach for easy implementation and utilizes virtualization techniques powered by OpenAI Generative Pre-trained Transformer. Key features include OpenAI Generative Pre-trained Transformer acting as Linux virtualization, SSH Honeypot, HTTP Honeypot, TCP Honeypot, Prometheus openmetrics integration, Docker integration, RabbitMQ integration, and kubernetes support. Beelzebub allows easy configuration for different services and ports, enabling users to create custom honeypot scenarios. The roadmap includes developing Beelzebub into a robust PaaS platform. The project welcomes contributions and encourages adherence to the Code of Conduct for a supportive and respectful community.

sparrow
Sparrow is an innovative open-source solution for efficient data extraction and processing from various documents and images. It seamlessly handles forms, invoices, receipts, and other unstructured data sources. Sparrow stands out with its modular architecture, offering independent services and pipelines all optimized for robust performance. One of the critical functionalities of Sparrow - pluggable architecture. You can easily integrate and run data extraction pipelines using tools and frameworks like LlamaIndex, Haystack, or Unstructured. Sparrow enables local LLM data extraction pipelines through Ollama or Apple MLX. With Sparrow solution you get API, which helps to process and transform your data into structured output, ready to be integrated with custom workflows. Sparrow Agents - with Sparrow you can build independent LLM agents, and use API to invoke them from your system. **List of available agents:** * **llamaindex** - RAG pipeline with LlamaIndex for PDF processing * **vllamaindex** - RAG pipeline with LLamaIndex multimodal for image processing * **vprocessor** - RAG pipeline with OCR and LlamaIndex for image processing * **haystack** - RAG pipeline with Haystack for PDF processing * **fcall** - Function call pipeline * **unstructured-light** - RAG pipeline with Unstructured and LangChain, supports PDF and image processing * **unstructured** - RAG pipeline with Weaviate vector DB query, Unstructured and LangChain, supports PDF and image processing * **instructor** - RAG pipeline with Unstructured and Instructor libraries, supports PDF and image processing. Works great for JSON response generation

vlmrun-hub
VLMRun Hub is a versatile tool for managing and running virtual machines in a centralized manner. It provides a user-friendly interface to easily create, start, stop, and monitor virtual machines across multiple hosts. With VLMRun Hub, users can efficiently manage their virtualized environments and streamline their workflow. The tool offers flexibility and scalability, making it suitable for both small-scale personal projects and large-scale enterprise deployments.

AICentral
AI Central is a powerful tool designed to take control of your AI services with minimal overhead. It is built on Asp.Net Core and dotnet 8, offering fast web-server performance. The tool enables advanced Azure APIm scenarios, PII stripping logging to Cosmos DB, token metrics through Open Telemetry, and intelligent routing features. AI Central supports various endpoint selection strategies, proxying asynchronous requests, custom OAuth2 authorization, circuit breakers, rate limiting, and extensibility through plugins. It provides an extensibility model for easy plugin development and offers enriched telemetry and logging capabilities for monitoring and insights.

pipecat-flows
Pipecat Flows is a framework designed for building structured conversations in AI applications. It allows users to create both predefined conversation paths and dynamically generated flows, handling state management and LLM interactions. The framework includes a Python module for building conversation flows and a visual editor for designing and exporting flow configurations. Pipecat Flows is suitable for scenarios such as customer service scripts, intake forms, personalized experiences, and complex decision trees.
For similar tasks

mcphub.nvim
MCPHub.nvim is a powerful Neovim plugin that integrates MCP (Model Context Protocol) servers into your workflow. It offers a centralized config file for managing servers and tools, with an intuitive UI for testing resources. Ideal for LLM integration, it provides programmatic API access and interactive testing through the `:MCPHub` command.

mcp-hub
MCP Hub is a centralized manager for Model Context Protocol (MCP) servers, offering dynamic server management and monitoring, REST API for tool execution and resource access, MCP Server marketplace integration, real-time server status tracking, client connection management, and process lifecycle handling. It acts as a central management server connecting to and managing multiple MCP servers, providing unified API endpoints for client access, handling server lifecycle and health monitoring, and routing requests between clients and MCP servers.

paddler
Paddler is an open-source load balancer and reverse proxy designed specifically for optimizing servers running llama.cpp. It overcomes typical load balancing challenges by maintaining a stateful load balancer that is aware of each server's available slots, ensuring efficient request distribution. Paddler also supports dynamic addition or removal of servers, enabling integration with autoscaling tools.
For similar jobs

AirGo
AirGo is a front and rear end separation, multi user, multi protocol proxy service management system, simple and easy to use. It supports vless, vmess, shadowsocks, and hysteria2.

mosec
Mosec is a high-performance and flexible model serving framework for building ML model-enabled backend and microservices. It bridges the gap between any machine learning models you just trained and the efficient online service API. * **Highly performant** : web layer and task coordination built with Rust 🦀, which offers blazing speed in addition to efficient CPU utilization powered by async I/O * **Ease of use** : user interface purely in Python 🐍, by which users can serve their models in an ML framework-agnostic manner using the same code as they do for offline testing * **Dynamic batching** : aggregate requests from different users for batched inference and distribute results back * **Pipelined stages** : spawn multiple processes for pipelined stages to handle CPU/GPU/IO mixed workloads * **Cloud friendly** : designed to run in the cloud, with the model warmup, graceful shutdown, and Prometheus monitoring metrics, easily managed by Kubernetes or any container orchestration systems * **Do one thing well** : focus on the online serving part, users can pay attention to the model optimization and business logic

llm-code-interpreter
The 'llm-code-interpreter' repository is a deprecated plugin that provides a code interpreter on steroids for ChatGPT by E2B. It gives ChatGPT access to a sandboxed cloud environment with capabilities like running any code, accessing Linux OS, installing programs, using filesystem, running processes, and accessing the internet. The plugin exposes commands to run shell commands, read files, and write files, enabling various possibilities such as running different languages, installing programs, starting servers, deploying websites, and more. It is powered by the E2B API and is designed for agents to freely experiment within a sandboxed environment.

pezzo
Pezzo is a fully cloud-native and open-source LLMOps platform that allows users to observe and monitor AI operations, troubleshoot issues, save costs and latency, collaborate, manage prompts, and deliver AI changes instantly. It supports various clients for prompt management, observability, and caching. Users can run the full Pezzo stack locally using Docker Compose, with prerequisites including Node.js 18+, Docker, and a GraphQL Language Feature Support VSCode Extension. Contributions are welcome, and the source code is available under the Apache 2.0 License.

learn-generative-ai
Learn Cloud Applied Generative AI Engineering (GenEng) is a course focusing on the application of generative AI technologies in various industries. The course covers topics such as the economic impact of generative AI, the role of developers in adopting and integrating generative AI technologies, and the future trends in generative AI. Students will learn about tools like OpenAI API, LangChain, and Pinecone, and how to build and deploy Large Language Models (LLMs) for different applications. The course also explores the convergence of generative AI with Web 3.0 and its potential implications for decentralized intelligence.

gcloud-aio
This repository contains shared codebase for two projects: gcloud-aio and gcloud-rest. gcloud-aio is built for Python 3's asyncio, while gcloud-rest is a threadsafe requests-based implementation. It provides clients for Google Cloud services like Auth, BigQuery, Datastore, KMS, PubSub, Storage, and Task Queue. Users can install the library using pip and refer to the documentation for usage details. Developers can contribute to the project by following the contribution guide.

fluid
Fluid is an open source Kubernetes-native Distributed Dataset Orchestrator and Accelerator for data-intensive applications, such as big data and AI applications. It implements dataset abstraction, scalable cache runtime, automated data operations, elasticity and scheduling, and is runtime platform agnostic. Key concepts include Dataset and Runtime. Prerequisites include Kubernetes version > 1.16, Golang 1.18+, and Helm 3. The tool offers features like accelerating remote file accessing, machine learning, accelerating PVC, preloading dataset, and on-the-fly dataset cache scaling. Contributions are welcomed, and the project is under the Apache 2.0 license with a vendor-neutral approach.

aiges
AIGES is a core component of the Athena Serving Framework, designed as a universal encapsulation tool for AI developers to deploy AI algorithm models and engines quickly. By integrating AIGES, you can deploy AI algorithm models and engines rapidly and host them on the Athena Serving Framework, utilizing supporting auxiliary systems for networking, distribution strategies, data processing, etc. The Athena Serving Framework aims to accelerate the cloud service of AI algorithm models and engines, providing multiple guarantees for cloud service stability through cloud-native architecture. You can efficiently and securely deploy, upgrade, scale, operate, and monitor models and engines without focusing on underlying infrastructure and service-related development, governance, and operations.