
AICentral
An AI Control Centre for monitoring, authenticating, and providing resilient access to multiple Open AI services.
Stars: 76

AI Central is a powerful tool designed to take control of your AI services with minimal overhead. It is built on Asp.Net Core and dotnet 8, offering fast web-server performance. The tool enables advanced Azure APIm scenarios, PII stripping logging to Cosmos DB, token metrics through Open Telemetry, and intelligent routing features. AI Central supports various endpoint selection strategies, proxying asynchronous requests, custom OAuth2 authorization, circuit breakers, rate limiting, and extensibility through plugins. It provides an extensibility model for easy plugin development and offers enriched telemetry and logging capabilities for monitoring and insights.
README:
- Minimal overhead - written on Asp.Net Core, on dotnet 8. One of the fastest web-servers in the business.
- Enable advanced Azure APIm scenarios such as passing a Subscription Key, and a JWT from libraries like PromptFlow that don't support that out-of-the-box.
- PII Stripping logging to Cosmos DB
- Powered by
graemefoster/aicentral.logging.piistripping
- Powered by
- Lightweight out-the-box token metrics surfaced through Open Telemetry
- Does not buffer and block streaming
- Use for PTU Chargeback scenarios
- Gain quick insights into who's using what, how much, and how often
- Standard Open Telemetry format to surface Dashboards in you monitoring solution of choice
- Prompt and usage logging to Azure Monitor
- Works for streaming endpoints as-well as non-streaming
- Intelligent Routing
- Endpoint Selector that favours endpoints reporting higher available capacity
- Random endpoint selector
- Prioritised endpoint selector with fallback
- Lowest Latency endpoint selector
- Can proxy asynchronous requests such as Azure OpenAI DALLE2 Image Generation across fleets of servers
- Custom consumer OAuth2 authorisation
- Can mint JWT time-bound and consumer-bound JWT tokens to make it easy to run events like Hackathons without blowing your budget
- Circuit breakers and backoff-retry over downstream AI services
- Local token rate limiting
- By consumer / by endpoint
- By number of tokens (including streaming by estimated token count)
- Local request rate limiting
- By consumer / by endpoint
- Bulkhead support for buffering requests to backend
- Distributed token rate limiting (using Redis)
- Powered by an extension
graemefoster/aicentral.ratelimiting.distributedredis
- Powered by an extension
- AI Search Vectorization endpoint
- Powered by an extension
graemefoster/aicentral.azureaisearchvectorizer
- Powered by an extension
Extensibility model makes it easy to build your own plugins
To make it easy to get up and running, we are creating QuickStart configurations. Simply pull the docker container, set a few environment variables, and you're away.
Quickstart | Features |
---|---|
APImProxyWithCosmosLogging | Run in-front of Azure APIm AI Gateway for easy PromptFlow and PII stripped logging. |
See Configuration for more details.
The Azure OpenAI SDK retries by default. As AI Central does this for you you can turn it off in the client by passing
new Azure.AI.OpenAI.OpenAIClientOptions() { RetryPolicy = new RetryPolicy(0) }
when you create an OpenAIClient
-
Install Azure CLI if you haven't done so already.
-
Install the Bicep CLI by running the following command in your terminal:
az bicep install
- Compile your Bicep file to an ARM template with the following command:
az bicep build --file ./infra/main.bicep
This will create a file named main.json
in the same directory as your main.bicep
file.
- Deploy the generated ARM template using the Azure CLI. You'll need to login to your Azure account and select the subscription where you want to deploy the resources:
az login
az account set --subscription "your-subscription-id"
az deployment sub create --template-file ./infra/main.json --location "your-location"
Replace "your-subscription-id"
with your actual Azure subscription ID and "your-location"
with the location where you want to deploy the resources (e.g., "westus2").
To test deployment retrieve url for the webapp and update the following curl
command:
curl -X POST \
-H "Content-Type: application/json" \
-H "api-key: {your-customer-key}" \
-d '{
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "what is .net core"}
]
}' \
"https://{your-web-url}/openai/deployments/Gpt35Turbo0613/chat/completions?api-version=2024-02-01"
To test if everything works by running some code of your choice, e.g., this code with OpenAI Python SDK:
import json
import httpx
from openai import AzureOpenAI
api_key = "<your-customer-key>"
def event_hook(req: httpx.Request) -> None:
print(json.dumps(dict(req.headers), indent=2))
client = AzureOpenAI(
azure_endpoint="https://app-[a]-[b].azurewebsites.net", #if you deployed to Azure Web App app-[a]-[b].azurewebsites.net
api_key=api_key,
api_version="2023-05-15",
http_client=httpx.Client(event_hooks={"request": [event_hook]})
)
response = client.chat.completions.create(
model="Gpt35Turbo0613",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the first letter of the alphabet?"}
]
)
print(response)
Note: delete create resources az deployment group list --resource-group "your-resource-group-name" --query "[].{Name:name, Timestamp:properties.timestamp, State:properties.provisioningState}" --output table
This sample produces a AI-Central proxy that
- Listens on a hostname of your choosing
- Proxies directly through to a back-end Open AI server
- Can be accessed using standard SDKs
# Run container in Docker, referencing a local configuration file
docker run -p 8080:8080 -v .\appsettings.Development.json:/app/appsettings.Development.json -e ASPNETCORE_ENVIRONMENT=Development graemefoster/aicentral:latest
#Create new project and bootstrap the AICentral nuget package
dotnet new web -o MyAICentral
cd MyAICentral
dotnet add package AICentral
#dotnet add package AICentral.Logging.AzureMonitor
//Minimal API to configure AI Central
var builder = WebApplication.CreateBuilder(args);
builder.Services.AddAICentral(builder.Configuration);
app.UseAICentral(
builder.Configuration,
//if using logging extension
additionalComponentAssemblies: [ typeof(AzureMonitorLoggerFactory).Assembly ]
);
var app = builder.Build();
app.Run();
{
"AICentral": {
"Endpoints": [
{
"Type": "AzureOpenAIEndpoint",
"Name": "openai-1",
"Properties": {
"LanguageEndpoint": "https://<my-ai>.openai.azure.com",
"AuthenticationType": "Entra"
}
}
],
"AuthProviders": [
{
"Type": "Entra",
"Name": "aad-role-auth",
"Properties": {
"Entra": {
"ClientId": "<my-client-id>",
"TenantId": "<my-tenant-id>",
"Instance": "https://login.microsoftonline.com/"
},
"Requirements" : {
"Roles": ["required-roles"]
}
}
}
],
"EndpointSelectors": [
{
"Type": "SingleEndpoint",
"Name": "default",
"Properties": {
"Endpoint": "openai-1"
}
}
],
"Pipelines": [
{
"Name": "AzureOpenAIPipeline",
"Host": "mypipeline.mydomain.com",
"AuthProvider": "aad-role-auth",
"EndpointSelector": "default"
}
]
}
}
Out of the box AI Central emits Open Telemetry metrics with the following dimensions:
- Consumer
- Endpoint
- Pipeline
- Prompt Tokens
- Response Tokens including streaming
Allowing insightful dashboards to be built using your monitoring tool of choice.
AI Central also allows fine-grained logging. We ship an extension that logs to Azure Monitor, but it's easy to build your own.
See advanced-otel for dashboard inspiration!
This pipeline will:
- Present an Azure OpenAI, and an Open AI downstream as a single upstream endpoint
- maps the incoming deployment Name "GPT35Turbo0613" to the downstream Azure OpenAI deployment "MyGptModel"
- maps incoming Azure OpenAI deployments to Open AI models
- Present it as an Azure OpenAI style endpoint
- Protect the front-end by requiring an AAD token issued for your own AAD application
- Put a local Asp.Net core rate-limiting policy over the endpoint
- Add logging to Azure monitor
- Logs quota, client caller information, and in this case the Prompt but not the response.
{
"AICentral": {
"Endpoints": [
{
"Type": "AzureOpenAIEndpoint",
"Name": "openai-priority",
"Properties": {
"LanguageEndpoint": "https://<my-ai>.openai.azure.com",
"AuthenticationType": "Entra|EntraPassThrough|ApiKey",
"ModelMappings": {
"Gpt35Turbo0613": "MyGptModel"
}
}
},
{
"Type": "OpenAIEndpoint",
"Name": "openai-fallback",
"Properties": {
"LanguageEndpoint": "https://api.openai.com",
"ModelMappings": {
"Gpt35Turbo0613": "gpt-3.5-turbo",
"Ada002Embedding": "text-embedding-ada-002"
},
"ApiKey": "<my-api-key>",
"Organization": "<optional-organisation>"
}
}
],
"AuthProviders": [
{
"Type": "Entra",
"Name": "simple-aad",
"Properties": {
"Entra": {
"ClientId": "<my-client-id>",
"TenantId": "<my-tenant-id>",
"Instance": "https://login.microsoftonline.com/",
"Audience": "<custom-audience>"
},
"Requirements" : {
"Roles": ["required-roles"]
}
}
}
],
"EndpointSelectors": [
{
"Type": "Prioritised",
"Name": "my-endpoint-selector",
"Properties": {
"PriorityEndpoints": ["openai-1"],
"FallbackEndpoints": ["openai-fallback"]
}
}
],
"GenericSteps": [
{
"Type": "AspNetCoreFixedWindowRateLimiting",
"Name": "window-rate-limiter",
"Properties": {
"LimitType": "PerConsumer|PerAICentralEndpoint",
"MetricType": "Requests",
"Options": {
"Window": "00:00:10",
"PermitLimit": 100
}
}
},
{
"Type": "AzureMonitorLogger",
"Name": "azure-monitor-logger",
"Properties": {
"WorkspaceId": "<workspace-id>",
"Key": "<key>",
"LogPrompt": true,
"LogResponse": false,
"LogClient": true
}
}
],
"Pipelines": [
{
"Name": "MyPipeline",
"Host": "prioritypipeline.mydomain.com",
"EndpointSelector": "my-endpoint-selector",
"AuthProvider": "simple-aad",
"Steps": [
"window-rate-limiter",
"azure-monitor-logger"
],
"OpenTelemetryConfig": {
"AddClientNameTag": true,
"Transmit": true
}
}
]
}
}
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for AICentral
Similar Open Source Tools

AICentral
AI Central is a powerful tool designed to take control of your AI services with minimal overhead. It is built on Asp.Net Core and dotnet 8, offering fast web-server performance. The tool enables advanced Azure APIm scenarios, PII stripping logging to Cosmos DB, token metrics through Open Telemetry, and intelligent routing features. AI Central supports various endpoint selection strategies, proxying asynchronous requests, custom OAuth2 authorization, circuit breakers, rate limiting, and extensibility through plugins. It provides an extensibility model for easy plugin development and offers enriched telemetry and logging capabilities for monitoring and insights.

firecrawl
Firecrawl is an API service that empowers AI applications with clean data from any website. It features advanced scraping, crawling, and data extraction capabilities. The repository is still in development, integrating custom modules into the mono repo. Users can run it locally but it's not fully ready for self-hosted deployment yet. Firecrawl offers powerful capabilities like scraping, crawling, mapping, searching, and extracting structured data from single pages, multiple pages, or entire websites with AI. It supports various formats, actions, and batch scraping. The tool is designed to handle proxies, anti-bot mechanisms, dynamic content, media parsing, change tracking, and more. Firecrawl is available as an open-source project under the AGPL-3.0 license, with additional features offered in the cloud version.

firecrawl
Firecrawl is an API service that takes a URL, crawls it, and converts it into clean markdown. It crawls all accessible subpages and provides clean markdown for each, without requiring a sitemap. The API is easy to use and can be self-hosted. It also integrates with Langchain and Llama Index. The Python SDK makes it easy to crawl and scrape websites in Python code.

lego-ai-parser
Lego AI Parser is an open-source application that uses OpenAI to parse visible text of HTML elements. It is built on top of FastAPI, ready to set up as a server, and make calls from any language. It supports preset parsers for Google Local Results, Amazon Listings, Etsy Listings, Wayfair Listings, BestBuy Listings, Costco Listings, Macy's Listings, and Nordstrom Listings. Users can also design custom parsers by providing prompts, examples, and details about the OpenAI model under the classifier key.

008
008 is an open-source event-driven AI powered WebRTC Softphone compatible with macOS, Windows, and Linux. It is also accessible on the web. The name '008' or 'agent 008' reflects our ambition: beyond crafting the premier Open Source Softphone, we aim to introduce a programmable, event-driven AI agent. This agent utilizes embedded artificial intelligence models operating directly on the softphone, ensuring efficiency and reduced operational costs.

jupyter-mcp-server
Jupyter MCP Server is a Model Context Protocol (MCP) server implementation that enables real-time interaction with Jupyter Notebooks. It allows AI to edit, document, and execute code for data analysis and visualization. The server offers features like real-time control, smart execution, and MCP compatibility. Users can use tools such as insert_execute_code_cell, append_markdown_cell, get_notebook_info, and read_cell for advanced interactions with Jupyter notebooks.

manga-image-translator
Translate texts in manga/images. Some manga/images will never be translated, therefore this project is born. * Image/Manga Translator * Samples * Online Demo * Disclaimer * Installation * Pip/venv * Poetry * Additional instructions for **Windows** * Docker * Hosting the web server * Using as CLI * Setting Translation Secrets * Using with Nvidia GPU * Building locally * Usage * Batch mode (default) * Demo mode * Web Mode * Api Mode * Related Projects * Docs * Recommended Modules * Tips to improve translation quality * Options * Language Code Reference * Translators Reference * GPT Config Reference * Using Gimp for rendering * Api Documentation * Synchronous mode * Asynchronous mode * Manual translation * Next steps * Support Us * Thanks To All Our Contributors :

python-utcp
The Universal Tool Calling Protocol (UTCP) is a secure and scalable standard for defining and interacting with tools across various communication protocols. UTCP emphasizes scalability, extensibility, interoperability, and ease of use. It offers a modular core with a plugin-based architecture, making it extensible, testable, and easy to package. The repository contains the complete UTCP Python implementation with core components and protocol-specific plugins for HTTP, CLI, Model Context Protocol, file-based tools, and more.

jambo
Jambo is a Python package that automatically converts JSON Schema definitions into Pydantic models. It streamlines schema validation and enforces type safety using Pydantic's validation features. The tool supports various JSON Schema features like strings, integers, floats, booleans, arrays, nested objects, and more. It enforces constraints such as minLength, maxLength, pattern, minimum, maximum, uniqueItems, and provides a zero-config approach for generating models. Jambo is designed to simplify the process of dynamically generating Pydantic models for AI frameworks.

firecrawl-mcp-server
Firecrawl MCP Server is a Model Context Protocol (MCP) server implementation that integrates with Firecrawl for web scraping capabilities. It supports features like scrape, crawl, search, extract, and batch scrape. It provides web scraping with JS rendering, URL discovery, web search with content extraction, automatic retries with exponential backoff, credit usage monitoring, comprehensive logging system, support for cloud and self-hosted FireCrawl instances, mobile/desktop viewport support, and smart content filtering with tag inclusion/exclusion. The server includes configurable parameters for retry behavior and credit usage monitoring, rate limiting and batch processing capabilities, and tools for scraping, batch scraping, checking batch status, searching, crawling, and extracting structured information from web pages.

crush
Crush is a versatile tool designed to enhance coding workflows in your terminal. It offers support for multiple LLMs, allows for flexible switching between models, and enables session-based work management. Crush is extensible through MCPs and works across various operating systems. It can be installed using package managers like Homebrew and NPM, or downloaded directly. Crush supports various APIs like Anthropic, OpenAI, Groq, and Google Gemini, and allows for customization through environment variables. The tool can be configured locally or globally, and supports LSPs for additional context. Crush also provides options for ignoring files, allowing tools, and configuring local models. It respects `.gitignore` files and offers logging capabilities for troubleshooting and debugging.

openmacro
Openmacro is a multimodal personal agent that allows users to run code locally. It acts as a personal agent capable of completing and automating tasks autonomously via self-prompting. The tool provides a CLI natural-language interface for completing and automating tasks, analyzing and plotting data, browsing the web, and manipulating files. Currently, it supports API keys for models powered by SambaNova, with plans to add support for other hosts like OpenAI and Anthropic in future versions.

mcp-hub
MCP Hub is a centralized manager for Model Context Protocol (MCP) servers, offering dynamic server management and monitoring, REST API for tool execution and resource access, MCP Server marketplace integration, real-time server status tracking, client connection management, and process lifecycle handling. It acts as a central management server connecting to and managing multiple MCP servers, providing unified API endpoints for client access, handling server lifecycle and health monitoring, and routing requests between clients and MCP servers.

functionary
Functionary is a language model that interprets and executes functions/plugins. It determines when to execute functions, whether in parallel or serially, and understands their outputs. Function definitions are given as JSON Schema Objects, similar to OpenAI GPT function calls. It offers documentation and examples on functionary.meetkai.com. The newest model, meetkai/functionary-medium-v3.1, is ranked 2nd in the Berkeley Function-Calling Leaderboard. Functionary supports models with different context lengths and capabilities for function calling and code interpretation. It also provides grammar sampling for accurate function and parameter names. Users can deploy Functionary models serverlessly using Modal.com.

pipecat-flows
Pipecat Flows is a framework designed for building structured conversations in AI applications. It allows users to create both predefined conversation paths and dynamically generated flows, handling state management and LLM interactions. The framework includes a Python module for building conversation flows and a visual editor for designing and exporting flow configurations. Pipecat Flows is suitable for scenarios such as customer service scripts, intake forms, personalized experiences, and complex decision trees.

chatgpt-exporter
A script to export the chat history of ChatGPT. Supports exporting to text, HTML, Markdown, PNG, and JSON formats. Also allows for exporting multiple conversations at once.
For similar tasks

AICentral
AI Central is a powerful tool designed to take control of your AI services with minimal overhead. It is built on Asp.Net Core and dotnet 8, offering fast web-server performance. The tool enables advanced Azure APIm scenarios, PII stripping logging to Cosmos DB, token metrics through Open Telemetry, and intelligent routing features. AI Central supports various endpoint selection strategies, proxying asynchronous requests, custom OAuth2 authorization, circuit breakers, rate limiting, and extensibility through plugins. It provides an extensibility model for easy plugin development and offers enriched telemetry and logging capabilities for monitoring and insights.

algernon
Algernon is a web server with built-in support for QUIC, HTTP/2, Lua, Teal, Markdown, Pongo2, HyperApp, Amber, Sass(SCSS), GCSS, JSX, Ollama (LLMs), BoltDB, Redis, PostgreSQL, MariaDB/MySQL, MSSQL, rate limiting, graceful shutdown, plugins, users, and permissions. It is a small self-contained executable that supports various technologies and features for web development.

cloudflare-rag
This repository provides a fullstack example of building a Retrieval Augmented Generation (RAG) app with Cloudflare. It utilizes Cloudflare Workers, Pages, D1, KV, R2, AI Gateway, and Workers AI. The app features streaming interactions to the UI, hybrid RAG with Full-Text Search and Vector Search, switchable providers using AI Gateway, per-IP rate limiting with Cloudflare's KV, OCR within Cloudflare Worker, and Smart Placement for workload optimization. The development setup requires Node, pnpm, and wrangler CLI, along with setting up necessary primitives and API keys. Deployment involves setting up secrets and deploying the app to Cloudflare Pages. The project implements a Hybrid Search RAG approach combining Full Text Search against D1 and Hybrid Search with embeddings against Vectorize to enhance context for the LLM.

k8sgateway
K8sGateway is a feature-rich, fast, and flexible Kubernetes-native API gateway built on Envoy proxy and Kubernetes Gateway API. It excels in function-level routing, supports legacy apps, microservices, and serverless. It offers robust discovery capabilities, seamless integration with open-source projects, and supports hybrid applications with various technologies, architectures, protocols, and clouds.
For similar jobs

sweep
Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

teams-ai
The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

uAgents
uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.