
wingman
Inference Hub for AI at Scale
Stars: 63

The LLM Platform, also known as Inference Hub, is an open-source tool designed to simplify the development and deployment of large language model applications at scale. It provides a unified framework for integrating and managing multiple LLM vendors, models, and related services through a flexible approach. The platform supports various LLM providers, document processing, RAG, advanced AI workflows, infrastructure operations, and flexible configuration using YAML files. Its modular and extensible architecture allows developers to plug in different providers and services as needed. Key components include completers, embedders, renderers, synthesizers, transcribers, document processors, segmenters, retrievers, summarizers, translators, AI workflows, tools, and infrastructure components. Use cases range from enterprise AI applications to scalable LLM deployment and custom AI pipelines. Integrations with LLM providers like OpenAI, Azure OpenAI, Anthropic, Google Gemini, AWS Bedrock, Groq, Mistral AI, xAI, Hugging Face, and more are supported.
README:
The LLM Platform or Inference Hub is an open-source product designed to simplify the development and deployment of large language model (LLM) applications at scale. It provides a unified framework that allows developers to integrate and manage multiple LLM vendors, models, and related services through a standardized but highly flexible approach.
The platform integrates with a wide range of LLM providers:
Chat/Completion Models:
- OpenAI Platform and Azure OpenAI Service (GPT models)
- Anthropic (Claude models)
- Google Gemini
- AWS Bedrock
- Groq
- Mistral AI
- xAI
- Hugging Face
- Local deployments: Ollama, LLAMA.CPP, Mistral.RS
- Custom models via gRPC plugins
Embedding Models:
- OpenAI, Azure OpenAI, Jina, Hugging Face, Google Gemini
- Local: Ollama, LLAMA.CPP
- Custom embedders via gRPC
Media Processing:
- Image generation: OpenAI DALL-E, Replicate
- Speech-to-text: OpenAI Whisper, Groq Whisper, WHISPER.CPP
- Text-to-speech: OpenAI TTS
- Reranking: Jina
Document Extractors:
- Apache Tika for various document formats
- Unstructured.io for advanced document parsing
- Azure Document Intelligence
- Jina Reader for web content
- Exa and Tavily for web search and extraction
- Text extraction from plain files
- Custom extractors via gRPC
Text Segmentation:
- Jina segmenter for semantic chunking
- Text-based chunking with configurable sizes
- Unstructured.io segmentation
- Custom segmenters via gRPC
Information Retrieval:
- Web search: DuckDuckGo, Exa, Tavily
- Custom retrievers via gRPC plugins
Chains & Agents:
- Agent/Assistant chains with tool calling capabilities
- Custom conversation flows
- Multi-step reasoning workflows
- Tool integration and function calling
Tools & Function Calling:
- Built-in tools: search, extract, retrieve, render, synthesize, translate
-
Model Context Protocol (MCP) support: Full server and client implementation
- Connect to external MCP servers as tool providers
- Built-in MCP server exposing platform capabilities
- Multiple transport methods (HTTP streaming, SSE, command execution)
- Custom tools via gRPC plugins
Additional Capabilities:
- Text summarization (via chat models)
- Language translation
- Content rendering and formatting
Routing & Load Balancing:
- Round-robin load balancer for distributing requests
- Model fallback strategies
- Request routing across multiple providers
Rate Limiting & Control:
- Per-provider and per-model rate limiting
- Request throttling and queuing
- Resource usage controls
Authentication & Security:
- Static token authentication
- OpenID Connect (OIDC) integration
- Secure credential management
API Compatibility:
- OpenAI-compatible API endpoints
- Custom API configurations
- Multiple API versions support
Observability & Monitoring:
- Full OpenTelemetry integration
- Request tracing across all components
- Comprehensive metrics and logging
- Performance monitoring and debugging
Developers can define providers, models, credentials, document processing pipelines, tools, and advanced AI workflows using YAML configuration files. This approach streamlines integration and makes it easy to manage complex AI applications.
The architecture is designed to be modular and extensible, allowing developers to plug in different providers and services as needed. It consists of key components:
Core Providers:
- Completers: Chat/completion models for text generation and reasoning
- Embedders: Vector embedding models for semantic understanding
- Renderers: Image generation and visual content creation
- Synthesizers: Text-to-speech and audio generation
- Transcribers: Speech-to-text and audio processing
- Rerankers: Result ranking and relevance scoring
Document & Data Processing:
- Extractors: Document parsing and content extraction from various formats
- Segmenters: Text chunking and semantic segmentation for RAG
- Retrievers: Web search and information retrieval
- Summarizers: Content compression and summarization
- Translators: Multi-language text translation
AI Workflows & Tools:
- Chains: Multi-step AI workflows and agent-based reasoning
- Tools: Function calling, web search, document processing, and custom capabilities
- APIs: Multiple API formats and compatibility layers
Infrastructure:
- Routers: Load balancing and request distribution
- Rate Limiters: Resource control and throttling
- Authorizers: Authentication and access control
- Observability: OpenTelemetry tracing and monitoring
- Enterprise AI Applications: Unified platform for multiple AI services and models
- RAG (Retrieval-Augmented Generation): Document processing, semantic search, and knowledge retrieval
- AI Agents & Workflows: Multi-step reasoning, tool integration, and autonomous task execution
- Scalable LLM Deployment: High-volume applications with load balancing and failover
- Multi-Modal AI: Combining text, image, and audio processing capabilities
- Custom AI Pipelines: Flexible workflows using custom tools and chains
https://platform.openai.com/docs/api-reference
providers:
- type: openai
token: sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
models:
- gpt-4o
- gpt-4o-mini
- text-embedding-3-small
- text-embedding-3-large
- whisper-1
- dall-e-3
- tts-1
- tts-1-hd
https://azure.microsoft.com/en-us/products/ai-services/openai-service
providers:
- type: openai
url: https://xxxxxxxx.openai.azure.com
token: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
models:
# https://docs.anthropic.com/en/docs/models-overview
#
# {alias}:
# - id: {azure oai deployment name}
gpt-3.5-turbo:
id: gpt-35-turbo-16k
gpt-4:
id: gpt-4-32k
text-embedding-ada-002:
id: text-embedding-ada-002
providers:
- type: anthropic
token: sk-ant-apixx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
# https://docs.anthropic.com/en/docs/models-overview
#
# {alias}:
# - id: {anthropic api model name}
models:
claude-3.5-sonnet:
id: claude-3-5-sonnet-20240620
providers:
- type: gemini
token: ${GOOGLE_API_KEY}
# https://ai.google.dev/gemini-api/docs/models/gemini
#
# {alias}:
# - id: {gemini api model name}
models:
gemini-1.5-pro:
id: gemini-1.5-pro-latest
gemini-1.5-flash:
id: gemini-1.5-flash-latest
providers:
- type: bedrock
# AWS credentials configured via environment or IAM roles
models:
claude-3-sonnet:
id: anthropic.claude-3-sonnet-20240229-v1:0
providers:
- type: groq
token: ${GROQ_API_KEY}
# https://console.groq.com/docs/models
#
# {alias}:
# - id: {groq api model name}
models:
groq-llama-3-8b:
id: llama3-8b-8192
groq-whisper-1:
id: whisper-large-v3
providers:
- type: mistral
token: ${MISTRAL_API_KEY}
# https://docs.mistral.ai/getting-started/models/
#
# {alias}:
# - id: {mistral api model name}
models:
mistral-large:
id: mistral-large-latest
providers:
- type: xai
token: ${XAI_API_KEY}
models:
grok-beta:
id: grok-beta
providers:
- type: replicate
token: ${REPLICATE_API_KEY}
#
# {alias}:
# - id: {cohere api model name}
models:
replicate-flux-pro:
id: black-forest-labs/flux-pro
$ ollama start
$ ollama run mistral
providers:
- type: ollama
url: http://localhost:11434
# https://ollama.com/library
#
# {alias}:
# - id: {ollama model name with optional version}
models:
mistral-7b-instruct:
id: mistral:latest
https://github.com/ggerganov/llama.cpp/tree/master/examples/server
$ llama-server --port 9081 --log-disable --model ./models/mistral-7b-instruct-v0.2.Q4_K_M.gguf
providers:
- type: llama
url: http://localhost:9081
models:
- mistral-7b-instruct
https://github.com/EricLBuehler/mistral.rs
$ mistralrs-server --port 1234 --isq Q4K plain -m meta-llama/Meta-Llama-3.1-8B-Instruct -a llama
providers:
- type: mistralrs
url: http://localhost:1234
models:
mistralrs-llama-3.1-8b:
id: llama
https://github.com/ggerganov/whisper.cpp/tree/master/examples/server
$ whisper-server --port 9083 --convert --model ./models/whisper-large-v3-turbo.bin
providers:
- type: whisper
url: http://localhost:9083
models:
- whisper
providers:
- type: huggingface
token: hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
models:
mistral-7B-instruct:
id: mistralai/Mistral-7B-Instruct-v0.1
huggingface-minilm-l6-2:
id: sentence-transformers/all-MiniLM-L6-v2
routers:
llama-lb:
type: roundrobin
models:
- llama-3-8b
- groq-llama-3-8b
- huggingface-llama-3-8b
retrievers:
web:
type: duckduckgo
retrievers:
exa:
type: exa
token: ${EXA_API_KEY}
retrievers:
tavily:
type: tavily
token: ${TAVILY_API_KEY}
retrievers:
custom:
type: custom
url: http://localhost:8080
# using Docker
docker run -it --rm -p 9998:9998 apache/tika:3.0.0.0-BETA2-full
extractors:
tika:
type: tika
url: http://localhost:9998
chunkSize: 4000
chunkOverlap: 200
docker run -it --rm -p 9085:8000 quay.io/unstructured-io/unstructured-api:0.0.80 --port 8000 --host 0.0.0.0
extractors:
unstructured:
type: unstructured
url: http://localhost:9085/general/v0/general
extractors:
azure:
type: azure
url: https://YOUR_INSTANCE.cognitiveservices.azure.com
token: ${AZURE_API_KEY}
extractors:
jina:
type: jina
token: ${JINA_API_KEY}
extractors:
exa:
type: exa
token: ${EXA_API_KEY}
tavily:
type: tavily
token: ${TAVILY_API_KEY}
extractors:
text:
type: text
extractors:
custom:
type: custom
url: http://localhost:8080
segmenters:
jina:
type: jina
token: ${JINA_API_KEY}
segmenters:
text:
type: text
chunkSize: 1000
chunkOverlap: 200
segmenters:
unstructured:
type: unstructured
url: http://localhost:9085/general/v0/general
segmenters:
custom:
type: custom
url: http://localhost:8080
chains:
assistant:
type: agent
model: gpt-4o
tools:
- search
- extract
messages:
- role: system
content: "You are a helpful AI assistant."
The platform provides comprehensive support for the Model Context Protocol (MCP), enabling integration with MCP-compatible tools and services.
MCP Server Support:
- Built-in MCP server that exposes platform tools to MCP clients
- Automatic tool discovery and schema generation
- Multiple transport methods (HTTP streaming, SSE, command-line)
MCP Client Support:
- Connect to external MCP servers as tool providers
- Support for various MCP transport methods
- Automatic tool registration and execution
MCP Tool Configuration:
tools:
# MCP server via HTTP streaming
mcp-streamable:
type: mcp
url: http://localhost:8080/mcp
# MCP server via Server-Sent Events
mcp-sse:
type: mcp
url: http://localhost:8080/sse
vars:
api-key: ${API_KEY}
# MCP server via command execution
mcp-command:
type: mcp
command: /path/to/mcp-server
args:
- --config
- /path/to/config.json
vars:
ENV_VAR: value
Built-in MCP Server:
The platform automatically exposes its tools via MCP protocol at /mcp
endpoint, allowing other MCP clients to discover and use platform capabilities.
tools:
search:
type: search
retriever: web
extract:
type: extract
extractor: tika
translate:
type: translate
translator: default
render:
type: render
renderer: dalle-3
synthesize:
type: synthesize
synthesizer: tts-1
tools:
custom-tool:
type: custom
url: http://localhost:8080
authorizers:
- type: static
tokens:
- "your-secret-token"
authorizers:
- type: oidc
url: https://your-oidc-provider.com
audience: your-audience
routers:
llama-lb:
type: roundrobin
models:
- llama-3-8b
- groq-llama-3-8b
- huggingface-llama-3-8b
Add rate limiting to any provider:
providers:
- type: openai
token: sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
limit: 10 # requests per second
models:
gpt-4o:
limit: 5 # override for specific model
Summarization is automatically available for any chat model:
# Use any completer model for summarization
# The platform automatically adapts chat models for summarization tasks
translators:
default:
type: default
# Uses configured chat models for translation
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for wingman
Similar Open Source Tools

wingman
The LLM Platform, also known as Inference Hub, is an open-source tool designed to simplify the development and deployment of large language model applications at scale. It provides a unified framework for integrating and managing multiple LLM vendors, models, and related services through a flexible approach. The platform supports various LLM providers, document processing, RAG, advanced AI workflows, infrastructure operations, and flexible configuration using YAML files. Its modular and extensible architecture allows developers to plug in different providers and services as needed. Key components include completers, embedders, renderers, synthesizers, transcribers, document processors, segmenters, retrievers, summarizers, translators, AI workflows, tools, and infrastructure components. Use cases range from enterprise AI applications to scalable LLM deployment and custom AI pipelines. Integrations with LLM providers like OpenAI, Azure OpenAI, Anthropic, Google Gemini, AWS Bedrock, Groq, Mistral AI, xAI, Hugging Face, and more are supported.

rag-security-scanner
RAG/LLM Security Scanner is a professional security testing tool designed for Retrieval-Augmented Generation (RAG) systems and LLM applications. It identifies critical vulnerabilities in AI-powered applications such as chatbots, virtual assistants, and knowledge retrieval systems. The tool offers features like prompt injection detection, data leakage assessment, function abuse testing, context manipulation identification, professional reporting with JSON/HTML formats, and easy integration with OpenAI, HuggingFace, and custom RAG systems.

one
ONE is a modern web and AI agent development toolkit that empowers developers to build AI-powered applications with high performance, beautiful UI, AI integration, responsive design, type safety, and great developer experience. It is perfect for building modern web applications, from simple landing pages to complex AI-powered platforms.

buster
Buster is a modern analytics platform designed with AI in mind, focusing on self-serve experiences powered by Large Language Models. It addresses pain points in existing tools by advocating for AI-centric app development, cost-effective data warehousing, improved CI/CD processes, and empowering data teams to create powerful, user-friendly data experiences. The platform aims to revolutionize AI analytics by enabling data teams to build deep integrations and own their entire analytics stack.

scabench
ScaBench is a comprehensive framework designed for evaluating security analysis tools and AI agents on real-world smart contract vulnerabilities. It provides curated datasets from recent audits and official tooling for consistent evaluation. The tool includes features such as curated datasets from Code4rena, Cantina, and Sherlock audits, a baseline runner for security analysis, a scoring tool for evaluating findings, a report generator for HTML reports with visualizations, and pipeline automation for complete workflow execution. Users can access curated datasets, generate new datasets, download project source code, run security analysis using LLMs, and evaluate tool findings against benchmarks using LLM matching. The tool enforces strict matching policies to ensure accurate evaluation results.

llm-context.py
LLM Context is a tool designed to assist developers in quickly injecting relevant content from code/text projects into Large Language Model chat interfaces. It leverages `.gitignore` patterns for smart file selection and offers a streamlined clipboard workflow using the command line. The tool also provides direct integration with Large Language Models through the Model Context Protocol (MCP). LLM Context is optimized for code repositories and collections of text/markdown/html documents, making it suitable for developers working on projects that fit within an LLM's context window. The tool is under active development and aims to enhance AI-assisted development workflows by harnessing the power of Large Language Models.

VimLM
VimLM is an AI-powered coding assistant for Vim that integrates AI for code generation, refactoring, and documentation directly into your Vim workflow. It offers native Vim integration with split-window responses and intuitive keybindings, offline first execution with MLX-compatible models, contextual awareness with seamless integration with codebase and external resources, conversational workflow for iterating on responses, project scaffolding for generating and deploying code blocks, and extensibility for creating custom LLM workflows with command chains.

photo-ai
100xPhoto is a powerful AI image platform that enables users to generate stunning images and train custom AI models. It provides an intuitive interface for creating unique AI-generated artwork and training personalized models on image datasets. The platform is built with cutting-edge technology and offers robust capabilities for AI image generation and model training.

arkflow
ArkFlow is a high-performance Rust stream processing engine that seamlessly integrates AI capabilities, providing powerful real-time data processing and intelligent analysis. It supports multiple input/output sources and processors, enabling easy loading and execution of machine learning models for streaming data and inference, anomaly detection, and complex event processing. The tool is built on Rust and Tokio async runtime, offering excellent performance and low latency. It features built-in SQL queries, Python script, JSON processing, Protobuf encoding/decoding, and batch processing capabilities. ArkFlow is extensible with a modular design, making it easy to extend with new components.

zcf
ZCF (Zero-Config Claude-Code Flow) is a tool that provides zero-configuration, one-click setup for Claude Code with bilingual support, intelligent agent system, and personalized AI assistant. It offers an interactive menu for easy operations and direct commands for quick execution. The tool supports bilingual operation with automatic language switching and customizable AI output styles. ZCF also includes features like BMad Workflow for enterprise-grade workflow system, Spec Workflow for structured feature development, CCR (Claude Code Router) support for proxy routing, and CCometixLine for real-time usage tracking. It provides smart installation, complete configuration management, and core features like professional agents, command system, and smart configuration. ZCF is cross-platform compatible, supports Windows and Termux environments, and includes security features like dangerous operation confirmation mechanism.

VASA-1-hack
VASA-1-hack is a repository containing the VASA implementation separated from EMOPortraits, with all components properly configured for standalone training. It provides detailed setup instructions, prerequisites, project structure, configuration details, running training modes, troubleshooting tips, monitoring training progress, development information, and acknowledgments. The repository aims to facilitate training volumetric avatar models with configurable parameters and logging levels for efficient debugging and testing.

hub
Hub is an open-source, high-performance LLM gateway written in Rust. It serves as a smart proxy for LLM applications, centralizing control and tracing of all LLM calls and traces. Built for efficiency, it provides a single API to connect to any LLM provider. The tool is designed to be fast, efficient, and completely open-source under the Apache 2.0 license.

tunacode
TunaCode CLI is an AI-powered coding assistant that provides a command-line interface for developers to enhance their coding experience. It offers features like model selection, parallel execution for faster file operations, and various commands for code management. The tool aims to improve coding efficiency and provide a seamless coding environment for developers.

auto-engineer
Auto Engineer is a tool designed to automate the Software Development Life Cycle (SDLC) by building production-grade applications with a combination of human and AI agents. It offers a plugin-based architecture that allows users to install only the necessary functionality for their projects. The tool guides users through key stages including Flow Modeling, IA Generation, Deterministic Scaffolding, AI Coding & Testing Loop, and Comprehensive Quality Checks. Auto Engineer follows a command/event-driven architecture and provides a modular plugin system for specific functionalities. It supports TypeScript with strict typing throughout and includes a built-in message bus server with a web dashboard for monitoring commands and events.

R2R
R2R (RAG to Riches) is a fast and efficient framework for serving high-quality Retrieval-Augmented Generation (RAG) to end users. The framework is designed with customizable pipelines and a feature-rich FastAPI implementation, enabling developers to quickly deploy and scale RAG-based applications. R2R was conceived to bridge the gap between local LLM experimentation and scalable production solutions. **R2R is to LangChain/LlamaIndex what NextJS is to React**. A JavaScript client for R2R deployments can be found here. ### Key Features * **🚀 Deploy** : Instantly launch production-ready RAG pipelines with streaming capabilities. * **🧩 Customize** : Tailor your pipeline with intuitive configuration files. * **🔌 Extend** : Enhance your pipeline with custom code integrations. * **⚖️ Autoscale** : Scale your pipeline effortlessly in the cloud using SciPhi. * **🤖 OSS** : Benefit from a framework developed by the open-source community, designed to simplify RAG deployment.

MassGen
MassGen is a cutting-edge multi-agent system that leverages the power of collaborative AI to solve complex tasks. It assigns a task to multiple AI agents who work in parallel, observe each other's progress, and refine their approaches to converge on the best solution to deliver a comprehensive and high-quality result. The system operates through an architecture designed for seamless multi-agent collaboration, with key features including cross-model/agent synergy, parallel processing, intelligence sharing, consensus building, and live visualization. Users can install the system, configure API settings, and run MassGen for various tasks such as question answering, creative writing, research, development & coding tasks, and web automation & browser tasks. The roadmap includes plans for advanced agent collaboration, expanded model, tool & agent integration, improved performance & scalability, enhanced developer experience, and a web interface.
For similar tasks

comfyui_LLM_party
COMFYUI LLM PARTY is a node library designed for LLM workflow development in ComfyUI, an extremely minimalist UI interface primarily used for AI drawing and SD model-based workflows. The project aims to provide a complete set of nodes for constructing LLM workflows, enabling users to easily integrate them into existing SD workflows. It features various functionalities such as API integration, local large model integration, RAG support, code interpreters, online queries, conditional statements, looping links for large models, persona mask attachment, and tool invocations for weather lookup, time lookup, knowledge base, code execution, web search, and single-page search. Users can rapidly develop web applications using API + Streamlit and utilize LLM as a tool node. Additionally, the project includes an omnipotent interpreter node that allows the large model to perform any task, with recommendations to use the 'show_text' node for display output.

n8n
n8n is a workflow automation platform that combines the flexibility of code with the speed of no-code. It offers 400+ integrations, native AI capabilities, and a fair-code license, empowering users to create powerful automations while maintaining control over data and deployments. With features like code customization, AI agent workflows, self-hosting options, enterprise-ready functionalities, and an active community, n8n provides a comprehensive solution for technical teams seeking efficient workflow automation.

wingman
The LLM Platform, also known as Inference Hub, is an open-source tool designed to simplify the development and deployment of large language model applications at scale. It provides a unified framework for integrating and managing multiple LLM vendors, models, and related services through a flexible approach. The platform supports various LLM providers, document processing, RAG, advanced AI workflows, infrastructure operations, and flexible configuration using YAML files. Its modular and extensible architecture allows developers to plug in different providers and services as needed. Key components include completers, embedders, renderers, synthesizers, transcribers, document processors, segmenters, retrievers, summarizers, translators, AI workflows, tools, and infrastructure components. Use cases range from enterprise AI applications to scalable LLM deployment and custom AI pipelines. Integrations with LLM providers like OpenAI, Azure OpenAI, Anthropic, Google Gemini, AWS Bedrock, Groq, Mistral AI, xAI, Hugging Face, and more are supported.

xtuner
XTuner is an efficient, flexible, and full-featured toolkit for fine-tuning large models. It supports various LLMs (InternLM, Mixtral-8x7B, Llama 2, ChatGLM, Qwen, Baichuan, ...), VLMs (LLaVA), and various training algorithms (QLoRA, LoRA, full-parameter fine-tune). XTuner also provides tools for chatting with pretrained / fine-tuned LLMs and deploying fine-tuned LLMs with any other framework, such as LMDeploy.

llm-hosting-container
The LLM Hosting Container repository provides Dockerfile and associated resources for building and hosting containers for large language models, specifically the HuggingFace Text Generation Inference (TGI) container. This tool allows users to easily deploy and manage large language models in a containerized environment, enabling efficient inference and deployment of language-based applications.
For similar jobs

sweep
Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

teams-ai
The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

uAgents
uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.