wingman

Inference Hub for AI at Scale

Stars: 63

Visit

The LLM Platform, also known as Inference Hub, is an open-source tool designed to simplify the development and deployment of large language model applications at scale. It provides a unified framework for integrating and managing multiple LLM vendors, models, and related services through a flexible approach. The platform supports various LLM providers, document processing, RAG, advanced AI workflows, infrastructure operations, and flexible configuration using YAML files. Its modular and extensible architecture allows developers to plug in different providers and services as needed. Key components include completers, embedders, renderers, synthesizers, transcribers, document processors, segmenters, retrievers, summarizers, translators, AI workflows, tools, and infrastructure components. Use cases range from enterprise AI applications to scalable LLM deployment and custom AI pipelines. Integrations with LLM providers like OpenAI, Azure OpenAI, Anthropic, Google Gemini, AWS Bedrock, Groq, Mistral AI, xAI, Hugging Face, and more are supported.

README:

LLM Platform

The LLM Platform or Inference Hub is an open-source product designed to simplify the development and deployment of large language model (LLM) applications at scale. It provides a unified framework that allows developers to integrate and manage multiple LLM vendors, models, and related services through a standardized but highly flexible approach.

Key Features

Multi-Provider Support

The platform integrates with a wide range of LLM providers:

Chat/Completion Models:

OpenAI Platform and Azure OpenAI Service (GPT models)
Anthropic (Claude models)
Google Gemini
AWS Bedrock
Groq
Mistral AI
xAI
Hugging Face
Local deployments: Ollama, LLAMA.CPP, Mistral.RS
Custom models via gRPC plugins

Embedding Models:

OpenAI, Azure OpenAI, Jina, Hugging Face, Google Gemini
Local: Ollama, LLAMA.CPP
Custom embedders via gRPC

Media Processing:

Image generation: OpenAI DALL-E, Replicate
Speech-to-text: OpenAI Whisper, Groq Whisper, WHISPER.CPP
Text-to-speech: OpenAI TTS
Reranking: Jina

Document Processing & RAG

Document Extractors:

Apache Tika for various document formats
Unstructured.io for advanced document parsing
Azure Document Intelligence
Jina Reader for web content
Exa and Tavily for web search and extraction
Text extraction from plain files
Custom extractors via gRPC

Text Segmentation:

Jina segmenter for semantic chunking
Text-based chunking with configurable sizes
Unstructured.io segmentation
Custom segmenters via gRPC

Information Retrieval:

Web search: DuckDuckGo, Exa, Tavily
Custom retrievers via gRPC plugins

Advanced AI Workflows

Chains & Agents:

Agent/Assistant chains with tool calling capabilities
Custom conversation flows
Multi-step reasoning workflows
Tool integration and function calling

Tools & Function Calling:

Built-in tools: search, extract, retrieve, render, synthesize, translate
Model Context Protocol (MCP) support: Full server and client implementation
- Connect to external MCP servers as tool providers
- Built-in MCP server exposing platform capabilities
- Multiple transport methods (HTTP streaming, SSE, command execution)
Custom tools via gRPC plugins

Additional Capabilities:

Text summarization (via chat models)
Language translation
Content rendering and formatting

Infrastructure & Operations

Routing & Load Balancing:

Round-robin load balancer for distributing requests
Model fallback strategies
Request routing across multiple providers

Rate Limiting & Control:

Per-provider and per-model rate limiting
Request throttling and queuing
Resource usage controls

Authentication & Security:

Static token authentication
OpenID Connect (OIDC) integration
Secure credential management

API Compatibility:

OpenAI-compatible API endpoints
Custom API configurations
Multiple API versions support

Observability & Monitoring:

Full OpenTelemetry integration
Request tracing across all components
Comprehensive metrics and logging
Performance monitoring and debugging

Flexible Configuration

Developers can define providers, models, credentials, document processing pipelines, tools, and advanced AI workflows using YAML configuration files. This approach streamlines integration and makes it easy to manage complex AI applications.

Architecture

The architecture is designed to be modular and extensible, allowing developers to plug in different providers and services as needed. It consists of key components:

Core Providers:

Completers: Chat/completion models for text generation and reasoning
Embedders: Vector embedding models for semantic understanding
Renderers: Image generation and visual content creation
Synthesizers: Text-to-speech and audio generation
Transcribers: Speech-to-text and audio processing
Rerankers: Result ranking and relevance scoring

Document & Data Processing:

Extractors: Document parsing and content extraction from various formats
Segmenters: Text chunking and semantic segmentation for RAG
Retrievers: Web search and information retrieval
Summarizers: Content compression and summarization
Translators: Multi-language text translation

AI Workflows & Tools:

Chains: Multi-step AI workflows and agent-based reasoning
Tools: Function calling, web search, document processing, and custom capabilities
APIs: Multiple API formats and compatibility layers

Infrastructure:

Routers: Load balancing and request distribution
Rate Limiters: Resource control and throttling
Authorizers: Authentication and access control
Observability: OpenTelemetry tracing and monitoring

Use Cases

Enterprise AI Applications: Unified platform for multiple AI services and models
RAG (Retrieval-Augmented Generation): Document processing, semantic search, and knowledge retrieval
AI Agents & Workflows: Multi-step reasoning, tool integration, and autonomous task execution
Scalable LLM Deployment: High-volume applications with load balancing and failover
Multi-Modal AI: Combining text, image, and audio processing capabilities
Custom AI Pipelines: Flexible workflows using custom tools and chains

Integrations & Configuration

LLM Providers

OpenAI Platform

https://platform.openai.com/docs/api-reference

providers:
  - type: openai
    token: sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

    models:
      - gpt-4o
      - gpt-4o-mini
      - text-embedding-3-small
      - text-embedding-3-large
      - whisper-1
      - dall-e-3
      - tts-1
      - tts-1-hd

Azure OpenAI Service

https://azure.microsoft.com/en-us/products/ai-services/openai-service

providers:
  - type: openai
    url: https://xxxxxxxx.openai.azure.com
    token: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

    models:
      # https://docs.anthropic.com/en/docs/models-overview
      #
      # {alias}:
      #   - id: {azure oai deployment name}

      gpt-3.5-turbo:
        id: gpt-35-turbo-16k

      gpt-4:
        id: gpt-4-32k
        
      text-embedding-ada-002:
        id: text-embedding-ada-002

Anthropic

https://www.anthropic.com/api

providers:
  - type: anthropic
    token: sk-ant-apixx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

    # https://docs.anthropic.com/en/docs/models-overview
    #
    # {alias}:
    #   - id: {anthropic api model name}
    models:
      claude-3.5-sonnet:
        id: claude-3-5-sonnet-20240620

Google Gemini

providers:
  - type: gemini
    token: ${GOOGLE_API_KEY}

    # https://ai.google.dev/gemini-api/docs/models/gemini
    #
    # {alias}:
    #   - id: {gemini api model name}
    models:
      gemini-1.5-pro:
        id: gemini-1.5-pro-latest
      
      gemini-1.5-flash:
        id: gemini-1.5-flash-latest

AWS Bedrock

providers:
  - type: bedrock
    # AWS credentials configured via environment or IAM roles

    models:
      claude-3-sonnet:
        id: anthropic.claude-3-sonnet-20240229-v1:0

Groq

providers:
  - type: groq
    token: ${GROQ_API_KEY}

    # https://console.groq.com/docs/models
    #
    # {alias}:
    #   - id: {groq api model name}
    models:
      groq-llama-3-8b:
        id: llama3-8b-8192

      groq-whisper-1:
        id: whisper-large-v3

Mistral AI

providers:
  - type: mistral
    token: ${MISTRAL_API_KEY}

    # https://docs.mistral.ai/getting-started/models/
    #
    # {alias}:
    #   - id: {mistral api model name}
    models:
      mistral-large:
        id: mistral-large-latest

xAI

providers:
  - type: xai
    token: ${XAI_API_KEY}

    models:
      grok-beta:
        id: grok-beta

Replicate

https://replicate.com/

providers:
  - type: replicate
    token: ${REPLICATE_API_KEY}
    #
    # {alias}:
    #   - id: {cohere api model name}
    models:
      replicate-flux-pro:
        id: black-forest-labs/flux-pro

Ollama

https://ollama.ai

$ ollama start
$ ollama run mistral

providers:
  - type: ollama
    url: http://localhost:11434

    # https://ollama.com/library
    #
    # {alias}:
    #   - id: {ollama model name with optional version}
    models:
      mistral-7b-instruct:
        id: mistral:latest

LLAMA.CPP

https://github.com/ggerganov/llama.cpp/tree/master/examples/server

$ llama-server --port 9081 --log-disable --model ./models/mistral-7b-instruct-v0.2.Q4_K_M.gguf

providers:
  - type: llama
    url: http://localhost:9081

    models:
      - mistral-7b-instruct

Mistral.RS

https://github.com/EricLBuehler/mistral.rs

$ mistralrs-server --port 1234 --isq Q4K plain -m meta-llama/Meta-Llama-3.1-8B-Instruct -a llama

providers:
  - type: mistralrs
    url: http://localhost:1234

    models:
      mistralrs-llama-3.1-8b:
        id: llama

WHISPER.CPP

https://github.com/ggerganov/whisper.cpp/tree/master/examples/server

$ whisper-server --port 9083 --convert --model ./models/whisper-large-v3-turbo.bin

providers:
  - type: whisper
    url: http://localhost:9083

    models:
      - whisper

Hugging Face

https://huggingface.co/

providers:
  - type: huggingface
    token: hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    
    models:
      mistral-7B-instruct:
        id: mistralai/Mistral-7B-Instruct-v0.1
      
      huggingface-minilm-l6-2:
        id: sentence-transformers/all-MiniLM-L6-v2

Routers

Round-robin Load Balancer

routers:
  llama-lb:
    type: roundrobin
    models:
      - llama-3-8b
      - groq-llama-3-8b
      - huggingface-llama-3-8b

Information Retrieval / Web Search

DuckDuckGo

retrievers:
  web:
    type: duckduckgo

Exa

https://exa.ai

retrievers:
  exa:
    type: exa
    token: ${EXA_API_KEY}

Tavily

https://tavily.com

retrievers:
  tavily:
    type: tavily
    token: ${TAVILY_API_KEY}

Custom Retriever

retrievers:
  custom:
    type: custom
    url: http://localhost:8080

Document Extraction

Tika

# using Docker
docker run -it --rm -p 9998:9998 apache/tika:3.0.0.0-BETA2-full

extractors:  
  tika:
    type: tika
    url: http://localhost:9998
    chunkSize: 4000
    chunkOverlap: 200

Unstructured

https://unstructured.io

docker run -it --rm -p 9085:8000 quay.io/unstructured-io/unstructured-api:0.0.80 --port 8000 --host 0.0.0.0

extractors:
  unstructured:
    type: unstructured
    url: http://localhost:9085/general/v0/general

Azure Document Intelligence

extractors:
  azure:
    type: azure
    url: https://YOUR_INSTANCE.cognitiveservices.azure.com
    token: ${AZURE_API_KEY}

Jina Reader

extractors:
  jina:
    type: jina
    token: ${JINA_API_KEY}

Exa / Tavily Web Extraction

extractors:
  exa:
    type: exa
    token: ${EXA_API_KEY}

  tavily:
    type: tavily
    token: ${TAVILY_API_KEY}

Text Extractor

extractors:
  text:
    type: text

Custom Extractor

extractors:
  custom:
    type: custom
    url: http://localhost:8080

Text Segmentation

Jina Segmenter

segmenters:
  jina:
    type: jina
    token: ${JINA_API_KEY}

Text Segmenter

segmenters:
  text:
    type: text
    chunkSize: 1000
    chunkOverlap: 200

Unstructured Segmenter

segmenters:
  unstructured:
    type: unstructured
    url: http://localhost:9085/general/v0/general

Custom Segmenter

segmenters:
  custom:
    type: custom
    url: http://localhost:8080

AI Agents & Chains

Agent/Assistant Chain

chains:
  assistant:
    type: agent
    model: gpt-4o
    tools:
      - search
      - extract
    messages:
      - role: system
        content: "You are a helpful AI assistant."

Tools & Function Calling

Model Context Protocol (MCP)

The platform provides comprehensive support for the Model Context Protocol (MCP), enabling integration with MCP-compatible tools and services.

MCP Server Support:

Built-in MCP server that exposes platform tools to MCP clients
Automatic tool discovery and schema generation
Multiple transport methods (HTTP streaming, SSE, command-line)

MCP Client Support:

Connect to external MCP servers as tool providers
Support for various MCP transport methods
Automatic tool registration and execution

MCP Tool Configuration:

tools:
  # MCP server via HTTP streaming
  mcp-streamable:
    type: mcp
    url: http://localhost:8080/mcp

  # MCP server via Server-Sent Events
  mcp-sse:
    type: mcp
    url: http://localhost:8080/sse
    vars:
      api-key: ${API_KEY}

  # MCP server via command execution
  mcp-command:
    type: mcp
    command: /path/to/mcp-server
    args:
      - --config
      - /path/to/config.json
    vars:
      ENV_VAR: value

Built-in MCP Server:

The platform automatically exposes its tools via MCP protocol at /mcp endpoint, allowing other MCP clients to discover and use platform capabilities.

Built-in Tools

tools:
  search:
    type: search
    retriever: web

  extract:
    type: extract
    extractor: tika

  translate:
    type: translate
    translator: default

  render:
    type: render
    renderer: dalle-3

  synthesize:
    type: synthesize
    synthesizer: tts-1

Custom Tools

tools:
  custom-tool:
    type: custom
    url: http://localhost:8080

Authentication

Static Authentication

authorizers:
  - type: static
    tokens:
      - "your-secret-token"

OIDC Authentication

authorizers:
  - type: oidc
    url: https://your-oidc-provider.com
    audience: your-audience

Routing & Load Balancing

Round-robin Load Balancer

routers:
  llama-lb:
    type: roundrobin
    models:
      - llama-3-8b
      - groq-llama-3-8b
      - huggingface-llama-3-8b

Rate Limiting

Add rate limiting to any provider:

providers:
  - type: openai
    token: sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    limit: 10  # requests per second

    models:
      gpt-4o:
        limit: 5  # override for specific model

Summarization & Translation

Automatic Summarization

Summarization is automatically available for any chat model:

# Use any completer model for summarization
# The platform automatically adapts chat models for summarization tasks

Translation

translators:
  default:
    type: default
    # Uses configured chat models for translation

For Tasks:

Click tags to check more tools for each tasks

build ai workflows manage document processing integrate multiple models deploy large language models develop custom pipelines

For Jobs:

ai engineer data scientist machine learning engineer nlp developer research scientist

Alternative AI tools for wingman

Similar Open Source Tools

wingman

github

: 63

rag-security-scanner

RAG/LLM Security Scanner is a professional security testing tool designed for Retrieval-Augmented Generation (RAG) systems and LLM applications. It identifies critical vulnerabilities in AI-powered applications such as chatbots, virtual assistants, and knowledge retrieval systems. The tool offers features like prompt injection detection, data leakage assessment, function abuse testing, context manipulation identification, professional reporting with JSON/HTML formats, and easy integration with OpenAI, HuggingFace, and custom RAG systems.

github

: 53

one

ONE is a modern web and AI agent development toolkit that empowers developers to build AI-powered applications with high performance, beautiful UI, AI integration, responsive design, type safety, and great developer experience. It is perfect for building modern web applications, from simple landing pages to complex AI-powered platforms.

github

: 58

buster

Buster is a modern analytics platform designed with AI in mind, focusing on self-serve experiences powered by Large Language Models. It addresses pain points in existing tools by advocating for AI-centric app development, cost-effective data warehousing, improved CI/CD processes, and empowering data teams to create powerful, user-friendly data experiences. The platform aims to revolutionize AI analytics by enabling data teams to build deep integrations and own their entire analytics stack.

github

: 449

scabench

ScaBench is a comprehensive framework designed for evaluating security analysis tools and AI agents on real-world smart contract vulnerabilities. It provides curated datasets from recent audits and official tooling for consistent evaluation. The tool includes features such as curated datasets from Code4rena, Cantina, and Sherlock audits, a baseline runner for security analysis, a scoring tool for evaluating findings, a report generator for HTML reports with visualizations, and pipeline automation for complete workflow execution. Users can access curated datasets, generate new datasets, download project source code, run security analysis using LLMs, and evaluate tool findings against benchmarks using LLM matching. The tool enforces strict matching policies to ensure accurate evaluation results.

github

: 53

llm-context.py

LLM Context is a tool designed to assist developers in quickly injecting relevant content from code/text projects into Large Language Model chat interfaces. It leverages `.gitignore` patterns for smart file selection and offers a streamlined clipboard workflow using the command line. The tool also provides direct integration with Large Language Models through the Model Context Protocol (MCP). LLM Context is optimized for code repositories and collections of text/markdown/html documents, making it suitable for developers working on projects that fit within an LLM's context window. The tool is under active development and aims to enhance AI-assisted development workflows by harnessing the power of Large Language Models.

github

: 273

VimLM

VimLM is an AI-powered coding assistant for Vim that integrates AI for code generation, refactoring, and documentation directly into your Vim workflow. It offers native Vim integration with split-window responses and intuitive keybindings, offline first execution with MLX-compatible models, contextual awareness with seamless integration with codebase and external resources, conversational workflow for iterating on responses, project scaffolding for generating and deploying code blocks, and extensibility for creating custom LLM workflows with command chains.

github

: 193

packages

This repository is a monorepo for NPM packages published under the `@elevenlabs` scope. It contains multiple packages in the `packages` folder. The setup allows for easy development, linking packages, creating new packages, and publishing them with GitHub actions.

github

: 56

arkflow

ArkFlow is a high-performance Rust stream processing engine that seamlessly integrates AI capabilities, providing powerful real-time data processing and intelligent analysis. It supports multiple input/output sources and processors, enabling easy loading and execution of machine learning models for streaming data and inference, anomaly detection, and complex event processing. The tool is built on Rust and Tokio async runtime, offering excellent performance and low latency. It features built-in SQL queries, Python script, JSON processing, Protobuf encoding/decoding, and batch processing capabilities. ArkFlow is extensible with a modular design, making it easy to extend with new components.

github

: 1.2k

VASA-1-hack

VASA-1-hack is a repository containing the VASA implementation separated from EMOPortraits, with all components properly configured for standalone training. It provides detailed setup instructions, prerequisites, project structure, configuration details, running training modes, troubleshooting tips, monitoring training progress, development information, and acknowledgments. The repository aims to facilitate training volumetric avatar models with configurable parameters and logging levels for efficient debugging and testing.

github

: 295

mcp-apache-spark-history-server

The MCP Server for Apache Spark History Server is a tool that connects AI agents to Apache Spark History Server for intelligent job analysis and performance monitoring. It enables AI agents to analyze job performance, identify bottlenecks, and provide insights from Spark History Server data. The server bridges AI agents with existing Apache Spark infrastructure, allowing users to query job details, analyze performance metrics, compare multiple jobs, investigate failures, and generate insights from historical execution data.

github

: 81

ocode

OCode is a sophisticated terminal-native AI coding assistant that provides deep codebase intelligence and autonomous task execution. It seamlessly works with local Ollama models, bringing enterprise-grade AI assistance directly to your development workflow. OCode offers core capabilities such as terminal-native workflow, deep codebase intelligence, autonomous task execution, direct Ollama integration, and an extensible plugin layer. It can perform tasks like code generation & modification, project understanding, development automation, data processing, system operations, and interactive operations. The tool includes specialized tools for file operations, text processing, data processing, system operations, development tools, and integration. OCode enhances conversation parsing, offers smart tool selection, and provides performance improvements for coding tasks.

github

: 117

MassGen

MassGen is a cutting-edge multi-agent system that leverages the power of collaborative AI to solve complex tasks. It assigns a task to multiple AI agents who work in parallel, observe each other's progress, and refine their approaches to converge on the best solution to deliver a comprehensive and high-quality result. The system operates through an architecture designed for seamless multi-agent collaboration, with key features including cross-model/agent synergy, parallel processing, intelligence sharing, consensus building, and live visualization. Users can install the system, configure API settings, and run MassGen for various tasks such as question answering, creative writing, research, development & coding tasks, and web automation & browser tasks. The roadmap includes plans for advanced agent collaboration, expanded model, tool & agent integration, improved performance & scalability, enhanced developer experience, and a web interface.

github

: 454

ck

ck (seek) is a semantic grep tool that finds code by meaning, not just keywords. It replaces traditional grep by understanding the user's search intent. It allows users to search for code based on concepts like 'error handling' and retrieves relevant code even if the exact keywords are not present. ck offers semantic search, drop-in grep compatibility, hybrid search combining keyword precision with semantic understanding, agent-friendly output in JSONL format, smart file filtering, and various advanced features. It supports multiple search modes, relevance scoring, top-K results, and smart exclusions. Users can index projects for semantic search, choose embedding models, and search specific files or directories. The tool is designed to improve code search efficiency and accuracy for developers and AI agents.

github

: 742

quantalogic

QuantaLogic is a ReAct framework for building advanced AI agents that seamlessly integrates large language models with a robust tool system. It aims to bridge the gap between advanced AI models and practical implementation in business processes by enabling agents to understand, reason about, and execute complex tasks through natural language interaction. The framework includes features such as ReAct Framework, Universal LLM Support, Secure Tool System, Real-time Monitoring, Memory Management, and Enterprise Ready components.

github

: 376

code-graph-rag

Graph-Code is an accurate Retrieval-Augmented Generation (RAG) system that analyzes multi-language codebases using Tree-sitter. It builds comprehensive knowledge graphs, enabling natural language querying of codebase structure and relationships, along with editing capabilities. The system supports various languages, uses Tree-sitter for parsing, Memgraph for storage, and AI models for natural language to Cypher translation. It offers features like code snippet retrieval, advanced file editing, shell command execution, interactive code optimization, reference-guided optimization, dependency analysis, and more. The architecture consists of a multi-language parser and an interactive CLI for querying the knowledge graph.

github

: 1.2k

For similar tasks

comfyui_LLM_party

COMFYUI LLM PARTY is a node library designed for LLM workflow development in ComfyUI, an extremely minimalist UI interface primarily used for AI drawing and SD model-based workflows. The project aims to provide a complete set of nodes for constructing LLM workflows, enabling users to easily integrate them into existing SD workflows. It features various functionalities such as API integration, local large model integration, RAG support, code interpreters, online queries, conditional statements, looping links for large models, persona mask attachment, and tool invocations for weather lookup, time lookup, knowledge base, code execution, web search, and single-page search. Users can rapidly develop web applications using API + Streamlit and utilize LLM as a tool node. Additionally, the project includes an omnipotent interpreter node that allows the large model to perform any task, with recommendations to use the 'show_text' node for display output.

github

: 1.6k

n8n

n8n is a workflow automation platform that combines the flexibility of code with the speed of no-code. It offers 400+ integrations, native AI capabilities, and a fair-code license, empowering users to create powerful automations while maintaining control over data and deployments. With features like code customization, AI agent workflows, self-hosting options, enterprise-ready functionalities, and an active community, n8n provides a comprehensive solution for technical teams seeking efficient workflow automation.

github

: 141.4k

wingman

github

: 63

xtuner

XTuner is an efficient, flexible, and full-featured toolkit for fine-tuning large models. It supports various LLMs (InternLM, Mixtral-8x7B, Llama 2, ChatGLM, Qwen, Baichuan, ...), VLMs (LLaVA), and various training algorithms (QLoRA, LoRA, full-parameter fine-tune). XTuner also provides tools for chatting with pretrained / fine-tuned LLMs and deploying fine-tuned LLMs with any other framework, such as LMDeploy.

github

: 4.8k

llm-hosting-container

The LLM Hosting Container repository provides Dockerfile and associated resources for building and hosting containers for large language models, specifically the HuggingFace Text Generation Inference (TGI) container. This tool allows users to easily deploy and manage large language models in a containerized environment, enabling efficient inference and deployment of language-based applications.

github

: 90

For similar jobs

sweep

Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

github

: 7.1k

teams-ai

The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

github

: 502

ai-guide

This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

github

: 159

classifai

Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

github

: 668

chatbot-ui

Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

github

: 27.7k

BricksLLM

BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

github

: 953

uAgents

uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

github

: 1.3k

griptape

Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.

github

: 2.2k