Awesome-LLMOps

Awesome-LLMOps

๐ŸŽ‰ An awesome & curated list of best LLMOps tools.

Stars: 53

Visit
 screenshot

Awesome-LLMOps is a curated list of the best LLMOps tools, providing a comprehensive collection of frameworks and tools for building, deploying, and managing large language models (LLMs) and AI agents. The repository includes a wide range of tools for tasks such as building multimodal AI agents, fine-tuning models, orchestrating applications, evaluating models, and serving models for inference. It covers various aspects of the machine learning operations (MLOps) lifecycle, from training to deployment and observability. The tools listed in this repository cater to the needs of developers, data scientists, and machine learning engineers working with large language models and AI applications.

README:

Awesome-LLMOps Awesome

๐ŸŽ‰ An awesome & curated list of best LLMOps tools.

Table of Contents

Agent

Framework

  • Agno: Build Multimodal AI Agents with memory, knowledge and tools. Simple, fast and model-agnostic. Stars Contributors LastCommit
  • AutoGPT: AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters. Stars Contributors LastCommit
  • LangGraph: Build resilient language agents as graphs. Stars Contributors LastCommit
  • MetaGPT: ๐ŸŒŸ The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming. Stars Contributors LastCommit
  • OpenAI Agents SDK: A lightweight, powerful framework for multi-agent workflows. Stars Contributors LastCommit
  • OpenManus: No fortress, purely open ground. OpenManus is Coming. Stars Contributors LastCommit
  • PydanticAI: Agent Framework / shim to use Pydantic with LLMs. Stars Contributors LastCommit
  • Swarm: Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team. Stars Contributors LastCommit Tag

Tools

  • Browser Use: Make websites accessible for AI agents. Stars Contributors LastCommit
  • Mem0: The Memory layer for AI Agents. Stars Contributors LastCommit
  • OpenAI CUA: Computer Using Agent Sample App. Stars Contributors LastCommit

Alignment

  • OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT). Stars Contributors LastCommit
  • Self-RLHF: Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback. Stars Contributors LastCommit

Application Orchestration Framework

  • Dify: Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production. Stars Contributors LastCommit
  • Flowise: Drag & drop UI to build your customized LLM flow. Stars Contributors LastCommit
  • Haystack: AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots. Stars Contributors LastCommit
  • Inference: Turn any computer or edge device into a command center for your computer vision projects. Stars Contributors LastCommit Tag
  • LangChain: ๐Ÿฆœ๐Ÿ”— Build context-aware reasoning applications. Stars Contributors LastCommit
  • LlamaIndex: LlamaIndex is the leading framework for building LLM-powered agents over your data. Stars Contributors LastCommit

Chat Framework

  • FastChat: An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena. Stars Contributors LastCommit
  • Gradio: Build and share delightful machine learning apps, all in Python. ๐ŸŒŸ Star to support our work! Stars Contributors LastCommit
  • Jan: Jan is an open source alternative to ChatGPT that runs 100% offline on your computer. Stars Contributors LastCommit
  • Lobe Chat: ๐Ÿคฏ Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / DeepSeek / Qwen), Knowledge Base (file upload / knowledge management / RAG ), Multi-Modals (Plugins/Artifacts) and Thinking. One-click FREE deployment of your private ChatGPT/ Claude / DeepSeek application. Stars Contributors LastCommit
  • NextChat: โœจ Light and Fast AI Assistant. Support: Web | iOS | MacOS | Android | Linux | Windows. Stars Contributors LastCommit
  • Open WebUI: User-friendly AI Interface (Supports Ollama, OpenAI API, ...). Stars Contributors LastCommit
  • PrivateGPT: Interact with your documents using the power of GPT, 100% privately, no data leaks. Stars Contributors LastCommit

Database

  • chroma: the AI-native open-source embedding database. Stars Contributors LastCommit
  • deeplake: Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. Stars Contributors LastCommit
  • Faiss: A library for efficient similarity search and clustering of dense vectors. Stars Contributors LastCommit
  • milvus: Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search. Stars Contributors LastCommit
  • weaviate: Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native databaseโ€‹. Stars Contributors LastCommit

Evaluation

  • AgentBench: A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24). Stars Contributors LastCommit
  • lm-evaluation-harness: A framework for few-shot evaluation of language models. Stars Contributors LastCommit
  • LongBench: LongBench v2 and LongBench (ACL 2024). Stars Contributors LastCommit

FineTune

  • Axolotl: Go ahead and axolotl questions. Stars Contributors LastCommit
  • LLaMa-Factory: Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024). Stars Contributors LastCommit
  • LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All. Stars Contributors LastCommit
  • maestro: streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL. Stars Contributors LastCommit
  • Swift: Use PEFT or Full-parameter to finetune 450+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek-R1, ...) and 150+ MLLMs (Qwen2.5-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2.5, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL2, Phi3.5-Vision, GOT-OCR2, ...). Stars Contributors LastCommit
  • torchtune: PyTorch native post-training library. Stars Contributors LastCommit
  • unsloth: Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory! ๐Ÿฆฅ Stars Contributors LastCommit

Gateway

LLM Router

  • AI Gateway: A blazing fast AI Gateway with integrated guardrails. Route to 200+ LLMs, 50+ AI Guardrails with 1 fast & friendly API. Stars Contributors LastCommit
  • LiteLLM: Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]. Stars Contributors LastCommit
  • RouteLLM: A framework for serving and evaluating LLM routers - save LLM costs without compromising quality. Stars Contributors LastCommit

API Gateway

  • Envoy AI Gateway: Envoy AI Gateway is an open source project for using Envoy Gateway to handle request traffic from application clients to Generative AI services. Stars Contributors LastCommit
  • Higress: ๐Ÿค– AI Gateway | AI Native API Gateway. Stars Contributors LastCommit
  • kgateway: The Cloud-Native API Gateway and AI Gateway. Stars Contributors LastCommit
  • Kong: ๐Ÿฆ The Cloud-Native API Gateway and AI Gateway. Stars Contributors LastCommit

Inference

Inference Engine

  • Cortex.cpp: Local AI API Platform. Stars Contributors LastCommit
  • DeepSpeed-MII: MII makes low-latency and high-throughput inference possible, powered by DeepSpeed. Stars Contributors LastCommit
  • ipex-llm: Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, DeepSpeed, Axolotl, etc. Stars Contributors LastCommit
  • LMDeploy: LMDeploy is a toolkit for compressing, deploying, and serving LLMs. Stars Contributors LastCommit
  • LoRAX: Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs. Stars Contributors LastCommit Tag
  • llama.cpp: LLM inference in C/C++. Stars Contributors LastCommit
  • Llumnix: Efficient and easy multi-instance LLM serving. Stars Contributors LastCommit
  • MInference: [NeurIPS'24 Spotlight, ICLR'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an A100 while maintaining accuracy. Stars Contributors LastCommit Tag
  • MLC LLM: Universal LLM Deployment Engine with ML Compilation. Stars Contributors LastCommit
  • MLServer: An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and more. Stars Contributors LastCommit
  • Ollama: Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, and other large language models. Stars Contributors LastCommit
  • OpenVINO: OpenVINOโ„ข is an open source toolkit for optimizing and deploying AI inference. Stars Contributors LastCommit
  • Ratchet: A cross-platform browser ML framework. Stars Contributors LastCommit Tag
  • SGLang: SGLang is a fast serving framework for large language models and vision language models. Stars Contributors LastCommit
  • transformers.js: State-of-the-art Machine Learning for the web. Run ๐Ÿค— Transformers directly in your browser, with no need for a server! Stars Contributors LastCommit Tag
  • Triton Inference Server: The Triton Inference Server provides an optimized cloud and edge inferencing solution. Stars Contributors LastCommit
  • Text Generation Inference: Large Language Model Text Generation Inference. Stars Contributors LastCommit
  • vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs. Stars Contributors LastCommit
  • web-llm: High-performance In-browser LLM Inference Engine. Stars Contributors LastCommit Tag
  • zml: Any model. Any hardware. Zero compromise. Built with @ziglang / @openxla / MLIR / @bazelbuild. Stars Contributors LastCommit

Inference Platform

  • AIBrix: Cost-efficient and pluggable Infrastructure components for GenAI inference. Stars Contributors LastCommit
  • Kserve: Standardized Serverless ML Inference Platform on Kubernetes. Stars Contributors LastCommit
  • KubeAI: AI Inference Operator for Kubernetes. The easiest way to serve ML models in production. Supports VLMs, LLMs, embeddings, and speech-to-text. Stars Contributors LastCommit
  • llmaz: โ˜ธ๏ธ Easy, advanced inference platform for large language models on Kubernetes. ๐ŸŒŸ Star to support our work! Stars Contributors LastCommit
  • LMCache: 10x Faster Long-Context LLM By Smart KV Cache Optimizations. Stars Contributors LastCommit Tag
  • Mooncake: Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI. Stars Contributors LastCommit
  • OpenLLM: Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud. Stars Contributors LastCommit

MLOps

  • BentoML: The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more! Stars Contributors LastCommit
  • Flyte: Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks. Stars Contributors LastCommit
  • Kubeflow: Machine Learning Toolkit for Kubernetes. Stars Contributors LastCommit
  • Metaflow: Build, Deploy and Manage AI/ML Systems. Stars Contributors LastCommit
  • MLflow: Open source platform for the machine learning lifecycle. Stars Contributors LastCommit
  • Polyaxon: MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle. Stars Contributors LastCommit
  • Ray: Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads. Stars Contributors LastCommit
  • Seldon-Core: An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models. Stars Contributors LastCommit
  • ZenML: ZenML ๐Ÿ™: The bridge between ML and Ops. https://zenml.io. Stars Contributors LastCommit

Observation

  • OpenLLMetry: Open-source observability for your LLM application, based on OpenTelemetry. Stars Contributors LastCommit
  • Helicone: ๐ŸงŠ Open source LLM observability platform. One line of code to monitor, evaluate, and experiment. YC W23 ๐Ÿ“ Stars Contributors LastCommit
  • phoenix: AI Observability & Evaluation. Stars Contributors LastCommit
  • wandb: The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production. Stars Contributors LastCommit

Output

Training

  • ColossalAI: Making large AI models cheaper, faster and more accessible. Stars Contributors LastCommit
  • Ludwig: Low-code framework for building custom LLMs, neural networks, and other AI models. Stars Contributors LastCommit
  • MaxText: A simple, performant and scalable Jax LLM! Stars Contributors LastCommit
  • MLX: MLX: An array framework for Apple silicon. Stars Contributors LastCommit

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for Awesome-LLMOps

Similar Open Source Tools

For similar tasks

For similar jobs