awesome-LLM-resourses

🧑‍🚀 全世界最好的LLM资料总结（数据处理、模型训练、模型部署、o1 模型、MCP、小语言模型、视觉语言模型） | Summary of the world's best LLM resources.

Stars: 4648

Visit

A comprehensive repository of resources for Chinese large language models (LLMs), including data processing tools, fine-tuning frameworks, inference libraries, evaluation platforms, RAG engines, agent frameworks, books, courses, tutorials, and tips. The repository covers a wide range of tools and resources for working with LLMs, from data labeling and processing to model fine-tuning, inference, evaluation, and application development. It also includes resources for learning about LLMs through books, courses, and tutorials, as well as insights and strategies from building with LLMs.

README:

全世界最好的大语言模型资源汇总持续更新

Check More Information

[在线阅读]

数据 Data
微调 Fine-Tuning
推理 Inference
评估 Evaluation
体验 Usage
知识库 RAG
智能体 Agents
搜索 Search
书籍 Book
课程 Course
教程 Tutorial
论文 Paper
社区 Community
MCP
Open o1
Small Language Model
Small Vision Language Model
Tips

数据 Data

[!NOTE]

此处命名为数据，但这里并没有提供具体数据集，而是提供了处理获取大规模数据的方法

AotoLabel: Label, clean and enrich text datasets with LLMs.
LabelLLM: The Open-Source Data Annotation Platform.
data-juicer: A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs!
OmniParser: a native Golang ETL streaming parser and transform library for CSV, JSON, XML, EDI, text, etc.
MinerU: MinerU is a one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.
PDF-Extract-Kit: A Comprehensive Toolkit for High-Quality PDF Content Extraction.
Parsera: Lightweight library for scraping web-sites with LLMs.
Sparrow: Sparrow is an innovative open-source solution for efficient data extraction and processing from various documents and images.
Docling: Get your documents ready for gen AI.
GOT-OCR2.0: OCR Model.
LLM Decontaminator: Rethinking Benchmark and Contamination for Language Models with Rephrased Samples.
DataTrove: DataTrove is a library to process, filter and deduplicate text data at a very large scale.
llm-swarm: Generate large synthetic datasets like Cosmopedia.
Distilabel: Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Common-Crawl-Pipeline-Creator: The Common Crawl Pipeline Creator.
Tabled: Detect and extract tables to markdown and csv.
Zerox: Zero shot pdf OCR with gpt-4o-mini.
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception.
TensorZero: make LLMs improve through experience.
Promptwright: Generate large synthetic data using a local LLM.
pdf-extract-api: Document (PDF) extraction and parse API using state of the art modern OCRs + Ollama supported models.
pdf2htmlEX: Convert PDF to HTML without losing text or format.
Extractous: Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
MegaParse: File Parser optimised for LLM Ingestion with no loss.
MarkItDown: Python tool for converting files and office documents to Markdown.
datasketch: datasketch gives you probabilistic data structures that can process and search very large amount of data super fast, with little loss of accuracy.
semhash: lightweight and flexible tool for deduplicating datasets using semantic similarity.
ReaderLM-v2: a 1.5B parameter language model that converts raw HTML into beautifully formatted markdown or JSON.
Bespoke Curator: Data Curation for Post-Training & Structured Data Extraction.
LangKit: An open-source toolkit for monitoring Large Language Models (LLMs). Extracts signals from prompts & responses, ensuring safety & security.
Curator: Synthetic Data curation for post-training and structured data extraction.
olmOCR: A toolkit for training language models to work with PDF documents in the wild.
Easy Dataset: A powerful tool for creating fine-tuning datasets for LLM.

↥ back to top

微调 Fine-Tuning

LLaMA-Factory: Unify Efficient Fine-Tuning of 100+ LLMs.
360-LLaMA-Factory: Unify Efficient Fine-Tuning of 100+ LLMs. (add Sequence Parallelism for supporting long context training)
unsloth: 2-5X faster 80% less memory LLM finetuning.
TRL: Transformer Reinforcement Learning.
Firefly: Firefly: 大模型训练工具，支持训练数十种大模型
Xtuner: An efficient, flexible and full-featured toolkit for fine-tuning large models.
torchtune: A Native-PyTorch Library for LLM Fine-tuning.
Swift: Use PEFT or Full-parameter to finetune 200+ LLMs or 15+ MLLMs.
AutoTrain: A new way to automatically train, evaluate and deploy state-of-the-art Machine Learning models.
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework (Support 70B+ full tuning & LoRA & Mixtral & KTO).
Ludwig: Low-code framework for building custom LLMs, neural networks, and other AI models.
mistral-finetune: A light-weight codebase that enables memory-efficient and performant finetuning of Mistral's models.
aikit: Fine-tune, build, and deploy open-source LLMs easily!
H2O-LLMStudio: H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs.
LitGPT: Pretrain, finetune, deploy 20+ LLMs on your own data. Uses state-of-the-art techniques: flash attention, FSDP, 4-bit, LoRA, and more.
LLMBox: A comprehensive library for implementing LLMs, including a unified training pipeline and comprehensive model evaluation.
PaddleNLP: Easy-to-use and powerful NLP and LLM library.
workbench-llamafactory: This is an NVIDIA AI Workbench example project that demonstrates an end-to-end model development workflow using Llamafactory.
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral).
TinyLLaVA Factory: A Framework of Small-scale Large Multimodal Models.
LLM-Foundry: LLM training code for Databricks foundation models.
lmms-finetune: A unified codebase for finetuning (full, lora) large multimodal models, supporting llava-1.5, qwen-vl, llava-interleave, llava-next-video, phi3-v etc.
Simplifine: Simplifine lets you invoke LLM finetuning with just one line of code using any Hugging Face dataset or model.
Transformer Lab: Open Source Application for Advanced LLM Engineering: interact, train, fine-tune, and evaluate large language models on your own computer.
Liger-Kernel: Efficient Triton Kernels for LLM Training.
ChatLearn: A flexible and efficient training framework for large-scale alignment.
nanotron: Minimalistic large language model 3D-parallelism training.
Proxy Tuning: Tuning Language Models by Proxy.
Effective LLM Alignment: Effective LLM Alignment Toolkit.
Autotrain-advanced
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
Vision-LLM Alignemnt: This repository contains the code for SFT, RLHF, and DPO, designed for vision-based LLMs, including the LLaVA models and the LLaMA-3.2-vision models.
finetune-Qwen2-VL: Quick Start for Fine-tuning or continue pre-train Qwen2-VL Model.
Online-RLHF: A recipe for online RLHF and online iterative DPO.
InternEvo: an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies.
veRL: Volcano Engine Reinforcement Learning for LLM.
Axolotl: Axolotl is designed to work with YAML config files that contain everything you need to preprocess a dataset, train or fine-tune a model, run model inference or evaluation, and much more.
Oumi: Everything you need to build state-of-the-art foundation models, end-to-end.
Kiln: The easiest tool for fine-tuning LLM models, synthetic data generation, and collaborating on datasets.
DeepSeek-671B-SFT-Guide: An open-source solution for full parameter fine-tuning of DeepSeek-V3/R1 671B, including complete code and scripts from training to inference, as well as some practical experiences and conclusions.
MLX-VLM: MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.

↥ back to top

推理 Inference

ollama: Get up and running with Llama 3, Mistral, Gemma, and other large language models.
Open WebUI: User-friendly WebUI for LLMs (Formerly Ollama WebUI).
Text Generation WebUI: A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
Xinference: A powerful and versatile library designed to serve language, speech recognition, and multimodal models.
LangChain: Build context-aware reasoning applications.
LlamaIndex: A data framework for your LLM applications.
lobe-chat: an open-source, modern-design LLMs/AI chat framework. Supports Multi AI Providers, Multi-Modals (Vision/TTS) and plugin system.
TensorRT-LLM: TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs.
vllm: A high-throughput and memory-efficient inference and serving engine for LLMs.
LlamaChat: Chat with your favourite LLaMA models in a native macOS app.
NVIDIA ChatRTX: ChatRTX is a demo app that lets you personalize a GPT large language model (LLM) connected to your own content—docs, notes, or other data.
LM Studio: Discover, download, and run local LLMs.
chat-with-mlx: Chat with your data natively on Apple Silicon using MLX Framework.
LLM Pricing: Quickly Find the Perfect Large Language Models (LLM) API for Your Budget! Use Our Free Tool for Instant Access to the Latest Prices from Top Providers.
Open Interpreter: A natural language interface for computers.
Chat-ollama: An open source chatbot based on LLMs. It supports a wide range of language models, and knowledge base management.
chat-ui: Open source codebase powering the HuggingChat app.
MemGPT: Create LLM agents with long-term memory and custom tools.
koboldcpp: A simple one-file way to run various GGML and GGUF models with KoboldAI's UI.
LLMFarm: llama and other large language models on iOS and MacOS offline using GGML library.
enchanted: Enchanted is iOS and macOS app for chatting with private self hosted language models such as Llama2, Mistral or Vicuna using Ollama.
Flowise: Drag & drop UI to build your customized LLM flow.
Jan: Jan is an open source alternative to ChatGPT that runs 100% offline on your computer. Multiple engine support (llama.cpp, TensorRT-LLM).
LMDeploy: LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
RouteLLM: A framework for serving and evaluating LLM routers - save LLM costs without compromising quality!
MInference: About To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an A100 while maintaining accuracy.
Mem0: The memory layer for Personalized AI.
SGLang: SGLang is yet another fast serving framework for large language models and vision language models.
AirLLM: AirLLM optimizes inference memory usage, allowing 70B large language models to run inference on a single 4GB GPU card without quantization, distillation and pruning. And you can run 405B Llama3.1 on 8GB vram now.
LLMHub: LLMHub is a lightweight management platform designed to streamline the operation and interaction with various language models (LLMs).
YuanChat
LiteLLM: Call all LLM APIs using the OpenAI format [Bedrock, Huggingface, VertexAI, TogetherAI, Azure, OpenAI, Groq etc.]
GuideLLM: GuideLLM is a powerful tool for evaluating and optimizing the deployment of large language models (LLMs).
LLM-Engines: A unified inference engine for large language models (LLMs) including open-source models (VLLM, SGLang, Together) and commercial models (OpenAI, Mistral, Claude).
OARC: ollama_agent_roll_cage (OARC) is a local python agent fusing ollama llm's with Coqui-TTS speech models, Keras classifiers, Llava vision, Whisper recognition, and more to create a unified chatbot agent for local, custom automation.
g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains.
MemoryScope: MemoryScope provides LLM chatbots with powerful and flexible long-term memory capabilities, offering a framework for building such abilities.
OpenLLM: Run any open-source LLMs, such as Llama 3.1, Gemma, as OpenAI compatible API endpoint in the cloud.
Infinity: The AI-native database built for LLM applications, providing incredibly fast hybrid search of dense embedding, sparse embedding, tensor and full-text.
optillm: an OpenAI API compatible optimizing inference proxy which implements several state-of-the-art techniques that can improve the accuracy and performance of LLMs.
LLaMA Box: LLM inference server implementation based on llama.cpp.
ZhiLight: A highly optimized inference acceleration engine for Llama and its variants.
DashInfer: DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures.
LocalAI: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required.
ktransformers: A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations.
SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 14+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
Chitu: High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.
TokenSwift: From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation.

↥ back to top

评估 Evaluation

lm-evaluation-harness: A framework for few-shot evaluation of language models.
opencompass: OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
llm-comparator: LLM Comparator is an interactive data visualization tool for evaluating and analyzing LLM responses side-by-side, developed.
EvalScope
Weave: A lightweight toolkit for tracking and evaluating LLM applications.
MixEval: Deriving Wisdom of the Crowd from LLM Benchmark Mixtures.
Evaluation guidebook: If you've ever wondered how to make sure an LLM performs well on your specific task, this guide is for you!
Ollama Benchmark: LLM Benchmark for Throughput via Ollama (Local LLMs).
VLMEvalKit: Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 40+ benchmarks.
AGI-Eval
EvalScope: A streamlined and customizable framework for efficient large model evaluation and performance benchmarking.
DeepEval: a simple-to-use, open-source LLM evaluation framework, for evaluating and testing large-language model systems.
Lighteval: Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends.
QwQ/eval: QwQ is the reasoning model series developed by Qwen team, Alibaba Cloud.
Evalchemy: A unified and easy-to-use toolkit for evaluating post-trained language models.
MathArena: Evaluation of LLMs on latest math competitions.
YourBench: A Dynamic Benchmark Generation Framework.

LLM API 服务平台：

↥ back to top

体验 Usage

↥ back to top

知识库 RAG

AnythingLLM: The all-in-one AI app for any LLM with full RAG and AI Agent capabilites.
MaxKB: 基于 LLM 大语言模型的知识库问答系统。开箱即用，支持快速嵌入到第三方业务系统
RAGFlow: An open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Dify: An open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
FastGPT: A knowledge-based platform built on the LLM, offers out-of-the-box data processing and model invocation capabilities, allows for workflow orchestration through Flow visualization.
Langchain-Chatchat: 基于 Langchain 与 ChatGLM 等不同大语言模型的本地知识库问答
QAnything: Question and Answer based on Anything.
Quivr: A personal productivity assistant (RAG) ⚡️🤖 Chat with your docs (PDF, CSV, ...) & apps using Langchain, GPT 3.5 / 4 turbo, Private, Anthropic, VertexAI, Ollama, LLMs, Groq that you can share with users ! Local & Private alternative to OpenAI GPTs & ChatGPT powered by retrieval-augmented generation.
RAG-GPT: RAG-GPT, leveraging LLM and RAG technology, learns from user-customized knowledge bases to provide contextually relevant answers for a wide range of queries, ensuring rapid and accurate information retrieval.
Verba: Retrieval Augmented Generation (RAG) chatbot powered by Weaviate.
FlashRAG: A Python Toolkit for Efficient RAG Research.
GraphRAG: A modular graph-based Retrieval-Augmented Generation (RAG) system.
LightRAG: LightRAG helps developers with both building and optimizing Retriever-Agent-Generator pipelines.
GraphRAG-Ollama-UI: GraphRAG using Ollama with Gradio UI and Extra Features.
nano-GraphRAG: A simple, easy-to-hack GraphRAG implementation.
RAG Techniques: This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. RAG systems combine information retrieval with generative models to provide accurate and contextually rich responses.
ragas: Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines.
kotaemon: An open-source clean & customizable RAG UI for chatting with your documents. Built with both end users and developers in mind.
RAGapp: The easiest way to use Agentic RAG in any enterprise.
TurboRAG: Accelerating Retrieval-Augmented Generation with Precomputed KV Caches for Chunked Text.
LightRAG: Simple and Fast Retrieval-Augmented Generation.
TEN: the Next-Gen AI-Agent Framework, the world's first truly real-time multimodal AI agent framework.
AutoRAG: RAG AutoML tool for automatically finding an optimal RAG pipeline for your data.
KAG: KAG is a knowledge-enhanced generation framework based on OpenSPG engine, which is used to build knowledge-enhanced rigorous decision-making and information retrieval knowledge services.
Fast-GraphRAG: RAG that intelligently adapts to your use case, data, and queries.
Tiny-GraphRAG
DB-GPT GraphRAG: DB-GPT GraphRAG integrates both triplet-based knowledge graphs and document structure graphs while leveraging community and document retrieval mechanisms to enhance RAG capabilities, achieving comparable performance while consuming only 50% of the tokens required by Microsoft's GraphRAG. Refer to the DB-GPT Graph RAG User Manual for details.
Chonkie: The no-nonsense RAG chunking library that's lightweight, lightning-fast, and ready to CHONK your texts.
RAGLite: RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with PostgreSQL or SQLite.
KAG: KAG is a logical form-guided reasoning and retrieval framework based on OpenSPG engine and LLMs.
CAG: CAG leverages the extended context windows of modern large language models (LLMs) by preloading all relevant resources into the model’s context and caching its runtime parameters.
MiniRAG: an extremely simple retrieval-augmented generation framework that enables small models to achieve good RAG performance through heterogeneous graph indexing and lightweight topology-enhanced retrieval.
XRAG: a benchmarking framework designed to evaluate the foundational components of advanced Retrieval-Augmented Generation (RAG) systems.
Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented Generation.

↥ back to top

智能体 Agents

AutoGen: AutoGen is a framework that enables the development of LLM applications using multiple agents that can converse with each other to solve tasks. AutoGen AIStudio
CrewAI: Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
Coze
AgentGPT: Assemble, configure, and deploy autonomous AI Agents in your browser.
XAgent: An Autonomous LLM Agent for Complex Task Solving.
MobileAgent: The Powerful Mobile Device Operation Assistant Family.
Lagent: A lightweight framework for building LLM-based agents.
Qwen-Agent: Agent framework and applications built upon Qwen2, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.
LinkAI: 一站式 AI 智能体搭建平台
Baidu APPBuilder
agentUniverse: agentUniverse is a LLM multi-agent framework that allows developers to easily build multi-agent applications. Furthermore, through the community, they can exchange and share practices of patterns across different domains.
LazyLLM: 低代码构建多Agent大模型应用的开发工具
AgentScope: Start building LLM-empowered multi-agent applications in an easier way.
MoA: Mixture of Agents (MoA) is a novel approach that leverages the collective strengths of multiple LLMs to enhance performance, achieving state-of-the-art results.
Agently: AI Agent Application Development Framework.
OmAgent: A multimodal agent framework for solving complex tasks.
Tribe: No code tool to rapidly build and coordinate multi-agent teams.
CAMEL: First LLM multi-agent framework and an open-source community dedicated to finding the scaling law of agents.
PraisonAI: PraisonAI application combines AutoGen and CrewAI or similar frameworks into a low-code solution for building and managing multi-agent LLM systems, focusing on simplicity, customisation, and efficient human-agent collaboration.
IoA: An open-source framework for collaborative AI agents, enabling diverse, distributed agents to team up and tackle complex tasks through internet-like connectivity.
llama-agentic-system : Agentic components of the Llama Stack APIs.
Agent Zero: Agent Zero is not a predefined agentic framework. It is designed to be dynamic, organically growing, and learning as you use it.
Agents: An Open-source Framework for Data-centric, Self-evolving Autonomous Language Agents.
AgentScope: Start building LLM-empowered multi-agent applications in an easier way.
FastAgency: The fastest way to bring multi-agent workflows to production.
Swarm: Framework for building, orchestrating and deploying multi-agent systems. Managed by OpenAI Solutions team. Experimental framework.
Agent-S: an open agentic framework that uses computers like a human.
PydanticAI: Agent Framework / shim to use Pydantic with LLMs.
Agentarium: open-source framework for creating and managing simulations populated with AI-powered agents.
smolagents: a barebones library for agents. Agents write python code to call tools and orchestrate other agents.

↥ back to top

搜索 Search

OpenSearch GPT: SearchGPT / Perplexity clone, but personalised for you.
MindSearch: An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT).
nanoPerplexityAI: The simplest open-source implementation of perplexity.ai.
curiosity: Try to build a Perplexity-like user experience.
MiniPerplx: A minimalistic AI-powered search engine that helps you find information on the internet.

↥ back to top

书籍 Book

↥ back to top

课程 Course

LLM Resources Hub

斯坦福 CS224N: Natural Language Processing with Deep Learning
吴恩达: Generative AI for Everyone
吴恩达: LLM series of courses
ACL 2023 Tutorial: Retrieval-based Language Models and Applications
llm-course: Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
微软: Generative AI for Beginners
微软: State of GPT
HuggingFace NLP Course
清华 NLP 刘知远团队大模型公开课
斯坦福 CS25: Transformers United V4
斯坦福 CS324: Large Language Models
普林斯顿 COS 597G (Fall 2022): Understanding Large Language Models
约翰霍普金斯 CS 601.471/671 NLP: Self-supervised Models
李宏毅 GenAI课程
openai-cookbook: Examples and guides for using the OpenAI API.
Hands on llms: Learn about LLM, LLMOps, and vector DBS for free by designing, training, and deploying a real-time financial advisor LLM system.
滑铁卢大学 CS 886: Recent Advances on Foundation Models
Mistral: Getting Started with Mistral
斯坦福 CS25: Transformers United V4
Coursera: Chatgpt 应用提示工程
LangGPT: Empowering everyone to become a prompt expert!
mistralai-cookbook
Introduction to Generative AI 2024 Spring
build nanoGPT: Video+code lecture on building nanoGPT from scratch.
LLM101n: Let's build a Storyteller.
Knowledge Graphs for RAG
LLMs From Scratch (Datawhale Version)
OpenRAG
通往AGI之路
Andrej Karpathy - Neural Networks: Zero to Hero
Interactive visualization of Transformer
andysingal/llm-course
LM-class
Google Advanced: Generative AI for Developers Learning Path
Anthropics：Prompt Engineering Interactive Tutorial
LLMsBook
Large Language Model Agents
Cohere LLM University
LLMs and Transformers
Smol Vision: Recipes for shrinking, optimizing, customizing cutting edge vision models.
Multimodal RAG: Chat with Videos
LLMs Interview Note
RAG++ : From POC to production: Advanced RAG course.
Weights & Biases AI Academy: Finetuning, building with LLMs, Structured outputs and more LLM courses.
Prompt Engineering & AI tutorials & Resources
Learn RAG From Scratch – Python AI Tutorial from a LangChain Engineer
LLM Evaluation: A Complete Course
HuggingFace Learn
Andrej Karpathy: Deep Dive into LLMs like ChatGPT
LLM技术科普

↥ back to top

教程 Tutorial

↥ back to top

论文 Paper

[!NOTE] 🤝Huggingface Daily Papers、Cool Papers、ML Papers Explained

↥ back to top

社区 Community

↥ back to top

MCP

MCP工具聚合：

↥ back to top

Open o1

[!NOTE]

开放的技术是我们永恒的追求

↥ back to top

Small Language Model

↥ back to top

Small Vision Language Model

↥ back to top

Tips

What We Learned from a Year of Building with LLMs (Part I)
What We Learned from a Year of Building with LLMs (Part II)
What We Learned from a Year of Building with LLMs (Part III): Strategy
轻松入门大语言模型（LLM）
LLMs for Text Classification: A Guide to Supervised Learning
Unsupervised Text Classification: Categorize Natural Language With LLMs
Text Classification With LLMs: A Roundup of the Best Methods
LLM Pricing
Uncensor any LLM with abliteration
Tiny LLM Universe
Zero-Chatgpt
Zero-Qwen-VL
finetune-Qwen2-VL
MPP-LLaVA
build_MiniLLM_from_scratch
Tiny LLM zh
MiniMind: 3小时完全从0训练一个仅有26M的小参数GPT，最低仅需2G显卡即可推理训练.
LLM-Travel: 致力于深入理解、探讨以及实现与大模型相关的各种技术、原理和应用
Knowledge distillation: Teaching LLM's with synthetic data
Part 1: Methods for adapting large language models
Part 2: To fine-tune or not to fine-tune
Part 3: How to fine-tune: Focus on effective datasets
Reader-LM: Small Language Models for Cleaning and Converting HTML to Markdown
LLMs应用构建一年之心得
LLM训练-pretrain
pytorch-llama: LLaMA 2 implemented from scratch in PyTorch.
Preference Optimization for Vision Language Models with TRL 【support model】
Fine-tuning visual language models using SFTTrainer 【docs】
A Visual Guide to Mixture of Experts (MoE)
Role-Playing in Large Language Models like ChatGPT
Distributed Training Guide: Best practices & guides on how to write distributed pytorch training code.
Chat Templates
Top 20+ RAG Interview Questions
LLM-Dojo 开源大模型学习场所，使用简洁且易阅读的代码构建模型训练框架
o1 isn’t a chat model (and that’s the point)
Beam Search快速理解及代码解析
基于 transformers 的 generate() 方法实现多样化文本生成：参数含义和算法原理解读
The Ultra-Scale Playbook: Training LLMs on GPU Clusters

↥ back to top

如果你觉得本项目对你有帮助，欢迎引用：

@misc{wang2024llm,
      title={awesome-LLM-resourses}, 
      author={Rongsheng Wang},
      year={2024},
      publisher = {GitHub},
      journal = {GitHub repository},
      howpublished = {\url{https://github.com/WangRongsheng/awesome-LLM-resourses}},
}

For Tasks:

Click tags to check more tools for each tasks

label data fine-tune models build chatbots evaluate models develop ai applications

For Jobs:

data scientist machine learning engineer ai researcher nlp specialist ai application developer

Alternative AI tools for awesome-LLM-resourses

Similar Open Source Tools

awesome-LLM-resourses

github

: 4.6k

self-learn-llms

Self Learn LLMs is a repository containing resources for self-learning about Large Language Models. It includes theoretical and practical hands-on resources to facilitate learning. The repository aims to provide a clear roadmap with milestones for proper understanding of LLMs. The owner plans to refactor the repository to remove irrelevant content, organize model zoo better, and enhance the learning experience by adding contributors and hosting notes, tutorials, and open discussions.

github

: 51

nlp-llms-resources

The 'nlp-llms-resources' repository is a comprehensive resource list for Natural Language Processing (NLP) and Large Language Models (LLMs). It covers a wide range of topics including traditional NLP datasets, data acquisition, libraries for NLP, neural networks, sentiment analysis, optical character recognition, information extraction, semantics, topic modeling, multilingual NLP, domain-specific LLMs, vector databases, ethics, costing, books, courses, surveys, aggregators, newsletters, papers, conferences, and societies. The repository provides valuable information and resources for individuals interested in NLP and LLMs.

github

: 82

llm-course

The LLM course is divided into three parts: 1. 🧩 **LLM Fundamentals** covers essential knowledge about mathematics, Python, and neural networks. 2. 🧑‍🔬 **The LLM Scientist** focuses on building the best possible LLMs using the latest techniques. 3. 👷 **The LLM Engineer** focuses on creating LLM-based applications and deploying them. For an interactive version of this course, I created two **LLM assistants** that will answer questions and test your knowledge in a personalized way: * 🤗 **HuggingChat Assistant**: Free version using Mixtral-8x7B. * 🤖 **ChatGPT Assistant**: Requires a premium account. ## 📝 Notebooks A list of notebooks and articles related to large language models. ### Tools | Notebook | Description | Notebook | |----------|-------------|----------| | 🧐 LLM AutoEval | Automatically evaluate your LLMs using RunPod | ![Open In Colab](img/colab.svg) | | 🥱 LazyMergekit | Easily merge models using MergeKit in one click. | ![Open In Colab](img/colab.svg) | | 🦎 LazyAxolotl | Fine-tune models in the cloud using Axolotl in one click. | ![Open In Colab](img/colab.svg) | | ⚡ AutoQuant | Quantize LLMs in GGUF, GPTQ, EXL2, AWQ, and HQQ formats in one click. | ![Open In Colab](img/colab.svg) | | 🌳 Model Family Tree | Visualize the family tree of merged models. | ![Open In Colab](img/colab.svg) | | 🚀 ZeroSpace | Automatically create a Gradio chat interface using a free ZeroGPU. | ![Open In Colab](img/colab.svg) |

github

: 42.1k

Macaw-LLM

Macaw-LLM is a pioneering multi-modal language modeling tool that seamlessly integrates image, audio, video, and text data. It builds upon CLIP, Whisper, and LLaMA models to process and analyze multi-modal information effectively. The tool boasts features like simple and fast alignment, one-stage instruction fine-tuning, and a new multi-modal instruction dataset. It enables users to align multi-modal features efficiently, encode instructions, and generate responses across different data types.

github

: 1.6k

Controllable-RAG-Agent

This repository contains a sophisticated deterministic graph-based solution for answering complex questions using a controllable autonomous agent. The solution is designed to ensure that answers are solely based on the provided data, avoiding hallucinations. It involves various steps such as PDF loading, text preprocessing, summarization, database creation, encoding, and utilizing large language models. The algorithm follows a detailed workflow involving planning, retrieval, answering, replanning, content distillation, and performance evaluation. Heuristics and techniques implemented focus on content encoding, anonymizing questions, task breakdown, content distillation, chain of thought answering, verification, and model performance evaluation.

github

: 951

maxtext

MaxText is a high performance, highly scalable, open-source Large Language Model (LLM) written in pure Python/Jax targeting Google Cloud TPUs and GPUs for training and inference. It aims to be a launching off point for ambitious LLM projects in research and production, supporting TPUs and GPUs, models like Llama2, Mistral, and Gemma. MaxText provides specific instructions for getting started, runtime performance results, comparison to alternatives, and features like stack trace collection, ahead of time compilation for TPUs and GPUs, and automatic upload of logs to Vertex Tensorboard.

github

: 2.1k

OpenNARS-for-Applications

OpenNARS-for-Applications is an implementation of a Non-Axiomatic Reasoning System, a general-purpose reasoner that adapts under the Assumption of Insufficient Knowledge and Resources. The system combines the logic and conceptual ideas of OpenNARS, event handling and procedure learning capabilities of ANSNA and 20NAR1, and the control model from ALANN. It is written in C, offers improved reasoning performance, and has been compared with Reinforcement Learning and means-end reasoning approaches. The system has been used in real-world applications such as assisting first responders, real-time traffic surveillance, and experiments with autonomous robots. It has been developed with a pragmatic mindset focusing on effective implementation of existing theory.

github

: 93

mlcourse.ai

mlcourse.ai is an open Machine Learning course by OpenDataScience (ods.ai), led by Yury Kashnitsky (yorko). The course offers a perfect balance between theory and practice, with math formulae in lectures and practical assignments including Kaggle Inclass competitions. It is currently in a self-paced mode, guiding users through 10 weeks of content covering topics from Pandas to Gradient Boosting. The course provides articles, lectures, and assignments to enhance understanding and application of machine learning concepts.

github

: 9.9k

miniLLMFlow

Mini LLM Flow is a 100-line minimalist LLM framework designed for agents, task decomposition, RAG, etc. It aims to be the framework used by LLMs, focusing on high-level programming paradigms while stripping away low-level implementation details. It serves as a learning resource and allows LLMs to design, build, and maintain projects themselves.

github

: 52

learn-agentic-ai

Learn Agentic AI is a repository that is part of the Panaversity Certified Agentic and Robotic AI Engineer program. It covers AI-201 and AI-202 courses, providing fundamentals and advanced knowledge in Agentic AI. The repository includes video playlists, projects, and project submission guidelines for students to enhance their understanding and skills in the field of AI engineering.

github

: 3.7k

Vision-LLM-Alignment

Vision-LLM-Alignment is a repository focused on implementing alignment training for visual large language models (LLMs), including SFT training, reward model training, and PPO/DPO training. It supports various model architectures and provides datasets for training. The repository also offers benchmark results and installation instructions for users.

github

: 63

babilong

BABILong is a generative benchmark designed to evaluate the performance of NLP models in processing long documents with distributed facts. It consists of 20 tasks that simulate interactions between characters and objects in various locations, requiring models to distinguish important information from irrelevant details. The tasks vary in complexity and reasoning aspects, with test samples potentially containing millions of tokens. The benchmark aims to challenge and assess the capabilities of Large Language Models (LLMs) in handling complex, long-context information.

github

: 125

LazyLLM

LazyLLM is a low-code development tool for building complex AI applications with multiple agents. It assists developers in building AI applications at a low cost and continuously optimizing their performance. The tool provides a convenient workflow for application development and offers standard processes and tools for various stages of application development. Users can quickly prototype applications with LazyLLM, analyze bad cases with scenario task data, and iteratively optimize key components to enhance the overall application performance. LazyLLM aims to simplify the AI application development process and provide flexibility for both beginners and experts to create high-quality applications.

github

: 3.7k

DataDreamer

DataDreamer is a powerful open-source Python library designed for prompting, synthetic data generation, and training workflows. It is simple, efficient, and research-grade, allowing users to create prompting workflows, generate synthetic datasets, and train models with ease. The library is built for researchers, by researchers, focusing on correctness, best practices, and reproducibility. It offers features like aggressive caching, resumability, support for bleeding-edge techniques, and easy sharing of datasets and models. DataDreamer enables users to run multi-step prompting workflows, generate synthetic datasets for various tasks, and train models by aligning, fine-tuning, instruction-tuning, and distilling them using existing or synthetic data.

github

: 897

oreilly-retrieval-augmented-gen-ai

This repository focuses on Retrieval-Augmented Generation (RAG) and Large Language Models (LLMs). It provides code and resources to augment LLMs with real-time data for dynamic, context-aware applications. The content covers topics such as semantic search, fine-tuning embeddings, building RAG chatbots, evaluating LLMs, and using knowledge graphs in RAG. Prerequisites include Python skills, knowledge of machine learning and LLMs, and introductory experience with NLP and AI models.

github

: 164

For similar tasks

agentcloud

AgentCloud is an open-source platform that enables companies to build and deploy private LLM chat apps, empowering teams to securely interact with their data. It comprises three main components: Agent Backend, Webapp, and Vector Proxy. To run this project locally, clone the repository, install Docker, and start the services. The project is licensed under the GNU Affero General Public License, version 3 only. Contributions and feedback are welcome from the community.

github

: 583

zep-python

Zep is an open-source platform for building and deploying large language model (LLM) applications. It provides a suite of tools and services that make it easy to integrate LLMs into your applications, including chat history memory, embedding, vector search, and data enrichment. Zep is designed to be scalable, reliable, and easy to use, making it a great choice for developers who want to build LLM-powered applications quickly and easily.

github

: 60

lollms

LoLLMs Server is a text generation server based on large language models. It provides a Flask-based API for generating text using various pre-trained language models. This server is designed to be easy to install and use, allowing developers to integrate powerful text generation capabilities into their applications.

github

: 287

LlamaIndexTS

LlamaIndex.TS is a data framework for your LLM application. Use your own data with large language models (LLMs, OpenAI ChatGPT and others) in Typescript and Javascript.

github

: 2.5k

semantic-kernel

Semantic Kernel is an SDK that integrates Large Language Models (LLMs) like OpenAI, Azure OpenAI, and Hugging Face with conventional programming languages like C#, Python, and Java. Semantic Kernel achieves this by allowing you to define plugins that can be chained together in just a few lines of code. What makes Semantic Kernel _special_ , however, is its ability to _automatically_ orchestrate plugins with AI. With Semantic Kernel planners, you can ask an LLM to generate a plan that achieves a user's unique goal. Afterwards, Semantic Kernel will execute the plan for the user.

github

: 27.2k

botpress

Botpress is a platform for building next-generation chatbots and assistants powered by OpenAI. It provides a range of tools and integrations to help developers quickly and easily create and deploy chatbots for various use cases.

github

: 14.6k

BotSharp

BotSharp is an open-source machine learning framework for building AI bot platforms. It provides a comprehensive set of tools and components for developing and deploying intelligent virtual assistants. BotSharp is designed to be modular and extensible, allowing developers to easily integrate it with their existing systems and applications. With BotSharp, you can quickly and easily create AI-powered chatbots, virtual assistants, and other conversational AI applications.

github

: 2.6k

qdrant

Qdrant is a vector similarity search engine and vector database. It is written in Rust, which makes it fast and reliable even under high load. Qdrant can be used for a variety of applications, including: * Semantic search * Image search * Product recommendations * Chatbots * Anomaly detection Qdrant offers a variety of features, including: * Payload storage and filtering * Hybrid search with sparse vectors * Vector quantization and on-disk storage * Distributed deployment * Highlighted features such as query planning, payload indexes, SIMD hardware acceleration, async I/O, and write-ahead logging Qdrant is available as a fully managed cloud service or as an open-source software that can be deployed on-premises.

github

: 28.8k

For similar jobs

Awesome-LLM-RAG-Application

Awesome-LLM-RAG-Application is a repository that provides resources and information about applications based on Large Language Models (LLM) with Retrieval-Augmented Generation (RAG) pattern. It includes a survey paper, GitHub repo, and guides on advanced RAG techniques. The repository covers various aspects of RAG, including academic papers, evaluation benchmarks, downstream tasks, tools, and technologies. It also explores different frameworks, preprocessing tools, routing mechanisms, evaluation frameworks, embeddings, security guardrails, prompting tools, SQL enhancements, LLM deployment, observability tools, and more. The repository aims to offer comprehensive knowledge on RAG for readers interested in exploring and implementing LLM-based systems and products.

github

: 1.5k

ChatGPT-On-CS

ChatGPT-On-CS is an intelligent chatbot tool based on large models, supporting various platforms like WeChat, Taobao, Bilibili, Douyin, Weibo, and more. It can handle text, voice, and image inputs, access external resources through plugins, and customize enterprise AI applications based on proprietary knowledge bases. Users can set custom replies, utilize ChatGPT interface for intelligent responses, send images and binary files, and create personalized chatbots using knowledge base files. The tool also features platform-specific plugin systems for accessing external resources and supports enterprise AI applications customization.

github

: 2.2k

call-gpt

Call GPT is a voice application that utilizes Deepgram for Speech to Text, elevenlabs for Text to Speech, and OpenAI for GPT prompt completion. It allows users to chat with ChatGPT on the phone, providing better transcription, understanding, and speaking capabilities than traditional IVR systems. The app returns responses with low latency, allows user interruptions, maintains chat history, and enables GPT to call external tools. It coordinates data flow between Deepgram, OpenAI, ElevenLabs, and Twilio Media Streams, enhancing voice interactions.

github

: 127

awesome-LLM-resourses

github

: 4.6k

tappas

Hailo TAPPAS is a set of full application examples that implement pipeline elements and pre-trained AI tasks. It demonstrates Hailo's system integration scenarios on predefined systems, aiming to accelerate time to market, simplify integration with Hailo's runtime SW stack, and provide a starting point for customers to fine-tune their applications. The tool supports both Hailo-15 and Hailo-8, offering various example applications optimized for different common hosts. TAPPAS includes pipelines for single network, two network, and multi-stream processing, as well as high-resolution processing via tiling. It also provides example use case pipelines like License Plate Recognition and Multi-Person Multi-Camera Tracking. The tool is regularly updated with new features, bug fixes, and platform support.

github

: 122

cloudflare-rag

This repository provides a fullstack example of building a Retrieval Augmented Generation (RAG) app with Cloudflare. It utilizes Cloudflare Workers, Pages, D1, KV, R2, AI Gateway, and Workers AI. The app features streaming interactions to the UI, hybrid RAG with Full-Text Search and Vector Search, switchable providers using AI Gateway, per-IP rate limiting with Cloudflare's KV, OCR within Cloudflare Worker, and Smart Placement for workload optimization. The development setup requires Node, pnpm, and wrangler CLI, along with setting up necessary primitives and API keys. Deployment involves setting up secrets and deploying the app to Cloudflare Pages. The project implements a Hybrid Search RAG approach combining Full Text Search against D1 and Hybrid Search with embeddings against Vectorize to enhance context for the LLM.

github

: 93

pixeltable

Pixeltable is a Python library designed for ML Engineers and Data Scientists to focus on exploration, modeling, and app development without the need to handle data plumbing. It provides a declarative interface for working with text, images, embeddings, and video, enabling users to store, transform, index, and iterate on data within a single table interface. Pixeltable is persistent, acting as a database unlike in-memory Python libraries such as Pandas. It offers features like data storage and versioning, combined data and model lineage, indexing, orchestration of multimodal workloads, incremental updates, and automatic production-ready code generation. The tool emphasizes transparency, reproducibility, cost-saving through incremental data changes, and seamless integration with existing Python code and libraries.

github

: 805

wave-apps

Wave Apps is a directory of sample applications built on H2O Wave, allowing users to build AI apps faster. The apps cover various use cases such as explainable hotel ratings, human-in-the-loop credit risk assessment, mitigating churn risk, online shopping recommendations, and sales forecasting EDA. Users can download, modify, and integrate these sample apps into their own projects to learn about app development and AI model deployment.

github

: 145

awesome-LLM-resourses

README:

Contents

数据 Data

微调 Fine-Tuning

推理 Inference

评估 Evaluation

体验 Usage

知识库 RAG

智能体 Agents

搜索 Search

书籍 Book

课程 Course

教程 Tutorial

论文 Paper

社区 Community

MCP

Open o1

Small Language Model

Small Vision Language Model

Tips

For Tasks:

For Jobs:

Alternative AI tools for awesome-LLM-resourses

Similar Open Source Tools

awesome-LLM-resourses

self-learn-llms

nlp-llms-resources

llm-course

Macaw-LLM

Controllable-RAG-Agent

maxtext

OpenNARS-for-Applications

mlcourse.ai

miniLLMFlow

learn-agentic-ai

Vision-LLM-Alignment

babilong

LazyLLM

DataDreamer

oreilly-retrieval-augmented-gen-ai

For similar tasks

agentcloud

zep-python

lollms

LlamaIndexTS

semantic-kernel

botpress

BotSharp

qdrant

For similar jobs

Awesome-LLM-RAG-Application

ChatGPT-On-CS

call-gpt

awesome-LLM-resourses

tappas

cloudflare-rag

pixeltable

wave-apps