llm-resource

LLM全栈优质资源汇总

Stars: 309

Visit

llm-resource is a comprehensive collection of high-quality resources for Large Language Models (LLM). It covers various aspects of LLM including algorithms, training, fine-tuning, alignment, inference, data engineering, compression, evaluation, prompt engineering, AI frameworks, AI basics, AI infrastructure, AI compilers, LLM application development, LLM operations, AI systems, and practical implementations. The repository aims to gather and share valuable resources related to LLM for the community to benefit from.

README:

llm-resource（LLM 百宝箱）

LLM全栈优质资源汇总

非常欢迎大家也参与进来，收集更多优质大模型相关资源。

🐼 LLM算法
🐘 LLM训练
- 🐘 LLM微调
- 🐼 LLM对齐
🔥 LLM推理
🌴 LLM数据工程（Data Engineering）
📡 LLM压缩
🐰 LLM测评
🐘 AI基础知识
📡 AI基础设施
- 🌴 AI芯片
- 🐰 CUDA
🐘 AI编译器
🐰 AI框架
📡 LLM应用开发
🐘 LLMOps
📡 LLM实践
📡微信公众号文章集锦

LLM算法

Transformer

原理：

源码：

GPT1

GPT2

GPT2 源码：https://github.com/huggingface/transformers/blob/main/src/transformers/models/gpt2/modeling_gpt2.py
GPT2 源码解析：https://zhuanlan.zhihu.com/p/630970209
nanoGPT：https://github.com/karpathy/nanoGPT/blob/master/model.py
7.3 GPT2模型深度解析：http://121.199.45.168:13013/7_3.html
GPT（三）GPT2原理和代码详解: https://zhuanlan.zhihu.com/p/637782385
GPT2参数量剖析: https://zhuanlan.zhihu.com/p/640501114

ChatGPT

GLM

预训练语言模型：GLM

LLaMA

MOE 大模型

下一代大模型

多模态大模型

A Survey on Multimodal Large Language Models：https://arxiv.org/pdf/2306.13549 Efficient-Multimodal-LLMs-Survey：https://github.com/lijiannuist/Efficient-Multimodal-LLMs-Survey

LLM训练

LLM微调

Adapting P-Tuning to Solve Non-English Downstream Tasks

LLM对齐

LLM推理

使用HuggingFace的Accelerate库加载和运行超大模型 : device_map、no_split_module_classes、 offload_folder、 offload_state_dict
借助 PyTorch，Accelerate 如何运行超大模型
使用 DeepSpeed 和 Accelerate 进行超快 BLOOM 模型推理
LLM七种推理服务框架总结
LLM投机采样（Speculative Sampling）为何能加速模型推理
大模型推理妙招—投机采样（Speculative Decoding）
https://github.com/flexflow/FlexFlow/tree/inference
TensorRT-LLM(3)--架构
NLP（十八）：LLM 的推理优化技术纵览：https://zhuanlan.zhihu.com/p/642412124
揭秘NVIDIA大模型推理框架：TensorRT-LLM：https://zhuanlan.zhihu.com/p/680808866
如何生成文本: 通过 Transformers 用不同的解码方法生成文本 | How to generate text: using different decoding methods for language generation with Transformers

大模型推理优化技术

KV Cache：

解码优化：

大模型推理妙招—投机采样（Speculative Decoding）

vLLM

LLM数据工程

An Initial Exploration of Theoretical Support for Language Model Data Engineering. Part 1: Pretraining @ 符尧

LLM压缩

LLM测评

CLiB中文大模型能力评测榜单
huggingface Open LLM Leaderboard
HELM：https://github.com/stanford-crfm/helm
HELM：https://crfm.stanford.edu/helm/latest/
lm-evaluation-harness：https://github.com/EleutherAI/lm-evaluation-harness/
CLEVA：http://www.lavicleva.com/#/homepage/overview
CLEVA：https://github.com/LaVi-Lab/CLEVA/blob/main/README_zh-CN.md

提示工程

综合

safetensors：

AI框架

PyTorch

PyTorch 源码解读系列 @ OpenMMLab 团队
[源码解析] PyTorch 分布式 @ 罗西的思考
PyTorch 分布式(18) --- 使用 RPC 的分布式流水线并行 @ 罗西的思考
【Pytorch】model.train() 和 model.eval() 原理与用法

DeepSpeed

Megatron-LM

Megatron-DeepSpeed

Huggingface Transformers

AI基础知识

AI基础设施

AI芯片

业界AI加速芯片浅析（一）百度昆仑芯
NVIDIA CUDA-X AI：https://www.nvidia.cn/technologies/cuda-x/
Intel，Nvidia，AMD三大巨头火拼GPU与CPU
处理器与AI芯片-Google-TPU：https://zhuanlan.zhihu.com/p/646793355
一文看懂国产AI芯片玩家
深度 | 国产AI芯片，玩家几何

CUDA

AI编译器

TVM资料
AI编译器原理 @ZIMO酱

LLM应用开发

LLMOps

MLOps Landscape in 2023: Top Tools and Platforms
What Constitutes A Large Language Model Application? ：LLM Functionality Landscape
AI System @吃果冻不吐果冻皮

RAG

https://github.com/hymie122/RAG-Survey

书籍

大语言模型原理与工程 @杨青
大语言模型从理论到实践 @张奇：https://intro-llm.github.io/
动手学大模型

LLM实践

minGPT @karpathy
llm.c @karpathy: LLM training in simple, raw C/CUDA
LLM101n
llama2.c: Inference Llama 2 in one file of pure C
nanoGPT
Baby-Llama2-Chinese
从0到1构建一个MiniLLM
gpt-fast 、blog

大模型汇总资料

微信公众号文章集锦

其他

Hugging Face 博客

For Tasks:

Click tags to check more tools for each tasks

train models fine-tune models compress models evaluate models develop llm applications

For Jobs:

data scientist machine learning engineer ai researcher nlp specialist ai infrastructure engineer

Alternative AI tools for llm-resource

Similar Open Source Tools

llm-resource

github

: 309

LLMLanding

LLMLanding is a repository focused on practical implementation of large models, covering topics from theory to practice. It provides a structured learning path for training large models, including specific tasks like training 1B-scale models, exploring SFT, and working on specialized tasks such as code generation, NLP tasks, and domain-specific fine-tuning. The repository emphasizes a dual learning approach: quickly applying existing tools for immediate output benefits and delving into foundational concepts for long-term understanding. It offers detailed resources and pathways for in-depth learning based on individual preferences and goals, combining theory with practical application to avoid overwhelm and ensure sustained learning progress.

github

: 95

awesome-chatgpt-zh

The Awesome ChatGPT Chinese Guide project aims to help Chinese users understand and use ChatGPT. It collects various free and paid ChatGPT resources, as well as methods to communicate more effectively with ChatGPT in Chinese. The repository contains a rich collection of ChatGPT tools, applications, and examples.

github

: 10.5k

MaiMBot

MaiMBot is an intelligent QQ group chat bot based on a large language model. It is developed using the nonebot2 framework, utilizes LLM for conversation abilities, MongoDB for data persistence, and NapCat for QQ protocol support. The bot features keyword-triggered proactive responses, dynamic prompt construction, support for images and message forwarding, typo generation, multiple replies, emotion-based emoji responses, daily schedule generation, user relationship management, knowledge base, and group impressions. Work-in-progress features include personality, group atmosphere, image handling, humor, meme functions, and Minecraft interactions. The tool is in active development with plans for GIF compatibility, mini-program link parsing, bug fixes, documentation improvements, and logic enhancements for emoji sending.

github

: 1.1k

Code-Review-GPT-Gitlab

A project that utilizes large models to help with Code Review on Gitlab, aimed at improving development efficiency. The project is customized for Gitlab and is developing a Multi-Agent plugin for collaborative review. It integrates various large models for code security issues and stays updated with the latest Code Review trends. The project architecture is designed to be powerful, flexible, and efficient, with easy integration of different models and high customization for developers.

github

: 452

MaiBot

MaiBot is an intelligent QQ group chat bot based on a large language model. It is developed using the nonebot2 framework, with LLM providing conversation abilities, MongoDB for data persistence support, and NapCat as the QQ protocol endpoint support. The project is in active development stage, with features like chat functionality, emoji functionality, schedule management, memory function, knowledge base function, and relationship function planned for future updates. The project aims to create a 'life form' active in QQ group chats, focusing on companionship and creating a more human-like presence rather than a perfect assistant. The application generates content from AI models, so users are advised to discern carefully and not use it for illegal purposes.

github

: 1.8k

Awesome-Mind-Network

github

: 197

douyin-chatgpt-bot

Douyin ChatGPT Bot is an AI-driven system for automatic replies on Douyin, including comment and private message replies. It offers features such as comment filtering, customizable robot responses, and automated account management. The system aims to enhance user engagement and brand image on the Douyin platform, providing a seamless experience for managing interactions with followers and potential customers.

github

: 166

Long-Novel-GPT

Long-Novel-GPT is a long novel generator based on large language models like GPT. It utilizes a hierarchical outline/chapter/text structure to maintain the coherence of long novels. It optimizes API calls cost through context management and continuously improves based on self or user feedback until reaching the set goal. The tool aims to continuously refine and build novel content based on user-provided initial ideas, ultimately generating long novels at the level of human writers.

github

: 396

AHU-AI-Repository

This repository is dedicated to the learning and exchange of resources for the School of Artificial Intelligence at Anhui University. Notes will be published on this website first: https://www.aoaoaoao.cn and will be synchronized to the repository regularly. You can also contact me at [email protected].

github

: 197

LLMForEverybody

LLMForEverybody is a comprehensive repository covering various aspects of large language models (LLMs) including pre-training, architecture, optimizers, activation functions, attention mechanisms, tokenization, parallel strategies, training frameworks, deployment, fine-tuning, quantization, GPU parallelism, prompt engineering, agent design, RAG architecture, enterprise deployment challenges, evaluation metrics, and current hot topics in the field. It provides detailed explanations, tutorials, and insights into the workings and applications of LLMs, making it a valuable resource for researchers, developers, and enthusiasts interested in understanding and working with large language models.

github

: 2.0k

FisherAI

FisherAI is a Chrome extension designed to improve learning efficiency. It supports automatic summarization, web and video translation, multi-turn dialogue, and various large language models such as gpt/azure/gemini/deepseek/mistral/groq/yi/moonshot. Users can enjoy flexible and powerful AI tools with FisherAI.

github

: 120

aituber-kit

AITuber-Kit is a tool that enables users to interact with AI characters, conduct AITuber live streams, and engage in external integration modes. Users can easily converse with AI characters using various LLM APIs, stream on YouTube with AI character reactions, and send messages to server apps via WebSocket. The tool provides settings for API keys, character configurations, voice synthesis engines, and more. It supports multiple languages and allows customization of VRM models and background images. AITuber-Kit follows the MIT license and offers guidelines for adding new languages to the project.

github

: 421

AI-Drug-Discovery-Design

AI-Drug-Discovery-Design is a repository focused on Artificial Intelligence-assisted Drug Discovery and Design. It explores the use of AI technology to accelerate and optimize the drug development process. The advantages of AI in drug design include speeding up research cycles, improving accuracy through data-driven models, reducing costs by minimizing experimental redundancies, and enabling personalized drug design for specific patients or disease characteristics.

github

: 77

Fay

Fay is an open-source digital human framework that offers different versions for various purposes. The '带货完整版' is suitable for online and offline salespersons. The '助理完整版' serves as a human-machine interactive digital assistant that can also control devices upon command. The 'agent版' is designed to be an autonomous agent capable of making decisions and contacting its owner. The framework provides updates and improvements across its different versions, including features like emotion analysis integration, model optimizations, and compatibility enhancements. Users can access detailed documentation for each version through the provided links.

github

: 10.7k

LabelQuick

LabelQuick_V2.0 is a fast image annotation tool designed and developed by the AI Horizon team. This version has been optimized and improved based on the previous version. It provides an intuitive interface and powerful annotation and segmentation functions to efficiently complete dataset annotation work. The tool supports video object tracking annotation, quick annotation by clicking, and various video operations. It introduces the SAM2 model for accurate and efficient object detection in video frames, reducing manual intervention and improving annotation quality. The tool is designed for Windows systems and requires a minimum of 6GB of memory.

github

: 70

For similar tasks

aimet

AIMET is a library that provides advanced model quantization and compression techniques for trained neural network models. It provides features that have been proven to improve run-time performance of deep learning neural network models with lower compute and memory requirements and minimal impact to task accuracy. AIMET is designed to work with PyTorch, TensorFlow and ONNX models. We also host the AIMET Model Zoo - a collection of popular neural network models optimized for 8-bit inference. We also provide recipes for users to quantize floating point models using AIMET.

github

: 2.3k

hqq

HQQ is a fast and accurate model quantizer that skips the need for calibration data. It's super simple to implement (just a few lines of code for the optimizer). It can crunch through quantizing the Llama2-70B model in only 4 minutes! 🚀

github

: 770

llm-resource

github

: 309

llmc

llmc is an off-the-shell tool designed for compressing LLM, leveraging state-of-the-art compression algorithms to enhance efficiency and reduce model size without compromising performance. It provides users with the ability to quantize LLMs, choose from various compression algorithms, export transformed models for further optimization, and directly infer compressed models with a shallow memory footprint. The tool supports a range of model types and quantization algorithms, with ongoing development to include pruning techniques. Users can design their configurations for quantization and evaluation, with documentation and examples planned for future updates. llmc is a valuable resource for researchers working on post-training quantization of large language models.

github

: 430

Awesome-Efficient-LLM

Awesome-Efficient-LLM is a curated list focusing on efficient large language models. It includes topics such as knowledge distillation, network pruning, quantization, inference acceleration, efficient MOE, efficient architecture of LLM, KV cache compression, text compression, low-rank decomposition, hardware/system, tuning, and survey. The repository provides a collection of papers and projects related to improving the efficiency of large language models through various techniques like sparsity, quantization, and compression.

github

: 1.6k

TensorRT-Model-Optimizer

The NVIDIA TensorRT Model Optimizer is a library designed to quantize and compress deep learning models for optimized inference on GPUs. It offers state-of-the-art model optimization techniques including quantization and sparsity to reduce inference costs for generative AI models. Users can easily stack different optimization techniques to produce quantized checkpoints from torch or ONNX models. The quantized checkpoints are ready for deployment in inference frameworks like TensorRT-LLM or TensorRT, with planned integrations for NVIDIA NeMo and Megatron-LM. The tool also supports 8-bit quantization with Stable Diffusion for enterprise users on NVIDIA NIM. Model Optimizer is available for free on NVIDIA PyPI, and this repository serves as a platform for sharing examples, GPU-optimized recipes, and collecting community feedback.

github

: 438

Awesome_LLM_System-PaperList

Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of papers on LLMs inference and serving.

github

: 184

llm-compressor

llm-compressor is an easy-to-use library for optimizing models for deployment with vllm. It provides a comprehensive set of quantization algorithms, seamless integration with Hugging Face models and repositories, and supports mixed precision, activation quantization, and sparsity. Supported algorithms include PTQ, GPTQ, SmoothQuant, and SparseGPT. Installation can be done via git clone and local pip install. Compression can be easily applied by selecting an algorithm and calling the oneshot API. The library also offers end-to-end examples for model compression. Contributions to the code, examples, integrations, and documentation are appreciated.

github

: 1.2k

For similar jobs

llm-resource

github

: 309

LitServe

LitServe is a high-throughput serving engine designed for deploying AI models at scale. It generates an API endpoint for models, handles batching, streaming, and autoscaling across CPU/GPUs. LitServe is built for enterprise scale with a focus on minimal, hackable code-base without bloat. It supports various model types like LLMs, vision, time-series, and works with frameworks like PyTorch, JAX, Tensorflow, and more. The tool allows users to focus on model performance rather than serving boilerplate, providing full control and flexibility.

github

: 3.0k

how-to-optim-algorithm-in-cuda

This repository documents how to optimize common algorithms based on CUDA. It includes subdirectories with code implementations for specific optimizations. The optimizations cover topics such as compiling PyTorch from source, NVIDIA's reduce optimization, OneFlow's elementwise template, fast atomic add for half data types, upsample nearest2d optimization in OneFlow, optimized indexing in PyTorch, OneFlow's softmax kernel, linear attention optimization, and more. The repository also includes learning resources related to deep learning frameworks, compilers, and optimization techniques.

github

: 2.1k

aiac

AIAC is a library and command line tool to generate Infrastructure as Code (IaC) templates, configurations, utilities, queries, and more via LLM providers such as OpenAI, Amazon Bedrock, and Ollama. Users can define multiple 'backends' targeting different LLM providers and environments using a simple configuration file. The tool allows users to ask a model to generate templates for different scenarios and composes an appropriate request to the selected provider, storing the resulting code to a file and/or printing it to standard output.

github

: 3.4k

ENOVA

ENOVA is an open-source service for Large Language Model (LLM) deployment, monitoring, injection, and auto-scaling. It addresses challenges in deploying stable serverless LLM services on GPU clusters with auto-scaling by deconstructing the LLM service execution process and providing configuration recommendations and performance detection. Users can build and deploy LLM with few command lines, recommend optimal computing resources, experience LLM performance, observe operating status, achieve load balancing, and more. ENOVA ensures stable operation, cost-effectiveness, efficiency, and strong scalability of LLM services.

github

: 124

jina

Jina is a tool that allows users to build multimodal AI services and pipelines using cloud-native technologies. It provides a Pythonic experience for serving ML models and transitioning from local deployment to advanced orchestration frameworks like Docker-Compose, Kubernetes, or Jina AI Cloud. Users can build and serve models for any data type and deep learning framework, design high-performance services with easy scaling, serve LLM models while streaming their output, integrate with Docker containers via Executor Hub, and host on CPU/GPU using Jina AI Cloud. Jina also offers advanced orchestration and scaling capabilities, a smooth transition to the cloud, and easy scalability and concurrency features for applications. Users can deploy to their own cloud or system with Kubernetes and Docker Compose integration, and even deploy to JCloud for autoscaling and monitoring.

github

: 21.0k

vidur

Vidur is a high-fidelity and extensible LLM inference simulator designed for capacity planning, deployment configuration optimization, testing new research ideas, and studying system performance of models under different workloads and configurations. It supports various models and devices, offers chrome trace exports, and can be set up using mamba, venv, or conda. Users can run the simulator with various parameters and monitor metrics using wandb. Contributions are welcome, subject to a Contributor License Agreement and adherence to the Microsoft Open Source Code of Conduct.

github

: 241

AI-System-School

AI System School is a curated list of research in machine learning systems, focusing on ML/DL infra, LLM infra, domain-specific infra, ML/LLM conferences, and general resources. It provides resources such as data processing, training systems, video systems, autoML systems, and more. The repository aims to help users navigate the landscape of AI systems and machine learning infrastructure, offering insights into conferences, surveys, books, videos, courses, and blogs related to the field.

github

: 2.6k

llm-resource

README:

llm-resource（LLM 百宝箱）

目录

LLM算法

Transformer

GPT1

GPT2

ChatGPT

GLM

LLaMA

MOE 大模型

下一代大模型

多模态大模型

LLM训练

LLM微调

LLM对齐

LLM推理

大模型推理优化技术

vLLM

LLM数据工程

LLM压缩

LLM测评

提示工程

综合

AI框架

PyTorch

DeepSpeed

Megatron-LM

Megatron-DeepSpeed

Huggingface Transformers

AI基础知识

AI基础设施

AI芯片

CUDA

AI编译器

LLM应用开发

LLMOps

RAG

书籍

LLM实践

大模型汇总资料

微信公众号文章集锦

其他

For Tasks:

For Jobs:

Alternative AI tools for llm-resource

Similar Open Source Tools

llm-resource

LLMLanding

awesome-chatgpt-zh

MaiMBot

Code-Review-GPT-Gitlab

MaiBot

Awesome-Mind-Network

douyin-chatgpt-bot

Long-Novel-GPT

AHU-AI-Repository

LLMForEverybody

FisherAI

aituber-kit

AI-Drug-Discovery-Design

Fay

LabelQuick

For similar tasks

aimet

hqq

llm-resource

llmc

Awesome-Efficient-LLM

TensorRT-Model-Optimizer

Awesome_LLM_System-PaperList

llm-compressor

For similar jobs

llm-resource

LitServe

how-to-optim-algorithm-in-cuda

aiac

ENOVA

jina