LightLLM

LightLLM

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Stars: 3581

Visit
 screenshot

LightLLM is a lightweight library for linear and logistic regression models. It provides a simple and efficient way to train and deploy machine learning models for regression tasks. The library is designed to be easy to use and integrate into existing projects, making it suitable for both beginners and experienced data scientists. With LightLLM, users can quickly build and evaluate regression models using a variety of algorithms and hyperparameters. The library also supports feature engineering and model interpretation, allowing users to gain insights from their data and make informed decisions based on the model predictions.

README:

LightLLM

docs Docker stars visitors Discord Banner license

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance. LightLLM harnesses the strengths of numerous well-regarded open-source implementations, including but not limited to FasterTransformer, TGI, vLLM, and FlashAttention.

English Docs | 中文文档 | Blogs

News

  • [2025/09] 🔥 LightLLM v1.1.0 release!
  • [2025/08] Pre $^3$ achieves the outstanding paper award of ACL2025.
  • [2025/05] LightLLM paper on constrained decoding accepted by ACL2025 (Pre $^3$: Enabling Deterministic Pushdown Automata for Faster Structured LLM Generation). For a more accessible overview of the research with key insights and examples, check out our blog post: LightLLM Blog
  • [2025/04] LightLLM paper on request scheduler published in ASPLOS’25 (Past-Future Scheduler for LLM Serving under SLA Guarantees)
  • [2025/02] 🔥 LightLLM v1.0.0 release, achieving the fastest DeepSeek-R1 serving performance on single H200 machine.

Get started

Performance

Learn more in the release blogs: v1.0.0 blog.

FAQ

Please refer to the FAQ for more information.

Projects using LightLLM

We welcome any coopoeration and contribution. If there is a project requires LightLLM's support, please contact us via email or create a pull request.

Projects based on LightLLM or referenced LightLLM components:

Also, LightLLM's pure-python design and token-level KC Cache management make it easy to use as the basis for research projects.

Academia works based on or use part of LightLLM:

Community

For further information and discussion, join our discord server. Welcome to be a member and look forward to your contribution!

License

This repository is released under the Apache-2.0 license.

Acknowledgement

We learned a lot from the following projects when developing LightLLM.

Citation

We have published a number of papers around components or features of LightLLM, if you use LightLLM in your work, please consider citing the relevant paper.

constrained decoding: accepted by ACL2025 and achieved the outstanding paper award.

@inproceedings{
anonymous2025pre,
title={Pre\${\textasciicircum}3\$: Enabling Deterministic Pushdown Automata for Faster Structured {LLM} Generation},
author={Anonymous},
booktitle={Submitted to ACL Rolling Review - February 2025},
year={2025},
url={https://openreview.net/forum?id=g1aBeiyZEi},
note={under review}
}

Request scheduler: accepted by ASPLOS’25:

@inproceedings{gong2025past,
  title={Past-Future Scheduler for LLM Serving under SLA Guarantees},
  author={Gong, Ruihao and Bai, Shihao and Wu, Siyu and Fan, Yunqian and Wang, Zaijun and Li, Xiuhong and Yang, Hailong and Liu, Xianglong},
  booktitle={Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2},
  pages={798--813},
  year={2025}
}

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for LightLLM

Similar Open Source Tools

For similar tasks

For similar jobs