NanoLLM

NanoLLM

Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector DB, and RAG.

Stars: 156

Visit
 screenshot

NanoLLM is a tool designed for optimized local inference for Large Language Models (LLMs) using HuggingFace-like APIs. It supports quantization, vision/language models, multimodal agents, speech, vector DB, and RAG. The tool aims to provide efficient and effective processing for LLMs on local devices, enhancing performance and usability for various AI applications.

README:

NanoLLM

Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector DB, and RAG.

[!NOTE]
See dusty-nv.github.io/NanoLLM for docs and Jetson AI Lab for tutorials.

Latest Release: 24.7 (dustynv/nano_llm:24.7-r36.2.0)

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for NanoLLM

Similar Open Source Tools

For similar tasks

For similar jobs