Awesome-local-LLM

A curated list of awesome platforms, tools, practices and resources that helps run LLMs locally

Stars: 285

Visit

Awesome-local-LLM is a curated list of platforms, tools, practices, and resources that help run Large Language Models (LLMs) locally. It includes sections on inference platforms, engines, user interfaces, specific models for general purpose, coding, vision, audio, and miscellaneous tasks. The repository also covers tools for coding agents, agent frameworks, retrieval-augmented generation, computer use, browser automation, memory management, testing, evaluation, research, training, and fine-tuning. Additionally, there are tutorials on models, prompt engineering, context engineering, inference, agents, retrieval-augmented generation, and miscellaneous topics, along with a section on communities for LLM enthusiasts.

README:

Awesome local LLM

A curated list of awesome platforms, tools, practices and resources that helps run LLMs locally

Inference platforms
Inference engines
User Interfaces
Large Language Models
Tools
Hardware
Tutorials
Communities

Inference platforms

LM Studio - discover, download and run local LLMs
jan - an open source alternative to ChatGPT that runs 100% offline on your computer
ChatBox - user-friendly desktop client app for AI models/LLMs
LocalAI - the free, open-source alternative to OpenAI, Claude and others
lemonade - a local LLM server with GPU and NPU Acceleration

Back to Table of Contents

Inference engines

ollama - get up and running with LLMs
llama.cpp - LLM inference in C/C++
vllm - a high-throughput and memory-efficient inference and serving engine for LLMs
exo - run your own AI cluster at home with everyday devices
sglang - a fast serving framework for large language models and vision language models
koboldcpp - run GGUF models easily with a KoboldAI UI
Nano-vLLM - a lightweight vLLM implementation built from scratch
gpustack - simple, scalable AI model deployment on GPU clusters
distributed-llama - connect home devices into a powerful cluster to accelerate LLM inference
mlx-lm - generate text and fine-tune large language models on Apple silicon with MLX
ik_llama.cpp - llama.cpp fork with additional SOTA quants and improved performance
vllm-gfx906 - vLLM for AMD gfx906 GPUs, e.g. Radeon VII / MI50 / MI60
FastFlowLM - run LLMs on AMD Ryzen™ AI NPUs
llm-scaler - run LLMs on Intel Arc™ Pro B60 GPUs

Back to Table of Contents

User Interfaces

Open WebUI - User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
Lobe Chat - an open-source, modern design AI chat framework
Text generation web UI - LLM UI with advanced features, easy setup, and multiple backend support
SillyTavern - LLM Frontend for Power Users
Page Assist - Use your locally running AI models to assist you in your web browsing

Back to Table of Contents

Large Language Models

Explorers, Benchmarks, Leaderboards

AI Models & API Providers Analysis - understand the AI landscape to choose the best model and provider for your use case
LLM Explorer - explore list of the open-source LLM models
Dubesor LLM Benchmark table - small-scale manual performance comparison benchmark
oobabooga benchmark - a list sorted by size (on disk) for each score

Back to Table of Contents

Model providers

Qwen - powered by Alibaba Cloud
Mistral AI - a pioneering French artificial intelligence startup
Tencent - a profile of a Chinese multinational technology conglomerate and holding company
Unsloth AI - focusing on making AI more accessible to everyone (GGUFs etc.)
bartowski - providing GGUF versions of popular LLMs
Beijing Academy of Artificial Intelligence - a private non-profit organization engaged in AI research and development
Open Thoughts - a team of researchers and engineers curating the best open reasoning datasets

Back to Table of Contents

Specific models

General purpose

Qwen3-Next - a collection of the latest generation Qwen LLMs
Gemma 3 - a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models
gpt-oss - a collection of open-weight models from OpenAI, designed for powerful reasoning, agentic tasks, and versatile developer use cases
Mistral-Small-3.2-24B-Instruct-2506 - a versatile model designed to handle a wide range of generative AI tasks, including instruction following, conversational assistance, image understanding, and function calling
Magistral-Small-2509 - a Mistral Small 3.2 (2506) with added reasoning capabilities
GLM-4.5 - a collection of hybrid reasoning models designed for intelligent agents
Hunyuan - a collection of Tencent's open-source efficient LLMs designed for versatile deployment across diverse computational environments
Phi-4-mini-instruct - a lightweight open model built upon synthetic data and filtered publicly available websites
NVIDIA Nemotron - a collection of open, production-ready enterprise models trained from scratch by NVIDIA
Llama Nemotron - a collection of open, production-ready enterprise models from NVIDIA
OpenReasoning-Nemotron - a collection of models from NVIDIA, trained on 5M reasoning traces for math, code and science
Granite 3.3 - a collection of LLMs from IBM, fine-tuned for improved reasoning and instruction-following capabilities
EXAONE-4.0 - a collection of LLMs from LG AI Research, integrating non-reasoning and reasoning modes
ERNIE 4.5 - a collection of large-scale multimodal models from Baidu
Seed-OSS - a collection of LLMs developed by ByteDance's Seed Team, designed for powerful long-context, reasoning, agent and general capabilities, and versatile developer-friendly features

Back to Table of Contents

Coding

Qwen3-Coder - a collection of the Qwen's most agentic code models to date
Devstral-Small-2507 - an agentic LLM for software engineering tasks fine-tuned from Mistral-Small-3.1
Mellum-4b-base - an LLM from JetBrains, optimized for code-related tasks
OlympicCoder-32B - a code model that achieves very strong performance on competitive coding benchmarks such as LiveCodeBench and the 2024 International Olympiad in Informatics
NextCoder - a family of code-editing LLMs developed using the Qwen2.5-Coder Instruct variants as base

Back to Table of Contents

Multimodal

Qwen3-Omni - a collection of the natively end-to-end multilingual omni-modal foundation models from Qwen

Back to Table of Contents

Image

Qwen-Image - an image generation foundation model in the Qwen series that achieves significant advances in complex text rendering and precise image editing
Qwen-Image-Edit-2509 - the image editing version of Qwen-Image extending the base model's unique text rendering capabilities to image editing tasks, enabling precise text editing
GLM-4.5V - a VLLM based on ZhipuAI’s next-generation flagship text foundation model GLM-4.5-Air
HunyuanImage-2.1 - an efficient diffusion model for high-resolution (2K) text-to-image generation
FastVLM - a collection of VLMs with efficient vision encoding from Apple
MiniCPM-V-4_5 - a GPT-4o Level MLLM for single image, multi image and high-FPS video understanding on your phone
LFM2-VL - a colection of vision-language models, designed for on-device deployment
ClipTagger-12b - a vision-language model (VLM) designed for video understanding at massive scale

Back to Table of Contents

Audio

Voxtral-Small-24B-2507 - an enhancement of Mistral Small 3, incorporating state-of-the-art audio input capabilities while retaining best-in-class text performance
chatterbox - first production-grade open-source TTS model
VibeVoice - a collection of frontier text-to-speech models from Microsoft
canary-1b-v2 - a multitask speech transcription and translation model from NVIDIA
parakeet-tdt-0.6b-v3 - a multilingual speech-to-text model from NVIDIA
Kitten TTS - a collection of open-source realistic text-to-speech models designed for lightweight deployment and high-quality voice synthesis

Back to Table of Contents

Miscellaneous

Jan-v1-4B - the first release in the Jan Family, designed for agentic reasoning and problem-solving within the Jan App
Jan-nano - a compact 4-billion parameter language model specifically designed and trained for deep research tasks
Jan-nano-128k - an enhanced version of Jan-nano features a native 128k context window that enables deeper, more comprehensive research capabilities without the performance degradation typically associated with context extension method
Arch-Router-1.5B - the fastest LLM router model that aligns to subjective usage preferences
HunyuanWorld-1 - an open-source 3D world generation model
Hunyuan-GameCraft-1.0 - a novel framework for high-dynamic interactive video generation in game environments

Back to Table of Contents

Tools

Models

unsloth - fine-tuning & reinforcement learning for LLMs

Back to Table of Contents

Agent Frameworks

AutoGPT - a powerful platform that allows you to create, deploy, and manage continuous AI agents that automate complex workflows
langchain - build context-aware reasoning applications
langflow - a powerful tool for building and deploying AI-powered agents and workflows
autogen - a programming framework for agentic AI
anything-llm - the all-in-one Desktop & Docker AI application with built-in RAG, AI agents, No-code agent builder, MCP compatibility, and more
llama_index - the leading framework for building LLM-powered agents over your data
Flowise - build AI agents, visually
crewAI - a framework for orchestrating role-playing, autonomous AI agents
agno - a full-stack framework for building Multi-Agent Systems with memory, knowledge and reasoning
SuperAGI - an open-source framework to build, manage and run useful Autonomous AI Agents
camel - the first and the best multi-agent framework
openai-agents-python - a lightweight, powerful framework for multi-agent workflows
pydantic-ai - a Python agent framework designed to help you quickly, confidently, and painlessly build production grade applications and workflows with Generative AI
txtai - all-in-one open-source AI framework for semantic search, LLM orchestration and language model workflows
archgw - a high-performance proxy server that handles the low-level work in building agents: like applying guardrails, routing prompts to the right agent, and unifying access to LLMs, etc.
ClaraVerse - privacy-first, fully local AI workspace with Ollama LLM chat, tool calling, agent builder, Stable Diffusion, and embedded n8n-style automation
ragbits - building blocks for rapid development of GenAI applications

Back to Table of Contents

Retrieval-Augmented Generation

graphrag - a modular graph-based RAG system
haystack - AI orchestration framework to build customizable, production-ready LLM applications, best suited for building RAG, question answering, semantic search or conversational agent chatbots
LightRAG - simple and fast RAG
vanna - an open-source Python RAG framework for SQL generation and related functionality
graphiti - build real-time knowledge graphs for AI Agents
onyx - the AI platform connected to your company's docs, apps, and people

Back to Table of Contents

Coding Agents

zed - a next-generation code editor designed for high-performance collaboration with humans and AI
OpenHands - a platform for software development agents powered by AI
cline - autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your permission every step of the way
aider - AI pair programming in your terminal
tabby - an open-source GitHub Copilot alternative, set up your own LLM-powered code completion server
continue - create, share, and use custom AI code assistants with our open-source IDE extensions and hub of models, rules, prompts, docs, and other building blocks
void - an open-source Cursor alternative, use AI agents on your codebase, checkpoint and visualize changes, and bring any model or host locally
Roo-Code - a whole dev team of AI agents in your code editor
goose - an open-source, extensible AI agent that goes beyond code suggestions
opencode - a AI coding agent built for the terminal
crush - the glamourous AI coding agent for your favourite terminal
kilocode - open source AI coding assistant for planning, building, and fixing code
ProxyAI - the leading open-source AI copilot for JetBrains

Back to Table of Contents

Computer Use

open-interpreter - a natural language interface for computers
OmniParser - a simple screen parsing tool towards pure vision based GUI agent
self-operating-computer - a framework to enable multimodal models to operate a computer
cua - the Docker Container for Computer-Use AI Agents
Agent-S - an open agentic framework that uses computers like a human

Back to Table of Contents

Browser Automation

puppeteer - a JavaScript API for Chrome and Firefox
playwright - a framework for Web Testing and Automation
Playwright MCP server - an MCP server that provides browser automation capabilities using Playwright
browser-use - make websites accessible for AI agents
firecrawl - turn entire websites into LLM-ready markdown or structured data
stagehand - the AI Browser Automation Framework

Back to Table of Contents

Memory Management

mem0 - universal memory layer for AI Agents
letta - the stateful agents framework with memory, reasoning, and context management
cognee - memory for AI Agents in 5 lines of code
LMCache - supercharge your LLM with the fastest KV Cache Layer
memU - an open-source memory framework for AI companions

Back to Table of Contents

Testing, Evaluation, and Observability

langfuse - an open-source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more
opik - debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards
openllmetry - an open-source observability for your LLM application, based on OpenTelemetry
giskard - an open-source evaluation & testing for AI & LLM systems
agenta - an open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place

Back to Table of Contents

Research

Perplexica - an open-source alternative to Perplexity AI, the AI-powered search engine
gpt-researcher - an LLM based autonomous agent that conducts deep local and web research on any topic and generates a long report with citations
local-deep-researcher - fully local web research and report writing assistant
SurfSense - an open-source alternative to NotebookLM / Perplexity / Glean
local-deep-research - an AI-powered research assistant for deep, iterative research
maestro - an AI-powered research application designed to streamline complex research tasks
open-notebook - an open-source implementation of Notebook LM with more flexibility and features

Back to Table of Contents

Training and Fine-tuning

OpenRLHF - an easy-to-use, high-performance open-source RLHF framework built on Ray, vLLM, ZeRO-3 and HuggingFace Transformers, designed to make RLHF training simple and accessible
Kiln - the easiest tool for fine-tuning LLM models, synthetic data generation, and collaborating on datasets
augmentoolkit - train an open-source LLM on new facts

Back to Table of Contents

Miscellaneous

context7 - up-to-date code documentation for LLMs and AI code editors
cai - Cybersecurity AI (CAI), the framework for AI Security
speakr - a personal, self-hosted web application designed for transcribing audio recordings
presenton - an open-source AI presentation generator and API
OmniGen2 - exploration to advanced multimodal generation
4o-ghibli-at-home - a powerful, self-hosted AI photo stylizer built for performance and privacy
Observer - local open-source micro-agents that observe, log and react, all while keeping your data private and secure
mobile-use - a powerful, open-source AI agent that controls your Android or IOS device using natural language
gabber - build AI applications that can see, hear, and speak using your screens, microphones, and cameras as inputs
promptcat - a zero-dependency prompt manager/catalog/library in a single HTML file

Back to Table of Contents

Hardware

Alex Ziskind - tests of pcs, laptops, gpus etc. capable of running LLMs
Digital Spaceport - reviews of various builds designed for LLM inference
JetsonHacks - information about developing on NVIDIA Jetson Development Kits
Miyconst - tests of various types of hardware capable of running LLMs
LLM Inference VRAM & GPU Requirement Calculator - calculate how many GPUs you need to deploy LLMs
ZLUDA - CUDA on non-NVIDIA GPUs

Back to Table of Contents

Tutorials

Models

Let's reproduce GPT-2 (124M)

Back to Table of Contents

Prompt Engineering

Prompt Engineering by NirDiamant - a comprehensive collection of tutorials and implementations for Prompt Engineering techniques, ranging from fundamental concepts to advanced strategies
Prompting guide 101 - a quick-start handbook for effective prompts by Google
Prompt Engineering by Google - prompt engineering by Google
Prompt Engineering by Anthropic - prompt engineering by Anthropic
Prompt Engineering Interactive Tutorial - Prompt Engineering Interactive Tutorial by Anthropic
Real world prompting - real world prompting tutorial by Anthropic
Prompt evaluations - prompt evaluations course by Anthropic
system-prompts-and-models-of-ai-tools - a collection of system prompts extracted from AI tools
system_prompts_leaks - a collection of extracted System Prompts from popular chatbots like ChatGPT, Claude & Gemini
Prompt from Codex - Prompt used to steer behavior of OpenAI's Codex

Back to Table of Contents

Context Engineering

Context-Engineering - a frontier, first-principles handbook inspired by Karpathy and 3Blue1Brown for moving beyond prompt engineering to the wider discipline of context design, orchestration, and optimization
Awesome-Context-Engineering - a comprehensive survey on Context Engineering: from prompt engineering to production-grade AI systems

Back to Table of Contents

Inference

vLLM Production Stack - vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization

Back to Table of Contents

Agents

GenAI Agents - tutorials and implementations for various Generative AI Agent techniques
12-Factor Agents - principles for building reliable LLM applications
Agents towards production - end-to-end, code-first tutorials covering every layer of production-grade GenAI agents, guiding you from spark to scale with proven patterns and reusable blueprints for real-world launches
LLM Agents & Ecosystem Handbook - one-stop handbook for building, deploying, and understanding LLM agents with 60+ skeletons, tutorials, ecosystem guides, and evaluation tools
601 real-world gen AI use cases - 601 real-world gen AI use cases from the world's leading organizations by Google
A practical guide to building agents - a practical guide to building agents by OpenAI

Back to Table of Contents

Retrieval-Augmented Generation

RAG Techniques - various advanced techniques for Retrieval-Augmented Generation (RAG) systems
Controllable RAG Agent - an advanced Retrieval-Augmented Generation (RAG) solution for complex question answering that uses sophisticated graph based algorithm to handle the tasks
LangChain RAG Cookbook - a collection of modular RAG techniques, implemented in LangChain + Python

Back to Table of Contents

Miscellaneous

Self-hosted AI coding that just works

Back to Table of Contents

Communities

Back to Table of Contents

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines on how to get started.

For Tasks:

Click tags to check more tools for each tasks

run llms locally fine-tune large language models build ai agents perform inference manage memory for ai agents

For Jobs:

ai researcher software engineer data scientist machine learning engineer developer advocate

Alternative AI tools for Awesome-local-LLM

Similar Open Source Tools

Awesome-local-LLM

github

: 285

towhee

Towhee is a cutting-edge framework designed to streamline the processing of unstructured data through the use of Large Language Model (LLM) based pipeline orchestration. It can extract insights from diverse data types like text, images, audio, and video files using generative AI and deep learning models. Towhee offers rich operators, prebuilt ETL pipelines, and a high-performance backend for efficient data processing. With a Pythonic API, users can build custom data processing pipelines easily. Towhee is suitable for tasks like sentence embedding, image embedding, video deduplication, question answering with documents, and cross-modal retrieval based on CLIP.

github

: 3.2k

SLAM-LLM

SLAM-LLM is a deep learning toolkit designed for researchers and developers to train custom multimodal large language models (MLLM) focusing on speech, language, audio, and music processing. It provides detailed recipes for training and high-performance checkpoints for inference. The toolkit supports tasks such as automatic speech recognition (ASR), text-to-speech (TTS), visual speech recognition (VSR), automated audio captioning (AAC), spatial audio understanding, and music caption (MC). SLAM-LLM features easy extension to new models and tasks, mixed precision training for faster training with less GPU memory, multi-GPU training with data and model parallelism, and flexible configuration based on Hydra and dataclass.

github

: 343

Crypto-Nft-Airdrop-Tool

Crypto-Nft-Airdrop-Tool is a Python tool designed for conducting airdrops of NFTs in the crypto space. It provides functionality for distributing NFTs to a specified audience efficiently. The tool is compatible with Windows platform and requires Python 3. Users can easily manage and execute airdrop campaigns using this tool, enhancing their engagement with the NFT community. The tool simplifies the process of distributing NFTs and ensures a seamless experience for both creators and recipients.

github

: 107

FuseAI

FuseAI is a repository that focuses on knowledge fusion of large language models. It includes FuseChat, a state-of-the-art 7B LLM on MT-Bench, and FuseLLM, which surpasses Llama-2-7B by fusing three open-source foundation LLMs. The repository provides tech reports, releases, and datasets for FuseChat and FuseLLM, showcasing their performance and advancements in the field of chat models and large language models.

github

: 77

LakeSoul

LakeSoul is a cloud-native Lakehouse framework that supports scalable metadata management, ACID transactions, efficient and flexible upsert operation, schema evolution, and unified streaming & batch processing. It supports multiple computing engines like Spark, Flink, Presto, and PyTorch, and computing modes such as batch, stream, MPP, and AI. LakeSoul scales metadata management and achieves ACID control by using PostgreSQL. It provides features like automatic compaction, table lifecycle maintenance, redundant data cleaning, and permission isolation for metadata.

github

: 3.0k

esp-ai

ESP-AI provides a complete AI conversation solution for your development board, including IAT+LLM+TTS integration solutions for ESP32 series development boards. It can be injected into projects without affecting existing ones. By providing keys from platforms like iFlytek, Jiling, and local services, you can run the services without worrying about interactions between services or between development boards and services. The project's server-side code is based on Node.js, and the hardware code is based on Arduino IDE.

github

: 734

semantic-router

The Semantic Router is an intelligent routing tool that utilizes a Mixture-of-Models (MoM) approach to direct OpenAI API requests to the most suitable models based on semantic understanding. It enhances inference accuracy by selecting models tailored to different types of tasks. The tool also automatically selects relevant tools based on the prompt to improve tool selection accuracy. Additionally, it includes features for enterprise security such as PII detection and prompt guard to protect user privacy and prevent misbehavior. The tool implements similarity caching to reduce latency. The comprehensive documentation covers setup instructions, architecture guides, and API references.

github

: 1.5k

SummaryYou

Summary You is a tool that utilizes AI to summarize YouTube videos, articles, images, and documents. Users can set the length of the summary and have the option to listen to the summaries. The tool also includes a history section, intelligent paywall detection, OLED-Dark Mode, and a user-friendly Material Design 3 style UI with dynamic color themes. It uses GPT-3.5 OpenAI/Mixtral 8x7B Groq for summarization. The backend is implemented in Python with Chaquopy, and some UI designs and codes are borrowed from Seal Material color utilities.

github

: 81

amplication

Amplication is a robust, open-source development platform designed to revolutionize the creation of scalable and secure .NET and Node.js applications. It automates backend applications development, ensuring consistency, predictability, and adherence to the highest standards with code that's built to scale. The user-friendly interface fosters seamless integration of APIs, data models, databases, authentication, and authorization. Built on a flexible, plugin-based architecture, Amplication allows effortless customization of the code and offers a diverse range of integrations. With a strong focus on collaboration, Amplication streamlines team-oriented development, making it an ideal choice for groups of all sizes, from startups to large enterprises. It enables users to concentrate on business logic while handling the heavy lifting of development. Experience the fastest way to develop .NET and Node.js applications with Amplication.

github

: 15.5k

SEED-Bench

SEED-Bench is a comprehensive benchmark for evaluating the performance of multimodal large language models (LLMs) on a wide range of tasks that require both text and image understanding. It consists of two versions: SEED-Bench-1 and SEED-Bench-2. SEED-Bench-1 focuses on evaluating the spatial and temporal understanding of LLMs, while SEED-Bench-2 extends the evaluation to include text and image generation tasks. Both versions of SEED-Bench provide a diverse set of tasks that cover different aspects of multimodal understanding, making it a valuable tool for researchers and practitioners working on LLMs.

github

: 240

LiveDevAgents

LiveDevAgents is a multi-agent danmaku game engine built using CAMEL. It was developed for the 2024.12 CAMEL-AI Hackathon project. The engine allows for the creation of games in real-time through live bullet comments during streaming, enabling interaction with AI hosts. The project aims to expand and deconstruct simple ideas with a team of agents of different expertise, continuously updating and self-correcting during runtime. It also supports workforce enhancement, migration of anchor agents to a new framework, improvement of bullet comment processing logic, expansion of live control for more platforms, integration of art and music agents, and VR shared workspace for collaborative development.

github

: 67

CHATPGT-MEV-BOT-ETH

This tool is a bot that monitors the performance of MEV transactions on the Ethereum blockchain. It provides real-time data on MEV profitability, transaction volume, and network congestion. The bot can be used to identify profitable MEV opportunities and to track the performance of MEV strategies.

github

: 75

agentgateway

Agentgateway is an open source data plane optimized for agentic AI connectivity within or across any agent framework or environment. It provides drop-in security, observability, and governance for agent-to-agent and agent-to-tool communication, supporting leading interoperable protocols like Agent2Agent (A2A) and Model Context Protocol (MCP). Highly performant, security-first, multi-tenant, dynamic, and supporting legacy API transformation, agentgateway is designed to handle any scale and run anywhere with any agent framework.

github

: 961

IvyGPT

IvyGPT is a medical large language model that aims to generate the most realistic doctor consultation effects. It has been fine-tuned on high-quality medical Q&A data and trained using human feedback reinforcement learning. The project features full-process training on medical Q&A LLM, multiple fine-tuning methods support, efficient dataset creation tools, and a dataset of over 300,000 high-quality doctor-patient dialogues for training.

github

: 56

SLAM-LLM

SLAM-LLM is a deep learning toolkit for training custom multimodal large language models (MLLM) focusing on speech, language, audio, and music processing. It provides detailed recipes for training and high-performance checkpoints for inference. The toolkit supports various tasks such as automatic speech recognition (ASR), text-to-speech (TTS), visual speech recognition (VSR), automated audio captioning (AAC), spatial audio understanding, and music caption (MC). Users can easily extend to new models and tasks, utilize mixed precision training for faster training with less GPU memory, and perform multi-GPU training with data and model parallelism. Configuration is flexible based on Hydra and dataclass, allowing different configuration methods.

github

: 647

For similar tasks

AutoGPT

AutoGPT is a revolutionary tool that empowers everyone to harness the power of AI. With AutoGPT, you can effortlessly build, test, and delegate tasks to AI agents, unlocking a world of possibilities. Our mission is to provide the tools you need to focus on what truly matters: innovation and creativity.

github

: 178.7k

agent-os

The Agent OS is an experimental framework and runtime to build sophisticated, long running, and self-coding AI agents. We believe that the most important super-power of AI agents is to write and execute their own code to interact with the world. But for that to work, they need to run in a suitable environment—a place designed to be inhabited by agents. The Agent OS is designed from the ground up to function as a long-term computing substrate for these kinds of self-evolving agents.

github

: 60

chatdev

ChatDev IDE is a tool for building your AI agent, Whether it's NPCs in games or powerful agent tools, you can design what you want for this platform. It accelerates prompt engineering through **JavaScript Support** that allows implementing complex prompting techniques.

github

: 384

module-ballerinax-ai.agent

This library provides functionality required to build ReAct Agent using Large Language Models (LLMs).

github

: 108

npi

NPi is an open-source platform providing Tool-use APIs to empower AI agents with the ability to take action in the virtual world. It is currently under active development, and the APIs are subject to change in future releases. NPi offers a command line tool for installation and setup, along with a GitHub app for easy access to repositories. The platform also includes a Python SDK and examples like Calendar Negotiator and Twitter Crawler. Join the NPi community on Discord to contribute to the development and explore the roadmap for future enhancements.

github

: 211

ai-agents

The 'ai-agents' repository is a collection of books and resources focused on developing AI agents, including topics such as GPT models, building AI agents from scratch, machine learning theory and practice, and basic methods and tools for data analysis. The repository provides detailed explanations and guidance for individuals interested in learning about and working with AI agents.

github

: 53

llms

The 'llms' repository is a comprehensive guide on Large Language Models (LLMs), covering topics such as language modeling, applications of LLMs, statistical language modeling, neural language models, conditional language models, evaluation methods, transformer-based language models, practical LLMs like GPT and BERT, prompt engineering, fine-tuning LLMs, retrieval augmented generation, AI agents, and LLMs for computer vision. The repository provides detailed explanations, examples, and tools for working with LLMs.

github

: 266

ai-app

The 'ai-app' repository is a comprehensive collection of tools and resources related to artificial intelligence, focusing on topics such as server environment setup, PyCharm and Anaconda installation, large model deployment and training, Transformer principles, RAG technology, vector databases, AI image, voice, and music generation, and AI Agent frameworks. It also includes practical guides and tutorials on implementing various AI applications. The repository serves as a valuable resource for individuals interested in exploring different aspects of AI technology.

github

: 103

For similar jobs

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 980

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 2.9k

tabby

Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

github

: 32.1k

spear

SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

github

: 224

Magick

Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.

github

: 675