
Awesome-local-LLM
A curated list of awesome platforms, tools, practices and resources that helps run LLMs locally
Stars: 259

Awesome-local-LLM is a curated list of platforms, tools, practices, and resources that help run Large Language Models (LLMs) locally. It includes sections on inference platforms, engines, user interfaces, specific models for general purpose, coding, vision, audio, and miscellaneous tasks. The repository also covers tools for coding agents, agent frameworks, retrieval-augmented generation, computer use, browser automation, memory management, testing, evaluation, research, training, and fine-tuning. Additionally, there are tutorials on models, prompt engineering, context engineering, inference, agents, retrieval-augmented generation, and miscellaneous topics, along with a section on communities for LLM enthusiasts.
README:
A curated list of awesome platforms, tools, practices and resources that helps run LLMs locally
- Inference platforms
- Inference engines
- User Interfaces
- Large Language Models
- Tools
- Hardware
- Tutorials
- Communities
- LM Studio - discover, download and run local LLMs
-
jan - an open source alternative to ChatGPT that runs 100% offline on your computer
-
ChatBox - user-friendly desktop client app for AI models/LLMs
-
LocalAI - the free, open-source alternative to OpenAI, Claude and others
-
lemonade - a local LLM server with GPU and NPU Acceleration
-
ollama - get up and running with LLMs
-
llama.cpp - LLM inference in C/C++
-
ik_llama.cpp - llama.cpp fork with additional SOTA quants and improved performance
-
koboldcpp - run GGUF models easily with a KoboldAI UI
-
vllm - a high-throughput and memory-efficient inference and serving engine for LLMs
-
Nano-vLLM - a lightweight vLLM implementation built from scratch
-
vllm-gfx906 - vLLM for AMD gfx906 GPUs, e.g. Radeon VII / MI50 / MI60
-
mlx-lm - generate text and fine-tune large language models on Apple silicon with MLX
-
FastFlowLM - run LLMs on AMD Ryzen™ AI NPUs
-
exo - run your own AI cluster at home with everyday devices
-
gpustack - simple, scalable AI model deployment on GPU clusters
-
sglang - a fast serving framework for large language models and vision language models
-
distributed-llama - connect home devices into a powerful cluster to accelerate LLM inference
-
Open WebUI - User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
-
Lobe Chat - an open-source, modern design AI chat framework
-
Text generation web UI - LLM UI with advanced features, easy setup, and multiple backend support
-
SillyTavern - LLM Frontend for Power Users
-
Page Assist - Use your locally running AI models to assist you in your web browsing
- AI Models & API Providers Analysis - understand the AI landscape to choose the best model and provider for your use case
- LLM Explorer - explore list of the open-source LLM models
- Dubesor LLM Benchmark table - small-scale manual performance comparison benchmark
- oobabooga benchmark - a list sorted by size (on disk) for each score
- Qwen - powered by Alibaba Cloud
-
Mistral AI - a pioneering French artificial intelligence startup
- Tencent - a profile of a Chinese multinational technology conglomerate and holding company
- Unsloth AI - focusing on making AI more accessible to everyone (GGUFs etc.)
- bartowski - providing GGUF versions of popular LLMs
- Beijing Academy of Artificial Intelligence - a private non-profit organization engaged in AI research and development
- Open Thoughts - a team of researchers and engineers curating the best open reasoning datasets
- Qwen3 - a collection of the latest generation Qwen LLMs
-
Gemma 3 - a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models
-
gpt-oss - a collection of open-weight models from OpenAI, designed for powerful reasoning, agentic tasks, and versatile developer use cases
-
Mistral-Small-3.2-24B-Instruct-2506 - a versatile model designed to handle a wide range of generative AI tasks, including instruction following, conversational assistance, image understanding, and function calling
-
Magistral-Small-2507 - a Mistral Small 3.1 (2503) with added reasoning capabilities
- GLM-4.5 - a collection of hybrid reasoning models designed for intelligent agents
- Hunyuan - a collection of Tencent's open-source efficient LLMs designed for versatile deployment across diverse computational environments
- Phi-4-mini-instruct - a lightweight open model built upon synthetic data and filtered publicly available websites
- NVIDIA Nemotron - a collection of open, production-ready enterprise models trained from scratch by NVIDIA
- Llama Nemotron - a collection of open, production-ready enterprise models from NVIDIA
- OpenReasoning-Nemotron - a collection of models from NVIDIA, trained on 5M reasoning traces for math, code and science
- Granite 3.3 - a collection of LLMs from IBM, fine-tuned for improved reasoning and instruction-following capabilities
- EXAONE-4.0 - a collection of LLMs from LG AI Research, integrating non-reasoning and reasoning modes
- ERNIE 4.5 - a collection of large-scale multimodal models from Baidu
- Seed-OSS - a collection of LLMs developed by ByteDance's Seed Team, designed for powerful long-context, reasoning, agent and general capabilities, and versatile developer-friendly features
- Qwen3-Coder - a collection of the Qwen's most agentic code models to date
-
Devstral-Small-2507 - an agentic LLM for software engineering tasks fine-tuned from Mistral-Small-3.1
- Mellum-4b-base - an LLM from JetBrains, optimized for code-related tasks
- OlympicCoder-32B - a code model that achieves very strong performance on competitive coding benchmarks such as LiveCodeBench and the 2024 International Olympiad in Informatics
- NextCoder - a family of code-editing LLMs developed using the Qwen2.5-Coder Instruct variants as base
- Qwen-Image - an image generation foundation model in the Qwen series that achieves significant advances in complex text rendering and precise image editing
- Qwen-Image-Edit - the image editing version of Qwen-Image extending the base model's unique text rendering capabilities to image editing tasks, enabling precise text editing
- GLM-4.5V - a VLLM based on ZhipuAI’s next-generation flagship text foundation model GLM-4.5-Air
- FastVLM - a collection of VLMs with efficient vision encoding from Apple
- MiniCPM-V-4_5 - a GPT-4o Level MLLM for single image, multi image and high-FPS video understanding on your phone
- LFM2-VL - a colection of vision-language models, designed for on-device deployment
- ClipTagger-12b - a vision-language model (VLM) designed for video understanding at massive scale
-
Voxtral-Small-24B-2507 - an enhancement of Mistral Small 3, incorporating state-of-the-art audio input capabilities while retaining best-in-class text performance
- chatterbox - first production-grade open-source TTS model
- canary-1b-v2 - a multitask speech transcription and translation model from NVIDIA
- parakeet-tdt-0.6b-v3 - a multilingual speech-to-text model from NVIDIA
- Kitten TTS - a collection of open-source realistic text-to-speech models designed for lightweight deployment and high-quality voice synthesis
- Jan-v1-4B - the first release in the Jan Family, designed for agentic reasoning and problem-solving within the Jan App
- Jan-nano - a compact 4-billion parameter language model specifically designed and trained for deep research tasks
- Jan-nano-128k - an enhanced version of Jan-nano features a native 128k context window that enables deeper, more comprehensive research capabilities without the performance degradation typically associated with context extension method
- Arch-Router-1.5B - the fastest LLM router model that aligns to subjective usage preferences
- HunyuanWorld-1 - an open-source 3D world generation model
- Hunyuan-GameCraft-1.0 - a novel framework for high-dynamic interactive video generation in game environments
-
zed - a next-generation code editor designed for high-performance collaboration with humans and AI
-
OpenHands - a platform for software development agents powered by AI
-
cline - autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your permission every step of the way
-
aider - AI pair programming in your terminal
-
tabby - an open-source GitHub Copilot alternative, set up your own LLM-powered code completion server
-
continue - create, share, and use custom AI code assistants with our open-source IDE extensions and hub of models, rules, prompts, docs, and other building blocks
-
void - an open-source Cursor alternative, use AI agents on your codebase, checkpoint and visualize changes, and bring any model or host locally
-
Roo-Code - a whole dev team of AI agents in your code editor
-
goose - an open-source, extensible AI agent that goes beyond code suggestions
-
opencode - a AI coding agent built for the terminal
-
crush - the glamourous AI coding agent for your favourite terminal
-
kilocode - open source AI coding assistant for planning, building, and fixing code
-
ProxyAI - the leading open-source AI copilot for JetBrains
-
AutoGPT - a powerful platform that allows you to create, deploy, and manage continuous AI agents that automate complex workflows
-
langchain - build context-aware reasoning applications
-
langflow - a powerful tool for building and deploying AI-powered agents and workflows
-
autogen - a programming framework for agentic AI
-
anything-llm - the all-in-one Desktop & Docker AI application with built-in RAG, AI agents, No-code agent builder, MCP compatibility, and more
-
llama_index - the leading framework for building LLM-powered agents over your data
-
Flowise - build AI agents, visually
-
crewAI - a framework for orchestrating role-playing, autonomous AI agents
-
agno - a full-stack framework for building Multi-Agent Systems with memory, knowledge and reasoning
-
SuperAGI - an open-source framework to build, manage and run useful Autonomous AI Agents
-
camel - the first and the best multi-agent framework
-
openai-agents-python - a lightweight, powerful framework for multi-agent workflows
-
txtai - all-in-one open-source AI framework for semantic search, LLM orchestration and language model workflows
-
archgw - a high-performance proxy server that handles the low-level work in building agents: like applying guardrails, routing prompts to the right agent, and unifying access to LLMs, etc.
-
ClaraVerse - privacy-first, fully local AI workspace with Ollama LLM chat, tool calling, agent builder, Stable Diffusion, and embedded n8n-style automation
-
ragbits - building blocks for rapid development of GenAI applications
-
graphrag - a modular graph-based RAG system
-
haystack - AI orchestration framework to build customizable, production-ready LLM applications, best suited for building RAG, question answering, semantic search or conversational agent chatbots
-
LightRAG - simple and fast RAG
-
graphiti - build real-time knowledge graphs for AI Agents
-
vanna - an open-source Python RAG framework for SQL generation and related functionality
-
open-interpreter - a natural language interface for computers
-
OmniParser - a simple screen parsing tool towards pure vision based GUI agent
-
self-operating-computer - a framework to enable multimodal models to operate a computer
-
cua - the Docker Container for Computer-Use AI Agents
-
Agent-S - an open agentic framework that uses computers like a human
-
puppeteer - a JavaScript API for Chrome and Firefox
-
playwright - a framework for Web Testing and Automation
-
Playwright MCP server - an MCP server that provides browser automation capabilities using Playwright
-
browser-use - make websites accessible for AI agents
-
firecrawl - turn entire websites into LLM-ready markdown or structured data
-
stagehand - the AI Browser Automation Framework
-
mem0 - universal memory layer for AI Agents
-
letta - the stateful agents framework with memory, reasoning, and context management
-
cognee - memory for AI Agents in 5 lines of code
-
LMCache - supercharge your LLM with the fastest KV Cache Layer
-
langfuse - an open-source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more
-
opik - debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards
-
openllmetry - an open-source observability for your LLM application, based on OpenTelemetry
-
giskard - an open-source evaluation & testing for AI & LLM systems
-
agenta - an open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place
-
Perplexica - an open-source alternative to Perplexity AI, the AI-powered search engine
-
gpt-researcher - an LLM based autonomous agent that conducts deep local and web research on any topic and generates a long report with citations
-
local-deep-researcher - fully local web research and report writing assistant
-
SurfSense - an open-source alternative to NotebookLM / Perplexity / Glean
-
local-deep-research - an AI-powered research assistant for deep, iterative research
-
maestro - an AI-powered research application designed to streamline complex research tasks
-
open-notebook - an open-source implementation of Notebook LM with more flexibility and features
-
OpenRLHF - an easy-to-use, high-performance open-source RLHF framework built on Ray, vLLM, ZeRO-3 and HuggingFace Transformers, designed to make RLHF training simple and accessible
-
Kiln - the easiest tool for fine-tuning LLM models, synthetic data generation, and collaborating on datasets
-
augmentoolkit - train an open-source LLM on new facts
-
context7 - up-to-date code documentation for LLMs and AI code editors
-
cai - Cybersecurity AI (CAI), the framework for AI Security
-
speakr - a personal, self-hosted web application designed for transcribing audio recordings
-
presenton - an open-source AI presentation generator and API
-
OmniGen2 - exploration to advanced multimodal generation
-
4o-ghibli-at-home - a powerful, self-hosted AI photo stylizer built for performance and privacy
-
Observer - local open-source micro-agents that observe, log and react, all while keeping your data private and secure
-
mobile-use - a powerful, open-source AI agent that controls your Android or IOS device using natural language
-
gabber - build AI applications that can see, hear, and speak using your screens, microphones, and cameras as inputs
-
promptcat - a zero-dependency prompt manager/catalog/library in a single HTML file
-
Alex Ziskind - tests of pcs, laptops, gpus etc. capable of running LLMs
-
Digital Spaceport - reviews of various builds designed for LLM inference
-
JetsonHacks - information about developing on NVIDIA Jetson Development Kits
-
Miyconst - tests of various types of hardware capable of running LLMs
- LLM Inference VRAM & GPU Requirement Calculator - calculate how many GPUs you need to deploy LLMs
-
ZLUDA - CUDA on non-NVIDIA GPUs
-
Prompt Engineering by NirDiamant - a comprehensive collection of tutorials and implementations for Prompt Engineering techniques, ranging from fundamental concepts to advanced strategies
-
Prompting guide 101 - a quick-start handbook for effective prompts by Google
-
Prompt Engineering by Google - prompt engineering by Google
-
Prompt Engineering by Anthropic - prompt engineering by Anthropic
-
Prompt Engineering Interactive Tutorial - Prompt Engineering Interactive Tutorial by Anthropic
-
Real world prompting - real world prompting tutorial by Anthropic
-
Prompt evaluations - prompt evaluations course by Anthropic
-
system-prompts-and-models-of-ai-tools - a collection of system prompts extracted from AI tools
-
system_prompts_leaks - a collection of extracted System Prompts from popular chatbots like ChatGPT, Claude & Gemini
-
Prompt from Codex - Prompt used to steer behavior of OpenAI's Codex
-
Context-Engineering - a frontier, first-principles handbook inspired by Karpathy and 3Blue1Brown for moving beyond prompt engineering to the wider discipline of context design, orchestration, and optimization
-
Awesome-Context-Engineering - a comprehensive survey on Context Engineering: from prompt engineering to production-grade AI systems
-
vLLM Production Stack - vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization
-
GenAI Agents - tutorials and implementations for various Generative AI Agent techniques
-
12-Factor Agents - principles for building reliable LLM applications
-
Agents towards production - end-to-end, code-first tutorials covering every layer of production-grade GenAI agents, guiding you from spark to scale with proven patterns and reusable blueprints for real-world launches
-
601 real-world gen AI use cases - 601 real-world gen AI use cases from the world's leading organizations by Google
-
A practical guide to building agents - a practical guide to building agents by OpenAI
-
RAG Techniques - various advanced techniques for Retrieval-Augmented Generation (RAG) systems
-
Controllable RAG Agent - an advanced Retrieval-Augmented Generation (RAG) solution for complex question answering that uses sophisticated graph based algorithm to handle the tasks
-
LangChain RAG Cookbook - a collection of modular RAG techniques, implemented in LangChain + Python
We welcome contributions! Please see CONTRIBUTING.md for guidelines on how to get started.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for Awesome-local-LLM
Similar Open Source Tools

Awesome-local-LLM
Awesome-local-LLM is a curated list of platforms, tools, practices, and resources that help run Large Language Models (LLMs) locally. It includes sections on inference platforms, engines, user interfaces, specific models for general purpose, coding, vision, audio, and miscellaneous tasks. The repository also covers tools for coding agents, agent frameworks, retrieval-augmented generation, computer use, browser automation, memory management, testing, evaluation, research, training, and fine-tuning. Additionally, there are tutorials on models, prompt engineering, context engineering, inference, agents, retrieval-augmented generation, and miscellaneous topics, along with a section on communities for LLM enthusiasts.

lobe-chat
Lobe Chat is an open-source, modern-design ChatGPT/LLMs UI/Framework. Supports speech-synthesis, multi-modal, and extensible ([function call][docs-functionc-call]) plugin system. One-click **FREE** deployment of your private OpenAI ChatGPT/Claude/Gemini/Groq/Ollama chat application.

efficient-transformers
Efficient Transformers Library provides reimplemented blocks of Large Language Models (LLMs) to make models functional and highly performant on Qualcomm Cloud AI 100. It includes graph transformations, handling for under-flows and overflows, patcher modules, exporter module, sample applications, and unit test templates. The library supports seamless inference on pre-trained LLMs with documentation for model optimization and deployment. Contributions and suggestions are welcome, with a focus on testing changes for model support and common utilities.

SLAM-LLM
SLAM-LLM is a deep learning toolkit designed for researchers and developers to train custom multimodal large language models (MLLM) focusing on speech, language, audio, and music processing. It provides detailed recipes for training and high-performance checkpoints for inference. The toolkit supports tasks such as automatic speech recognition (ASR), text-to-speech (TTS), visual speech recognition (VSR), automated audio captioning (AAC), spatial audio understanding, and music caption (MC). SLAM-LLM features easy extension to new models and tasks, mixed precision training for faster training with less GPU memory, multi-GPU training with data and model parallelism, and flexible configuration based on Hydra and dataclass.

LLMGA
LLMGA (Multimodal Large Language Model-based Generation Assistant) is a tool that leverages Large Language Models (LLMs) to assist users in image generation and editing. It provides detailed language generation prompts for precise control over Stable Diffusion (SD), resulting in more intricate and precise content in generated images. The tool curates a dataset for prompt refinement, similar image generation, inpainting & outpainting, and visual question answering. It offers a two-stage training scheme to optimize SD alignment and a reference-based restoration network to alleviate texture, brightness, and contrast disparities in image editing. LLMGA shows promising generative capabilities and enables wider applications in an interactive manner.

Crypto-Nft-Airdrop-Tool
Crypto-Nft-Airdrop-Tool is a Python tool designed for conducting airdrops of NFTs in the crypto space. It provides functionality for distributing NFTs to a specified audience efficiently. The tool is compatible with Windows platform and requires Python 3. Users can easily manage and execute airdrop campaigns using this tool, enhancing their engagement with the NFT community. The tool simplifies the process of distributing NFTs and ensures a seamless experience for both creators and recipients.

semantic-router
The Semantic Router is an intelligent routing tool that utilizes a Mixture-of-Models (MoM) approach to direct OpenAI API requests to the most suitable models based on semantic understanding. It enhances inference accuracy by selecting models tailored to different types of tasks. The tool also automatically selects relevant tools based on the prompt to improve tool selection accuracy. Additionally, it includes features for enterprise security such as PII detection and prompt guard to protect user privacy and prevent misbehavior. The tool implements similarity caching to reduce latency. The comprehensive documentation covers setup instructions, architecture guides, and API references.

LakeSoul
LakeSoul is a cloud-native Lakehouse framework that supports scalable metadata management, ACID transactions, efficient and flexible upsert operation, schema evolution, and unified streaming & batch processing. It supports multiple computing engines like Spark, Flink, Presto, and PyTorch, and computing modes such as batch, stream, MPP, and AI. LakeSoul scales metadata management and achieves ACID control by using PostgreSQL. It provides features like automatic compaction, table lifecycle maintenance, redundant data cleaning, and permission isolation for metadata.

FuseAI
FuseAI is a repository that focuses on knowledge fusion of large language models. It includes FuseChat, a state-of-the-art 7B LLM on MT-Bench, and FuseLLM, which surpasses Llama-2-7B by fusing three open-source foundation LLMs. The repository provides tech reports, releases, and datasets for FuseChat and FuseLLM, showcasing their performance and advancements in the field of chat models and large language models.

FinRobot
FinRobot is an open-source AI agent platform designed for financial applications using large language models. It transcends the scope of FinGPT, offering a comprehensive solution that integrates a diverse array of AI technologies. The platform's versatility and adaptability cater to the multifaceted needs of the financial industry. FinRobot's ecosystem is organized into four layers, including Financial AI Agents Layer, Financial LLMs Algorithms Layer, LLMOps and DataOps Layers, and Multi-source LLM Foundation Models Layer. The platform's agent workflow involves Perception, Brain, and Action modules to capture, process, and execute financial data and insights. The Smart Scheduler optimizes model diversity and selection for tasks, managed by components like Director Agent, Agent Registration, Agent Adaptor, and Task Manager. The tool provides a structured file organization with subfolders for agents, data sources, and functional modules, along with installation instructions and hands-on tutorials.

AutoPatent
AutoPatent is a multi-agent framework designed for automatic patent generation. It challenges large language models to generate full-length patents based on initial drafts. The framework leverages planner, writer, and examiner agents along with PGTree and RRAG to craft lengthy, intricate, and high-quality patent documents. It introduces a new metric, IRR (Inverse Repetition Rate), to measure sentence repetition within patents. The tool aims to streamline the patent generation process by automating the creation of detailed and specialized patent documents.

esp-ai
ESP-AI provides a complete AI conversation solution for your development board, including IAT+LLM+TTS integration solutions for ESP32 series development boards. It can be injected into projects without affecting existing ones. By providing keys from platforms like iFlytek, Jiling, and local services, you can run the services without worrying about interactions between services or between development boards and services. The project's server-side code is based on Node.js, and the hardware code is based on Arduino IDE.

SummaryYou
Summary You is a tool that utilizes AI to summarize YouTube videos, articles, images, and documents. Users can set the length of the summary and have the option to listen to the summaries. The tool also includes a history section, intelligent paywall detection, OLED-Dark Mode, and a user-friendly Material Design 3 style UI with dynamic color themes. It uses GPT-3.5 OpenAI/Mixtral 8x7B Groq for summarization. The backend is implemented in Python with Chaquopy, and some UI designs and codes are borrowed from Seal Material color utilities.

CHATPGT-MEV-BOT-ETH
This tool is a bot that monitors the performance of MEV transactions on the Ethereum blockchain. It provides real-time data on MEV profitability, transaction volume, and network congestion. The bot can be used to identify profitable MEV opportunities and to track the performance of MEV strategies.

SEED-Bench
SEED-Bench is a comprehensive benchmark for evaluating the performance of multimodal large language models (LLMs) on a wide range of tasks that require both text and image understanding. It consists of two versions: SEED-Bench-1 and SEED-Bench-2. SEED-Bench-1 focuses on evaluating the spatial and temporal understanding of LLMs, while SEED-Bench-2 extends the evaluation to include text and image generation tasks. Both versions of SEED-Bench provide a diverse set of tasks that cover different aspects of multimodal understanding, making it a valuable tool for researchers and practitioners working on LLMs.
For similar tasks

AutoGPT
AutoGPT is a revolutionary tool that empowers everyone to harness the power of AI. With AutoGPT, you can effortlessly build, test, and delegate tasks to AI agents, unlocking a world of possibilities. Our mission is to provide the tools you need to focus on what truly matters: innovation and creativity.

agent-os
The Agent OS is an experimental framework and runtime to build sophisticated, long running, and self-coding AI agents. We believe that the most important super-power of AI agents is to write and execute their own code to interact with the world. But for that to work, they need to run in a suitable environment—a place designed to be inhabited by agents. The Agent OS is designed from the ground up to function as a long-term computing substrate for these kinds of self-evolving agents.

chatdev
ChatDev IDE is a tool for building your AI agent, Whether it's NPCs in games or powerful agent tools, you can design what you want for this platform. It accelerates prompt engineering through **JavaScript Support** that allows implementing complex prompting techniques.

module-ballerinax-ai.agent
This library provides functionality required to build ReAct Agent using Large Language Models (LLMs).

npi
NPi is an open-source platform providing Tool-use APIs to empower AI agents with the ability to take action in the virtual world. It is currently under active development, and the APIs are subject to change in future releases. NPi offers a command line tool for installation and setup, along with a GitHub app for easy access to repositories. The platform also includes a Python SDK and examples like Calendar Negotiator and Twitter Crawler. Join the NPi community on Discord to contribute to the development and explore the roadmap for future enhancements.

ai-agents
The 'ai-agents' repository is a collection of books and resources focused on developing AI agents, including topics such as GPT models, building AI agents from scratch, machine learning theory and practice, and basic methods and tools for data analysis. The repository provides detailed explanations and guidance for individuals interested in learning about and working with AI agents.

llms
The 'llms' repository is a comprehensive guide on Large Language Models (LLMs), covering topics such as language modeling, applications of LLMs, statistical language modeling, neural language models, conditional language models, evaluation methods, transformer-based language models, practical LLMs like GPT and BERT, prompt engineering, fine-tuning LLMs, retrieval augmented generation, AI agents, and LLMs for computer vision. The repository provides detailed explanations, examples, and tools for working with LLMs.

ai-app
The 'ai-app' repository is a comprehensive collection of tools and resources related to artificial intelligence, focusing on topics such as server environment setup, PyCharm and Anaconda installation, large model deployment and training, Transformer principles, RAG technology, vector databases, AI image, voice, and music generation, and AI Agent frameworks. It also includes practical guides and tutorials on implementing various AI applications. The repository serves as a valuable resource for individuals interested in exploring different aspects of AI technology.
For similar jobs

weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

tabby
Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

spear
SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

Magick
Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.