
Awesome-LLM-Resources-List
A Curated Collection of LLM resources (work in progress).
Stars: 105

Awesome LLM Resources is a curated collection of resources for Large Language Models (LLMs) covering various aspects such as serverless hosting, accessing off-the-shelf models via API, local inference, LLM serving frameworks, open-source LLM web chat UIs, renting GPUs for fine-tuning, fine-tuning with no-code UI, fine-tuning frameworks, OS agentic/AI workflow, AI agents, co-pilots, voice API, open-source TTS models, OS RAG frameworks, research papers on chain-of-thought prompting, CoT implementations, CoT fine-tuned models & datasets, and more.
README:
A Curated Collection of LLM resources. π‘β¨
π Updated: 12th of March 2025
Platform/Tool | Rel. | Scale Down | OS π | GH | Start | GPU Machine | One-Click | Dev Exp. | Free-Tier |
---|---|---|---|---|---|---|---|---|---|
Beam.Cloud | 2021 | > 1 min | Helpers | β | β | π | π 15h | ||
Baseten | 2019 | > 15 min | π΄ | Guide | β | π‘ | π | $30 | |
Modal | 2021 | < 1 min | π΄ | Helpers | β | β | π | $30/m | |
HF Endpoints | 2023 | > 15 min | π΄ | None Needed | β | β | π | β | |
Replicate | 2019 | < 1 min | π΄ | Guide | β | π‘ | π€· | β | |
Sagemaker (Serverless) | 2017 | N/A | π΄ | N/A | π΅ | β | β | 300,000s | |
Lambda w/ EFS (AWS) | 2014 | < 1 min | π΄ | Guide | π‘ | β | β | β | |
RunPod Serverless | 2022 | > 30s | π΄ | N/A | π‘ | β | π€· | β | |
BentoML | 2019 | > 5 min | Gallery | π‘ | π‘ | π | π $10 |
It goes without saying that these platforms can usually do more than LLM serving**
Platform/Tool | Released | GitHub |
---|---|---|
Together.ai | N/A | π΄ |
Fireworks.ai | N/A | |
Replicate | 2019 | |
Groq | N/A | |
DeepInfra | N/A | |
Bedrock | N/A | |
Lepton | N/A | |
Fal.ai | N/A | |
VertexAI | N/A |
Framework | Browser Chat π₯οΈ | Organization | Open Source | GitHub |
---|---|---|---|---|
Llama.cpp | β | ggerganov | ||
Ollama | β | Ollama | ||
gpt4all | β | Nomic.ai | ||
LMStudio | β | LMStudio AI | π΄ | |
OpenLLM | β | BentoML |
Framework | Open Source | GitHub |
---|---|---|
vLLM | ||
OpenLLM | ||
TGI (Text Generation Inference) | ||
TensorRT LLM | ||
Ray Serve | ||
LMDeploy | ||
Ollama | ||
MLC-LLM |
Tool | Organization | Description | Open Source | GitHub |
---|---|---|---|---|
Text Generation WebUI | oobabooga | A Gradio web UI for Large Language Models. | ||
Jan AI | Jan HQ | An open source alternative to ChatGPT that runs 100% offline on your computer. Multiple engine support (llama.cpp, TensorRT-LLM). | ||
AnythingLLM | Mintplex Labs | The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, and more. | ||
Superagent | Superagent AI | Allows developers to add powerful AI assistants to their applications using LLMs and RAG. | ||
Bionic-GPT | Bionic GPT | A ChatGPT replacement offering generative AI advantages while maintaining strict data confidentiality. | ||
Open WebUI | Open WebUI | A user-friendly web interface for interacting with Large Language Models (LLMs). | ||
Xyne | xynehq | A sleek, minimal web chat interface for interacting with Large Language Models. | ||
Assistant UI | assistant-ui | An open-source ChatGPT-like interface with a clean and responsive design. | ||
Scira | zaidmukaddam | An AI-powered search interface that leverages LLMs for intelligent search results. | ||
Onyx | onyx-dot-app | A customizable and extendable web chat UI for interacting with large language models. | ||
NextChat | ChatGPTNextWeb | A Next.js-based, open-source ChatGPT clone for seamless web interaction. |
Platform | Templates | Beginner Friendly | GitHub |
---|---|---|---|
Brev.dev | Fine-tuning | β | |
Modal | Fine-tuning | β | |
Hyperbolic AI | None | β | |
RunPod | None | β | |
Paperspace | Fine-tuning | β | |
Colab | Small models only | β |
Tool | Beginner Friendly | Open Source | GitHub |
---|---|---|---|
Together.ai | β | β | N/A |
Hugging Face AutoTrain | β | β | |
AutoML | β | β | |
LLaMA-Factory | β | β | |
H2O LLM Studio | β | β |
Framework | Open Source | GitHub |
---|---|---|
Axolotl | ||
Unsloth |
Framework | Open Source | Beginner Friendly | Released | GitHub |
---|---|---|---|---|
LangChain | β | 2022 | ||
LangGraph | β | 2023 | ||
LlamaIndex | β | 2023 | ||
Langroid | β | 2023 | ||
Flowise | β | 2023 | ||
Swarms | β | 2023 | ||
CrewAI | β | 2023 | ||
Autogen | β | 2023 | ||
AutoChain | β | 2023 | ||
SuperAGI | β | 2023 | ||
AILegion | β | 2023 | ||
MemGPT (Letta) | β | 2023 | ||
uAgents | β | 2023 | ||
Phidata | β | 2023 | ||
AGiXT | β | 2023 | ||
Agno | β | 2023 | ||
Dify | β | 2024 | ||
TaskingAI | β | 2024 | ||
Bee Agent Framework | β | 2024 | ||
Swarms | β | 2024 | ||
IoA | β | 2024 | ||
Atomic Agents | β | 2024 | ||
Upsonic | β | 2024 | ||
Parlant | β | 2024 | ||
Rig | β | 2024 | ||
smolagents | β | 2023 | ||
eliza | β | 2024 |
Tool | Organization | Description | Open Source | GitHub |
---|---|---|---|---|
Rivet | Ironclad | A visual builder to design and deploy AI agent workflows. | ||
PySpur | PySpur-Dev | A tool to build and visualize AI agents seamlessly. | ||
Flowise | FlowiseAI | A noβcode, visual platform for designing AI agent workflows. |
Tool | Organization | Description | Open Source | GitHub |
---|---|---|---|---|
browser-use | browser-use | Integrates browser functionalities into agentic workflows. | ||
code2prompt | mufeedvh | Converts code snippets into actionable prompts for development. | ||
note-gen | codexu | Automatically generates notes and documentation from your code. | ||
refly | refly-ai | Automates code refactoring and prompt generation tasks. | ||
potpie | potpie-ai | A toolkit for prototyping and building AI agent pipelines. | ||
AgentStack | AgentOps-AI | A comprehensive stack for constructing and deploying AI agents. | ||
browser | lightpanda-io | A browserβbased tool designed for integrating agentic functionalities. | ||
Memary | kingjulio8238 | A memory module for retaining context in agent workflows. | ||
open-canvas | langchain-ai | A visual interface for designing agent workflows with LangChain. | ||
agent-service-toolkit | JoshuaC215 | A toolkit for building and deploying agent-based services. |
Tool | Organization | Description | Open Source | GitHub |
---|---|---|---|---|
Leon | leon-ai | An openβsource personal assistant and automation platform powered by AI. | ||
Khoj | khoj-ai | A virtual brain for organizing and retrieving your knowledge using AI. |
Framework | Organization | Open Source | Released | GitHub |
---|---|---|---|---|
GPT Engineer | GPT Engineer Org | 2023 | ||
XAgent | OpenBMB | 2023 | ||
Bolt.new | StackBlitz | 2023 | ||
Goose | Block | 2023 | ||
Jobs Applier AI Agent | feder-cr | 2023 | ||
AI Hedge Fund | virattt | 2023 | ||
FinRobot | AI4Finance Foundation | 2024 | ||
STORM | Stanford OVAL | 2024 | ||
Multion | MULTI-ON | π΄ | N/A | |
Minion | Minion AI | π΄ | N/A |
Framework | Open Source | GitHub |
---|---|---|
Aider | ||
Cursor | ||
Continue |
Framework | Open Source | GitHub |
---|---|---|
VAPI.ai | π΄ | |
Bland.ai | π΄ | N/A |
CallAnnie | π΄ | N/A |
RealtimeTTS | ||
RealtimeSTT | ||
Coqui TTS |
Model | License | Stars/Likes | Downloads (Last Month) | Repository |
---|---|---|---|---|
Kokoro-82M | Apache 2.0 | β 3.16k (HF) | π₯ 557,392 | Hugging Face |
Zonos-v0.1-transformer | Apache 2.0 | β 249 (HF) | π₯ 24,240 | Hugging Face |
XTTS-v2 | Non-Commercial | β€οΈ 368 (HF) | π₯ 2,545,850 | Hugging Face |
ChatTTS | AGPL-3.0 | N/A | N/A | GitHub |
MeloTTS | MIT | N/A | N/A | GitHub |
For more TTS models and rankings, check out the TTS Leaderboard.
Tool | Organization | Description | Open Source | GitHub |
---|---|---|---|---|
Eino | CloudWeGo | A lightweight LLM application framework for scalable AI solutions. | ||
Conversation Knowledge Mining Solution Accelerator | Microsoft | A solution accelerator for integrating conversation intelligence and knowledge mining using LLMs. | ||
Olmocr | AllenAI | An OCR framework optimized for integration with language models. | ||
PDFMathTranslate | Byaidu | A tool for converting and translating mathematical content in PDFs using LLMs. | ||
Podcastfy | souzatharsis | A tool to generate podcasts from written content using LLMs. | ||
Pandas AI | sinaptik-ai | Brings LLM-powered analytics to pandas dataframes. | ||
Ramalama | containers | An LLM application framework for containerized deployment of AI solutions. | ||
Robyn | facebookexperimental | A scalable framework for building LLM applications from Facebook Experimental. | ||
ExtractThinker | enoch3712 | A tool for extracting and synthesizing insights from textual data using LLMs. |
Framework | Organization | Open Source | Released | GitHub |
---|---|---|---|---|
Haystack | deepset.ai | 2023 | ||
RAGflow | Infiniflow | 2024 | ||
txtai | Neuml | 2022 | ||
LLM App | Pathway | 2023 | ||
Cognita | Truefoundry | 2024 | ||
R2R | SciPhi AI | 2024 | ||
Raptor | Parth Sarthi | 2024 | ||
LightRAG | HKUDS | 2023 | ||
PIKE-RAG | Microsoft | 2024 | ||
KAG | OpenSPG | 2024 | ||
MemoRAG | qhjqhj00 | 2023 |
See RAG_Techniques if you get stuck (not always needed)
Tool | Organization | Description | Open Source | GitHub |
---|---|---|---|---|
magic-resume | JOYCEQL | An AI-powered tool for generating resumes. | ||
VideoCaptioner | WEIFENG2333 | An AI tool for automatically generating video captions. | ||
DeepSeekAI | DeepLifeStudio | Browser extension for invoking the DeepSeek AI large model. | ||
logocreator | Nutlope | A tool for creating logos using AI. | ||
blinkshot | Nutlope | An AI-powered tool for capturing and enhancing screenshots. | ||
pollinations | pollinations | A tool for generating creative images and artwork using AI. | ||
PromptWizard | microsoft | A tool to generate, manage, and optimize prompts for AI models. | ||
Open-Interface | AmberSahdev | Control Any Computer Using LLMs. | ||
wut | shobrook | LLM for the terminal |
Tool | Organization | Description | Open Source | GitHub |
---|---|---|---|---|
transformerlab-app | transformerlab | An application for training and optimizing transformer models. | ||
fluxgym | cocktailpeanut | A gym environment for reinforcement learning training and optimization. | ||
AutoGPTQ | AutoGPTQ | A tool for automating GPT quantization and optimization. |
Tool | Organization | Description | Open Source | GitHub |
---|---|---|---|---|
WALDO | stephansturges | An AI model for visual reasoning and object detection. | ||
Janus | deepseek-ai | A multi-modal AI model for advanced data processing. | ||
ModernBERT | AnswerDotAI | A modernized version of BERT for natural language processing tasks. | ||
Magma | microsoft | A scalable AI model for large-scale data analysis. | ||
Cosmos-Nemotron | NVlabs | An AI model for advanced image and video processing. | ||
Paints-UNDO | lllyasviel | An interactive AI model for image generation and editing. |
Tool | Organization | Description | Open Source | GitHub |
---|---|---|---|---|
helicone | Helicone | A platform for monitoring and analyzing AI model performance. | ||
langwatch | langwatch | A tool for monitoring outputs and performance of language models. | ||
shortest | antiwork | A tool for evaluating and optimizing AI-generated content. | ||
deepeval | confident-ai | A framework for deep evaluation of AI models. |
Tool | Organization | Description | Open Source | GitHub |
---|---|---|---|---|
gpustack | gpustack | A toolkit for managing GPU infrastructure for AI workloads. | ||
harbor | av | A repository for containerized AI infrastructure management. |
Publication Date | Title | π | Authors | Organization | Technique |
---|---|---|---|---|---|
January 28, 2022 | Chain-of-Thought Prompting Elicits Reasoning in Large Language Models | π | Jason Wei, et al. | DeepMind | CoT Prompting |
March 21, 2022 | Self-Consistency Improves Chain of Thought Reasoning in Language Models | π | Xuezhi Wang et al. | DeepMind | CoT with Self-Consistency |
May 21, 2022 | Least-to-Most Prompting Enables Complex Reasoning in Large Language Models | π | Denny Zhou et al. | DeepMind | Least-to-Most Prompting |
May 21, 2022 | Large Language Models are Zero-Shot Reasoners | π | Takeshi Kojima, et al. | DeepMind | Zero-shot-CoT |
October 6, 2022 | ReAct: Synergizing Reasoning and Acting in Language Models | π | Shunyu Yao et al. | Princeton University | ReAct |
April 1, 2023 | Teaching Large Language Models to Self-Debug | π | Xiang Lisa Li, et al. | DeepMind, Stanford University | Self-Debugging |
May 6, 2023 | Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models | π | Lei Wang et al. | The Chinese University of Hong Kong, SenseTime Research | Plan-and-Solve Prompting |
May 23, 2023 | Letβs Verify Step by Step | π | Anya Goyal, et al. | DeepMind | Verification for CoT |
October 3, 2023 | Large Language Models Cannot Self-Correct Reasoning Yet | π | Qingxiu Dong, et al. | The Chinese University of Hong Kong, Huawei Noah's Ark Lab | Self-Correction in LLMs |
November 2023 | Universal Self-Consistency for Large Language Model Generation | π | Xinyun Chen, Renat Aksitov, Uri Alon, Jie Ren, Kefan Xiao, Pengcheng Yin, Sushant Prakash, Charles Sutton, Xuezhi Wang, Denny Zhou | DeepMind | Universal Self-Consistency |
May 17, 2023 | Tree of Thoughts: Deliberate Problem Solving with Large Language Models | π | Shunyu Yao, et al. | Princeton University, DeepMind | Tree-of-Thought |
February 15, 2024 | Chain-of-Thought Reasoning Without Prompting | π | Xuezhi Wang, Denny Zhou | DeepMind | Chain-of-Thought Decoding |
March 21, 2024 | ChainLM: Empowering Large Language Models with Improved Chain-of-Thought Prompting | π | Xiaoxue Cheng et al. | Renmin University of China | CoTGenius |
June 2024 | Language Agent Tree Search Unifies Reasoning, Acting, and Planning in Language Models | π | Andy Zhou, Kai Yan, Michal Shlapentokh-Rothman, Haohan Wang, Yu-Xiong Wang | Language Agent Tree Search (LATS) | |
May 2024 | Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning | π | Yuxi Xie, et al. | National University of Singapore, DeepMind | MCTS |
September 18, 2024 | To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning | π | Zayne Sprague, et al. | The University of Texas at Austin, Johns Hopkins University, Princeton University | Meta-analysis of CoT |
September 25, 2024 | Chain-of-Thoughtlessness? An Analysis of CoT in Planning | π | Kaya Stechly, et al. | Arizona State University | Analysis of CoT in Planning |
October 18, 2024 | Supervised Chain of Thought | π | Xiang Zhang, Dujian Ding | University of British Columbia | Supervised Chain of Thought |
October 24, 2024 | On examples: A Theoretical Understanding of Chain-of-Thought: Coherent Reasoning and Error-Aware Demonstration | π | Zhiqiang Hu, et al. | Amazon, Michigan State University | Theoretical Analysis of CoT |
Implementation | Link | Author | GitHub Stars | GitHub Followers |
---|---|---|---|---|
CoT | chain-of-thought-hub | Franx Yao | ||
CoT | optillm | Codelion | ||
CoT | auto-cot | Amazon Science | ||
CoT | g1 | BKlieger Groq | ||
Decoding CoT | optillm/cot_decoding.py | Codelion | ||
Tree of Thoughts | tree-of-thought-llm | Princeton NLP | ||
Tree of Thoughts | tree-of-thoughts | Kye Gomez | ||
Tree of Thoughts | saplings | Shobrook | ||
MCTS | optillm/mcts.py | Codelion | ||
Graph of Thoughts | graph-of-thoughts | SPCL | ||
Other | CPO | SAIL SG | ||
Other | Everything-of-Thoughts-XoT | Microsoft |
Model Name | Author | Size | Link |
---|---|---|---|
CoT-T5-3B | KAIST AI | 3B | π |
CoT-T5-11B | KAIST AI | 11B | π |
Llama-3.2V-11B-cot | Xkev | 11B | π |
Llama-3.1-8B-Instruct-Reasoner-1o1_v0.3 | Lyte | 8B | π |
Dataset Name | Author | Data Size | Likes | Link |
---|---|---|---|---|
chain-of-thought-sharegpt | Isaiah Bjork | 7.14k rows | π 8 | π |
CoT-Collection | KAIST AI | 1.84 million rows | π 122 | π |
Reasoner-1o1-v0.3-HQ | Lyte | 370 rows | π 7 | π |
OpenLongCoT-Pretrain | qq8933 | 103k rows | π 86 | π |
Tool | Organization | Description | Open Source | GitHub |
---|---|---|---|---|
awesome-cursorrules | PatrickJS | A curated list of resources and guides on cursorrules. | ||
ai-engineering-hub | patchy631 | A hub of AI engineering learning resources, tutorials, and best practices. | ||
GenAI_Agents | NirDiamant | Resources and examples for building Generative AI Agents. | ||
learn-agentic-ai | panaversity | Learning materials for understanding and building agentic AI. | ||
awesome-generative-ai | steven2358 | A curated list of generative AI resources and projects. | ||
awesome-mcp-servers | punkpeye | A curated collection of awesome MCP servers resources. | ||
GenAI-Showcase | mongodb-developer | A showcase of innovative Generative AI projects. | ||
well-architected-iac-analyzer | aws-samples | A tool to analyze and ensure well-architected Infrastructure as Code practices. | ||
llama-cookbook | meta-llama | A collection of recipes and guides for working with LLaMA models. | ||
optillm | codelion | Resources for optimizing LLM usage and performance. | ||
cursor.directory | pontusab | A directory of tools and resources related to cursor-based workflows. | ||
GenAI_Agents | NirDiamant | A curated collection of generative AI agents and related tools. |
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for Awesome-LLM-Resources-List
Similar Open Source Tools

Awesome-LLM-Resources-List
Awesome LLM Resources is a curated collection of resources for Large Language Models (LLMs) covering various aspects such as serverless hosting, accessing off-the-shelf models via API, local inference, LLM serving frameworks, open-source LLM web chat UIs, renting GPUs for fine-tuning, fine-tuning with no-code UI, fine-tuning frameworks, OS agentic/AI workflow, AI agents, co-pilots, voice API, open-source TTS models, OS RAG frameworks, research papers on chain-of-thought prompting, CoT implementations, CoT fine-tuned models & datasets, and more.

cgft-llm
The cgft-llm repository is a collection of video tutorials and documentation for implementing large models. It provides guidance on topics such as fine-tuning llama3 with llama-factory, lightweight deployment and quantization using llama.cpp, speech generation with ChatTTS, introduction to Ollama for large model deployment, deployment tools for vllm and paged attention, and implementing RAG with llama-index. Users can find detailed code documentation and video tutorials for each project in the repository.

dora
Dataflow-oriented robotic application (dora-rs) is a framework that makes creation of robotic applications fast and simple. Building a robotic application can be summed up as bringing together hardwares, algorithms, and AI models, and make them communicate with each others. At dora-rs, we try to: make integration of hardware and software easy by supporting Python, C, C++, and also ROS2. make communication low latency by using zero-copy Arrow messages. dora-rs is still experimental and you might experience bugs, but we're working very hard to make it stable as possible.

Awesome-LLM-Tabular
This repository is a curated list of research papers that explore the integration of Large Language Model (LLM) technology with tabular data. It aims to provide a comprehensive resource for researchers and practitioners interested in this emerging field. The repository includes papers on a wide range of topics, including table-to-text generation, table question answering, and tabular data classification. It also includes a section on related datasets and resources.

fastapi
ζΊε Fast API is a one-stop API management system that unifies various LLM APIs in terms of format, standards, and management, achieving the ultimate in functionality, performance, and user experience. It supports various models from companies like OpenAI, Azure, Baidu, Keda Xunfei, Alibaba Cloud, Zhifu AI, Google, DeepSeek, 360 Brain, and Midjourney. The project provides user and admin portals for preview, supports cluster deployment, multi-site deployment, and cross-zone deployment. It also offers Docker deployment, a public API site for registration, and screenshots of the admin and user portals. The API interface is similar to OpenAI's interface, and the project is open source with repositories for API, web, admin, and SDK on GitHub and Gitee.

PaddleNLP
PaddleNLP is an easy-to-use and high-performance NLP library. It aggregates high-quality pre-trained models in the industry and provides out-of-the-box development experience, covering a model library for multiple NLP scenarios with industry practice examples to meet developers' flexible customization needs.

EmoLLM
EmoLLM is a series of large-scale psychological health counseling models that can support **understanding-supporting-helping users** in the psychological health counseling chain, which is fine-tuned from `LLM` instructions. Welcome everyone to star~ββ. The currently open source `LLM` fine-tuning configurations are as follows:

widgets
Widgets is a desktop component front-end open source component. The project is still being continuously improved. The desktop component client can be downloaded and run in two ways: 1. https://www.microsoft.com/store/productId/9NPR50GQ7T53 2. https://widgetjs.cn After cloning the code, you need to download the dependency in the project directory: `shell pnpm install` and run: `shell pnpm serve`

JiwuChat
JiwuChat is a lightweight multi-platform chat application built on Tauri2 and Nuxt3, with various real-time messaging features, AI group chat bots (such as 'iFlytek Spark', 'KimiAI' etc.), WebRTC audio-video calling, screen sharing, and AI shopping functions. It supports seamless cross-device communication, covering text, images, files, and voice messages, also supporting group chats and customizable settings. It provides light/dark mode for efficient social networking.

AstrBot
AstrBot is a powerful and versatile tool that leverages the capabilities of large language models (LLMs) like GPT-3, GPT-3.5, and GPT-4 to enhance communication and automate tasks. It seamlessly integrates with popular messaging platforms such as QQ, QQ Channel, and Telegram, enabling users to harness the power of AI within their daily conversations and workflows.

MedicalGPT
MedicalGPT is a training medical GPT model with ChatGPT training pipeline, implement of Pretraining, Supervised Finetuning, RLHF(Reward Modeling and Reinforcement Learning) and DPO(Direct Preference Optimization).

llm-book
The 'llm-book' repository is dedicated to the introduction of large-scale language models, focusing on natural language processing tasks. The code is designed to run on Google Colaboratory and utilizes datasets and models available on the Hugging Face Hub. Note that as of July 28, 2023, there are issues with the MARC-ja dataset links, but an alternative notebook using the WRIME Japanese sentiment analysis dataset has been added. The repository covers various chapters on topics such as Transformers, fine-tuning language models, entity recognition, summarization, document embedding, question answering, and more.

awesome-ai-painting
This repository, named 'awesome-ai-painting', is a comprehensive collection of resources related to AI painting. It is curated by a user named η§ι£, who is an AI painting enthusiast with a background in the AIGC industry. The repository aims to help more people learn AI painting and also documents the user's goal of creating 100 AI products, with current progress at 4/100. The repository includes information on various AI painting products, tutorials, tools, and models, providing a valuable resource for individuals interested in AI painting and related technologies.
For similar tasks

Awesome-LLM-Resources-List
Awesome LLM Resources is a curated collection of resources for Large Language Models (LLMs) covering various aspects such as serverless hosting, accessing off-the-shelf models via API, local inference, LLM serving frameworks, open-source LLM web chat UIs, renting GPUs for fine-tuning, fine-tuning with no-code UI, fine-tuning frameworks, OS agentic/AI workflow, AI agents, co-pilots, voice API, open-source TTS models, OS RAG frameworks, research papers on chain-of-thought prompting, CoT implementations, CoT fine-tuned models & datasets, and more.

ai-on-gke
This repository contains assets related to AI/ML workloads on Google Kubernetes Engine (GKE). Run optimized AI/ML workloads with Google Kubernetes Engine (GKE) platform orchestration capabilities. A robust AI/ML platform considers the following layers: Infrastructure orchestration that support GPUs and TPUs for training and serving workloads at scale Flexible integration with distributed computing and data processing frameworks Support for multiple teams on the same infrastructure to maximize utilization of resources

ray
Ray is a unified framework for scaling AI and Python applications. It consists of a core distributed runtime and a set of AI libraries for simplifying ML compute, including Data, Train, Tune, RLlib, and Serve. Ray runs on any machine, cluster, cloud provider, and Kubernetes, and features a growing ecosystem of community integrations. With Ray, you can seamlessly scale the same code from a laptop to a cluster, making it easy to meet the compute-intensive demands of modern ML workloads.

labelbox-python
Labelbox is a data-centric AI platform for enterprises to develop, optimize, and use AI to solve problems and power new products and services. Enterprises use Labelbox to curate data, generate high-quality human feedback data for computer vision and LLMs, evaluate model performance, and automate tasks by combining AI and human-centric workflows. The academic & research community uses Labelbox for cutting-edge AI research.

djl
Deep Java Library (DJL) is an open-source, high-level, engine-agnostic Java framework for deep learning. It is designed to be easy to get started with and simple to use for Java developers. DJL provides a native Java development experience and allows users to integrate machine learning and deep learning models with their Java applications. The framework is deep learning engine agnostic, enabling users to switch engines at any point for optimal performance. DJL's ergonomic API interface guides users with best practices to accomplish deep learning tasks, such as running inference and training neural networks.

mlflow
MLflow is a platform to streamline machine learning development, including tracking experiments, packaging code into reproducible runs, and sharing and deploying models. MLflow offers a set of lightweight APIs that can be used with any existing machine learning application or library (TensorFlow, PyTorch, XGBoost, etc), wherever you currently run ML code (e.g. in notebooks, standalone applications or the cloud). MLflow's current components are:
* `MLflow Tracking

tt-metal
TT-NN is a python & C++ Neural Network OP library. It provides a low-level programming model, TT-Metalium, enabling kernel development for Tenstorrent hardware.

burn
Burn is a new comprehensive dynamic Deep Learning Framework built using Rust with extreme flexibility, compute efficiency and portability as its primary goals.
For similar jobs

weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

tabby
Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

spear
SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

Magick
Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.