
AgentsMeetRL
An Awesome List of Agentic Model trained with Reinforcement Learning
Stars: 461

AgentsMeetRL is an awesome list that summarizes open-source repositories for training LLM Agents using reinforcement learning. The criteria for identifying an agent project are multi-turn interactions or tool use. The project is based on code analysis from open-source repositories using GitHub Copilot Agent. The focus is on reinforcement learning frameworks, RL algorithms, rewards, and environments that projects depend on, for everyone's reference on technical choices.
README:
AgentsMeetRL is an awesome list that summarizes open-source repositories for training LLM Agents using reinforcement learning:
- 🤖 The criteria for identifying an agent project are that it must have at least one of the following: multi-turn interactions or tool use (so TIR projects, Tool-Integrated Reasoning, are considered in this repo).
⚠️ This project is based on code analysis from open-source repositories using GitHub Copilot Agent, which may contain unfaithful cases. Although manually reviewed, there may still be omissions. If you find any errors, please don't hesitate to let us know immediately through issues or PRs - we warmly welcome them!- 🚀 We particularly focus on the reinforcement learning frameworks, RL algorithms, rewards, and environments that projects depend on, for everyone's reference on how these excellent open-source projects make their technical choices. See [Click to view technical details] under each table.
- 🤗 Feel free to submit your own projects anytime - we welcome contributions!
Some Enumeration:
- Enumeration for Reward Type:
- External Verifier: e.g., a compiler or math solver
- Rule-Based: e.g., a LaTeX parser with exact match scoring
- Model-Based: e.g., a trained verifier LLM or reward LLM
- Custom
Github Repo | 🌟 Stars | Date | Org | Paper Link |
---|---|---|---|---|
siiRL | 2025.7 | Shanghai Innovation Institute | Paper | |
slime | 2025.6 | Tsinghua University (THUDM) | blog | |
agent-lightning | 2025.6 | Microsoft Research | Paper | |
AReaL | 2025.6 | AntGroup/Tsinghua | Paper | |
ROLL | 2025.6 | Alibaba | Paper | |
MARTI | 2025.5 | Tsinghua | -- | |
RL2 | 2025.4 | Accio | – | |
verifiers | 2025.3 | Individual | -- | |
oat | 2024.11 | NUS/Sea AI | Paper | |
veRL | 2024.10 | ByteDance | Paper | |
OpenRLHF | 2023.7 | OpenRLHF | Paper | |
trl | 2019.11 | HuggingFace | -- |
📋 Click to view technical details
Github Repo | RL Algorithm | Single/Multi Agent | Outcome/Process Reward | Single/Multi Turn | Task | Reward Type | Tool usage |
---|---|---|---|---|---|---|---|
siiRL | PPO/GRPO/CPGD/MARFT | Multi | Both | Multi | LLM/VLM/LLM-MAS PostTraining | Model/Rule | Planned |
slime | GRPO/GSPO/REINFORCE++ | Single | Both | Both | Math/Code | External Verifier | Yes |
agent-lightning | PPO/Custom/Automatic Prompt Optimization | Multi | Outcome | Multi | Calculator/SQL | Model/External/Rule | Yes |
AReaL | PPO | Both | Outcome | Both | Math/Code | External | Yes |
ROLL | PPO/GRPO/Reinforce++/TOPR/RAFT++ | Multi | Both | Multi | Math/QA/Code/Alignment | All | Yes |
MARTI | PPO/GRPO/REINFORCE++/TTRL | Multi | Both | Multi | Math | All | Yes |
RL2 | Dr. GRPO/PPO/DPO | Single | Both | Both | QA/Dialogue | Rule/Model/External | Yes |
verifiers | GRPO | Multi | Outcome | Both | Reasoning/Math/Code | All | Code |
oat | PPO/GRPO | Single | Outcome | Multi | Math/Alignment | External | No |
veRL | PPO/GRPO | Single | Outcome | Both | Math/QA/Reasoning/Search | All | Yes |
OpenRLHF | PPO/REINFORCE++/GRPO/DPO/IPO/KTO/RLOO | Multi | Both | Both | Dialogue/Chat/Completion | Rule/Model/External | Yes |
trl | PPO/GRPO/DPO | Single | Both | Single | QA | Custom | No |
Github Repo | 🌟 Stars | Date | Org | Paper Link | RL Framework |
---|---|---|---|---|---|
AgentGym-RL | 2025.9 | Fudan University | Paper | veRL | |
Agent_Foundation_Models | 2025.8 | OPPO Personal AI Lab | Paper | veRL | |
SPA-RL-Agent | 2025.5 | PolyU | Paper | TRL | |
verl-agent | 2025.5 | NTU/Skywork | Paper | veRL |
📋 Click to view technical details
Github Repo | RL Algorithm | Single/Multi Agent | Outcome/Process Reward | Single/Multi Turn | Task | Reward Type | Tool usage |
---|---|---|---|---|---|---|---|
AgentGym-RL | PPO/GRPO/RLOO/REINFORCE++ | Single | Outcome | Multi | Web/Search/Game/Embodied/Science | Rule/Model/External | Yes (Web, Search, Env APIs) |
Agent_Foundation_Models | DAPO/PPO | Single | Outcome | Single | QA/Code/Math | Rule/External | Yes |
SPA-RL-Agent | PPO | Single | Process | Multi | Navigation/Web/TextGame | Model | No |
verl-agent | PPO/GRPO/GiGPO/DAPO/RLOO/REINFORCE++ | Multi | Both | Multi | Phone Use/Math/Code/Web/TextGame | All | Yes |
Github Repo | 🌟 Stars | Date | Org | Paper Link | RL Framework |
---|---|---|---|---|---|
ASearcher | 2025.8 | Ant Research RL Lab Tsinghua University & UW |
Paper | RealHF/AReaL | |
Kimi-Researcher | 2025.6 | Moonshot AI | blog | Custom | |
TTI | 2025.6 | CMU | Paper | Custom | |
R-Search | 2025.6 | Individual | -- | veRL | |
R1-Searcher-plus | 2025.5 | RUC | Paper | Custom | |
StepSearch | 2025.5 | SenseTime | Paper | veRL | |
AutoRefine | 2025.5 | USTC | Paper | veRL | |
ZeroSearch | 2025.5 | Alibaba | Paper | veRL | |
WebThinker | 2025.4 | RUC | Paper | Custom | |
DeepResearcher | 2025.4 | SJTU | Paper | veRL | |
Search-R1 | 2025.3 | UIUC/Google | paper1, paper2 | veRL | |
R1-Searcher | 2025.3 | RUC | Paper | OpenRLHF | |
C-3PO | 2025.2 | Alibaba | Paper | OpenRLHF | |
WebAgent | 2025.1 | Alibaba | paper1, paper2 | LLaMA-Factory |
📋 Click to view technical details
Github Repo | RL Algorithm | Single/Multi Agent | Outcome/Process Reward | Single/Multi Turn | Task | Reward Type | Tool usage |
---|---|---|---|---|---|---|---|
ASearcher | PPO/GRPO + Decoupled PPO | Single | Outcome | Multi | Math/Code/SearchQA | External/Rule | Yes |
Kimi-Researcher | REINFORCE | Single | Outcome | Multi | Research | Outcome | Search, Browse, Coding |
TTI | REINFORCE/BC | Single | Outcome | Multi | Web | External | Web Browsing |
R-Search | PPO/GRPO | Single | Both | Multi | QA/Search | All | Yes |
R1-Searcher-plus | Custom | Single | Outcome | Multi | Search | Model | Search |
StepSearch | PPO | Single | Process | Multi | QA | Model | Search |
AutoRefine | PPO/GRPO | Multi | Both | Multi | RAG QA | Rule | Search |
ZeroSearch | PPO/GRPO/REINFORCE | Single | Outcome | Multi | QA/Search | Rule | Yes |
WebThinker | DPO | Single | Outcome | Multi | Reasoning/QA/Research | Model/External | Web Browsing |
DeepResearcher | PPO/GRPO | Multi | Outcome | Multi | Research | All | Yes |
Search-R1 | PPO/GRPO | Single | Outcome | Multi | Search | All | Search |
R1-Searcher | PPO/DPO | Single | Both | Multi | Search | All | Yes |
C-3PO | PPO | Multi | Outcome | Multi | Search | Model | Yes |
WebAgent | DAPO | Multi | Process | Multi | Web | Model | Yes |
Github Repo | 🌟 Stars | Date | Org | Paper Link | RL Framework |
---|---|---|---|---|---|
Grounding-R1 | 2025.6 | Salesforce | blog | trl | |
AgentCPM-GUI | 2025.6 | OpenBMB/Tsinghua/RUC | Paper | Huggingface | |
ARPO | 2025.5 | CUHK/HKUST | Paper | veRL | |
GUI-G1 | 2025.5 | RUC | Paper | TRL | |
GUI-R1 | 2025.4 | CAS/NUS | Paper | veRL | |
UI-R1 | 2025.3 | vivo/CUHK | Paper | TRL |
📋 Click to view technical details
Github Repo | RL Algorithm | Single/Multi Agent | Outcome/Process Reward | Single/Multi Turn | Task | Reward Type | Tool usage |
---|---|---|---|---|---|---|---|
Grounding-R1 | GRPO | Single | Outcome | Multi | GUI Grounding | Model | Yes |
AgentCPM-GUI | GRPO | Single | Outcome | Multi | Mobile GUI | Model | Yes |
ARPO | GRPO | Single | Outcome | Multi | GUI | External | Computer Use |
GUI-G1 | GRPO | Single | Outcome | Single | GUI | Rule/External | No |
GUI-R1 | GRPO | Single | Outcome | Multi | GUI | Rule | No |
UI-R1 | GRPO | Single | Process | Both | GUI | Rule | Computer/Phone Use |
Github Repo | 🌟 Stars | Date | Org | Paper Link | RL Framework |
---|---|---|---|---|---|
MiroRL | 2025.8 | MiroMindAI | HF Repo | veRL | |
verl-tool | 2025.6 | TIGER-Lab | X | veRL | |
Multi-Turn-RL-Agent | 2025.5 | University of Minnesota | Paper | Custom | |
Tool-N1 | 2025.5 | NVIDIA | Paper | veRL | |
Tool-Star | 2025.5 | RUC | Paper | LLaMA-Factory | |
RL-Factory | 2025.5 | Simple-Efficient | model | veRL | |
ReTool | 2025.4 | ByteDance | Paper | veRL | |
AWorld | 2025.3 | Ant Group (inclusionAI) | Paper | veRL | |
Agent-R1 | 2025.3 | USTC | -- | veRL | |
ReCall | 2025.3 | BaiChuan | Paper | veRL |
📋 Click to view technical details
Github Repo | RL Algorithm | Single/Multi Agent | Outcome/Process Reward | Single/Multi Turn | Task | Reward Type | Tool usage |
---|---|---|---|---|---|---|---|
MiroRL | GRPO | Single | Both | Multi | Reasoning/Planning/ToolUse | Rule-based | MCP |
verl-tool | PPO/GRPO | Single | Both | Both | Math/Code | Rule/External | Yes |
Multi-Turn-RL-Agent | GRPO | Single | Both | Multi | Tool-use/Math | Rule/External | Yes |
Tool-N1 | PPO | Single | Outcome | Multi | Math/Dialogue | All | Yes |
Tool-Star | PPO/DPO/ORPO/SimPO/KTO | Single | Outcome | Multi | Multi-modal/Tool Use/Dialogue | Model/External | Yes |
RL-Factory | GRPO | Multi | Both | Multi | Tool-use/NL2SQL | All | MCP |
ReTool | PPO | Single | Outcome | Multi | Math | External | Code |
AWorld | GRPO | Both | Outcome | Multi | Search/Web/Code | External/Rule | Yes |
Agent-R1 | PPO/GRPO | Single | Both | Multi | Tool-use/QA | Model | Yes |
ReCall | PPO/GRPO/RLOO/REINFORCE++/ReMax | Single | Outcome | Multi | Tool-use/Math/QA | All | Yes |
Github Repo | 🌟 Stars | Date | Org | Paper Link | RL Framework |
---|---|---|---|---|---|
ARIA | 2025.6 | Fudan University | Paper | Custom | |
AMPO | 2025.5 | Tongyi Lab, Alibaba | Paper | veRL | |
Trinity-RFT | 2025.5 | Alibaba | Paper | veRL | |
VAGEN | 2025.3 | RAGEN-AI | Paper | veRL | |
ART | 2025.3 | OpenPipe | Paper | TRL | |
OpenManus-RL | 2025.3 | UIUC/MetaGPT | -- | Custom | |
RAGEN | 2025.1 | RAGEN-AI | Paper | veRL |
📋 Click to view technical details
Github Repo | RL Algorithm | Single/Multi Agent | Outcome/Process Reward | Single/Multi Turn | Task | Reward Type | Tool usage |
---|---|---|---|---|---|---|---|
ARIA | REINFORCE | Both | Process | Multi | Negotiation/Bargaining | Other | No |
AMPO | BC/AMPO(GRPO improvement) | Multi | Outcome | Multi | Social Interaction | Model-based | No |
Trinity-RFT | PPO/GRPO | Single | Outcome | Both | Math/TextGame/Web | All | Yes |
VAGEN | PPO/GRPO | Single | Both | Multi | TextGame/Navigation | All | Yes |
ART | GRPO | Multi | Both | Multi | TextGame | All | Yes |
OpenManus-RL | PPO/DPO/GRPO | Multi | Outcome | Multi | TextGame | All | Yes |
RAGEN | PPO/GRPO | Single | Both | Multi | TextGame | All | Yes |
Github Repo | 🌟 Stars | Date | Org | Paper Link | RL Framework |
---|---|---|---|---|---|
MedAgentGym | 2025.6 | Emory/Georgia Tech | Paper | Hugginface | |
CURE | 2025.6 | University of Chicago Princeton/ByteDance |
Paper | Huggingface | |
MASLab | 2025.5 | MASWorks | Paper | Custom | |
Time-R1 | 2025.5 | UIUC | Paper | veRL | |
ML-Agent | 2025.5 | MASWorks | Paper | Custom | |
SkyRL | 2025.4 | NovaSky | -- | veRL | |
digitalhuman | 2025.4 | Tencent | Paper | veRL | |
sweet_rl | 2025.3 | Meta/UCB | Paper | OpenRLHF | |
rllm | 2025.1 | Berkeley Sky Computing Lab BAIR / Together AI |
Notion Blog | veRL | |
open-r1 | 2025.1 | HuggingFace | -- | TRL |
📋 Click to view technical details
Github Repo | RL Algorithm | Single/Multi Agent | Outcome/Process Reward | Single/Multi Turn | Task | Reward Type | Tool usage |
---|---|---|---|---|---|---|---|
MedAgentGym | SFT/DPO/PPO/GRPO | Single | Outcome | Multi | Medical/Code | External | Yes |
CURE | PPO | Single | Outcome | Single | Code | External | No |
MASLab | NO RL | Multi | Outcome | Multi | Code/Math/Reasoning | External | Yes |
Time-R1 | PPO/GRPO/DPO | Multi | Outcome | Multi | Temporal | All | Code |
ML-Agent | Custom | Single | Process | Multi | Code | All | Yes |
SkyRL | PPO/GRPO | Single | Outcome | Multi | Math/Code | All | Code |
digitalhuman | PPO/GRPO/ReMax/RLOO | Multi | Outcome | Multi | Empathy/Math/Code/MultimodalQA | Rule/Model/External | Yes |
sweet_rl | DPO | Multi | Process | Multi | Design/Code | Model | Web Browsing |
rllm | PPO/GRPO | Single | Outcome | Multi | Code Edit | External | Yes |
open-r1 | GRPO | Single | Outcome | Single | Math/Code | All | Yes |
Github Repo | 🌟 Stars | Date | Org | Paper Link | RL Framework |
---|---|---|---|---|---|
ARPO | 2025.7 | RUC, Kuaishou | Paper | veRL | |
terminal-bench-rl | 2025.7 | Individual (Danau5tin) | N/A | rLLM | |
MOTIF | 2025.6 | University of Maryland | Paper | trl | |
cmriat/l0 | 2025.6 | CMRIAT | Paper | veRL | |
agent-distillation | 2025.5 | KAIST | Paper | Custom | |
VDeepEyes | 2025.5 | Xiaohongshu/XJTU | Paper | veRL | |
EasyR1 | 2025.4 | Individual | repo1/paper2 | veRL | |
AutoCoA | 2025.3 | BJTU | Paper | veRL | |
ToRL | 2025.3 | SJTU | Paper | veRL | |
ReMA | 2025.3 | SJTU, UCL | Paper | veRL | |
Agentic-Reasoning | 2025.2 | Oxford | Paper | Custom | |
SimpleTIR | 2025.2 | NTU, Bytedance | Notion Blog | veRL | |
openrlhf_async_pipline | 2024.5 | OpenRLHF | Paper | OpenRLHF |
📋 Click to view technical details
Github Repo | RL Algorithm | Single/Multi Agent | Outcome/Process Reward | Single/Multi Turn | Task | Reward Type | Tool usage |
---|---|---|---|---|---|---|---|
ARPO | GRPO | Single | Outcome | Multi | Math/Coding | Model/Rule | Yes |
terminal-bench-rl | GRPO | Single | Outcome | Multi | Coding/Terminal | Model+External Verifier | Yes |
MOTIF | GRPO | Single | Outcome | Multi | QA | Rule | No |
cmriat/l0 | PPO | Multi | Process | Multi | QA | All | Yes |
agent-distillation | PPO | Single | Process | Multi | QA/Math | External | Yes |
VDeepEyes | PPO/GRPO | Multi | Process | Multi | VQA | All | Yes |
EasyR1 | GRPO | Single | Process | Multi | Vision-Language | Model | Yes |
AutoCoA | GRPO | Multi | Outcome | Multi | Reasoning/Math/QA | All | Yes |
ToRL | GRPO | Single | Outcome | Single | Math | Rule/External | Yes |
ReMA | PPO | Multi | Outcome | Multi | Math | Rule | No |
Agentic-Reasoning | Custom | Single | Process | Multi | QA/Math | External | Web Browsing |
SimpleTIR | PPO/GRPO (with extensions) | Single | Outcome | Multi | Math, Coding | All | Yes |
openrlhf_async_pipline | PPO/REINFORCE++/DPO/RLOO | Single | Outcome | Multi | Dialogue/Reasoning/QA | All | No |
Github Repo | 🌟 Stars | Date | Org | Paper Link | RL Framework |
---|---|---|---|---|---|
MEM1 | 2025.7 | MIT | Paper | veRL (based on Search-R1) | |
MemAgent | 2025.6 | Bytedance, Tsinghua-SIA | Paper | veRL |
📋 Click to view technical details
Github Repo | RL Algorithm | Single/Multi Agent | Outcome/Process Reward | Single/Multi Turn | Task | Reward Type | Tool usage |
---|---|---|---|---|---|---|---|
MEM1 | PPO/GRPO | Single | Outcome | Multi | WebShop/GSM8K/QA | Rule/Model | Yes |
MemAgent | PPO, GRPO, DPO | Multi | Outcome | Multi | Long-context QA | Rule/Model/External | Yes |
Github Repo | 🌟 Stars | Date | Org | Paper Link | RL Framework |
---|---|---|---|---|---|
MMedAgent-RL | 2025.8 | Unknown | paper | Unknown | |
DoctorAgent-RL | 2025.5 | UCAS/CAS/USTC | Paper | RAGEN | |
Biomni | 2025.3 | Stanford University (SNAP) | Paper | Custom |
📋 Click to view technical details
Github Repo | RL Algorithm | Single/Multi Agent | Outcome/Process Reward | Single/Multi Turn | Task | Reward Type | Tool usage |
---|---|---|---|---|---|---|---|
MMedAgent-RL | Unknown | Multi | Unknown | Unknown | Unknown | Unknown | Unknown |
DoctorAgent-RL | GRPO | Multi | Both | Multi | Consultation/Diagnosis | Model/Rule | No |
Biomni | TBD | Single | TBD | Single | scRNAseq/CRISPR/ADMET/Knowledge | TBD | Yes |
Github Repo | 🌟 Stars | Date | Org | Task |
---|---|---|---|---|
Mind2Web-2 | 2025.6 | Ohio State University | Web | |
gem | 2025.5 | Sea AI Lab | Math/Code/Game/QA | |
MLE-Dojo | 2025.5 | GIT, Stanford | MLE | |
atropos | 2025.4 | Nous Research | Game/Code/Tool | |
InternBootcamp | 2025.4 | InternBootcamp | Coding/QA/Game | |
loong | 2025.3 | CAMEL-AI.org | RLVR | |
reasoning-gym | 2025.1 | open-thought | Math/Game | |
llmgym | 2025.1 | tensorzero | TextGame/Tool | |
debug-gym | 2024.11 | Microsoft Research | Debugging/Game/Code | |
gym-llm | 2024.8 | Rodrigo Sánchez Molina | Control/Game | |
AgentGym | 2024.6 | Fudan | Web/Game | |
tau-bench | 2024.6 | Sierra | Tool | |
appworld | 2024.6 | Stony Brook University | Phone Use | |
android_world | 2024.5 | Google Research | Phone Use | |
TheAgentCompany | 2024.3 | CMU, Duke | Coding | |
LlamaGym | 2024.3 | Rohan Pandey | Game | |
visualwebarena | 2024.1 | CMU | Web | |
LMRL-Gym | 2023.12 | UC Berkeley | Game | |
OSWorld | 2023.10 | HKU, CMU, Salesforce, Waterloo | Computer Use | |
webarena | 2023.7 | CMU | Web | |
AgentBench | 2023.7 | Tsinghua University | Game/Web/QA/Tool | |
WebShop | 2022.7 | Princeton-NLP | Web | |
ScienceWorld | 2022.3 | AllenAI | TextGame/ScienceQA | |
alfworld | 2020.10 | Microsoft, CMU, UW | Embodied | |
factorio-learning-environment | 2021.6 | JackHopkins | Game | |
jericho | 2018.10 | Microsoft, GIT | TextGame | |
TextWorld | 2018.6 | Microsoft Research | TextGame |
- JoyAgents-R1: Joint Evolution Dynamics for Versatile Multi-LLM Agents with Reinforcement Learning
- Shop-R1: Rewarding LLMs to Simulate Human Behavior in Online Shopping via Reinforcement Learning
- Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning
- Acting Less is Reasoning More! Teaching Model to Act Efficiently
- Agentic Reasoning and Tool Integration for LLMs via Reinforcement Learning
- ComputerRL: Scaling End-to-End Online Reinforcement Learning for Computer Use Agents
- Atom-Searcher: Enhancing Agentic Deep Research via Fine-Grained Atomic Thought Reward
- MUA-RL: MULTI-TURN USER-INTERACTING AGENTREINFORCEMENT LEARNING FOR AGENTIC TOOL USE
- Understanding Tool-Integrated Reasoning
- Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning
- Encouraging Good Processes Without the Need for Good Answers: Reinforcement Learning for LLM Agent Planning
- SFR-DeepResearch: Towards Effective Reinforcement Learning for Autonomously Reasoning Single Agents
If you find this repository useful, please consider citing it:
@misc{agentsMeetRL,
title={When LLM Agents Meet Reinforcement Learning: A Comprehensive Survey},
author={AgentsMeetRL Contributors},
year={2025},
url={https://github.com/thinkwee/agentsMeetRL}
}
Made with ❤️ by the AgentsMeetRL community
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for AgentsMeetRL
Similar Open Source Tools

AgentsMeetRL
AgentsMeetRL is an awesome list that summarizes open-source repositories for training LLM Agents using reinforcement learning. The criteria for identifying an agent project are multi-turn interactions or tool use. The project is based on code analysis from open-source repositories using GitHub Copilot Agent. The focus is on reinforcement learning frameworks, RL algorithms, rewards, and environments that projects depend on, for everyone's reference on technical choices.

yudao-cloud
Yudao-cloud is an open-source project designed to provide a fast development platform for developers in China. It includes various system functions, infrastructure, member center, data reports, workflow, mall system, WeChat public account, CRM, ERP, etc. The project is based on Java backend with Spring Boot and Spring Cloud Alibaba microservices architecture. It supports multiple databases, message queues, authentication systems, dynamic menu loading, SaaS multi-tenant system, code generator, real-time communication, integration with third-party services like WeChat, Alipay, and more. The project is well-documented and follows the Alibaba Java development guidelines, ensuring clean code and architecture.

ruoyi-vue-pro
The ruoyi-vue-pro repository is an open-source project that provides a comprehensive development platform with various functionalities such as system features, infrastructure, member center, data reports, workflow, payment system, mall system, ERP system, CRM system, and AI big model. It is built using Java backend with Spring Boot framework and Vue frontend with different versions like Vue3 with element-plus, Vue3 with vben(ant-design-vue), and Vue2 with element-ui. The project aims to offer a fast development platform for developers and enterprises, supporting features like dynamic menu loading, button-level access control, SaaS multi-tenancy, code generator, real-time communication, integration with third-party services like WeChat, Alipay, and cloud services, and more.

yudao-boot-mini
yudao-boot-mini is an open-source project focused on developing a rapid development platform for developers in China. It includes features like system functions, infrastructure, member center, data reports, workflow, mall system, WeChat official account, CRM, ERP, etc. The project is based on Spring Boot with Java backend and Vue for frontend. It offers various functionalities such as user management, role management, menu management, department management, workflow management, payment system, code generation, API documentation, database documentation, file service, WebSocket integration, message queue, Java monitoring, and more. The project is licensed under the MIT License, allowing both individuals and enterprises to use it freely without restrictions.

Awesome-LLM-Resources-List
Awesome LLM Resources is a curated collection of resources for Large Language Models (LLMs) covering various aspects such as serverless hosting, accessing off-the-shelf models via API, local inference, LLM serving frameworks, open-source LLM web chat UIs, renting GPUs for fine-tuning, fine-tuning with no-code UI, fine-tuning frameworks, OS agentic/AI workflow, AI agents, co-pilots, voice API, open-source TTS models, OS RAG frameworks, research papers on chain-of-thought prompting, CoT implementations, CoT fine-tuned models & datasets, and more.

ailia-models
The collection of pre-trained, state-of-the-art AI models. ailia SDK is a self-contained, cross-platform, high-speed inference SDK for AI. The ailia SDK provides a consistent C++ API across Windows, Mac, Linux, iOS, Android, Jetson, and Raspberry Pi platforms. It also supports Unity (C#), Python, Rust, Flutter(Dart) and JNI for efficient AI implementation. The ailia SDK makes extensive use of the GPU through Vulkan and Metal to enable accelerated computing. # Supported models 323 models as of April 8th, 2024

AIO-Firebog-Blocklists
AIO-Firebog-Blocklists is a comprehensive tool that combines various sources into a single, cohesive blocklist. It offers customizable options to suit individual preferences and needs, ensuring regular updates to stay up-to-date with the latest threats. The tool focuses on performance optimization to minimize impact while maintaining effective filtering. It is designed to help users with ad blocking, malware protection, tracker prevention, and content filtering.

LLamaTuner
LLamaTuner is a repository for the Efficient Finetuning of Quantized LLMs project, focusing on building and sharing instruction-following Chinese baichuan-7b/LLaMA/Pythia/GLM model tuning methods. The project enables training on a single Nvidia RTX-2080TI and RTX-3090 for multi-round chatbot training. It utilizes bitsandbytes for quantization and is integrated with Huggingface's PEFT and transformers libraries. The repository supports various models, training approaches, and datasets for supervised fine-tuning, LoRA, QLoRA, and more. It also provides tools for data preprocessing and offers models in the Hugging Face model hub for inference and finetuning. The project is licensed under Apache 2.0 and acknowledges contributions from various open-source contributors.

llm-book
The 'llm-book' repository is dedicated to the introduction of large-scale language models, focusing on natural language processing tasks. The code is designed to run on Google Colaboratory and utilizes datasets and models available on the Hugging Face Hub. Note that as of July 28, 2023, there are issues with the MARC-ja dataset links, but an alternative notebook using the WRIME Japanese sentiment analysis dataset has been added. The repository covers various chapters on topics such as Transformers, fine-tuning language models, entity recognition, summarization, document embedding, question answering, and more.

Awesome-LLM-Tabular
This repository is a curated list of research papers that explore the integration of Large Language Model (LLM) technology with tabular data. It aims to provide a comprehensive resource for researchers and practitioners interested in this emerging field. The repository includes papers on a wide range of topics, including table-to-text generation, table question answering, and tabular data classification. It also includes a section on related datasets and resources.

yudao-ui-admin-vue3
The yudao-ui-admin-vue3 repository is an open-source project focused on building a fast development platform for developers in China. It utilizes Vue3 and Element Plus to provide features such as configurable themes, internationalization, dynamic route permission generation, common component encapsulation, and rich examples. The project supports the latest front-end technologies like Vue3 and Vite4, and also includes tools like TypeScript, pinia, vueuse, vue-i18n, vue-router, unocss, iconify, and wangeditor. It offers a range of development tools and features for system functions, infrastructure, workflow management, payment systems, member centers, data reporting, e-commerce systems, WeChat public accounts, ERP systems, and CRM systems.

fastapi
智元 Fast API is a one-stop API management system that unifies various LLM APIs in terms of format, standards, and management, achieving the ultimate in functionality, performance, and user experience. It supports various models from companies like OpenAI, Azure, Baidu, Keda Xunfei, Alibaba Cloud, Zhifu AI, Google, DeepSeek, 360 Brain, and Midjourney. The project provides user and admin portals for preview, supports cluster deployment, multi-site deployment, and cross-zone deployment. It also offers Docker deployment, a public API site for registration, and screenshots of the admin and user portals. The API interface is similar to OpenAI's interface, and the project is open source with repositories for API, web, admin, and SDK on GitHub and Gitee.

LangBot
LangBot is an open-source large language model native instant messaging robot development platform, aiming to provide a plug-and-play IM robot development experience, with various LLM application functions such as Agent, RAG, MCP, adapting to mainstream instant messaging platforms globally, and providing rich API interfaces to support custom development.
For similar tasks

AgentsMeetRL
AgentsMeetRL is an awesome list that summarizes open-source repositories for training LLM Agents using reinforcement learning. The criteria for identifying an agent project are multi-turn interactions or tool use. The project is based on code analysis from open-source repositories using GitHub Copilot Agent. The focus is on reinforcement learning frameworks, RL algorithms, rewards, and environments that projects depend on, for everyone's reference on technical choices.
For similar jobs

weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

tabby
Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

spear
SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

Magick
Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.