Awesome-Agent-Papers

Awesome-Agent-Papers

[Up-to-date] Large Language Model Agent: A Survey on Methodology, Applications and Challenges

Stars: 98

Visit
 screenshot

README:

🤖 Comprehensive LLM Agent Research Collection

Last Updated

LLM Agent Research Overview

🌟 Overview

This repository contains a comprehensive collection of research papers on Large Language Model (LLM) agents. We organize papers across key categories including agent construction, collaboration mechanisms, evolution, tools, security, benchmarks, and applications.

Our taxonomy provides a structured framework for understanding the rapidly evolving field of LLM agents, from architectural foundations to practical implementations. The repository bridges fragmented research threads by highlighting connections between agent design principles and emergent behaviors.

📄 Read our survey paper here

📊 Statistics & Trends

Our survey covers the rapidly evolving field of LLM agents, with a significant increase in research publications since 2023.

Research Paper Titles Word Cloud Distribution of Surveyed Papers

📑 Table of Contents

🔍 Key Categories

  • Agent Construction: Methodologies and architectures for building LLM agents
  • Agent Collaboration: Frameworks for multi-agent interaction and cooperation
  • Agent Evolution: Self-improvement and learning capabilities of agents
  • Tools: Integration of external tools and APIs with LLM agents
  • Security: Security concerns and protections for LLM agent systems
  • Benchmarks: Evaluation frameworks and datasets for testing agent capabilities
  • Applications: Real-world implementations and use cases

📚 Resource List

Title Section_or_Category Year url
Adaptive Collaboration Strategy for LLMs in Medical Decision Making Agent Collaboration 2024 link
ReConcile: Round-Table Conference Improves Reasoning via Consensus among Diverse LLMs Agent Collaboration 2024 link
MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework Agent Collaboration 2024 link
Debating with More Persuasive LLMs Leads to More Truthful Answers Agent Collaboration 2024 link
Roco: Dialectic multi-robot collaboration with large language models Agent Collaboration 2024 link
AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning Agent Collaboration 2024 link
Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding Agent Collaboration 2024 link
Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate Agent Collaboration 2024 link
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors Agent Collaboration 2024 link
Dynamic LLM-Powered Agent Network for Task-Oriented Agent Collaboration Agent Collaboration 2024 link
ChatDev: Communicative Agents for Software Development Agent Collaboration 2024 link
ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate Agent Collaboration 2024 link
A Dynamic LLM-Powered Agent Network for Task-Oriented Agent Collaboration Agent Collaboration 2024 link
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Agent Collaboration 2023 link
Improving Factuality and Reasoning in Language Models through Multiagent Debate Agent Collaboration 2023 link
Autonomous chemical research with large language models Agent Collaboration 2023 link
Planning with Multi-Constraints via Collaborative Language Agents Agent Construction 2025 link
Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making Agent Construction 2025 link
AutoAgents: A Framework for Automatic Agent Generation Agent Construction 2024 link
MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework Agent Construction 2024 link
Cognitive Architectures for Language Agents Agent Construction 2024 link
Executable Code Actions Elicit Better LLM Agents Agent Construction 2024 link
ChatDev: Communicative Agents for Software Development Agent Construction 2024 link
Editable Scene Simulation for Autonomous Driving via Collaborative LLM-Agents Agent Construction 2024 link
A Dynamic LLM-Powered Agent Network for Task-Oriented Agent Collaboration Agent Construction 2024 link
More Agents Is All You Need Agent Construction 2024 link
Agent Hospital: A Simulacrum of Hospital with Evolvable Medical Agents Agent Construction 2024 link
Empowering biomedical discovery with AI agents Agent Construction 2024 link
SMART-LLM: Smart Multi-Agent Robot Task Planning using Large Language Models Agent Construction 2024 link
Perceive, Reflect, and Plan: Designing LLM Agent for Goal-Directed City Navigation without Instructions Agent Construction 2024 link
Enhancing the General Agent Capabilities of Low-Parameter LLMs through Tuning and Multi-Branch Reasoning Agent Construction 2024 link
PlanCritic: Formal Planning with Human Feedback Agent Construction 2024 link
Enhancing Robot Task Planning: Integrating Environmental Information and Feedback Insights through Large Language Models Agent Construction 2024 link
Devil's Advocate: Anticipatory Reflection for LLM Agents Agent Construction 2024 link
Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios Agent Construction 2024 link
CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society Agent Construction 2023 link
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Agent Construction 2023 link
AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation Agent Construction 2023 link
War and Peace (WarAgent): Large Language Model-based Multi-Agent Simulation of World Wars Agent Construction 2023 link
Describe, Explain, Plan and Select: Interactive Planning with LLMs Enables Open-World Multi-Task Agents Agent Construction 2023 link
TPTU: Large Language Model-based AI Agents for Task Planning and Tool Usage Agent Construction 2023 link
Evolutionary optimization of model merging recipes Agent Evolution 2025 link
CREAM: Consistency Regularized Self-Rewarding Language Models Agent Evolution 2025 link
KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents Agent Evolution 2025 link
Benchmark Self-Evolving: A Multi-Agent Framework for Dynamic LLM Evaluation Agent Evolution 2024 link
Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization Agent Evolution 2024 link
Coevolving with the Other You: Fine-Tuning LLM with Sequential Cooperative Multi-Agent Reinforcement Learning Agent Evolution 2024 link
A Survey on Self-Evolution of Large Language Models Agent Evolution 2024 link
LLM-Evolve: Evaluation for LLM’s Evolving Capability on Benchmarks Agent Evolution 2024 link
CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing Agent Evolution 2024 link
Iterative Translation Refinement with Large Language Models Agent Evolution 2024 link
Agent Alignment in Evolving Social Norms Agent Evolution 2024 link
Mitigating the Alignment Tax of RLHF Agent Evolution 2024 link
Self-Rewarding Language Models Agent Evolution 2024 link
V-STaR: Training Verifiers for Self-Taught Reasoners Agent Evolution 2024 link
RLCD: Reinforcement learning from contrastive distillation for LM alignment Agent Evolution 2024 link
LANGUAGE MODEL SELF-IMPROVEMENT BY REIN- FORCEMENT LEARNING CONTEMPLATION Agent Evolution 2024 link
ProAgent: Building Proactive Cooperative Agents with Large Language Models Agent Evolution 2024 link
Agent Planning with World Knowledge Model Agent Evolution 2024 link
Refining Guideline Knowledge for Agent Planning Using Textgrad Agent Evolution 2024 link
Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate Agent Evolution 2024 link
LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error Agent Evolution 2024 link
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback Agent Evolution 2023 link
SELF-REFINE: Iterative Refinement with Self-Feedback Agent Evolution 2023 link
Self-Evolution Learning for Discriminative Language Model Pretraining Agent Evolution 2023 link
Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning Agent Evolution 2023 link
SELFEVOLVE: A Code Evolution Framework via Large Language Models Agent Evolution 2023 link
SELF-INSTRUCT: Aligning Language Models with Self-Generated Instructions Agent Evolution 2023 link
Large Language Models are Better Reasoners with Self-Verification Agent Evolution 2023 link
CODET: CODE GENERATION WITH GENERATED TESTS Agent Evolution 2023 link
Evolving Diverse Red-team Language Models in Multi-round Multi-agent Games Agent Evolution 2023 link
Improving Factuality and Reasoning in Language Models through Multiagent Debate Agent Evolution 2023 link
CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society Agent Evolution 2023 link
STaR: Self-Taught Reasoner Bootstrapping Reasoning With Reasoning Agent Evolution 2022 link
An active inference strategy for prompting reliable responses from large language models in medical practice Applications 2025 link
An evaluation framework for clinical use of large language models in patient interaction tasks Applications 2025 link
Large Language Models lack essential metacognition for reliable medical reasoning Applications 2025 link
Balancing autonomy and expertise in autonomous synthesis laboratories Applications 2025 link
Motif: Intrinsic Motivation from Artificial Intelligence Feedback Applications 2024 link
Baba Is AI: Break the Rules to Beat the Benchmark Applications 2024 link
Large language model-empowered agents for simulating macroeconomic activities Applications 2024 link
CompeteAI: Understanding the Competition Dynamics in Large Language Model-based Agents Applications 2024 link
Understanding the benefits and challenges of using large language model-based conversational agents for mental well-being support Applications 2024 link
Exploring Collaboration Mechanisms for LLM Agents Applications 2024 link
Simulating Human Society with Large Language Model Agents: City, Social Media, and Economic System Applications 2024 link
Can large language models transform computational social science? Applications 2024 link
AgentCF: Collaborative Learning with Autonomous Language Agents for Recommender Systems Applications 2024 link
On Generative Agents in Recommendation Applications 2024 link
ChatDev: Communicative Agents for Software Development Applications 2024 link
CRISPR-GPT: An LLM Agent for Automated Design of Gene-Editing Experiments Applications 2024 link
SciAgents: Automating Scientific Discovery Through Bioinspired Multi-Agent Intelligent Graph Reasoning Applications 2024 link
Medical large language models are susceptible to targeted misinformation attacks Applications 2024 link
Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents Applications 2023 link
Language Models Meet World Models: Embodied Experiences Enhance Language Models Applications 2023 link
ChessGPT: Bridging Policy Learning and Language Modeling Applications 2023 link
Mindagent: Emergent gaming interaction Applications 2023 link
Exploring large language models for communication games: An empirical study on Werewolf Applications 2023 link
Language as reality: a co-creative storytelling game experience in 1001 nights using generative AI Applications 2023 link
TradingGPT: Multi-Agent System with Layered Memory and Distinct Characters for Enhanced Financial Trading Performance Applications 2023 link
Using large language models to simulate multiple humans and replicate human subject studies Applications 2023 link
Generative Agents: Interactive Simulacra of Human Behavior Applications 2023 link
Self-collaboration Code Generation via ChatGPT Applications 2023 link
Language models can solve computer tasks Applications 2023 link
ChemCrow: Augmenting large-language models with chemistry tools Applications 2023 link
AlphaFlow: autonomous discovery and optimization of multi-step chemistry using a self-driven fluidic lab guided by reinforcement learning Applications 2023 link
Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents Applications 2022 link
Stress-testing the resilience of the Austrian healthcare system using agent-based simulation Applications 2022 link
AgentHarm: Benchmarking Robustness of LLM Agents on Harmful Tasks Datasets & Benchmarks 2025 link
AI Hospital: Benchmarking Large Language Models in a Multi-agent Medical Interaction Simulator Datasets & Benchmarks 2025 link
Benchmark Self-Evolving: A Multi-Agent Framework for Dynamic LLM Evaluation Datasets & Benchmarks 2025 link
DCA-Bench: A Benchmark for Dataset Curation Agents Datasets & Benchmarks 2025 link
MedAgentBench: A Realistic Virtual EHR Environment to Benchmark Medical LLM Agents Datasets & Benchmarks 2025 link
MLE-Bench: Evaluating Machine Learning Agents on Machine Learning Engineering Datasets & Benchmarks 2025 link
EgoLife: Towards Egocentric Life Assistant Datasets & Benchmarks 2025 link
DSBench: How Far Are Data Science Agents to Becoming Data Science Experts? Datasets & Benchmarks 2025 link
AgentBench: Evaluating LLMs as Agents Datasets & Benchmarks 2024 link
AgentQuest: A Modular Benchmark Framework to Measure Progress and Improve LLM Agents Datasets & Benchmarks 2024 link
BENCHAGENTS: Automated Benchmark Creation with Agent Interaction Datasets & Benchmarks 2024 link
Benchmarking Data Science Agents Datasets & Benchmarks 2024 link
Benchmarking Large Language Models as AI Research Agents Datasets & Benchmarks 2024 link
Benchmarking Large Language Models for Multi-agent Systems: A Comparative Analysis of AutoGen, CrewAI, and TaskWeaver Datasets & Benchmarks 2024 link
BLADE- Benchmarking Language Model Agents Datasets & Benchmarks 2024 link
CRAB: Cross-platfrom agent benchmark for multi-modal embodied language model agents Datasets & Benchmarks 2024 link
CToolEval: A Chinese Benchmark for LLM-Powered Agent Evaluation in Real-World API Interactions Datasets & Benchmarks 2024 link
DA-Code: Agent Data Science Code Generation Benchmark for Large Language Models Datasets & Benchmarks 2024 link
Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making Datasets & Benchmarks 2024 link
GTA: A Benchmark for General Tool Agents Datasets & Benchmarks 2024 link
LaMPilot: An Open Benchmark Dataset for Autonomous Driving with Language Model Programs Datasets & Benchmarks 2024 link
ML Research Benchmark Datasets & Benchmarks 2024 link
MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains Datasets & Benchmarks 2024 link
OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist Autonomous Agents for Desktop and Web Datasets & Benchmarks 2024 link
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments Datasets & Benchmarks 2024 link
Revisiting Benchmark and Assessment: An Agent-based Exploratory Dynamic Evaluation Framework for LLMs Datasets & Benchmarks 2024 link
Seal-Tools: Self-instruct Tool Learning Dataset for Agent Tuning and Detailed Benchmark Datasets & Benchmarks 2024 link
Tapilot-Crossing: Benchmarking and Evolving LLMs Towards Interactive Data Analysis Agents Datasets & Benchmarks 2024 link
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks Datasets & Benchmarks 2024 link
Tur[k]ingBench: A Challenge Benchmark for Web Agents Datasets & Benchmarks 2024 link
Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models Datasets & Benchmarks 2024 link
AgentBank: Towards Generalized LLM Agents via Fine-Tuning on 50000+ Interaction Trajectories Datasets & Benchmarks 2024 link
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning Datasets & Benchmarks 2024 link
AgentTuning: Enabling Generalized Agent Abilities for LLMs Datasets & Benchmarks 2024 link
Executable Code Actions Elicit Better LLM Agents Datasets & Benchmarks 2024 link
FireAct: Toward Language Agent Fine-tuning Datasets & Benchmarks 2023 link
Medical large language models are vulnerable to data-poisoning attacks Ethics 2025 link
Foundation Models and Fair Use Ethics 2024 link
Estimating the Carbon Footprint of BLOOM, a 176B Parameter Language Model Ethics 2023 link
LLaMA: Open and Efficient Foundation Language Models Ethics 2023 link
Predictability and Surprise in Large Generative Models Ethics 2022 link
On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜 Ethics 2021 link
Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets Ethics 2021 link
GPT-3: Its Nature, Scope, Limits, and Consequences Ethics 2020 link
Energy and Policy Considerations for Modern Deep Learning Research Ethics 2020 link
Defending Against Neural Fake News Ethics 2019 link
RTBAS: Defending LLM Agents Against Prompt Injection and Privacy Leakage Security 2025 link
Red-Teaming LLM Multi-Agent Systems via Communication Attacks Security 2025 link
Unveiling Privacy Risks in LLM Agent Memory Security 2025 link
AEIA-MN: Evaluating the Robustness of Multimodal LLM-Powered Mobile Agents Against Active Environmental Injection Attacks Security 2025 link
Firewalls to Secure Dynamic LLM Agentic Networks Security 2025 link
AUTOHIJACKER: AUTOMATIC INDIRECT PROMPT INJECTION AGAINST BLACK-BOX LLM AGENTS Security 2025 link
AI Agents Under Threat: A Survey of Key Security Challenges and Future Pathways Security 2025 link
DemonAgent: Dynamically Encrypted Multi-Backdoor Implantation Attack on LLM-based Agent Security 2025 link
CORBA: Contagious Recursive Blocking Attacks on Multi-Agent Systems Based on Large Language Models Security 2025 link
G-Safeguard: A Topology-Guided Security Lens and Treatment on LLM-based Multi-agent Systems Security 2025 link
AgentHarm: Benchmarking Robustness of LLM Agents on Harmful Tasks Security 2025 link
Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks Security 2025 link
Prompt Infection: LLM-to-LLM Prompt Injection within Multi-Agent Systems Security 2025
LLM-based Multi-Agent Systems: Techniques and Business Perspectives Security 2024 link
BlockAgents: Towards Byzantine-Robust LLM-Based Multi-Agent Coordination via Blockchain Security 2024 link
PROMPT INFECTION: LLM-TO-LLM PROMPT INJECTION WITHIN MULTI-AGENT SYSTEMS Security 2024 link
AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents Security 2024 link
AGENTPOISON: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases Security 2024 link
AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks Security 2024 link
Imprompter- Tricking LLM Agents into Improper Tool Use Security 2024 link
TARGETING THE CORE: A SIMPLE AND EFFECTIVE METHOD TO ATTACK RAG-BASED AGENTS VIA DIRECT LLM MANIPULATION Security 2024 link
Prompt Injection as a Defense Against LLM-driven Cyberattacks Security 2024 link
Evil Geniuses: Delving into the Safety of LLM-based Agents Security 2024 link
AGENT SECURITY BENCH (ASB): FORMALIZING AND BENCHMARKING ATTACKS AND DEFENSES IN LLM-BASED AGENTS Security 2024 link
AGENTHARM: A BENCHMARK FOR MEASURING HARMFULNESS OF LLM AGENTS Security 2024 link
CLAS 2024: The Competition for LLM and Agent Safety Security 2024 link
The Task Shield: Enforcing Task Alignment to Defend Against Indirect Prompt Injection in LLM Agents Security 2024 link
WIPI: A New Web Threat for LLM-Driven Web Agents Security 2024 link
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast Security 2024 link
CORBA: Contagious Recursive Blocking Attacks on Multi-Agent Systems Based on Large Language Models Security 2024 link
PsySafe: A Comprehensive Framework for Psychological-based Attack, Defense, and Evaluation of Multi-agent System Safety Security 2024 link
Breaking ReAct Agents: Foot-in-the-Door Attack Will Get You In Security 2024 link
AGENT-SAFETYBENCH: Evaluating the Safety of LLM Agents Security 2024 link
INJECAGENT: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents Security 2024 link
PsySafe: A Comprehensive Framework for Psychological-based Attack, Defense, and Evaluation of Multi-agent System Safety Security 2024 link
TrustAgent: Towards Safe and Trustworthy LLM-based Agents Security 2024 link
Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents Security 2024 link
R-Judge: Benchmarking Safety Risk Awareness for LLM Agents Security 2024 link
NetSafe: Exploring the Topological Safety of Multi-agent Networks Security 2024 link
A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents Security 2024 link
Benchmark Evaluations, Applications, and Challenges of Large Vision Language Models: A Survey Survey 2025 link
Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks Survey 2025 link
Multi-Agent Collaboration Mechanisms: A Survey of LLMs Survey 2025 link
AI Agents Under Threat: A Survey of Key Security Challenges and Future Pathways Survey 2025 link
Large Model Based Agents: State-of-the-Art, Cooperation Paradigms, Security and Privacy, and Future Trends Survey 2024 link
Agent AI: Surveying the Horizons of Multimodal Interaction Survey 2024 link
Large Language Model based Multi-Agents: A Survey of Progress and Challenges Survey 2024 link
Large Multimodal Agents: A Survey Survey 2024 link
Understanding the planning of LLM agents: A survey Survey 2024 link
Computational Experiments Meet Large Language Model Based Agents: A Survey and Perspective Survey 2024 link
Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security Survey 2024 link
Large Model Based Agents: State-of-the-Art, Cooperation Paradigms, Security and Privacy, and Future Trends Survey 2024 link
The Landscape of Emerging AI Agent Architectures for Reasoning, Planning, and Tool Calling: A Survey Survey 2024 link
Exploring Large Language Model based Intelligent Agents: Definitions, Methods, and Prospects Survey 2024 link
Position Paper: Agent AI Towards a Holistic Intelligence Survey 2024 link
Large Language Model based Multi-Agents: A Survey of Progress and Challenges Survey 2024 link
LLM With Tools: A Survey Survey 2024 link
A Survey on the Memory Mechanism of Large Language Model based Agents Survey 2024 link
Understanding the planning of LLM agents: A survey Survey 2024 link
Large Language Model based Multi-Agents: A Survey of Progress and Challenges Survey 2024 link
A Survey on Large Language Model-Based Game Agents Survey 2024 link
Large Language Models and Games: A Survey and Roadmap Survey 2024 link
Exploring Large Language Model based Intelligent Agents: Definitions, Methods, and Prospects Survey 2024 link
Navigating the Risks: A Survey of Security, Privacy, and Ethics Threats in LLM-Based Agents Survey 2024 link
Security of AI Agents Survey 2024 link
PERSONAL LLM AGENTS: INSIGHTS AND SURVEY ABOUT THE CAPABILITY, EFFICIENCY AND SECURITY Survey 2024 link
The Emerged Security and Privacy of LLM Agent: A Survey with Case Studies Survey 2024 link
Inferring the Goals of Communicating Agents from Actions and Instructions Survey 2024 link
Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security Survey 2024 link
Recent advancements in LLM Red-Teaming: Techniques, Defenses, and Ethical Considerations Survey 2024 link
Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas: A Surveyhttps://ui.adsabs.harvard.edu/ Survey 2024 link
A survey on large language model based autonomous agents Survey 2023 link
The rise and potential of large language model based agents: a survey Survey 2023 link
Large Language Model Alignment: A Survey Survey 2023 link
Ethical and social risks of harm from Language Models Survey 2021 link
On the Opportunities and Risks of Foundation Models Survey 2021 link
Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims Survey 2020 link
Actionable Auditing: Investigating the Impact of Publicly Naming Biased Performance Results of Commercial AI Products Survey 2019 link
ToolCoder: A Systematic Code-Empowered Tool Learning Framework for Large Language Models Tools 2025 link
Re-Invoke: Tool Invocation Rewriting for Zero-Shot Tool Retrieval Tools 2024 link
Chain of Tools: Large Language Model is an Automatic Multi-tool Learner Tools 2024 link
EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction Tools 2024 link
ToolGen: Unified Tool Retrieval and Calling via Generation Tools 2024 link
ToolNet: Connecting Large Language Models with Massive Tools via Tool Graph Tools 2024 link
ToolPlanner: A Tool Augmented LLM for Multi Granularity Instructions with Path Planning and Feedback Tools 2024 link
Making Language Models Better Tool Learners with Execution Feedback Tools 2024 link
Leveraging Large Language Models to Improve REST API Testing Tools 2024 link
LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error Tools 2024 link
Skills-in-Context: Unlocking Compositionality in Large Language Models Tools 2024 link
TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs Tools 2024 link
Gorilla: Large Language Model Connected with Massive APIs Tools 2024 link
LARGE LANGUAGE MODELS AS TOOL MAKERS Tools 2024 link
Multi-Agent Collaboration: Harnessing the Power of Intelligent LLM Agents Tools 2023 link
Recommender AI Agent: Integrating Large Language Models for Interactive Recommendations Tools 2023 link
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs Tools 2023 link
TPTU-v2: Boosting Task Planning and Tool Usage of Large Language Model-based Agents in Real-world Systems Tools 2023 link
TPTU: Large Language Model-based AI Agents for Task Planning and Tool Usage Tools 2023 link
GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction Tools 2023 link
API-Bank: A Comprehensive Benchmark for Tool-Augmented LLMs Tools 2023 link
ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models Tools 2023 link
ToolQA: A Dataset for LLM Question Answering with External Tools Tools 2023 link
On the Tool Manipulation Capability of Open-source Large Language Models Tools 2023 link
RestGPT: Connecting Large Language Models with Real-World RESTful APIs Tools 2023 link
Toolformer: Language Models Can Teach Themselves to Use Tools Tools 2023 link
WebCPM: Interactive Web Search for Chinese Long-form Question Answering Tools 2023 link
ToolCoder: Teach Code Generation Models to use API search tools Tools 2023 link
ToolAlpaca: Generalized Tool Learning for Language Models with 3000 Simulated Cases Tools 2023 link
ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings Tools 2023 link
MultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of Thought Prompting Tools 2023 link
CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models Tools 2023 link
GEAR: Augmenting Language Models with Generalizable and Efficient Tool Resolution Tools 2023 link
Dify Tools 2023 link
LangChain Tools 2023 link
WebGPT: Browser-assisted question-answering with human feedback Tools 2022 link
Task Bench: A Parameterized Benchmark for Evaluating Parallel Runtime Performance Tools 2020 link

🤝 Contributing

We welcome contributions to expand our collection. You can:

We regularly update the repository to include new research.

📝 Citation

If you find our survey helpful, please consider citing our work:


@article{agentsurvey2025,
  title={Large Language Model Agent: A Survey on Methodology, Applications and Challenges},
  author={Junyu Luo and Weizhi Zhang and Ye Yuan and Yusheng Zhao and Junwei Yang and Yiyang Gu and Bohan Wu and Binqi Chen and Ziyue Qiao and Qingqing Long and Rongcheng Tu and Xiao Luo and Wei Ju and Zhiping Xiao and Yifan Wang and Meng Xiao and Chenwu Liu and Jingyang Yuan and Shichang Zhang and Yiqiao Jin and Fan Zhang and Xian Wu and Hanqing Zhao and Dacheng Tao and Philip S. Yu and Ming Zhang},
  journal={arXiv preprint arXiv:2503.21460},
  year={2025}
}


For questions or suggestions, please open an issue or contact the repository maintainers.

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for Awesome-Agent-Papers

Similar Open Source Tools

For similar tasks

No tools available

For similar jobs

No tools available