LLM-Agent-Survey

LLM-Agent-Survey

Survey on LLM Agents (Published on CoLing 2025)

Stars: 113

Visit
 screenshot

LLM-Agent-Survey is a comprehensive repository that provides a curated list of papers related to Large Language Model (LLM) agents. The repository categorizes papers based on LLM-Profiled Roles and includes high-quality publications from prestigious conferences and journals. It aims to offer a systematic understanding of LLM-based agents, covering topics such as tool use, planning, and feedback learning. The repository also includes unpublished papers with insightful analysis and novelty, marked for future updates. Users can explore a wide range of surveys, tool use cases, planning workflows, and benchmarks related to LLM agents.

README:

A Reading List for LLM-Agents (Updated: 5 Mar 2025)

Xinzhe Li

Paper Github License

This Repository vs. Others

Our Github Repository follows the selection criteria below:

Other Github Repositories summarize related papers with less constrained selection criteria:

Other Github Repositories summarize related papers focusing on specific perspectives:

Table of Contents

🎁 Surveys

  • A Review of Prominent Paradigms for LLM-Based Agents: Tool Use (Including RAG), Planning, and Feedback Learning, CoLing 2025 [paper]
  • A Survey on Large Language Model based Autonomous Agents, Frontiers of Computer Science 2024 [paper] | [code]
  • Augmented Language Models: a Survey, TMLR [paper]
  • Understanding the planning of LLM agents: A survey, arXiv [paper] πŸ’‘
  • The Rise and Potential of Large Language Model Based Agents: A Survey, arxiv [paper] πŸ’‘
  • A Survey on the Memory Mechanism of Large Language Model based Agents, arxiv [paper] πŸ’‘

πŸš€ Tool Use

  • ReAct: Synergizing Reasoning and Acting in Language Models, ICLR 2023 [paper]
  • Toolformer: Language Models Can Teach Themselves to Use Tools, NeurIPS 2023 [paper]
  • HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face, NeurIPS 2023 [paper]
  • API-Bank: A Benchmark for Tool-Augmented LLMs, EMNLP 2023 [paper]
  • ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings, NeurIPS 2023 [paper]
  • MultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of Thought Prompting, ACL 2023 [paper]
  • ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models, EMNLP 2023 [paper]
  • ART: Automatic multi-step reasoning and tool-use for large language models, arXiv.2303.09014 [paper] πŸ’‘
  • TALM: Tool Augmented Language Models, arXiv.2205.12255 [paper] πŸ’‘
  • On the Tool Manipulation Capability of Open-source Large Language Models, arXiv.2305.16504 [paper] πŸ’‘
  • Large Language Models as Tool Makers, arXiv.2305.17126 [paper] πŸ’‘
  • GEAR: Augmenting Language Models with Generalizable and Efficient Tool Resolution, arXiv.2307.08775 [paper] πŸ’‘
  • ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs, arXiv.2307.16789 [paper] πŸ’‘
  • Tool Documentation Enables Zero-Shot Tool-Usage with Large Language Models, arXiv.2308.00675 [paper] πŸ’‘
  • MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback, arXiv.2309.10691 [paper] πŸ’‘
  • Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning, arXiv.2309.10814 [paper] πŸ’‘

🧠 Planning

Base Workflows

  • On the Planning Abilities of Large Language Models -- A Critical Investigation, NeurIPS 2023 [paper]

Search Workflows

Details in the page (on the way to be publised).

  • Alphazero-like Tree-Search can guide large language model decoding and training, ICML 2024 [paper]
    • Search Algorithm: MCTS
  • Language Agent Tree Search Unifies Reasoning, Acting, and Planning in Language Models, ICML 2024 [paper]
    • Search Algorithm: MCTS
  • When is Tree Search Useful for {LLM} Planning? It Depends on the Discriminator, ACL 2024 [paper]
    • Search Algorithm: MCTS
  • Everything of Thoughts: Defying the Law of Penrose Triangle for Thought Generation, ACL findings 2024 [paper]
    • Search Algorithm: MCTS
  • Tree-of-Traversals: A Zero-Shot Reasoning Algorithm for Augmenting Black-box Language Models with Knowledge Graphs, ACL 2024 [paper]
    • Search Algorithm: BFS/DFS
  • LLM-A*: Large Language Model Enhanced Incremental Heuristic Search on Path Planning, EMNLP findings 2024 [paper] | [code]
    • Search Algorithm: A*
  • LLM Reasoners: New Evaluation, Library, and Analysis of Step-by-Step Reasoning with Large Language Models, COLM2024 [paper] | [code]
  • Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models, arXiv.2310.04406 [paper] πŸ’‘
  • Large Language Model Guided Tree-of-Thought, arXiv.2305.08291 [paper]πŸ’‘
  • Tree Search for Language Model Agents, Under Review [paper]πŸ’‘
    • Search Algorithm: Best-First Search
  • Q*: Improving multi-step reasoning for llms with deliberative planning, Under Review [paper]πŸ’‘
    • Search Algorithm: A*
  • Planning with Large Language Models for Code Generation, ICLR 2023 [paper]
    • Search Algorithm: MCTS
  • Tree of Thoughts: Deliberate Problem Solving with Large Language Models, NeurIPS 2023 [paper]
    • Search Algorithm: BFS/DFS
  • LLM-MCTS:Large Language Models as Commonsense Knowledge for Large-Scale Task Planning, NeurIPS 2023 [paper] | [code]
    • Search Algorithm: MCTS
  • Self-Evaluation Guided Beam Search for Reasoning, NeurIPS 2023 [paper]
    • Search Algorithm: BFS/DFS
  • PathFinder: Guided Search over Multi-Step Reasoning Paths, NeurIPS 2023 R0-FoMo [paper]
    • Search Algorithm: Beam Search
  • Plan, Verify and Switch: Integrated Reasoning with Diverse X-of-Thoughts, EMNLP 2023 [paper]
  • RAP: Reasoning with Language Model is Planning with World Model, EMNLP 2023 [paper]
    • Search Algorithm: MCTS
  • Prompt-Based Monte-Carlo Tree Search for Goal-oriented Dialogue Policy Planning, EMNLP 2023 [paper]
    • Search Algorithm: MCTS
  • Monte Carlo Thought Search: Large Language Model Querying for Complex Scientific Reasoning in Catalyst Design, EMNLP findings 2023 [paper]
    • Search Algorithm: MCTS
  • Agent q: Advanced reasoning and learning for autonomous ai agents, arXiv.2309.10814 [paper] πŸ’‘
    • Search Algorithm: MCTS

Decomposition

  • HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face, NeurIPS 2023 [paper] | [code]
  • Least-to-Most Prompting Enables Complex Reasoning in Large Language Models, NeurIPS 2023 [paper]

PDDL+Local Search

  • Leveraging Pre-trained Large Language Models to Construct and Utilize World Models for Model-based Task Planning, NeurIPS 2023 [paper] | [code]
  • On the Planning Abilities of Large Language Models - A Critical Investigation, NeurIPS 2023 [paper] | [code]
  • PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change, NeurIPS 2023 [paper] | [code]

Others

  • LLM+P: Empowering Large Language Models with Optimal Planning Proficiency, arXiv.2304.11477 [paper]πŸ’‘

πŸ”„ Feedback Learning

  • Reflexion: Language Agents with Verbal Reinforcement Learning, NeurIPS 2023 [paper]
  • Self-Refine: Iterative Refinement with Self-Feedback, NeurIPS 2023 [paper]
  • SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning, ICLR 2024 [paper] | [code]
  • Learning From Correctness Without Prompting Makes LLM Efficient Reasoner, COLM2024 [paper]
  • Learning From Mistakes Makes LLM Better Reasoner, arXiv [paper] | [code]πŸ’‘
  • LLM-based Rewriting of Inappropriate Argumentation using Reinforcement Learning from Machine Feedback ACL 2024 [paper]

🧩 Composition

Planning + Feedback Learning

  • AdaPlanner: Adaptive Planning from Feedback with Language Models, NeurIPS 2023 [paper]
  • CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing, ICLR 2024 [paper]
  • ISR-LLM: Iterative Self-Refined Large Language Model for Long-Horizon Sequential Task Planning, arXiv.2308.13724 [paper] πŸ’‘

Planning + Tool Use

  • ToolChain: Efficient Action Space Navigation in Large Language Models with A* Search, ICLR 2024 [paper]
  • TPTU: Task Planning and Tool Usage of Large Language Model-based AI Agents, FMDM @ NeurIPS 2023 [paper]
  • TPTU-v2: Boosting Task Planning and Tool Usage of Large Language Model-based Agents in Real-world Systems, LLMAgents @ ICLR 2024 [paper]

πŸ—ΊοΈ World Modeling

  • Can Language Models Serve as Text-Based World Simulators?, ACL 2024 [paper] | [code]
  • Making Large Language Models into World Models with Precondition and Effect Knowledge, arXiv [paper] πŸ’‘
  • Leveraging Pre-trained Large Language Models to Construct and Utilize World Models for Model-based Task Planning, NeurIPS 2023 [paper] | [code]
  • ByteSized32: A Corpus and Challenge Task for Generating Task-Specific World Models Expressed as Text Games, EMNLP 2023 [paper] | [code]

πŸ“Š Benchmarks

Tool-Use Benchmarks

  • MetaTool Benchmark: Deciding Whether to Use Tools and Which to Use, arXiv.2310.03128 [paper] πŸ’‘
  • TaskBench: Benchmarking Large Language Models for Task Automation, arXiv.2311.18760 [paper] πŸ’‘

Planning Benchmarks

  • Large Language Models Still Can't Plan (A Benchmark for LLMs on Planning and Reasoning about Change), NeurIPS 2023 [paper]

πŸ“ Citation

If you find our work helpful, you can cite this paper as:

@inproceedings{li2024review,
  title={A Review of Prominent Paradigms for LLM-Based Agents: Tool Use (Including RAG), Planning, and Feedback Learning},
  author={Li, Xinzhe},
  booktitle = "Proceedings of the 31st International Conference on Computational Linguistics",
  year = "2025",  
}
@article{li2025survey,
  title={A Survey on LLM Test-Time Compute via Search: Tasks, LLM Profiling, Search Algorithms, and Relevant Frameworks},
  author={Li, Xinzhe},
  journal={arXiv preprint arXiv:2501.10069},
  year={2025}
}

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for LLM-Agent-Survey

Similar Open Source Tools

For similar tasks

For similar jobs