LLM-Agents-Papers
A repo lists papers related to LLM based agent
Stars: 712
A repository that lists papers related to Large Language Model (LLM) based agents. The repository covers various topics including survey, planning, feedback & reflection, memory mechanism, role playing, game playing, tool usage & human-agent interaction, benchmark & evaluation, environment & platform, agent framework, multi-agent system, and agent fine-tuning. It provides a comprehensive collection of research papers on LLM-based agents, exploring different aspects of AI agent architectures and applications.
README:
Last Updated Time: 2024/5/25
A repo lists papers related to LLM based agent. Includes
- Survey
- Planning, Feedback&Reflection, Memory Mechanism
- Role Playing, Game Playing, Tool Usage&Human-Agent Interaction
- Benchmark&Evaluation, Environment&Platform
- Agent Framework, Multi-Agent System
- Agent Fine-tuning
For more comprehensive reading, we also recommend other paper lists:
- zjunlp/LLMAgentPapers: Must-read Papers on Large Language Model Agents.
- teacherpeterpan/self-correction-llm-papers: This is a collection of research papers for Self-Correcting Large Language Models with Automated Feedback.
- Paitesanshi/LLM-Agent-Survey: A Survey on LLM-based Autonomous Agents.
- woooodyy/llm-agent-paper-list: Must-read papers for LLM-based agents.
- git-disl/awesome-LLM-game-agent-papers: Must-read papers for LLM-based Game agents.
-
[2024/05/16] Agent Design Pattern Catalogue: A Collection of Architectural Patterns for Foundation Model based Agents | [paper] | [code]
-
[2024/04/17] The Landscape of Emerging AI Agent Architectures for Reasoning, Planning, and Tool Calling: A Survey | [paper] | [code]
-
[2024/04/17] Advancing Social Intelligence in AI Agents: Technical Challenges and Open Questions | [paper] | [code]
-
[2024/04/03] Empowering Biomedical Discovery with AI Agents | [paper] | [code]
-
[2024/04/02] A Survey on Large Language Model-Based Game Agents | [paper] | [code]
-
[2024/03/26] Large Language Models for Human-Robot Interaction: Opportunities and Risks | [paper] | [code]
-
[2024/03/07] Promising and worth-to-try future directions for advancing state-of-the-art surrogates methods of agent-based models in social and health computational sciences | [paper] | [code]
-
[2024/02/28] Large Language Models and Games: A Survey and Roadmap | [paper] | [code]
-
[2024/02/28] A Survey on Recent Advances in LLM-Based Multi-turn Dialogue Systems | [paper] | [code]
-
[2024/02/07] Can Large Language Model Agents Simulate Human Trust Behaviors? | [paper] | [code]
-
[2024/02/06] Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science | [paper] | [code]
-
[2024/02/05] Understanding the planning of LLM agents: A survey | [paper] | [code]
-
[2024/02/02] Reasoning Capacity in Multi-Agent Systems: Limitations, Challenges and Human-Centered Solutions | [paper] | [code]
-
[2024/01/01] If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents | [paper] | [code]
-
[2023/12/31] A Survey of Personality, Persona, and Profile in Conversational Agents and Chatbots | [paper] | [code]
-
[2023/12/19] Large Language Models Empowered Agent-based Modeling and Simulation: A Survey and Perspectives | [paper] | [code]
-
[2023/09/14] The Rise and Potential of Large Language Model Based Agents: A Survey | [paper] | [code]
-
[2023/08/22] A Survey on Large Language Model based Autonomous Agents | [paper] | [code]
-
[2023/06/27] Next Steps for Human-Centered Generative AI: A Technical Perspective | [paper] | [code]
-
[2023/04/06] Can Large Language Models Play Text Games Well? Current State-of-the-Art and Open Questions | [paper] | [code]
-
[2024/04/28] Logic Agent: Enhancing Validity with Logic Rule Invocation | [paper] | [code]
-
[2024/04/21] Socratic Planner: Inquiry-Based Zero-Shot Planning for Embodied Instruction Following | [paper] | [code]
-
[2024/03/13] AutoGuide: Automated Generation and Selection of State-Aware Guidelines for Large Language Model Agents | [paper] | [code]
-
[2024/03/12] AesopAgent: Agent-driven Evolutionary System on Story-to-Video Production | [paper] | [code]
-
[2024/03/11] Strength Lies in Differences! Towards Effective Non-collaborative Dialogues via Tailored Strategy Planning | [paper] | [code]
-
[2024/03/10] TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision | [paper] | [code]
-
[2024/03/05] KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents | [paper] | [code]
-
[2024/03/05] Language Guided Exploration for RL Agents in Text Environments | [paper] | [code]
-
[2024/02/29] PlanGPT: Enhancing Urban Planning with Tailored Language Model and Efficient Retrieval | [paper] | [code]
-
[2024/02/28] Data Interpreter: An LLM Agent For Data Science | [paper] | [code]
-
[2024/02/18] What's the Plan? Evaluating and Developing Planning-Aware Techniques for LLMs | [paper] | [code]
-
[2024/02/18] PreAct: Predicting Future in ReAct Enhances Agent's Planning Ability | [paper] | [code]
-
[2024/02/16] When is Tree Search Useful for LLM Planning? It Depends on the Discriminator | [paper] | [code]
-
[2024/02/09] Introspective Planning: Guiding Language-Enabled Agents to Refine Their Own Uncertainty | [paper] | [code]
-
[2024/02/06] RAP: Retrieval-Augmented Planning with Contextual Memory for Multimodal LLM Agents | [paper] | [code]
-
[2024/02/02] TravelPlanner: A Benchmark for Real-World Planning with Language Agents | [paper] | [code]
-
[2024/01/10] AUTOACT: Automatic Agent Learning from Scratch via Self-Planning | [paper] | [code]
-
[2023/11/19] TPTU-v2: Boosting Task Planning and Tool Usage of Large Language Model-based Agents in Real-world Systems | [paper] | [code]
-
[2023/10/09] Put Your Money Where Your Mouth Is: Evaluating Strategic Planning and Execution of LLM Agents in an Auction Arena | [paper] | [code]
-
[2023/08/07] TPTU: Task Planning and Tool Usage of Large Language Model-based AI Agents | [paper] | [code]
-
[2023/05/26] AdaPlanner: Adaptive Planning from Feedback with Language Models | [paper] | [code]
-
[2023/05/24] Reasoning with Language Model is Planning with World Model | [paper] | [code]
-
[2023/05/24] Leveraging Pre-trained Large Language Models to Construct and Utilize World Models for Model-based Task Planning | [paper] | [code]
-
[2023/03/29] Plan4MC: Skill Reinforcement Learning and Planning for Open-World Minecraft Tasks | [paper] | [code]
-
[2023/02/03] Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents | [paper] | [code]
-
[2022/12/08] LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models | [paper] | [code]
-
[2024/03/18] QueryAgent: A Reliable and Efficient Reasoning Framework with Environmental Feedback based Self-Correction | [paper] | [code]
-
[2024/03/17] Improving Dialogue Agents by Decomposing One Global Explicit Annotation with Local Implicit Multimodal Feedback | [paper] | [code]
-
[2024/03/08] ChatASU: Evoking LLM's Reflexion to Truly Understand Aspect Sentiment in Dialogues | [paper] | [code]
-
[2024/03/04] Trial and Error: Exploration-Based Trajectory Optimization for LLM Agents | [paper] | [code]
-
[2024/02/27] Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization | [paper] | [code]
-
[2024/02/26] SelectIT: Selective Instruction Tuning for Large Language Models via Uncertainty-Aware Self-Reflection | [paper] | [code]
-
[2024/02/24] Empowering Large Language Model Agents through Action Learning | [paper] | [code]
-
[2024/02/22] Mirror: A Multiple-perspective Self-Reflection Method for Knowledge-rich Reasoning | [paper] | [code]
-
[2024/02/19] A Critical Evaluation of AI Feedback for Aligning Large Language Models | [paper] | [code]
-
[2024/02/06] AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls | [paper] | [code]
-
[2024/02/02] StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback | [paper] | [code]
-
[2024/02/01] Generation, Distillation and Evaluation of Motivational Interviewing-Style Reflections with a Foundational Language Model | [paper] | [code]
-
[2023/12/18] CLOVA: A Closed-Loop Visual Assistant with Tool Usage and Update | [paper] | [code]
-
[2023/11/14] The ART of LLM Refinement: Ask, Refine, and Trust | [paper] | [code]
-
[2023/10/31] Learning From Mistakes Makes LLM Better Reasoner | [paper] | [code]
-
[2023/08/01] SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning | [paper] | [code]
-
[2023/07/27] PanGu-Coder2: Boosting Large Language Models for Code with Ranking Feedback | [paper] | [code]
-
[2023/05/26] AdaPlanner: Adaptive Planning from Feedback with Language Models | [paper] | [code]
-
[2023/05/22] Making Language Models Better Tool Learners with Execution Feedback | [paper] | [code]
-
[2023/04/11] Teaching Large Language Models to Self-Debug | [paper] | [code]
-
[2023/03/30] Self-Refine: Iterative Refinement with Self-Feedback | [paper] | [code]
-
[2023/02/03] Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents | [paper] | [code]
-
[2024/04/15] Memory Sharing for Large Language Model based Agents | [paper] | [code]
-
[2024/02/27] Evaluating Very Long-Term Conversational Memory of LLM Agents | [paper] | [code]
-
[2024/02/19] Compress to Impress: Unleashing the Potential of Compressive Memory in Real-World Long-Term Conversations | [paper] | [code]
-
[2024/02/07] InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory | [paper] | [code]
-
[2024/02/06] RAP: Retrieval-Augmented Planning with Contextual Memory for Multimodal LLM Agents | [paper] | [code]
-
[2023/12/22] Empowering Working Memory for Large Language Model Agents | [paper] | [code]
-
[2023/12/22] Evolving Large Language Model Assistant with Long-Term Conditional Memory | [paper] | [code]
-
[2023/11/10] JARVIS-1: Open-World Multi-task Agents with Memory-Augmented Multimodal Language Models | [paper] | [code]
-
[2023/10/16] CLIN: A Continually Learning Language Agent for Rapid Task Adaptation and Generalization | [paper] | [code]
-
[2023/06/06] ChatDB: Augmenting LLMs with Databases as Their Symbolic Memory | [paper] | [code]
-
[2023/05/31] Monotonic Location Attention for Length Generalization | [paper] | [code]
-
[2023/05/26] Randomized Positional Encodings Boost Length Generalization of Transformers | [paper] | [code]
-
[2023/05/25] Landmark Attention: Random-Access Infinite Context Length for Transformers | [paper] | [code]
-
[2023/05/24] Revisiting Parallel Context Windows: A Frustratingly Simple Alternative and Chain-of-Thought Deterioration | [paper] | [code]
-
[2023/05/24] Adapting Language Models to Compress Contexts | [paper] | [code]
-
[2023/05/23] RET-LLM: Towards a General Read-Write Memory for Large Language Models | [paper] | [code]
-
[2023/05/22] RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text | [paper] | [code]
-
[2023/05/19] ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings | [paper] | [code]
-
[2023/05/17] MemoryBank: Enhancing Large Language Models with Long-Term Memory | [paper] | [code]
-
[2023/05/15] Small Models are Valuable Plug-ins for Large Language Models | [paper] | [code]
-
[2023/05/02] Unlimiformer: Long-Range Transformers with Unlimited Length Input | [paper] | [code]
-
[2023/05/01] Learning to Reason and Memorize with Self-Notes | [paper] | [code]
-
[2023/04/27] ChatLog: Recording and Analyzing ChatGPT Across Time | [paper] | [code]
-
[2023/04/26] Unleashing Infinite-Length Input Capacity for Large-scale Language Models with Self-Controlled Memory System | [paper] | [code]
-
[2023/04/21] Emergent and Predictable Memorization in Large Language Models | [paper] | [code]
-
[2023/03/17] CoLT5: Faster Long-Range Transformers with Conditional Computation | [paper] | [code]
-
[2024/05/12] Exploring the Potential of Conversational AI Support for Agent-Based Social Simulation Model Design | [paper] | [code]
-
[2024/05/10] LLM Discussion: Enhancing the Creativity of Large Language Models via Discussion Framework and Role-Play | [paper] | [code]
-
[2024/05/06] Large Language Models (LLMs) as Agents for Augmented Democracy | [paper] | [code]
-
[2024/05/02] GAIA: A General AI Assistant for Intelligent Accelerator Operations | [paper] | [code]
-
[2024/05/01] "Ask Me Anything": How Comcast Uses LLMs to Assist Agents in Real Time | [paper] | [code]
-
[2024/04/30] PANGeA: Procedural Artificial Narrative using Generative AI for Turn-Based Video Games | [paper] | [code]
-
[2024/04/30] Large Language Model Agent for Fake News Detection | [paper] | [code]
-
[2024/04/27] CRISPR-GPT: An LLM Agent for Automated Design of Gene-Editing Experiments | [paper] | [code]
-
[2024/04/26] Large Language Model Agent as a Mechanical Designer | [paper] | [code]
-
[2024/04/25] Cooperate or Collapse: Emergence of Sustainability Behaviors in a Society of LLM Agents | [paper] | [code]
-
[2024/04/22] How Well Can LLMs Echo Us? Evaluating AI Chatbots' Role-Play Ability with ECHO | [paper] | [code]
-
[2024/04/19] Cooperative Sentiment Agents for Multimodal Sentiment Analysis | [paper] | [code]
-
[2024/04/19] Towards Human-centered Proactive Conversational Agents | [paper] | [code]
-
[2024/04/13] LLMSat: A Large Language Model-Based Goal-Oriented Agent for Autonomous Space Exploration | [paper] | [code]
-
[2024/04/10] Apollonion: Profile-centric Dialog Agent | [paper] | [code]
-
[2024/04/09] SurveyAgent: A Conversational System for Personalized and Efficient Research Survey | [paper] | [code]
-
[2024/03/31] DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model | [paper] | [code]
-
[2024/03/29] DataAgent: Evaluating Large Language Models' Ability to Answer Zero-Shot, Natural Language Queries | [paper] | [code]
-
[2024/03/23] EduAgent: Generative Student Agents in Learning | [paper] | [code]
-
[2024/03/22] CACA Agent: Capability Collaboration based AI Agent | [paper] | [code]
-
[2024/03/19] Characteristic AI Agents via Large Language Models | [paper] | [code]
-
[2024/03/15] VideoAgent: Long-form Video Understanding with Large Language Model as Agent | [paper] | [code]
-
[2024/03/05] ChatCite: LLM Agent with Human Workflow Guidance for Comparative Literature Summary | [paper] | [code]
-
[2024/03/05] SimuCourt: Building Judicial Decision-Making Agents with Real-world Judgement Documents | [paper] | [code]
-
[2024/03/02] SceneCraft: An LLM Agent for Synthesizing 3D Scene as Blender Code | [paper] | [code]
-
[2024/02/29] On the Decision-Making Abilities in Role-Playing using Large Language Models | [paper] | [code]
-
[2024/02/28] Prospect Personalized Recommendation on Large Language Model-based Agent Platform | [paper] | [code]
-
[2024/02/28] Data Interpreter: An LLM Agent For Data Science | [paper] | [code]
-
[2024/02/27] BASES: Large-scale Web Search User Simulation with Large Language Model based Agents | [paper] | [code]
-
[2024/02/26] Language Agents as Optimizable Graphs | [paper] | [code]
-
[2024/02/26] Unveiling the Truth and Facilitating Change: Towards Agent-based Large-scale Social Movement Simulation | [paper] | [code]
-
[2024/02/25] Understanding Public Perceptions of AI Conversational Agents: A Cross-Cultural Analysis | [paper] | [code]
-
[2024/02/25] Bootstrapping Cognitive Agents with a Large Language Model | [paper] | [code]
-
[2024/02/23] On the Multi-turn Instruction Following for Conversational Web Agents | [paper] | [code]
-
[2024/02/22] Large Language Models as Urban Residents: An LLM Agent Framework for Personal Mobility Generation | [paper] | [code]
-
[2024/02/21] Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent | [paper] | [code]
-
[2024/02/20] Can Large Language Models be Used to Provide Psychological Counselling? An Analysis of GPT-4-Generated Responses Using Role-play Dialogues | [paper] | [code]
-
[2024/02/20] Soft Self-Consistency Improves Language Model Agents | [paper] | [code]
-
[2024/02/20] CHATATC: Large Language Model-Driven Conversational Agents for Supporting Strategic Air Traffic Flow Management | [paper] | [code]
-
[2024/02/19] Polarization of Autonomous Generative AI Agents Under Echo Chambers | [paper] | [code]
-
[2024/02/19] LLM Agents for Psychology: A Study on Gamified Assessments | [paper] | [code]
-
[2024/02/19] Stick to your Role! Stability of Personal Values Expressed in Large Language Models | [paper] | [code]
-
[2024/02/19] WorldCoder, a Model-Based LLM Agent: Building World Models by Writing Code and Interacting with the Environment | [paper] | [code]
-
[2024/02/18] Modelling Political Coalition Negotiations Using LLM-based Agents | [paper] | [code]
-
[2024/02/17] Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents | [paper] | [code]
-
[2024/02/15] Knowledge-Infused LLM-Powered Conversational Health Agent: A Case Study for Diabetes Patients | [paper] | [code]
-
[2024/02/13] Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast | [paper] | [code]
-
[2024/02/06] Professional Agents -- Evolving Large Language Models into Autonomous Experts with Human-Level Competencies | [paper] | [code]
-
[2024/02/06] Can Generative Agents Predict Emotion? | [paper] | [code]
-
[2024/02/05] LLM Agents in Interaction: Measuring Personality Consistency and Linguistic Alignment in Interacting Populations of Large Language Models | [paper] | [code]
-
[2024/02/05] GUARD: Role-playing to Generate Natural-language Jailbreakings to Test Guideline Adherence of Large Language Models | [paper] | [code]
-
[2024/02/04] NavHint: Vision and Language Navigation Agent with a Hint Generator | [paper] | [code]
-
[2024/02/02] TrustAgent: Towards Safe and Trustworthy LLM-based Agents through Agent Constitution | [paper] | [code]
-
[2024/02/01] Executable Code Actions Elicit Better LLM Agents | [paper] | [code]
-
[2024/01/31] LLMs Simulate Big Five Personality Traits: Further Evidence | [paper] | [code]
-
[2024/01/29] Assistive Large Language Model Agents for Socially-Aware Negotiation Dialogues | [paper] | [code]
-
[2024/01/09] Agent Alignment in Evolving Social Norms | [paper] | [code]
-
[2023/12/28] Experiential Co-Learning of Software-Developing Agents | [paper] | [code]
-
[2023/12/27] Automating Knowledge Acquisition for Content-Centric Cognitive Agents Using LLMs | [paper] | [code]
-
[2023/12/21] ChatGPT as a commenter to the news: can LLMs generate human-like opinions? | [paper] | [code]
-
[2023/12/19] Can ChatGPT be Your Personal Medical Assistant? | [paper] | [code]
-
[2023/12/06] LLM as OS, Agents as Apps: Envisioning AIOS, Agents and the AIOS-Agent Ecosystem | [paper] | [code]
-
[2023/11/28] War and Peace (WarAgent): Large Language Model-based Multi-Agent Simulation of World Wars | [paper] | [code]
-
[2023/11/23] Controlling Large Language Model-based Agents for Large-Scale Decision-Making: An Actor-Critic Approach | [paper] | [code]
-
[2023/11/10] Smart Agent-Based Modeling: On the Use of Large Language Models in Computer Simulations | [paper] | [code]
-
[2023/10/01] RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models | [paper] | [code]
-
[2023/09/08] Unleashing the Power of Graph Learning through LLM-based Autonomous Agents | [paper] | [code]
-
[2023/09/05] Cognitive Architectures for Language Agents | [paper] | [code]
-
[2023/08/22] Towards an On-device Agent for Text Rewriting | [paper] | [code]
-
[2023/08/14] ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate | [paper] | [code]
-
[2023/07/24] To Infinity and Beyond: SHOW-1 and Showrunner Agents in Multi-Agent Simulations | [paper] | [code]
-
[2023/06/28] Inferring the Goals of Communicating Agents from Actions and Instructions | [paper] | [code]
-
[2023/05/30] Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate | [paper] | [code]
-
[2023/05/27] SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks | [paper] | [code]
-
[2023/05/26] Training Socially Aligned Language Models in Simulated Human Society | [paper] | [code]
-
[2023/05/25] Role-Play with Large Language Models | [paper] | [code]
-
[2023/05/17] Tree of Thoughts: Deliberate Problem Solving with Large Language Models | [paper] | [code]
-
[2023/05/09] TidyBot: Personalized Robot Assistance with Large Language Models | [paper] | [code]
-
[2023/05/02] The Role of Summarization in Generative Agents: A Preliminary Perspective | [paper] | [code]
-
[2023/04/26] Multi-Party Chat: Conversational Agents in Group Settings with Humans and Models | [paper] | [code]
-
[2023/04/24] ChatLLM Network: More brains, More intelligence | [paper] | [code]
-
[2023/04/21] Improving Grounded Language Understanding in a Collaborative Environment by Interacting with Agents Through Help Feedback | [paper] | [code]
-
[2023/04/19] Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models | [paper] | [code]
-
[2023/04/15] Self-collaboration Code Generation via ChatGPT | [paper] | [code]
-
[2023/04/07] Generative Agents: Interactive Simulacra of Human Behavior | [paper] | [code]
-
[2023/03/31] CAMEL: Communicative Agents for "Mind" Exploration of Large Scale Language Model Society | [paper] | [code]
-
[2022/12/08] LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models | [paper] | [code]
-
[2024/05/23] Human-Agent Cooperation in Games under Incomplete Information through Natural Language Communication | [paper] | [code]
-
[2024/05/08] LLMs with Personalities in Multi-issue Negotiation Games | [paper] | [code]
-
[2024/04/03] Learn to Disguise: Avoid Refusal Responses in LLM's Defense via a Multi-agent Attacker-Disguiser Game | [paper] | [code]
-
[2024/03/26] Sharing the Cost of Success: A Game for Evaluating and Learning Collaborative Multi-Agent Instruction Giving and Following Policies | [paper] | [code]
-
[2024/02/19] LLM Agents for Psychology: A Study on Gamified Assessments | [paper] | [code]
-
[2024/02/13] Large Language Models as Minecraft Agents | [paper] | [code]
-
[2024/02/12] Large Language Models as Agents in Two-Player Games | [paper] | [code]
-
[2024/02/07] Can Large Language Model Agents Simulate Human Trust Behaviors? | [paper] | [code]
-
[2024/02/04] Enhance Reasoning for Large Language Models in the Game Werewolf | [paper] | [code]
-
[2024/02/02] PokeLLMon: A Human-Parity Agent for Pokemon Battles with Large Language Models | [paper] | [code]
-
[2023/12/29] Cooperation on the Fly: Exploring Language Agents for Ad Hoc Teamwork in the Avalon Game | [paper] | [code]
-
[2023/11/10] JARVIS-1: Open-World Multi-task Agents with Memory-Augmented Multimodal Language Models | [paper] | [code]
-
[2023/10/31] Leveraging Word Guessing Games to Assess the Intelligence of Large Language Models | [paper] | [code]
-
[2023/09/29] Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT-4 | [paper] | [code]
-
[2023/09/18] MindAgent: Emergent Gaming Interaction | [paper] | [code]
-
[2023/09/10] An Appraisal-Based Chain-Of-Emotion Architecture for Affective Language Model Game Agents | [paper] | [code]
-
[2023/09/09] Exploring Large Language Models for Communication Games: An Empirical Study on Werewolf | [paper] | [code]
-
[2023/08/23] Are ChatGPT and GPT-4 Good Poker Players? -- A Pre-Flop Analysis | [paper] | [code]
-
[2023/05/31] Recursive Metropolis-Hastings Naming Game: Symbol Emergence in a Multi-agent System based on Probabilistic Generative Models | [paper] | [code]
-
[2023/05/26] Playing repeated games with Large Language Models | [paper] | [code]
-
[2023/05/25] Voyager: An Open-Ended Embodied Agent with Large Language Models | [paper] | [code]
-
[2023/05/25] Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory | [paper] | [code]
-
[2023/05/19] Examining the Inter-Consistency of Large Language Models: An In-depth Analysis via Debate | [paper] | [code]
-
[2023/05/17] Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback | [paper] | [code]
-
[2023/05/08] Knowledge-enhanced Agents for Interactive Text Games | [paper] | [code]
-
[2023/03/29] Plan4MC: Skill Reinforcement Learning and Planning for Open-World Minecraft Tasks | [paper] | [code]
-
[2023/02/03] Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents | [paper] | [code]
-
[2024/05/23] Human-Agent Cooperation in Games under Incomplete Information through Natural Language Communication | [paper] | [code]
-
[2024/05/17] Latent State Estimation Helps UI Agents to Reason | [paper] | [code]
-
[2024/05/02] CACTUS: Chemistry Agent Connecting Tool-Usage to Science | [paper] | [code]
-
[2024/05/01] Navigating WebAI: Training Agents to Complete Web Tasks with Large Language Models and Reinforcement Learning | [paper] | [code]
-
[2024/05/01] "Ask Me Anything": How Comcast Uses LLMs to Assist Agents in Real Time | [paper] | [code]
-
[2024/04/23] Aligning LLM Agents by Learning Latent Preference from User Edits | [paper] | [code]
-
[2024/04/16] Search Beyond Queries: Training Smaller Language Models for Web Interactions via Reinforcement Learning | [paper] | [code]
-
[2024/04/09] SurveyAgent: A Conversational System for Personalized and Efficient Research Survey | [paper] | [code]
-
[2024/04/04] AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent | [paper] | [code]
-
[2024/03/12] AesopAgent: Agent-driven Evolutionary System on Story-to-Video Production | [paper] | [code]
-
[2024/03/05] InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents | [paper] | [code]
-
[2024/03/05] Android in the Zoo: Chain-of-Action-Thought for GUI Agents | [paper] | [code]
-
[2024/02/27] BASES: Large-scale Web Search User Simulation with Large Language Model based Agents | [paper] | [code]
-
[2024/02/26] Look Before You Leap: Towards Decision-Aware and Generalizable Tool-Usage for Large Language Models | [paper] | [code]
-
[2024/02/23] On the Multi-turn Instruction Following for Conversational Web Agents | [paper] | [code]
-
[2024/02/20] Large Language Model-based Human-Agent Collaboration for Complex Task Solving | [paper] | [code]
-
[2024/02/20] AgentMD: Empowering Language Agents for Risk Prediction with Large-Scale Clinical Tool Learning | [paper] | [code]
-
[2024/02/18] SciAgent: Tool-augmented Language Models for Scientific Reasoning | [paper] | [code]
-
[2024/02/18] Shaping Human-AI Collaboration: Varied Scaffolding Levels in Co-writing with Language Models | [paper] | [code]
-
[2024/02/17] Human-AI Interactions in the Communication Era: Autophagy Makes Large Models Achieving Local Optima | [paper] | [code]
-
[2024/02/16] ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages | [paper] | [code]
-
[2024/02/14] Towards better Human-Agent Alignment: Assessing Task Utility in LLM-Powered Applications | [paper] | [code]
-
[2024/02/09] CoSearchAgent: A Lightweight Collaborative Search Agent with Large Language Models | [paper] | [code]
-
[2024/02/08] UFO: A UI-Focused Agent for Windows OS Interaction | [paper] | [code]
-
[2024/02/06] AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls | [paper] | [code]
-
[2024/01/11] EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction | [paper] | [code]
-
[2024/01/03] GPT-4V(ision) is a Generalist Web Agent, if Grounded | [paper] | [code]
-
[2023/12/21] Team Flow at DRC2023: Building Common Ground and Text-based Turn-taking in a Travel Agent Spoken Dialogue System | [paper] | [code]
-
[2023/12/21] AppAgent: Multimodal Agents as Smartphone Users | [paper] | [code]
-
[2023/12/18] CLOVA: A Closed-Loop Visual Assistant with Tool Usage and Update | [paper] | [code]
-
[2023/12/14] CogAgent: A Visual Language Model for GUI Agents | [paper] | [code]
-
[2023/11/19] TPTU-v2: Boosting Task Planning and Tool Usage of Large Language Model-based Agents in Real-world Systems | [paper] | [code]
-
[2023/10/18] MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models | [paper] | [code]
-
[2023/10/13] AgentCF: Collaborative Learning with Autonomous Language Agents for Recommender Systems | [paper] | [code]
-
[2023/10/12] A Zero-Shot Language Agent for Computer Control with Structured Reflection | [paper] | [code]
-
[2023/09/02] ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models | [paper] | [code]
-
[2023/08/07] TPTU: Task Planning and Tool Usage of Large Language Model-based AI Agents | [paper] | [code]
-
[2023/06/05] When Large Language Model based Agent Meets User Behavior Analysis: A Novel User Simulation Paradigm | [paper] | [code]
-
[2024/05/23] ALI-Agent: Assessing LLMs' Alignment with Human Values via Agent-based Evaluation | [paper] | [code]
-
[2024/05/23] AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents | [paper] | [code]
-
[2024/05/16] Speaker Verification in Agent-Generated Conversations | [paper] | [code]
-
[2024/05/13] AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments | [paper] | [code]
-
[2024/05/01] WorkBench: a Benchmark Dataset for Agents in a Realistic Workplace Setting | [paper] | [code]
-
[2024/04/23] Evaluating Tool-Augmented Agents in Remote Sensing Platforms | [paper] | [code]
-
[2024/04/15] MMInA: Benchmarking Multihop Multimodal Internet Agents | [paper] | [code]
-
[2024/04/11] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments | [paper] | [code]
-
[2024/04/09] AgentQuest: A Modular Benchmark Framework to Measure Progress and Improve LLM Agents | [paper] | [code]
-
[2024/03/20] RoleInteract: Evaluating the Social Interaction of Role-Playing Agents | [paper] | [code]
-
[2024/03/18] How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments | [paper] | [code]
-
[2024/03/18] Tur[k]ingBench: A Challenge Benchmark for Web Agents | [paper] | [code]
-
[2024/03/13] Evaluating Large Language Models as Generative User Simulators for Conversational Recommendation | [paper] | [code]
-
[2024/03/05] InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents | [paper] | [code]
-
[2024/02/27] Benchmarking Data Science Agents | [paper] | [code]
-
[2024/02/27] OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist Autonomous Agents for Desktop and Web | [paper] | [code]
-
[2024/02/18] Benchmark Self-Evolving: A Multi-Agent Framework for Dynamic LLM Evaluation | [paper] | [code]
-
[2024/02/18] MatPlotAgent: Method and Evaluation for LLM-Based Agentic Scientific Data Visualization | [paper] | [code]
-
[2024/01/02] CharacterEval: A Chinese Benchmark for Role-Playing Conversational Agent Evaluation | [paper] | [code]
-
[2023/12/28] How Far Are We from Believable AI Agents? A Framework for Evaluating the Believability of Human Behavior Simulation | [paper] | [code]
-
[2023/12/26] RoleEval: A Bilingual Role Evaluation Benchmark for Large Language Models | [paper] | [code]
-
[2023/11/17] Testing Language Model Agents Safely in the Wild | [paper] | [code]
-
[2023/11/16] ML-Bench: Large Language Models Leverage Open-source Libraries for Machine Learning Tasks | [paper] | [code]
-
[2023/11/15] ToolTalk: Evaluating Tool-Usage in a Conversational Setting | [paper] | [code]
-
[2023/10/24] FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions | [paper] | [code]
-
[2023/10/09] Put Your Money Where Your Mouth Is: Evaluating Strategic Planning and Execution of LLM Agents in an Auction Arena | [paper] | [code]
-
[2023/10/02] SmartPlay : A Benchmark for LLMs as Intelligent Agents | [paper] | [code]
-
[2023/09/18] MindAgent: Emergent Gaming Interaction | [paper] | [code]
-
[2023/08/11] BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents | [paper] | [code]
-
[2023/08/07] AgentBench: Evaluating LLMs as Agents | [paper] | [code]
-
[2023/07/31] HAGRID: A Human-LLM Collaborative Dataset for Generative Information-Seeking with Attribution | [paper] | [code]
-
[2024/05/23] AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents | [paper] | [code]
-
[2024/04/01] Rapid Mobile App Development for Generative AI Agents on MIT App Inventor | [paper] | [code]
-
[2024/03/28] MineLand: Simulating Large-Scale Multi-Agent Interactions with Limited Multimodal Senses and Physical Needs | [paper] | [code]
-
[2024/03/26] Sharing the Cost of Success: A Game for Evaluating and Learning Collaborative Multi-Agent Instruction Giving and Following Policies | [paper] | [code]
-
[2023/03/14] CB2: Collaborative Natural Language Interaction Research Platform | [paper] | [code]
-
[2024/04/11] Behavior Trees Enable Structured Programming of Language Model Agents | [paper] | [code]
-
[2024/04/05] Cleared for Takeoff? Compositional & Conditional Reasoning may be the Achilles Heel to (Flight-Booking) Language Agents | [paper] | [code]
-
[2024/03/29] ITCMA: A Generative Agent Based on a Computational Consciousness Structure | [paper] | [code]
-
[2024/03/18] QueryAgent: A Reliable and Efficient Reasoning Framework with Environmental Feedback based Self-Correction | [paper] | [code]
-
[2024/02/26] RepoAgent: An LLM-Powered Open-Source Framework for Repository-level Code Documentation Generation | [paper] | [code]
-
[2024/02/26] Chain-of-Discussion: A Multi-Model Framework for Complex Evidence-Based Question Answering | [paper] | [code]
-
[2024/02/22] Triad: A Framework Leveraging a Multi-Role LLM-based Agent to Solve Knowledge Base Question Answering | [paper] | [code]
-
[2024/02/17] KG-Agent: An Efficient Autonomous Agent Framework for Complex Reasoning over Knowledge Graph | [paper] | [code]
-
[2024/01/05] AFSPP: Agent Framework for Shaping Preference and Personality with Large Language Models | [paper] | [code]
-
[2023/11/02] ProAgent: From Robotic Process Automation to Agentic Process Automation | [paper] | [code]
-
[2023/09/29] Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency | [paper] | [code]
-
[2023/09/14] Agents: An Open-source Framework for Autonomous Language Agents | [paper] | [code]
-
[2023/08/22] ProAgent: Building Proactive Cooperative AI with Large Language Models | [paper] | [code]
-
[2023/06/09] Mind2Web: Towards a Generalist Agent for the Web | [paper] | [code]
-
[2024/05/23] CityGPT: Towards Urban IoT Learning, Analysis and Interaction with Multi-Agent System | [paper] | [code]
-
[2024/05/20] (Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts | [paper] | [code]
-
[2024/05/17] LLM-based Multi-Agent Reinforcement Learning: Current and Future Directions | [paper] | [code]
-
[2024/05/07] Enhancing the Efficiency and Accuracy of Underlying Asset Reviews in Structured Finance: The Application of Multi-agent Framework | [paper] | [code]
-
[2024/05/06] Conformity, Confabulation, and Impersonation: Persona Inconstancy in Multi-Agent LLM Collaboration | [paper] | [code]
-
[2024/05/05] Language Evolution for Evading Social Media Regulation via LLM-based Multi-agent Simulation | [paper] | [code]
-
[2024/04/28] ComposerX: Multi-Agent Symbolic Music Composition with LLMs | [paper] | [code]
-
[2024/04/25] Cooperate or Collapse: Emergence of Sustainability Behaviors in a Society of LLM Agents | [paper] | [code]
-
[2024/04/23] BattleAgent: Multi-modal Dynamic Emulation on Historical Battles to Complement Historical Analysis | [paper] | [code]
-
[2024/04/23] CT-Agent: Clinical Trial Multi-Agent with Large Language Model-based Reasoning | [paper] | [code]
-
[2024/04/14] Confidence Calibration and Rationalization for LLMs via Multi-Agent Deliberation | [paper] | [code]
-
[2024/04/12] Leveraging Multi-AI Agents for Cross-Domain Knowledge Discovery | [paper] | [code]
-
[2024/04/10] MathVC: An LLM-Simulated Multi-Character Virtual Classroom for Mathematics Education | [paper] | [code]
-
[2024/04/09] Large Language Models to the Rescue: Deadlock Resolution in Multi-Robot Systems | [paper] | [code]
-
[2024/04/08] 360{\deg}REA: Towards A Reusable Experience Accumulation with 360{\deg} Assessment for Multi-Agent System | [paper] | [code]
-
[2024/04/06] MACM: Utilizing a Multi-Agent System for Condition Mining in Solving Complex Mathematical Problems | [paper] | [code]
-
[2024/04/03] Learn to Disguise: Avoid Refusal Responses in LLM's Defense via a Multi-agent Attacker-Disguiser Game | [paper] | [code]
-
[2024/04/02] Self-Organized Agents: A LLM Multi-Agent Framework toward Ultra Large-Scale Code Generation and Optimization | [paper] | [code]
-
[2024/04/02] CMAT: A Multi-Agent Collaboration Tuning Framework for Enhancing Small Language Models | [paper] | [code]
-
[2024/04/01] TraveLER: A Multi-LMM Agent Framework for Video Question-Answering | [paper] | [code]
-
[2024/03/28] MATEval: A Multi-Agent Discussion Framework for Advancing Open-Ended Text Evaluation | [paper] | [code]
-
[2024/03/26] MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution | [paper] | [code]
-
[2024/03/21] Multi-Agent VQA: Exploring Multi-Agent Foundation Models in Zero-Shot Visual Question Answering | [paper] | [code]
-
[2024/03/20] Agent Group Chat: An Interactive Group Chat Simulacra For Better Eliciting Collective Emergent Behavior | [paper] | [code]
-
[2024/03/19] Embodied LLM Agents Learn to Cooperate in Organized Teams | [paper] | [code]
-
[2024/03/18] How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments | [paper] | [code]
-
[2024/03/12] Transforming Competition into Collaboration: The Revolutionary Role of Multi-Agent Systems and Language Models in Modern Organizations | [paper] | [code]
-
[2024/03/02] AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks | [paper] | [code]
-
[2024/02/28] Rethinking the Bounds of LLM Reasoning: Are Multi-Agent Discussions the Key? | [paper] | [code]
-
[2024/02/26] Unveiling the Truth and Facilitating Change: Towards Agent-based Large-scale Social Movement Simulation | [paper] | [code]
-
[2024/02/26] Chain-of-Discussion: A Multi-Model Framework for Complex Evidence-Based Question Answering | [paper] | [code]
-
[2024/02/26] LLMArena: Assessing Capabilities of Large Language Models in Dynamic Multi-Agent Environments | [paper] | [code]
-
[2024/02/21] LLM Based Multi-Agent Generation of Semi-structured Documents from Semantic Templates in the Public Administration Domain | [paper] | [code]
-
[2024/02/20] What if LLMs Have Different World Views: Simulating Alien Civilizations with LLM-based Agents | [paper] | [code]
-
[2024/02/18] Benchmark Self-Evolving: A Multi-Agent Framework for Dynamic LLM Evaluation | [paper] | [code]
-
[2024/02/18] LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration | [paper] | [code]
-
[2024/02/15] TDAG: A Multi-Agent Framework based on Dynamic Task Decomposition and Agent Generation | [paper] | [code]
-
[2024/02/03] More Agents Is All You Need | [paper] | [code]
-
[2024/02/02] A Multi-Agent Conversational Recommender System | [paper] | [code]
-
[2024/01/27] ProtAgents: Protein discovery via large language model multi-agent collaborations combining physics and machine learning | [paper] | [code]
-
[2024/01/11] Combating Adversarial Attacks with Multi-Agent Debate | [paper] | [code]
-
[2024/01/08] MARG: Multi-Agent Review Generation for Scientific Papers | [paper] | [code]
-
[2024/01/08] SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems | [paper] | [code]
-
[2024/01/08] Why Solving Multi-agent Path Finding with Large Language Model has not Succeeded Yet | [paper] | [code]
-
[2023/12/20] AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation | [paper] | [code]
-
[2023/12/01] Deciphering Digital Detectives: Understanding LLM Behaviors and Capabilities in Multi-Agent Mystery Games | [paper] | [code]
-
[2023/10/31] Multi-Agent Consensus Seeking via Large Language Models | [paper] | [code]
-
[2023/10/25] MultiPrompter: Cooperative Prompt Optimization with Multi-Agent Reinforcement Learning | [paper] | [code]
-
[2023/10/10] MetaAgents: Simulating Interactions of Human Behaviors for LLM-based Task-oriented Coordination via Collaborative Generative Agents | [paper] | [code]
-
[2023/10/03] Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View | [paper] | [code]
-
[2023/09/22] Learning to Coordinate with Anyone | [paper] | [code]
-
[2023/09/18] MindAgent: Emergent Gaming Interaction | [paper] | [code]
-
[2023/08/21] AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors in Agents | [paper] | [code]
-
[2023/08/03] InterAct: Exploring the Potentials of ChatGPT as a Cooperative Agent | [paper] | [code]
-
[2023/08/01] MetaGPT: Meta Programming for Multi-Agent Collaborative Framework | [paper] | [code]
-
[2023/07/16] Communicative Agents for Software Development | [paper] | [code]
-
[2023/07/11] Unleashing Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration | [paper] | [code]
-
[2023/07/05] Building Cooperative Embodied Agents Modularly with Large Language Models | [paper] | [code]
-
[2023/06/05] Multi-Agent Collaboration: Harnessing the Power of Intelligent LLM Agents | [paper] | [code]
-
[2024/05/16] Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning | [paper] | [code]
-
[2024/05/01] Navigating WebAI: Training Agents to Complete Web Tasks with Large Language Models and Reinforcement Learning | [paper] | [code]
-
[2024/04/17] Octopus v3: Technical Report for On-device Sub-billion Multimodal AI Agent | [paper] | [code]
-
[2024/04/16] Search Beyond Queries: Training Smaller Language Models for Web Interactions via Reinforcement Learning | [paper] | [code]
-
[2024/04/05] Social Skill Training with Large Language Models | [paper] | [code]
-
[2024/04/02] CMAT: A Multi-Agent Collaboration Tuning Framework for Enhancing Small Language Models | [paper] | [code]
-
[2024/03/29] Enhancing the General Agent Capabilities of Low-Parameter LLMs through Tuning and Multi-Branch Reasoning | [paper] | [code]
-
[2024/03/21] ReAct Meets ActRe: Autonomous Annotation of Agent Trajectories for Contrastive Self-Training | [paper] | [code]
-
[2024/03/19] Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models | [paper] | [code]
-
[2024/03/18] EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents | [paper] | [code]
-
[2024/02/23] AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning | [paper] | [code]
-
[2024/02/21] Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent | [paper] | [code]
-
[2024/02/19] A Critical Evaluation of AI Feedback for Aligning Large Language Models | [paper] | [code]
-
[2024/02/18] Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language Models as Agents | [paper] | [code]
-
[2024/02/17] Training Language Model Agents without Modifying Language Models | [paper] | [code]
-
[2024/01/10] Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training | [paper] | [code]
-
[2024/01/10] Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk | [paper] | [code]
-
[2024/01/05] From LLM to Conversational Agent: A Memory Enhanced Architecture with Fine-Tuning of Large Language Models | [paper] | [code]
-
[2023/12/22] Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning | [paper] | [code]
-
[2023/12/20] Machine Mindset: An MBTI Exploration of Large Language Models | [paper] | [code]
-
[2023/11/28] Embodied Multi-Modal Agent trained by an LLM from a Parallel TextWorld | [paper] | [code]
-
[2023/10/19] AgentTuning: Enabling Generalized Agent Abilities for LLMs | [paper] | [code]
-
[2023/10/09] FireAct: Toward Language Agent Fine-tuning | [paper] | [code]
-
[2023/10/01] Adapting LLM Agents Through Communication | [paper] | [code]
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for LLM-Agents-Papers
Similar Open Source Tools
LLM-Agents-Papers
A repository that lists papers related to Large Language Model (LLM) based agents. The repository covers various topics including survey, planning, feedback & reflection, memory mechanism, role playing, game playing, tool usage & human-agent interaction, benchmark & evaluation, environment & platform, agent framework, multi-agent system, and agent fine-tuning. It provides a comprehensive collection of research papers on LLM-based agents, exploring different aspects of AI agent architectures and applications.
awesome-LLM-game-agent-papers
This repository provides a comprehensive survey of research papers on large language model (LLM)-based game agents. LLMs are powerful AI models that can understand and generate human language, and they have shown great promise for developing intelligent game agents. This survey covers a wide range of topics, including adventure games, crafting and exploration games, simulation games, competition games, cooperation games, communication games, and action games. For each topic, the survey provides an overview of the state-of-the-art research, as well as a discussion of the challenges and opportunities for future work.
do-research-in-AI
This repository is a collection of research lectures and experience sharing posts from frontline researchers in the field of AI. It aims to help individuals upgrade their research skills and knowledge through insightful talks and experiences shared by experts. The content covers various topics such as evaluating research papers, choosing research directions, research methodologies, and tips for writing high-quality scientific papers. The repository also includes discussions on academic career paths, research ethics, and the emotional aspects of research work. Overall, it serves as a valuable resource for individuals interested in advancing their research capabilities in the field of AI.
Awesome-Robotics-3D
Awesome-Robotics-3D is a curated list of 3D Vision papers related to Robotics domain, focusing on large models like LLMs/VLMs. It includes papers on Policy Learning, Pretraining, VLM and LLM, Representations, and Simulations, Datasets, and Benchmarks. The repository is maintained by Zubair Irshad and welcomes contributions and suggestions for adding papers. It serves as a valuable resource for researchers and practitioners in the field of Robotics and Computer Vision.
Paper-Reading-ConvAI
Paper-Reading-ConvAI is a repository that contains a list of papers, datasets, and resources related to Conversational AI, mainly encompassing dialogue systems and natural language generation. This repository is constantly updating.
Awesome-LLM-Robotics
This repository contains a curated list of **papers using Large Language/Multi-Modal Models for Robotics/RL**. Template from awesome-Implicit-NeRF-Robotics Please feel free to send me pull requests or email to add papers! If you find this repository useful, please consider citing and STARing this list. Feel free to share this list with others! ## Overview * Surveys * Reasoning * Planning * Manipulation * Instructions and Navigation * Simulation Frameworks * Citation
Awesome-Code-LLM
Analyze the following text from a github repository (name and readme text at end) . Then, generate a JSON object with the following keys and provide the corresponding information for each key, in lowercase letters: 'description' (detailed description of the repo, must be less than 400 words,Ensure that no line breaks and quotation marks.),'for_jobs' (List 5 jobs suitable for this tool,in lowercase letters), 'ai_keywords' (keywords of the tool,user may use those keyword to find the tool,in lowercase letters), 'for_tasks' (list of 5 specific tasks user can use this tool to do,in lowercase letters), 'answer' (in english languages)
ABigSurveyOfLLMs
ABigSurveyOfLLMs is a repository that compiles surveys on Large Language Models (LLMs) to provide a comprehensive overview of the field. It includes surveys on various aspects of LLMs such as transformers, alignment, prompt learning, data management, evaluation, societal issues, safety, misinformation, attributes of LLMs, efficient LLMs, learning methods for LLMs, multimodal LLMs, knowledge-based LLMs, extension of LLMs, LLMs applications, and more. The repository aims to help individuals quickly understand the advancements and challenges in the field of LLMs through a collection of recent surveys and research papers.
Awesome_papers_on_LLMs_detection
This repository is a curated list of papers focused on the detection of Large Language Models (LLMs)-generated content. It includes the latest research papers covering detection methods, datasets, attacks, and more. The repository is regularly updated to include the most recent papers in the field.
Awesome-LLM-Interpretability
Awesome-LLM-Interpretability is a curated list of materials related to LLM (Large Language Models) interpretability, covering tutorials, code libraries, surveys, videos, papers, and blogs. It includes resources on transformer mechanistic interpretability, visualization, interventions, probing, fine-tuning, feature representation, learning dynamics, knowledge editing, hallucination detection, and redundancy analysis. The repository aims to provide a comprehensive overview of tools, techniques, and methods for understanding and interpreting the inner workings of large language models.
prompt-in-context-learning
An Open-Source Engineering Guide for Prompt-in-context-learning from EgoAlpha Lab. 📝 Papers | ⚡️ Playground | 🛠 Prompt Engineering | 🌍 ChatGPT Prompt | ⛳ LLMs Usage Guide > **⭐️ Shining ⭐️:** This is fresh, daily-updated resources for in-context learning and prompt engineering. As Artificial General Intelligence (AGI) is approaching, let’s take action and become a super learner so as to position ourselves at the forefront of this exciting era and strive for personal and professional greatness. The resources include: _🎉Papers🎉_: The latest papers about _In-Context Learning_ , _Prompt Engineering_ , _Agent_ , and _Foundation Models_. _🎉Playground🎉_: Large language models(LLMs)that enable prompt experimentation. _🎉Prompt Engineering🎉_: Prompt techniques for leveraging large language models. _🎉ChatGPT Prompt🎉_: Prompt examples that can be applied in our work and daily lives. _🎉LLMs Usage Guide🎉_: The method for quickly getting started with large language models by using LangChain. In the future, there will likely be two types of people on Earth (perhaps even on Mars, but that's a question for Musk): - Those who enhance their abilities through the use of AIGC; - Those whose jobs are replaced by AI automation. 💎EgoAlpha: Hello! human👤, are you ready?
Everything-LLMs-And-Robotics
The Everything-LLMs-And-Robotics repository is the world's largest GitHub repository focusing on the intersection of Large Language Models (LLMs) and Robotics. It provides educational resources, research papers, project demos, and Twitter threads related to LLMs, Robotics, and their combination. The repository covers topics such as reasoning, planning, manipulation, instructions and navigation, simulation frameworks, perception, and more, showcasing the latest advancements in the field.
Efficient_Foundation_Model_Survey
Efficient Foundation Model Survey is a comprehensive analysis of resource-efficient large language models (LLMs) and multimodal foundation models. The survey covers algorithmic and systemic innovations to support the growth of large models in a scalable and environmentally sustainable way. It explores cutting-edge model architectures, training/serving algorithms, and practical system designs. The goal is to provide insights on tackling resource challenges posed by large foundation models and inspire future breakthroughs in the field.
ChatGPT-On-CS
This project is an intelligent dialogue customer service tool based on a large model, which supports access to platforms such as WeChat, Qianniu, Bilibili, Douyin Enterprise, Douyin, Doudian, Weibo chat, Xiaohongshu professional account operation, Xiaohongshu, Zhihu, etc. You can choose GPT3.5/GPT4.0/ Lazy Treasure Box (more platforms will be supported in the future), which can process text, voice and pictures, and access external resources such as operating systems and the Internet through plug-ins, and support enterprise AI applications customized based on their own knowledge base.
For similar tasks
LLM-Agents-Papers
A repository that lists papers related to Large Language Model (LLM) based agents. The repository covers various topics including survey, planning, feedback & reflection, memory mechanism, role playing, game playing, tool usage & human-agent interaction, benchmark & evaluation, environment & platform, agent framework, multi-agent system, and agent fine-tuning. It provides a comprehensive collection of research papers on LLM-based agents, exploring different aspects of AI agent architectures and applications.
OSWorld
OSWorld is a benchmarking tool designed to evaluate multimodal agents for open-ended tasks in real computer environments. It provides a platform for running experiments, setting up virtual machines, and interacting with the environment using Python scripts. Users can install the tool on their desktop or server, manage dependencies with Conda, and run benchmark tasks. The tool supports actions like executing commands, checking for specific results, and evaluating agent performance. OSWorld aims to facilitate research in AI by providing a standardized environment for testing and comparing different agent baselines.
council
Council is an open-source platform designed for the rapid development and deployment of customized generative AI applications using teams of agents. It extends the LLM tool ecosystem by providing advanced control flow and scalable oversight for AI agents. Users can create sophisticated agents with predictable behavior by leveraging Council's powerful approach to control flow using Controllers, Filters, Evaluators, and Budgets. The framework allows for automated routing between agents, comparing, evaluating, and selecting the best results for a task. Council aims to facilitate packaging and deploying agents at scale on multiple platforms while enabling enterprise-grade monitoring and quality control.
For similar jobs
weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.
LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.
kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.
PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.
tabby
Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.
spear
SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.
Magick
Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.