awesome-lifelong-llm-agent

This repository collects awesome survey, resource, and paper for lifelong learning LLM agents

Stars: 55

Visit

This repository is a collection of papers and resources related to Lifelong Learning of Large Language Model (LLM) based Agents. It focuses on continual learning and incremental learning of LLM agents, identifying key modules such as Perception, Memory, and Action. The repository serves as a roadmap for understanding lifelong learning in LLM agents and provides a comprehensive overview of related research and surveys.

README:

Lifelong Learning of Large Language Model based Agents: A Roadmap

Welcome to the repository accompanying our survey paper on Lifelong Learning of Large Language Model based Agents: A Roadmap. This repository collects awesome paper for lifelong learning (also known as, continual learning and incremental learning) of LLM agent. We identify three key modules-Perception, Memory, and Action-that are integral to agent's ability to perform lifelong learning. Please refer to this survey for detailed introduction. Additionally, for other papers, surveys, and resources on lifelong learning (continual learning, incremental learning) of LLMs, you can refer to this repository. A chinese version of this README is provided in this file.

📢 News

2025-1-14: We released a survey paper "Lifelong Learning of Large Language Model based Agents: A Roadmap". Feel free to cite or open pull requests.

📒 Table of Contents

Perception Module

Single-Modal Perception

Title	Venue	Date
AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agent	arXiv	2024-10
GPT-4V(ision) is a Generalist Web Agent, if Grounded	ICLR	2024-01
Webarena: A realistic web environment for building autonomous agents	ICLR	2023-07
Synapse: Trajectory-asexemplar prompting with memory for computer control	ICLR	2023-06
Multimodal web navigation with instruction-finetuned foundation models	ICLR	2023-05

Multi-Modal Perception

Title	Venue	Date
Llms can evolve continually on modality for x-modal reasoning	NeurIPS	2024-10
Modaverse: Efficiently transforming modalities with llms	CVPR	2024-01
Omnivore: A single model for many visual modalities	CVPR	2022-01
Perceiver: General perception with iterative attention	ICML	2021-07
Vatt: Transformers for multimodal self-supervised learning from raw video, audio and text	NeurIPS	2021-04

Memory Module

Working Memory

Title	Venue	Date
Character-llm: A trainable agent for role-playing	EMNLP	2023-10
Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers	ICLR	2023-09
Adapting Language Models to Compress Contexts	ACL	2023-05
Critic: Large language models can self-correct with tool-interactive critiquing}	NeurIPS Workshop	2023-05
Cogltx: Applying bert to long texts	NeurIPS	2020-12

Episodic Memory

Title	Venue	Date
DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory	arXiv	2024-10
MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-Domain Conversation	arXiv	2023-08
RET-LLM: Towards a General Read-Write Memory for Large Language Models	ICLR	2023-05
Bring Evanescent Representations to Life in Lifelong Class Incremental Learning	CVPR	2022-06
iCaRL: Incremental Classifier and Representation Learning	CVPR	2017-04

Semantic Memory

Title	Venue	Date
Fast and Continual Knowledge Graph Embedding via Incremental LoRA	IJCAL	2024-07
PromptDSI: Prompt-based Rehearsal-free Instance-wise Incremental Learning for Document Retrieval	arXiv	2024-06
CorpusBrain++: A Continual Generative Pre-Training Framework for Knowledge-Intensive Language Tasks	arXiv	2024-02
Continual Multimodal Knowledge Graph Construction	IJCAI	2023-05
Lifelong embedding learning and transfer for growing knowledge graphs	AAAI	2022-11

Parametric Memory

Title	Venue	Date
ELDER: Enhancing Lifelong Model Editing with Mixture-of-LoRA	arXiv	2024-08
WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models	NeurIPS	2024-05
WilKE: Wise-Layer Knowledge Editor for Lifelong Knowledge Editing	ACL	2024-02
Aging with GRACE: Lifelong Model Editing with Key-Value Adaptors	ICLR	2022-11
Plug-and-Play Adaptation for Continuously-updated QA	ACL	2022-04

Action Module

Grounding Actions

Tool Environment

Title	Venue	Date
LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error	ACL	2024-03
EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction	ICLR Workshop	2024-01
Confucius: Iterative Tool Learning from Introspection Feedback by Easy-to-Difficult Curriculum	AAAI	2023-08
GEAR: Augmenting Language Models with Generalizable and Efficient Tool Resolution	EACL	2023-07
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs	ICLR	2023-07
Large Language Models as Tool Makers	ICLR	2023-05
Toolformer: Language Models Can Teach Themselves to Use Tools	NeurIPS	2023-05
ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings	NeurIPS	2023-05
On the Tool Manipulation Capability of Open-source Large Language Models	arXiv	2023-05
ART: Automatic multi-step reasoning and tool-use for large language models	arXiv	2023-03

Web Environment

Title	Venue	Date
AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agent	arXiv	2024-10
WebPilot: A Versatile and Autonomous Multi-Agent System for Web Task Execution with Strategic Exploration	arXiv	2024-08
SteP: Stacked LLM Policies for Web Actions	COLM	2024-07
LASER: LLM Agent with State-Space Exploration for Web Navigation	NeurIPS Workshop	2023-09
Large Language Models Are Semi-Parametric Reinforcement Learning Agent	NeurIPS	2023-06

Game Environment

Title	Venue	Date
VillagerAgent: A Graph-Based Multi-Agent Framework for Coordinating Complex Task Dependencies in Minecraft	ACL	2024-06
See and Think: Embodied Agent in Virtual Environment	ECCV	2023-11
JARVIS-1: Open-World Multi-task Agents with Memory-Augmented Multimodal Language Models	TPAMI	2023-11
Voyager: An Open-Ended Embodied Agent with Large Language Models	arXiv	2023-05
Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents	NeurIPS	2023-02

Retrieval Actions

Retrieval from Semantic memory

Title	Venue	Date
See and Think: Embodied Agent in Virtual Environment	ECCV	2023-11
Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory	arXiv	2023-05
Planning with Large Language Models via Corrective Re-prompting	NeurIPS Workshop	2022-10
Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents	ICML	2022-01

Retrieval from Episodic memory

Title	Venue	Date
VLM Agents Generate Their Own Memories: Distilling Experience into Embodied Programs	arXiv	2024-06
Large Language Models as Tool Makers	ICLR	2023-05
On the Tool Manipulation Capability of Open-source Large Language Models	arXiv	2023-05
Voyager: An Open-Ended Embodied Agent with Large Language Models	arXiv	2023-05
ART: Automatic multi-step reasoning and tool-use for large language models	arXiv	2023-03

Reasoning Actions

Intra-Episodic Memory

Title	Venue	Date
Reasoning with Language Model is Planning with World Model	EMNLP	2023-05
Large Language Models as Commonsense Knowledge for Large-Scale Task Planning	NeurIPS	2023-05
Tree of Thoughts: Deliberate Problem Solving with Large Language Models	NeurIPS	2023-05
SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks	NeurIPS2023	2023-05
Reflexion: Language Agents with Verbal Reinforcement Learning	NeurIPS	2023-03
ReAct: Synergizing Reasoning and Acting in Language Models	ICLR	2022-10

Inter-Episodic Memory

Title	Venue	Date
VLM Agents Generate Their Own Memories: Distilling Experience into Embodied Programs	arXiv	2024-06
LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error	ACL	2024-03
See and Think: Embodied Agent in Virtual Environment	ECCV	2023-11
Large Language Models Are Semi-Parametric Reinforcement Learning Agent	NeurIPS	2023-06
Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory	arXiv	2023-05
Voyager: An Open-Ended Embodied Agent with Large Language Models	arXiv	2023-05

📚 Cite Our Work

@article{zheng2025lifelong,
      title={Lifelong Learning of Large Language Model based Agents: A Roadmap}, 
      author={Zheng, Junhao and Shi, Chengming and Cai, Xidi and Li, Qiuke and Zhang, Duzhen and Li, Chenxing and Yu, Dong and Ma, Qianli},
      journal={arXiv preprint arXiv:2501.07278},
      year={2025},
}

😍 Star History

For Tasks:

Click tags to check more tools for each tasks

analyze papers explore resources understand lifelong learning implement llm models conduct research

For Jobs:

research scientist machine learning engineer data scientist ai researcher natural language processing specialist

Alternative AI tools for awesome-lifelong-llm-agent

Similar Open Source Tools

awesome-lifelong-llm-agent

github

: 55

awesome-sound_event_detection

The 'awesome-sound_event_detection' repository is a curated reading list focusing on sound event detection and Sound AI. It includes research papers covering various sub-areas such as learning formulation, network architecture, pooling functions, missing or noisy audio, data augmentation, representation learning, multi-task learning, few-shot learning, zero-shot learning, knowledge transfer, polyphonic sound event detection, loss functions, audio and visual tasks, audio captioning, audio retrieval, audio generation, and more. The repository provides a comprehensive collection of papers, datasets, and resources related to sound event detection and Sound AI, making it a valuable reference for researchers and practitioners in the field.

github

: 147

AceCoder

AceCoder is a tool that introduces a fully automated pipeline for synthesizing large-scale reliable tests used for reward model training and reinforcement learning in the coding scenario. It curates datasets, trains reward models, and performs RL training to improve coding abilities of language models. The tool aims to unlock the potential of RL training for code generation models and push the boundaries of LLM's coding abilities.

github

: 74

Awesome-LLM4AD

github

: 911

Awesome-CVPR2024-ECCV2024-AIGC

A Collection of Papers and Codes for CVPR 2024 AIGC. This repository compiles and organizes research papers and code related to CVPR 2024 and ECCV 2024 AIGC (Artificial Intelligence and Graphics Computing). It serves as a valuable resource for individuals interested in the latest advancements in the field of computer vision and artificial intelligence. Users can find a curated list of papers and accompanying code repositories for further exploration and research. The repository encourages collaboration and contributions from the community through stars, forks, and pull requests.

github

: 427

ERNIE

ERNIE 4.5 is a family of large-scale multimodal models with 10 distinct variants, including Mixture-of-Experts (MoE) models with 47B and 3B active parameters. The models feature a novel heterogeneous modality structure supporting parameter sharing across modalities while allowing dedicated parameters for each individual modality. Trained with optimal efficiency using PaddlePaddle deep learning framework, ERNIE 4.5 models achieve state-of-the-art performance across text and multimodal benchmarks, enhancing multimodal understanding without compromising performance on text-related tasks. The open-source development toolkits for ERNIE 4.5 offer industrial-grade capabilities, resource-efficient training and inference workflows, and multi-hardware compatibility.

github

: 7.5k

CuMo

CuMo is a project focused on scaling multimodal Large Language Models (LLMs) with Co-Upcycled Mixture-of-Experts. It introduces CuMo, which incorporates Co-upcycled Top-K sparsely-gated Mixture-of-experts blocks into the vision encoder and the MLP connector, enhancing the capabilities of multimodal LLMs. The project adopts a three-stage training approach with auxiliary losses to stabilize the training process and maintain a balanced loading of experts. CuMo achieves comparable performance to other state-of-the-art multimodal LLMs on various Visual Question Answering (VQA) and visual-instruction-following benchmarks.

github

: 94

OpenCatEsp32

OpenCat code running on BiBoard, a high-performance ESP32 quadruped robot development board. The board is mainly designed for developers and engineers working on multi-degree-of-freedom (MDOF) Multi-legged robots with up to 12 servos.

github

: 106

MiniAI-Face-Recognition-LivenessDetection-ServerSDK

The MiniAiLive Face Recognition LivenessDetection Server SDK provides system integrators with fast, flexible, and extremely precise facial recognition that can be deployed across various scenarios, including security, access control, public safety, fintech, smart retail, and home protection. The SDK is fully on-premise, meaning all processing happens on the hosting server, and no data leaves the server. The project structure includes bin, cpp, flask, model, python, test_image, and Dockerfile directories. To set up the project on Linux, download the repo, install system dependencies, and copy libraries into the system folder. For Windows, contact MiniAiLive via email. The C++ example involves replacing the license key in main.cpp, building the project, and running it. The Python example requires installing dependencies and running the project. The Python Flask example involves replacing the license key in app.py, installing dependencies, and running the project. The Docker Flask example includes building the docker image and running it. To request a license, contact MiniAiLive. Contributions to the project are welcome by following specific steps. An online demo is available at https://demo.miniai.live. Related products include MiniAI-Face-Recognition-LivenessDetection-AndroidSDK, MiniAI-Face-Recognition-LivenessDetection-iOS-SDK, MiniAI-Face-LivenessDetection-AndroidSDK, MiniAI-Face-LivenessDetection-iOS-SDK, MiniAI-Face-Matching-AndroidSDK, and MiniAI-Face-Matching-iOS-SDK. MiniAiLive is a leading AI solutions company specializing in computer vision and machine learning technologies.

github

: 83

LLM-Fine-Tuning-Azure

A fine-tuning guide for both OpenAI and Open-Source Large Language Models on Azure. Fine-Tuning retrains an existing pre-trained LLM using example data, resulting in a new 'custom' fine-tuned LLM optimized for task-specific examples. Use cases include improving LLM performance on specific tasks and introducing information not well represented by the base LLM model. Suitable for cases where latency is critical, high accuracy is required, and clear evaluation metrics are available. Learning path includes labs for fine-tuning GPT and Llama2 models via Dashboards and Python SDK.

github

: 103

FATE-LLM

FATE-LLM is a framework supporting federated learning for large and small language models. It promotes training efficiency of federated LLMs using Parameter-Efficient methods, protects the IP of LLMs using FedIPR, and ensures data privacy during training and inference through privacy-preserving mechanisms.

github

: 135

cad-recode

CAD-Recode is a 3D CAD reverse engineering method implemented in Python using the CadQuery library. It transforms point clouds into 3D CAD models by leveraging a pre-trained model and additional linear layers. The repository includes an inference demo for users to generate CAD models from point clouds. CAD-Recode has achieved state-of-the-art performance in CAD reconstruction benchmarks such as DeepCAD, Fusion360, and CC3D. Researchers and engineers can utilize this tool to reverse engineer CAD code from point clouds efficiently.

github

: 85

Agent

Agent is a RustSBI specialized domain knowledge quiz LLM tool that extracts domain knowledge from various sources such as Rust Documentation, RISC-V Documentation, Bouffalo Docs, Bouffalo SDK, and Xiangshan Docs. It also provides resources for LLM prompt engineering and RAG engineering, including guides and existing projects related to retrieval-augmented generation (RAG) systems.

github

: 101

MiniAI-Face-Recognition-LivenessDetection-AndroidSDK

MiniAiLive provides system integrators with fast, flexible and extremely precise facial recognition with 3D passive face liveness detection (face anti-spoofing) that can be deployed across a number of scenarios, including security, access control, public safety, fintech, smart retail and home protection.

github

: 307

enhance_llm

The enhance_llm repository contains three main parts: 1. Vector model domain fine-tuning based on llama_index and qwen fine-tuning BGE vector model. 2. Large model domain fine-tuning based on PEFT fine-tuning qwen1.5-7b-chat, with sft and dpo. 3. High-order retrieval enhanced generation (RAG) system based on the above domain work, implementing a two-stage RAG system. It includes query rewriting, recall reordering, retrieval reordering, multi-turn dialogue, and more. The repository also provides hardware and environment configurations along with star history and licensing information.

github

: 142

Awesome-LLM-Quantization

Awesome-LLM-Quantization is a curated list of resources related to quantization techniques for Large Language Models (LLMs). Quantization is a crucial step in deploying LLMs on resource-constrained devices, such as mobile phones or edge devices, by reducing the model's size and computational requirements.

github

: 310

For similar tasks

Awesome-LLM-RAG

This repository, Awesome-LLM-RAG, aims to record advanced papers on Retrieval Augmented Generation (RAG) in Large Language Models (LLMs). It serves as a resource hub for researchers interested in promoting their work related to LLM RAG by updating paper information through pull requests. The repository covers various topics such as workshops, tutorials, papers, surveys, benchmarks, retrieval-enhanced LLMs, RAG instruction tuning, RAG in-context learning, RAG embeddings, RAG simulators, RAG search, RAG long-text and memory, RAG evaluation, RAG optimization, and RAG applications.

github

: 733

Awesome_LLM_System-PaperList

Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of papers on LLMs inference and serving.

github

: 184

LLM-Tool-Survey

This repository contains a collection of papers related to tool learning with large language models (LLMs). The papers are organized according to the survey paper 'Tool Learning with Large Language Models: A Survey'. The survey focuses on the benefits and implementation of tool learning with LLMs, covering aspects such as task planning, tool selection, tool calling, response generation, benchmarks, evaluation, challenges, and future directions in the field. It aims to provide a comprehensive understanding of tool learning with LLMs and inspire further exploration in this emerging area.

github

: 220

Awesome-CVPR2024-ECCV2024-AIGC

github

: 427

LLMs-in-science

The 'LLMs-in-science' repository is a collaborative environment for organizing papers related to large language models (LLMs) and autonomous agents in the field of chemistry. The goal is to discuss trend topics, challenges, and the potential for supporting scientific discovery in the context of artificial intelligence. The repository aims to maintain a systematic structure of the field and welcomes contributions from the community to keep the content up-to-date and relevant.

github

: 103

Awesome-Papers-Autonomous-Agent

Awesome-Papers-Autonomous-Agent is a curated collection of recent papers focusing on autonomous agents, specifically interested in RL-based agents and LLM-based agents. The repository aims to provide a comprehensive resource for researchers and practitioners interested in intelligent agents that can achieve goals, acquire knowledge, and continually improve. The collection includes papers on various topics such as instruction following, building agents based on world models, using language as knowledge, leveraging LLMs as a tool, generalization across tasks, continual learning, combining RL and LLM, transformer-based policies, trajectory to language, trajectory prediction, multimodal agents, training LLMs for generalization and adaptation, task-specific designing, multi-agent systems, experimental analysis, benchmarking, applications, algorithm design, and combining with RL.

github

: 521

awesome-lifelong-llm-agent

github

: 55

LLM-Agent-Survey

LLM-Agent-Survey is a comprehensive repository that provides a curated list of papers related to Large Language Model (LLM) agents. The repository categorizes papers based on LLM-Profiled Roles and includes high-quality publications from prestigious conferences and journals. It aims to offer a systematic understanding of LLM-based agents, covering topics such as tool use, planning, and feedback learning. The repository also includes unpublished papers with insightful analysis and novelty, marked for future updates. Users can explore a wide range of surveys, tool use cases, planning workflows, and benchmarks related to LLM agents.

github

: 113

For similar jobs

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 980

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 2.9k

tabby

Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

github

: 32.1k

spear

SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

github

: 224

Magick

Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.

github

: 675