AgentSquare

AgentSquare

The official implementation of the paper "AgentSquare: Automatic LLM Agent Search in Modular Design Space""

Stars: 163

Visit
 screenshot

AgentSquare is an official implementation for the paper 'AgentSquare: Automatic LLM Agent Search in Modular Design Space'. It provides code, prompts, and results for automatic LLM agent search. The tool allows users to set up OpenAI API key, install dependencies, and run various tasks such as ALFworld, Webshop, M3Tooleval, and Sciworld. Users can also contribute new modules to the modular design challenge by standardizing LLM agents with recommended I/O interfaces. The tool aims to offer a platform for fully exploiting successful agent designs and consolidating efforts of the LLM agent research community.

README:

[ICLR 2025] AgentSquare: Automatic LLM Agent Search In Modular Design Space

Code License Python 3.8+

🌐 Website | πŸ“ƒ Paper |

AgentSquare

The official implementation for paper AgentSquare: Automatic LLM Agent Search in Modular Design Space with code, prompts and results.

intro

πŸŽ‰ News

🌎 Setup

  1. Set up OpenAI API key and store in environment.
export OPENAI_API_KEY=<YOUR KEY HERE>
  1. Install dependencies
git clone https://github.com/tsinghua-fib-lab/AgentSquare.git
conda create -n agentsquare python=3.9.12
conda activate agentsquare
cd AgentSquare
pip install -r requirements.txt

πŸš€ Quick Start: Demo with ALFWorld

https://github.com/user-attachments/assets/23090869-8c60-4ee8-98ec-75dd6f4255a0

An exemplar script combining different agent modules to solve the task of ALFworld:

export ALFWORLD_DATA=<Your path>/AgentSquare/tasks/alfworld
cd tasks/alfworld
sh run.sh or 
python3 alfworld_run.py \
    --planning deps\
    --reasoning cot\
    --tooluse none\
    --memory dilu\
    --model gpt-3.5-turbo-0125 \

πŸ”Ž Run Other Tasks

Install dependencies

cd tasks
pip install -r requirements.txt
Webshop

Install webshop environment following instructions here and launch the WebShop webpage.

cd tasks/webshop
sh run.sh
M3Tooleval
cd tasks/m3tooleval
sh run.sh
Sciworld

Install Sciworld environment following instructions here .

cd tasks/sciworld/agentboard
python3 eval_main_sci.py \
    --cfg-path ../eval_configs/main_results_all_tasks.yaml     --tasks scienceworld     --wandb     --log_path ../results/gpt-4o-2024-08-06    --project_name evaluate-gpt-4o-2024-08-06     --baseline_dir ../data/baseline_results \
    --model gpt-4o-2024-08-06 \
    --planning none \
    --reasoning cot \
    --tooluse none \
    --memory none \

🌟 Modular Design Challenge

We kindly invite you to participate in the modular design challenge by standardizing your LLM agents with our recommended I/O interfaces. Let's work together to offer a platform for fully exploiting the potential of successful agent designs and consolidating the collective efforts of LLM agent research community!

Contribute New Modules

For guidance on standardizing the I/O interfaces of the four types of agent modules, please refer to module pools, which provides some existing modules, along with a complete interface description available in module interface description. Click here for a detailed procedure. You can submit your standardized modules through this link. The .py file format is preferred, examples can be seen in the modules folder. We will check your submission timely, once approved we will cite and acknowledge your works in this repository.

πŸ’‘ How to Add Your Own Task

You can refer to the workflow.py to integrate it with your encapsulated tasks, just like in tasks/alfworld.

Citations

Please considering citing our paper and staring this repo if you use AgentSquare and find it useful, thanks! Feel free to contact [email protected] or open an issue if you have any question.

@article{shang2024agentsquare,
  title={AgentSquare: Automatic LLM Agent Search in Modular Design Space},
  author={Shang, Yu and Li, Yu and Zhao, Keyu and Ma, Likai and Liu, Jiahe and Xu, Fengli and Li, Yong},
  journal={arXiv preprint arXiv:2410.06153},
  year={2024}
}

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for AgentSquare

Similar Open Source Tools

For similar tasks

For similar jobs