
ms-agent
MS-Agent: Lightweight Framework for Empowering Agents with Autonomous Exploration in Complex Task Scenarios
Stars: 3446

MS-Agent is a lightweight framework designed to empower agents with autonomous exploration capabilities. It provides a flexible and extensible architecture for creating agents capable of tasks like code generation, data analysis, and tool calling with MCP support. The framework supports multi-agent interactions, deep research, code generation, and is lightweight and extensible for various applications.
README:
MS-Agent is a lightweight framework designed to empower agents with autonomous exploration capabilities. It provides a flexible and extensible architecture that allows developers to create agents capable of performing complex tasks, such as code generation, data analysis, and tool calling for general purposes with MCP (Model Calling Protocol) support.
- Multi-Agent for general purpose: Chat with agent with tool-calling capabilities based on MCP.
- Deep Research: To enable advanced capabilities for autonomous exploration and complex task execution.
- Code Generation: Supports code generation tasks with artifacts.
- Lightweight and Extensible: Easy to extend and customize for various applications.
[WARNING] For historical archive versions, please refer to: https://github.com/modelscope/ms-agent/tree/0.8.0
WeChat Group |
---|
![]() |
-
🚀Aug 28, 2025: Release MS-Agent v1.2.0, which includes the following updates:
- DocResearch now supports pushing to
ModelScope
、HuggingFace
、GitHub
for easy sharing of research reports. Refer to Doc Research for more details. - DocResearch now supports exporting the Markdown report to
HTML
、PDF
、PPTX
andDOCX
formats, refer to Doc Research for more details. - DocResearch now supports
TXT
file processing and file preprocessing, refer to Doc Research for more details.
- DocResearch now supports pushing to
-
🚀July 31, 2025: Release MS-Agent v1.1.0, which includes the following updates:
- 🔥 Support Doc Research, demo: DocResearchStudio
- Add
General Web Search Engine
for Agentic Insight (DeepResearch) - Add
Max Continuous Runs
for Agent chat with MCP.
-
🚀July 18, 2025: Release MS-Agent v1.0.0, improve the experience of Agent chat with MCP, and update the readme for Agentic Insight.
-
🚀July 16, 2025: Release MS-Agent v1.0.0rc0, which includes the following updates:
- Support for Agent chat with MCP (Model Context Protocol)
- Support for Deep Research (Agentic Insight), refer to: Report_Demo, Script_Demo
- Support for MCP-Playground
- Add callback mechanism for Agent chat
Archive
- 🔥🔥🔥Aug 8, 2024: A new graph based code generation tool CodexGraph is released by Modelscope-Agent, it has been proved effective and versatile on various code related tasks, please check example.
- 🔥🔥Aug 1, 2024: A high efficient and reliable Data Science Assistant is running on Modelscope-Agent, please find detail in example.
- 🔥July 17, 2024: Parallel tool calling on Modelscope-Agent-Server, please find detail in doc.
- 🔥June 17, 2024: Upgrading RAG flow based on LLama-index, allow user to hybrid search knowledge by different strategies and modalities, please find detail in doc.
- 🔥June 6, 2024: With Modelscope-Agent-Server, Qwen2 could be used by OpenAI SDK with tool calling ability, please find detail in doc.
- 🔥June 4, 2024: Modelscope-Agent supported Mobile-Agent-V2arxiv,based on Android Adb Env, please check in the application.
- 🔥May 17, 2024: Modelscope-Agent supported multi-roles room chat in the gradio.
- May 14, 2024: Modelscope-Agent supported image input in
RolePlay
agents with latest OpenAI modelGPT-4o
. Developers can experience this feature by specifying theimage_url
parameter. - May 10, 2024: Modelscope-Agent launched a user-friendly
Assistant API
, and also provided aTools API
that executes utilities in isolated, secure containers, please find the document - Apr 12, 2024: The Ray version of multi-agent solution is on modelscope-agent, please find the document
- Mar 15, 2024: Modelscope-Agent and the AgentFabric (opensource version for GPTs) is running on the production environment of modelscope studio.
- Feb 10, 2024: In Chinese New year, we upgrade the modelscope agent to version v0.3 to facilitate developers to customize various types of agents more conveniently through coding and make it easier to make multi-agent demos. For more details, you can refer to #267 and #293 .
- Nov 26, 2023: AgentFabric now supports collaborative use in ModelScope's Creation Space, allowing for the sharing of custom applications in the Creation Space. The update also includes the latest GTE text embedding integration.
- Nov 17, 2023: AgentFabric released, which is an interactive framework to facilitate creation of agents tailored to various real-world applications.
- Oct 30, 2023: Facechain Agent released a local version of the Facechain Agent that can be run locally. For detailed usage instructions, please refer to Facechain Agent.
- Oct 25, 2023: Story Agent released a local version of the Story Agent for generating storybook illustrations. It can be run locally. For detailed usage instructions, please refer to Story Agent.
- Sep 20, 2023: ModelScope GPT offers a local version through gradio that can be run locally. You can navigate to the demo/msgpt/ directory and execute
bash run_msgpt.sh
. - Sep 4, 2023: Three demos, demo_qwen, demo_retrieval_agent and demo_register_tool, have been added, along with detailed tutorials provided.
- Sep 2, 2023: The preprint paper associated with this project was published.
- Aug 22, 2023: Support accessing various AI model APIs using ModelScope tokens.
- Aug 7, 2023: The initial version of the modelscope-agent repository was released.
# For the basic functionalities
pip install ms-agent
# For the deep research functionalities
pip install 'ms-agent[research]'
git clone https://github.com/modelscope/ms-agent.git
cd ms-agent
pip install -e .
[!WARNING] As the project has been renamed to
ms-agent
, for versionsv0.8.0
or earlier, you can install using the following command:pip install modelscope-agent<=0.8.0
To import relevant dependencies using
modelscope_agent
:from modelscope_agent import ...
This project supports interaction with models via the MCP (Model Context Protocol). Below is a complete example showing how to configure and run an LLMAgent with MCP support.
✅ Chat with agents using the MCP protocol: MCP Playground
By default, the agent uses ModelScope's API inference service. Before running the agent, make sure to set your ModelScope API key.
export MODELSCOPE_API_KEY={your_modelscope_api_key}
You can find or generate your API key at https://modelscope.cn/my/myaccesstoken.
from ms_agent import LLMAgent
import asyncio
# Configure MCP server
mcp = {
"mcpServers": {
"fetch": {
"type": "sse",
"url": "https://{your_mcp_url}.api-inference.modelscope.net/sse"
}
}
}
async def main():
# Initialize the agent with MCP configuration
llm_agent = LLMAgent(mcp_config=mcp)
# Run a task
await llm_agent.run('Briefly introduce modelscope.cn')
if __name__ == '__main__':
# Launch the async main function
asyncio.run(main())
💡 Tip: You can find available MCP server configurations at modelscope.cn/mcp.
For example: https://modelscope.cn/mcp/servers/@modelcontextprotocol/fetch.
Replace the url in mcp["mcpServers"]["fetch"]
with your own MCP server endpoint.
This project provides a framework for Deep Research, enabling agents to autonomously explore and execute complex tasks.
-
Autonomous Exploration - Autonomous exploration for various complex tasks
-
Multimodal - Capable of processing diverse data modalities and generating research reports rich in both text and images.
-
Lightweight & Efficient - Support "search-then-execute" mode, completing complex research tasks within few minutes, significantly reducing token consumption.
Here is a demonstration of the Agentic Insight framework in action, showcasing its capabilities in handling complex research tasks efficiently.
-
User query
-
- Chinese:
在计算化学这个领域,我们通常使用Gaussian软件模拟各种情况下分子的结构和性质计算,比如在关键词中加入'field=x+100'代表了在x方向增加了电场。但是,当体系是经典的单原子催化剂时,它属于分子催化剂,在反应环境中分子的朝向是不确定的,那么理论模拟的x方向电场和实际电场是不一致的。
请问:通常情况下,理论计算是如何模拟外加电场存在的情况?
-
- English:
In the field of computational chemistry, we often use Gaussian software to simulate the structure and properties of molecules under various conditions. For instance, adding 'field=x+100' to the keywords signifies an electric field applied along the x-direction. However, when dealing with a classical single-atom catalyst, which falls under molecular catalysis, the orientation of the molecule in the reaction environment is uncertain. This means the x-directional electric field in the theoretical simulation might not align with the actual electric field.
So, how are external electric fields typically simulated in theoretical calculations?
https://github.com/user-attachments/assets/b1091dfc-9429-46ad-b7f8-7cbd1cf3209b
For more details, please refer to Deep Research.
This project provides a framework for Doc Research, enabling agents to autonomously explore and execute complex tasks related to document analysis and research.
- 🔍 Deep Document Research - Support deep analysis and summarization of documents
- 📝 Multiple Input Types - Support multi-file uploads and URL inputs
- 📊 Multimodal Reports - Support text and image reports in Markdown format
- 🚀 High Efficiency - Leverage powerful LLMs for fast and accurate research, leveraging key information extraction techniques to further optimize token usage
- ⚙️ Flexible Deployment - Support local run and ModelScope Studio
- 💰 Free Model Inference - Free LLM API inference calls for ModelScope users, refer to ModelScope API-Inference
1. ModelScope Studio DocResearchStudio
2. Local Gradio Application
- Research Report for UniME: Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs
For more details, refer to Doc Research
- A news collection agent ms-agent/newspaper
This project is licensed under the Apache License (Version 2.0).
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for ms-agent
Similar Open Source Tools

ms-agent
MS-Agent is a lightweight framework designed to empower agents with autonomous exploration capabilities. It provides a flexible and extensible architecture for creating agents capable of tasks like code generation, data analysis, and tool calling with MCP support. The framework supports multi-agent interactions, deep research, code generation, and is lightweight and extensible for various applications.

agentic
Agentic is a lightweight and flexible Python library for building multi-agent systems. It provides a simple and intuitive API for creating and managing agents, defining their behaviors, and simulating interactions in a multi-agent environment. With Agentic, users can easily design and implement complex agent-based models to study emergent behaviors, social dynamics, and decentralized decision-making processes. The library supports various agent architectures, communication protocols, and simulation scenarios, making it suitable for a wide range of research and educational applications in the fields of artificial intelligence, machine learning, social sciences, and robotics.

atomic-agents
The Atomic Agents framework is a modular and extensible tool designed for creating powerful applications. It leverages Pydantic for data validation and serialization. The framework follows the principles of Atomic Design, providing small and single-purpose components that can be combined. It integrates with Instructor for AI agent architecture and supports various APIs like Cohere, Anthropic, and Gemini. The tool includes documentation, examples, and testing features to ensure smooth development and usage.

slime
Slime is an LLM post-training framework for RL scaling that provides high-performance training and flexible data generation capabilities. It connects Megatron with SGLang for efficient training and enables custom data generation workflows through server-based engines. The framework includes modules for training, rollout, and data buffer management, offering a comprehensive solution for RL scaling.

OpenManus-RL
OpenManus-RL is an open-source initiative focused on enhancing reasoning and decision-making capabilities of large language models (LLMs) through advanced reinforcement learning (RL)-based agent tuning. The project explores novel algorithmic structures, diverse reasoning paradigms, sophisticated reward strategies, and extensive benchmark environments. It aims to push the boundaries of agent reasoning and tool integration by integrating insights from leading RL tuning frameworks and continuously updating progress in a dynamic, live-streaming fashion.

open-webui-tools
Open WebUI Tools Collection is a set of tools for structured planning, arXiv paper search, Hugging Face text-to-image generation, prompt enhancement, and multi-model conversations. It enhances LLM interactions with academic research, image generation, and conversation management. Tools include arXiv Search Tool and Hugging Face Image Generator. Function Pipes like Planner Agent offer autonomous plan generation and execution. Filters like Prompt Enhancer improve prompt quality. Installation and configuration instructions are provided for each tool and pipe.

ragflow
RAGFlow is an open-source Retrieval-Augmented Generation (RAG) engine that combines deep document understanding with Large Language Models (LLMs) to provide accurate question-answering capabilities. It offers a streamlined RAG workflow for businesses of all sizes, enabling them to extract knowledge from unstructured data in various formats, including Word documents, slides, Excel files, images, and more. RAGFlow's key features include deep document understanding, template-based chunking, grounded citations with reduced hallucinations, compatibility with heterogeneous data sources, and an automated and effortless RAG workflow. It supports multiple recall paired with fused re-ranking, configurable LLMs and embedding models, and intuitive APIs for seamless integration with business applications.

ktransformers
KTransformers is a flexible Python-centric framework designed to enhance the user's experience with advanced kernel optimizations and placement/parallelism strategies for Transformers. It provides a Transformers-compatible interface, RESTful APIs compliant with OpenAI and Ollama, and a simplified ChatGPT-like web UI. The framework aims to serve as a platform for experimenting with innovative LLM inference optimizations, focusing on local deployments constrained by limited resources and supporting heterogeneous computing opportunities like GPU/CPU offloading of quantized models.

trae-agent
Trae-agent is a Python library for building and training reinforcement learning agents. It provides a simple and flexible framework for implementing various reinforcement learning algorithms and experimenting with different environments. With Trae-agent, users can easily create custom agents, define reward functions, and train them on a variety of tasks. The library also includes utilities for visualizing agent performance and analyzing training results, making it a valuable tool for both beginners and experienced researchers in the field of reinforcement learning.

nekro-agent
Nekro Agent is an AI chat plugin and proxy execution bot that is highly scalable, offers high freedom, and has minimal deployment requirements. It features context-aware chat for group/private chats, custom character settings, sandboxed execution environment, interactive image resource handling, customizable extension development interface, easy deployment with docker-compose, integration with Stable Diffusion for AI drawing capabilities, support for various file types interaction, hot configuration updates and command control, native multimodal understanding, visual application management control panel, CoT (Chain of Thought) support, self-triggered timers and holiday greetings, event notification understanding, and more. It allows for third-party extensions and AI-generated extensions, and includes features like automatic context trigger based on LLM, and a variety of basic commands for bot administrators.

youtu-graphrag
Youtu-GraphRAG is a vertically unified agentic paradigm that connects the entire framework based on graph schema, allowing seamless domain transfer with minimal intervention. It introduces key innovations like schema-guided hierarchical knowledge tree construction, dually-perceived community detection, agentic retrieval, advanced construction and reasoning capabilities, fair anonymous dataset 'AnonyRAG', and unified configuration management. The framework demonstrates robustness with lower token cost and higher accuracy compared to state-of-the-art methods, enabling enterprise-scale deployment with minimal manual intervention for new domains.

context-portal
Context-portal is a versatile tool for managing and visualizing data in a collaborative environment. It provides a user-friendly interface for organizing and sharing information, making it easy for teams to work together on projects. With features such as customizable dashboards, real-time updates, and seamless integration with popular data sources, Context-portal streamlines the data management process and enhances productivity. Whether you are a data analyst, project manager, or team leader, Context-portal offers a comprehensive solution for optimizing workflows and driving better decision-making.

mcp-context-forge
MCP Context Forge is a powerful tool for generating context-aware data for machine learning models. It provides functionalities to create diverse datasets with contextual information, enhancing the performance of AI algorithms. The tool supports various data formats and allows users to customize the context generation process easily. With MCP Context Forge, users can efficiently prepare training data for tasks requiring contextual understanding, such as sentiment analysis, recommendation systems, and natural language processing.

deepflow
DeepFlow is an open-source project that provides deep observability for complex cloud-native and AI applications. It offers Zero Code data collection with eBPF for metrics, distributed tracing, request logs, and function profiling. DeepFlow is integrated with SmartEncoding to achieve Full Stack correlation and efficient access to all observability data. With DeepFlow, cloud-native and AI applications automatically gain deep observability, removing the burden of developers continually instrumenting code and providing monitoring and diagnostic capabilities covering everything from code to infrastructure for DevOps/SRE teams.

ml-engineering
This repository provides a comprehensive collection of methodologies, tools, and step-by-step instructions for successful training of large language models (LLMs) and multi-modal models. It is a technical resource suitable for LLM/VLM training engineers and operators, containing numerous scripts and copy-n-paste commands to facilitate quick problem-solving. The repository is an ongoing compilation of the author's experiences training BLOOM-176B and IDEFICS-80B models, and currently focuses on the development and training of Retrieval Augmented Generation (RAG) models at Contextual.AI. The content is organized into six parts: Insights, Hardware, Orchestration, Training, Development, and Miscellaneous. It includes key comparison tables for high-end accelerators and networks, as well as shortcuts to frequently needed tools and guides. The repository is open to contributions and discussions, and is licensed under Attribution-ShareAlike 4.0 International.

tensorzero
TensorZero is an open-source platform that helps LLM applications graduate from API wrappers into defensible AI products. It enables a data & learning flywheel for LLMs by unifying inference, observability, optimization, and experimentation. The platform includes a high-performance model gateway, structured schema-based inference, observability, experimentation, and data warehouse for analytics. TensorZero Recipes optimize prompts and models, and the platform supports experimentation features and GitOps orchestration for deployment.
For similar tasks

Azure-Analytics-and-AI-Engagement
The Azure-Analytics-and-AI-Engagement repository provides packaged Industry Scenario DREAM Demos with ARM templates (Containing a demo web application, Power BI reports, Synapse resources, AML Notebooks etc.) that can be deployed in a customer’s subscription using the CAPE tool within a matter of few hours. Partners can also deploy DREAM Demos in their own subscriptions using DPoC.

sorrentum
Sorrentum is an open-source project that aims to combine open-source development, startups, and brilliant students to build machine learning, AI, and Web3 / DeFi protocols geared towards finance and economics. The project provides opportunities for internships, research assistantships, and development grants, as well as the chance to work on cutting-edge problems, learn about startups, write academic papers, and get internships and full-time positions at companies working on Sorrentum applications.

tidb
TiDB is an open-source distributed SQL database that supports Hybrid Transactional and Analytical Processing (HTAP) workloads. It is MySQL compatible and features horizontal scalability, strong consistency, and high availability.

zep-python
Zep is an open-source platform for building and deploying large language model (LLM) applications. It provides a suite of tools and services that make it easy to integrate LLMs into your applications, including chat history memory, embedding, vector search, and data enrichment. Zep is designed to be scalable, reliable, and easy to use, making it a great choice for developers who want to build LLM-powered applications quickly and easily.

telemetry-airflow
This repository codifies the Airflow cluster that is deployed at workflow.telemetry.mozilla.org (behind SSO) and commonly referred to as "WTMO" or simply "Airflow". Some links relevant to users and developers of WTMO: * The `dags` directory in this repository contains some custom DAG definitions * Many of the DAGs registered with WTMO don't live in this repository, but are instead generated from ETL task definitions in bigquery-etl * The Data SRE team maintains a WTMO Developer Guide (behind SSO)

mojo
Mojo is a new programming language that bridges the gap between research and production by combining Python syntax and ecosystem with systems programming and metaprogramming features. Mojo is still young, but it is designed to become a superset of Python over time.

pandas-ai
PandasAI is a Python library that makes it easy to ask questions to your data in natural language. It helps you to explore, clean, and analyze your data using generative AI.

databend
Databend is an open-source cloud data warehouse that serves as a cost-effective alternative to Snowflake. With its focus on fast query execution and data ingestion, it's designed for complex analysis of the world's largest datasets.
For similar jobs

weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

tabby
Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

spear
SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

Magick
Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.