
ai-manus
AI Manus is a general-purpose AI Agent system that supports running various tools and operations in a sandbox environment.
Stars: 959

AI Manus is a general-purpose AI Agent system that supports running various tools and operations in a sandbox environment. It offers deployment with minimal dependencies, supports multiple tools like Terminal, Browser, File, Web Search, and messaging tools, allocates separate sandboxes for tasks, manages session history, supports stopping and interrupting conversations, file upload and download, and is multilingual. The system also provides user login and authentication. The project primarily relies on Docker for development and deployment, with model capability requirements and recommended Deepseek and GPT models.
README:
AI Manus is a general-purpose AI Agent system that supports running various tools and operations in a sandbox environment.
Enjoy your own agent with AI Manus!
👏 Join QQ Group(1005477581)
https://github.com/user-attachments/assets/37060a09-c647-4bcb-920c-959f7fa73ebe
- Task: Latest LLM papers
https://github.com/user-attachments/assets/4e35bc4d-024a-4617-8def-a537a94bd285
- Task: Write a complex Python example
https://github.com/user-attachments/assets/765ea387-bb1c-4dc2-b03e-716698feef77
- Deployment: Minimal deployment requires only an LLM service, with no dependency on other external services.
- Tools: Supports Terminal, Browser, File, Web Search, and messaging tools with real-time viewing and takeover capabilities, supports external MCP tool integration.
- Sandbox: Each task is allocated a separate sandbox that runs in a local Docker environment.
- Task Sessions: Session history is managed through MongoDB/Redis, supporting background tasks.
- Conversations: Supports stopping and interrupting, file upload and download.
- Multilingual: Supports both Chinese and English.
- Authentication: User login and authentication.
- Tools: Support for Deploy & Expose.
- Sandbox: Support for mobile and Windows computer access.
- Deployment: Support for K8s and Docker Swarm multi-cluster deployment.
When a user initiates a conversation:
- Web sends a request to create an Agent to the Server, which creates a Sandbox through
/var/run/docker.sock
and returns a session ID. - The Sandbox is an Ubuntu Docker environment that starts Chrome browser and API services for tools like File/Shell.
- Web sends user messages to the session ID, and when the Server receives user messages, it forwards them to the PlanAct Agent for processing.
- During processing, the PlanAct Agent calls relevant tools to complete tasks.
- All events generated during Agent processing are sent back to Web via SSE.
When users browse tools:
- Browser:
- The Sandbox's headless browser starts a VNC service through xvfb and x11vnc, and converts VNC to websocket through websockify.
- Web's NoVNC component connects to the Sandbox through the Server's Websocket Forward, enabling browser viewing.
- Other tools: Other tools work on similar principles.
This project primarily relies on Docker for development and deployment, requiring a relatively new version of Docker:
- Docker 20.10+
- Docker Compose
Model capability requirements:
- Compatible with OpenAI interface
- Support for FunctionCall
- Support for Json Format output
Deepseek and GPT models are recommended.
Docker Compose is recommended for deployment:
services:
frontend:
image: simpleyyt/manus-frontend
ports:
- "5173:80"
depends_on:
- backend
restart: unless-stopped
networks:
- manus-network
environment:
- BACKEND_URL=http://backend:8000
backend:
image: simpleyyt/manus-backend
depends_on:
- sandbox
restart: unless-stopped
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
#- ./mcp.json:/etc/mcp.json # Mount MCP servers directory
networks:
- manus-network
environment:
# OpenAI API base URL
- API_BASE=https://api.openai.com/v1
# OpenAI API key, replace with your own
- API_KEY=sk-xxxx
# LLM model name
- MODEL_NAME=gpt-4o
# LLM temperature parameter, controls randomness
- TEMPERATURE=0.7
# Maximum tokens for LLM response
- MAX_TOKENS=2000
# MongoDB connection URI
#- MONGODB_URI=mongodb://mongodb:27017
# MongoDB database name
#- MONGODB_DATABASE=manus
# MongoDB username (optional)
#- MONGODB_USERNAME=
# MongoDB password (optional)
#- MONGODB_PASSWORD=
# Redis server hostname
#- REDIS_HOST=redis
# Redis server port
#- REDIS_PORT=6379
# Redis database number
#- REDIS_DB=0
# Redis password (optional)
#- REDIS_PASSWORD=
# Sandbox server address (optional)
#- SANDBOX_ADDRESS=
# Docker image used for the sandbox
- SANDBOX_IMAGE=simpleyyt/manus-sandbox
# Prefix for sandbox container names
- SANDBOX_NAME_PREFIX=sandbox
# Time-to-live for sandbox containers in minutes
- SANDBOX_TTL_MINUTES=30
# Docker network for sandbox containers
- SANDBOX_NETWORK=manus-network
# Chrome browser arguments for sandbox (optional)
#- SANDBOX_CHROME_ARGS=
# HTTPS proxy for sandbox (optional)
#- SANDBOX_HTTPS_PROXY=
# HTTP proxy for sandbox (optional)
#- SANDBOX_HTTP_PROXY=
# No proxy hosts for sandbox (optional)
#- SANDBOX_NO_PROXY=
# Search engine configuration
# Options: baidu, google, bing
- SEARCH_PROVIDER=bing
# Google search configuration, only used when SEARCH_PROVIDER=google
#- GOOGLE_SEARCH_API_KEY=
#- GOOGLE_SEARCH_ENGINE_ID=
# Auth configuration
# Options: password, none, local
- AUTH_PROVIDER=password
# Password auth configuration, only used when AUTH_PROVIDER=password
- PASSWORD_SALT=
- PASSWORD_HASH_ROUNDS=10
# Local auth configuration, only used when AUTH_PROVIDER=local
#- [email protected]
#- LOCAL_AUTH_PASSWORD=admin
# JWT configuration
- JWT_SECRET_KEY=your-secret-key-here
- JWT_ALGORITHM=HS256
- JWT_ACCESS_TOKEN_EXPIRE_MINUTES=30
- JWT_REFRESH_TOKEN_EXPIRE_DAYS=7
# Email configuration
# Only used when AUTH_PROVIDER=password
#- EMAIL_HOST=smtp.gmail.com
#- EMAIL_PORT=587
#- [email protected]
#- EMAIL_PASSWORD=your-password
#- [email protected]
# MCP configuration file path
#- MCP_CONFIG_PATH=/etc/mcp.json
# Application log level
- LOG_LEVEL=INFO
sandbox:
image: simpleyyt/manus-sandbox
command: /bin/sh -c "exit 0" # prevent sandbox from starting, ensure image is pulled
restart: "no"
networks:
- manus-network
mongodb:
image: mongo:7.0
volumes:
- mongodb_data:/data/db
restart: unless-stopped
#ports:
# - "27017:27017"
networks:
- manus-network
redis:
image: redis:7.0
restart: unless-stopped
networks:
- manus-network
volumes:
mongodb_data:
name: manus-mongodb-data
networks:
manus-network:
name: manus-network
driver: bridge
Save as docker-compose.yml
file, and run:
docker compose up -d
Note: If you see
sandbox-1 exited with code 0
, this is normal, as it ensures the sandbox image is successfully pulled locally.
Open your browser and visit http://localhost:5173 to access Manus.
This project consists of three independent sub-projects:
-
frontend
: manus frontend -
backend
: Manus backend -
sandbox
: Manus sandbox
- Download the project:
git clone https://github.com/simpleyyt/ai-manus.git
cd ai-manus
- Copy the configuration file:
cp .env.example .env
- Modify the configuration file:
# Model provider configuration
API_KEY=
API_BASE=http://mockserver:8090/v1
# Model configuration
MODEL_NAME=deepseek-chat
TEMPERATURE=0.7
MAX_TOKENS=2000
# MongoDB configuration
#MONGODB_URI=mongodb://mongodb:27017
#MONGODB_DATABASE=manus
#MONGODB_USERNAME=
#MONGODB_PASSWORD=
# Redis configuration
#REDIS_HOST=redis
#REDIS_PORT=6379
#REDIS_DB=0
#REDIS_PASSWORD=
# Sandbox configuration
#SANDBOX_ADDRESS=
SANDBOX_IMAGE=simpleyyt/manus-sandbox
SANDBOX_NAME_PREFIX=sandbox
SANDBOX_TTL_MINUTES=30
SANDBOX_NETWORK=manus-network
#SANDBOX_CHROME_ARGS=
#SANDBOX_HTTPS_PROXY=
#SANDBOX_HTTP_PROXY=
#SANDBOX_NO_PROXY=
# Search engine configuration
# Options: baidu, google, bing
SEARCH_PROVIDER=bing
# Google search configuration, only used when SEARCH_PROVIDER=google
#GOOGLE_SEARCH_API_KEY=
#GOOGLE_SEARCH_ENGINE_ID=
# Auth configuration
# Options: password, none, local
AUTH_PROVIDER=password
# Password auth configuration, only used when AUTH_PROVIDER=password
PASSWORD_SALT=
PASSWORD_HASH_ROUNDS=10
# Local auth configuration, only used when AUTH_PROVIDER=local
#[email protected]
#LOCAL_AUTH_PASSWORD=admin
# JWT configuration
JWT_SECRET_KEY=your-secret-key-here
JWT_ALGORITHM=HS256
JWT_ACCESS_TOKEN_EXPIRE_MINUTES=30
JWT_REFRESH_TOKEN_EXPIRE_DAYS=7
# Email configuration
# Only used when AUTH_PROVIDER=password
#EMAIL_HOST=smtp.gmail.com
#EMAIL_PORT=587
#[email protected]
#EMAIL_PASSWORD=your-password
#[email protected]
# MCP configuration
#MCP_CONFIG_PATH=/etc/mcp.json
# Log configuration
LOG_LEVEL=INFO
- Run in debug mode:
# Equivalent to docker compose -f docker-compose-development.yaml up
./dev.sh up
All services will run in reload mode, and code changes will be automatically reloaded. The exposed ports are as follows:
- 5173: Web frontend port
- 8000: Server API service port
- 8080: Sandbox API service port
- 5900: Sandbox VNC port
- 9222: Sandbox Chrome browser CDP port
Note: In Debug mode, only one sandbox will be started globally
- When dependencies change (requirements.txt or package.json), clean up and rebuild:
# Clean up all related resources
./dev.sh down -v
# Rebuild images
./dev.sh build
# Run in debug mode
./dev.sh up
export IMAGE_REGISTRY=your-registry-url
export IMAGE_TAG=latest
# Build images
./run build
# Push to the corresponding image repository
./run push
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for ai-manus
Similar Open Source Tools

ai-manus
AI Manus is a general-purpose AI Agent system that supports running various tools and operations in a sandbox environment. It offers deployment with minimal dependencies, supports multiple tools like Terminal, Browser, File, Web Search, and messaging tools, allocates separate sandboxes for tasks, manages session history, supports stopping and interrupting conversations, file upload and download, and is multilingual. The system also provides user login and authentication. The project primarily relies on Docker for development and deployment, with model capability requirements and recommended Deepseek and GPT models.

ClaraVerse
ClaraVerse is a privacy-first AI assistant and agent builder that allows users to chat with AI, create intelligent agents, and turn them into fully functional apps. It operates entirely on open-source models running on the user's device, ensuring data privacy and security. With features like AI assistant, image generation, intelligent agent builder, and image gallery, ClaraVerse offers a versatile platform for AI interaction and app development. Users can install ClaraVerse through Docker, native desktop apps, or the web version, with detailed instructions provided for each option. The tool is designed to empower users with control over their AI stack and leverage community-driven innovations for AI development.

JetStream
JetStream is a throughput and memory optimized engine for Large Language Model (LLM) inference on XLA devices, specifically TPUs. It provides reference engine implementations for Jax and Pytorch models, along with documentation for online inference, serving Gemma using TPUs on GKE, benchmarking, observability, profiling, and standalone local setup. Users can easily set up a local server, run tests, and test core modules. JetStream aims to enhance the performance of LLM inference on XLA devices.

finite-monkey-engine
FiniteMonkey is an advanced vulnerability mining engine powered purely by GPT, requiring no prior knowledge base or fine-tuning. Its effectiveness significantly surpasses most current related research approaches. The tool is task-driven, prompt-driven, and focuses on prompt design, leveraging 'deception' and hallucination as key mechanics. It has helped identify vulnerabilities worth over $60,000 in bounties. The tool requires PostgreSQL database, OpenAI API access, and Python environment for setup. It supports various languages like Solidity, Rust, Python, Move, Cairo, Tact, Func, Java, and Fake Solidity for scanning. FiniteMonkey is best suited for logic vulnerability mining in real projects, not recommended for academic vulnerability testing. GPT-4-turbo is recommended for optimal results with an average scan time of 2-3 hours for medium projects. The tool provides detailed scanning results guide and implementation tips for users.

aichat
Aichat is an AI-powered CLI chat and copilot tool that seamlessly integrates with over 10 leading AI platforms, providing a powerful combination of chat-based interaction, context-aware conversations, and AI-assisted shell capabilities, all within a customizable and user-friendly environment.

J.A.R.V.I.S
J.A.R.V.I.S (Just A Rather Very Intelligent System) is an advanced AI assistant inspired by Iron Man's Jarvis, designed to assist with various tasks, from navigating websites to controlling your PC with natural language commands.

ChatTTS
ChatTTS is a generative speech model optimized for dialogue scenarios, providing natural and expressive speech synthesis with fine-grained control over prosodic features. It supports multiple speakers and surpasses most open-source TTS models in terms of prosody. The model is trained with 100,000+ hours of Chinese and English audio data, and the open-source version on HuggingFace is a 40,000-hour pre-trained model without SFT. The roadmap includes open-sourcing additional features like VQ encoder, multi-emotion control, and streaming audio generation. The tool is intended for academic and research use only, with precautions taken to limit potential misuse.

auto-subs
Auto-subs is a tool designed to automatically transcribe editing timelines using OpenAI Whisper and Stable-TS for extreme accuracy. It generates subtitles in a custom style, is completely free, and runs locally within Davinci Resolve. It works on Mac, Linux, and Windows, supporting both Free and Studio versions of Resolve. Users can jump to positions on the timeline using the Subtitle Navigator and translate from any language to English. The tool provides a user-friendly interface for creating and customizing subtitles for video content.

WilliamButcherBot
WilliamButcherBot is a Telegram Group Manager Bot and Userbot written in Python using Pyrogram. It provides features for managing Telegram groups and users, with ready-to-use methods available. The bot requires Python 3.9, Telegram API Key, Telegram Bot Token, and MongoDB URI. Users can install it locally or on a VPS, run it directly, generate Pyrogram session for Heroku, or use Docker for deployment. Additionally, users can write new modules to extend the bot's functionality by adding them to the wbb/modules/ directory.

AutoRAG
AutoRAG is an AutoML tool designed to automatically find the optimal RAG pipeline for your data. It simplifies the process of evaluating various RAG modules to identify the best pipeline for your specific use-case. The tool supports easy evaluation of different module combinations, making it efficient to find the most suitable RAG pipeline for your needs. AutoRAG also offers a cloud beta version to assist users in running and optimizing the tool, along with building RAG evaluation datasets for a starting price of $9.99 per optimization.

KsanaLLM
KsanaLLM is a high-performance engine for LLM inference and serving. It utilizes optimized CUDA kernels for high performance, efficient memory management, and detailed optimization for dynamic batching. The tool offers flexibility with seamless integration with popular Hugging Face models, support for multiple weight formats, and high-throughput serving with various decoding algorithms. It enables multi-GPU tensor parallelism, streaming outputs, and an OpenAI-compatible API server. KsanaLLM supports NVIDIA GPUs and Huawei Ascend NPU, and seamlessly integrates with verified Hugging Face models like LLaMA, Baichuan, and Qwen. Users can create a docker container, clone the source code, compile for Nvidia or Huawei Ascend NPU, run the tool, and distribute it as a wheel package. Optional features include a model weight map JSON file for models with different weight names.

AIaW
AIaW is a next-generation LLM client with full functionality, lightweight, and extensible. It supports various basic functions such as streaming transfer, image uploading, and latex formulas. The tool is cross-platform with a responsive interface design. It supports multiple service providers like OpenAI, Anthropic, and Google. Users can modify questions, regenerate in a forked manner, and visualize conversations in a tree structure. Additionally, it offers features like file parsing, video parsing, plugin system, assistant market, local storage with real-time cloud sync, and customizable interface themes. Users can create multiple workspaces, use dynamic prompt word variables, extend plugins, and benefit from detailed design elements like real-time content preview, optimized code pasting, and support for various file types.

Bavarder
Bavarder is an AI-powered chit-chat tool designed for informal conversations about unimportant matters. Users can engage in light-hearted discussions with the AI, simulating casual chit-chat scenarios. The tool provides a platform for users to interact with AI in a fun and entertaining way, offering a unique experience of engaging with artificial intelligence in a conversational manner.

ChatGPT-API-Faucet
ChatGPT API Faucet is a frontend project for the public platform ChatGPT API Faucet, inspired by the crypto project MultiFaucet. It allows developers in the AI ecosystem to claim $1.00 for free every 24 hours. The program is developed using the Next.js framework and React library, with key components like _app.tsx for initializing pages, index.tsx for main modifications, and Layout.tsx for defining layout components. Users can deploy the project by installing dependencies, building the project, starting the project, configuring reverse proxies or using port:IP access, and running a development server. The tool also supports token balance queries and is related to projects like one-api, ChatGPT-Cost-Calculator, and Poe.Monster. It is licensed under the MIT license.

fastRAG
fastRAG is a research framework designed to build and explore efficient retrieval-augmented generative models. It incorporates state-of-the-art Large Language Models (LLMs) and Information Retrieval to empower researchers and developers with a comprehensive tool-set for advancing retrieval augmented generation. The framework is optimized for Intel hardware, customizable, and includes key features such as optimized RAG pipelines, efficient components, and RAG-efficient components like ColBERT and Fusion-in-Decoder (FiD). fastRAG supports various unique components and backends for running LLMs, making it a versatile tool for research and development in the field of retrieval-augmented generation.

MaterialSearch
MaterialSearch is a tool for searching local images and videos using natural language. It provides functionalities such as text search for images, image search for images, text search for videos (providing matching video clips), image search for videos (searching for the segment in a video through a screenshot), image-text similarity calculation, and Pexels video search. The tool can be deployed through the source code or Docker image, and it supports GPU acceleration. Users can configure the tool through environment variables or a .env file. The tool is still under development, and configurations may change frequently. Users can report issues or suggest improvements through issues or pull requests.
For similar tasks

LLMs-at-DoD
This repository contains tutorials for using Large Language Models (LLMs) in the U.S. Department of Defense. The tutorials utilize open-source frameworks and LLMs, allowing users to run them in their own cloud environments. The repository is maintained by the Defense Digital Service and welcomes contributions from users.

ai-manus
AI Manus is a general-purpose AI Agent system that supports running various tools and operations in a sandbox environment. It offers deployment with minimal dependencies, supports multiple tools like Terminal, Browser, File, Web Search, and messaging tools, allocates separate sandboxes for tasks, manages session history, supports stopping and interrupting conversations, file upload and download, and is multilingual. The system also provides user login and authentication. The project primarily relies on Docker for development and deployment, with model capability requirements and recommended Deepseek and GPT models.

kilocode
Kilo Code is an open-source VS Code AI agent that allows users to generate code from natural language, check its own work, run terminal commands, automate the browser, and utilize the latest AI models. It offers features like task automation, automated refactoring, and integration with MCP servers. Users can access 400+ AI models and benefit from transparent pricing. Kilo Code is a fork of Roo Code and Cline, with improvements and unique features developed independently.

MiniSearch
MiniSearch is a minimalist search engine with integrated browser-based AI. It is privacy-focused, easy to use, cross-platform, integrated, time-saving, efficient, optimized, and open-source. MiniSearch can be used for a variety of tasks, including searching the web, finding files on your computer, and getting answers to questions. It is a great tool for anyone who wants a fast, private, and easy-to-use search engine.

search_with_ai
Build your own conversation-based search with AI, a simple implementation with Node.js & Vue3. Live Demo Features: * Built-in support for LLM: OpenAI, Google, Lepton, Ollama(Free) * Built-in support for search engine: Bing, Sogou, Google, SearXNG(Free) * Customizable pretty UI interface * Support dark mode * Support mobile display * Support local LLM with Ollama * Support i18n * Support Continue Q&A with contexts.

search2ai
S2A allows your large model API to support networking, searching, news, and web page summarization. It currently supports OpenAI, Gemini, and Moonshot (non-streaming). The large model will determine whether to connect to the network based on your input, and it will not connect to the network for searching every time. You don't need to install any plugins or replace keys. You can directly replace the custom address in your commonly used third-party client. You can also deploy it yourself, which will not affect other functions you use, such as drawing and voice.

Tiger
Tiger is a community-driven project developing a reusable and integrated tool ecosystem for LLM Agent Revolution. It utilizes Upsonic for isolated tool storage, profiling, and automatic document generation. With Tiger, you can create a customized environment for your agents or leverage the robust and publicly maintained Tiger curated by the community itself.

chat-xiuliu
Chat-xiuliu is a bidirectional voice assistant powered by ChatGPT, capable of accessing the internet, executing code, reading/writing files, and supporting GPT-4V's image recognition feature. It can also call DALL·E 3 to generate images. The project is a fork from a background of a virtual cat girl named Xiuliu, with removed live chat interaction and added voice input. It can receive questions from microphone or interface, answer them vocally, upload images and PDFs, process tasks through function calls, remember conversation content, search the web, generate images using DALL·E 3, read/write local files, execute JavaScript code in a sandbox, open local files or web pages, customize the cat girl's speaking style, save conversation screenshots, and support Azure OpenAI and other API endpoints in openai format. It also supports setting proxies and various AI models like GPT-4, GPT-3.5, and DALL·E 3.
For similar jobs

sweep
Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

teams-ai
The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

uAgents
uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.