
Whimbox
奇想盒Whimbox,一个基于大语言模型和图像识别技术的游戏AI智能体,带给你全新的游戏体验!
Stars: 72

Whimbox is a game AI agent based on large language models and image recognition technology, providing users with a new gaming experience. It automates daily tasks such as mining, material collection, and wish checking, as well as features like route recording, image recognition, and AI dialogue. The tool does not modify game files or memory, only captures screenshots and simulates mouse and keyboard actions. It is designed for games running in a 1920x1080 windowed mode on mid to high-end PCs, with plans for future cloud gaming support. Whimbox is grateful to open-source projects like GIA and BetterGI, as well as AI models and programming tools like chatgpt and cursor. Developers interested in contributing to the project can join the development community and explore various functionalities that need development and adaptation.
README:
Whimbox,一个基于大语言模型和图像识别技术的游戏AI智能体,带给你全新的游戏体验!
- 安装依赖(需要python3.12)
- 开发者建议手动安装依赖
pip install -r requirements.txt
# 安装paddleocr运行环境(可选,目前默认使用rapidocr,也可以不装)
python -m pip install paddlepaddle-gpu==3.2.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/
- 其他用户可运行自动安装脚本
setup_env.bat
- 创建配置文件
将config目录下的config_example.ini重命名为config.ini
修改Agent下的配置项修改为自己的大模型api(只要是openai格式的都可以)
- 创建提示词
将config目录下的prompt_example.txt重命名为prompt.txt
按自己喜好添加提示词,也可以不修改
- 打开游戏,将游戏设置为窗口模式,分辨率1920*1080
- 开发者请用管理员权限运行ide,并运行
whimbox.py
- 其他用户可用管理员权限运行一键启动脚本
run.bat
- 程序启动后请稍等片刻。在游戏界面的左侧看到📦图标后,按
/
打开对话框,按esc
关闭对话框
- 每日任务
- 自动美鸭梨挖掘
- 自动素材激化幻境
- 自动检查朝夕心愿
- 自动跑图
- 跑图路线录制、编辑
- 自动跑图(暂时只支持大世界和星海)
- 自动采集
- AI对话
- 通过自然语言编排以上所有功能
- 随时中断任务
- 框架完善:回退机制、重试机制。
- 多地图适配
- 自动战斗、钓鱼、捕虫、清洁
- 自动弹琴(我必须立刻演奏春日影!)
- 家园适配
- 单独的启动器
- Whimbox不会修改游戏文件、读写游戏内存,只会截图和模拟鼠标键盘,理论上不会被封号。但游戏的用户条款非常完善,涵盖了所有可能出现的情况。所以使用Whimbox导致的一切后果请自行承担。
- 由于游戏本身已经消耗PC的大量性能,图像识别还会额外消耗性能,所以目前仅支持中高配PC运行,正式发布后会推出云游戏版本。
- Whimbox目前仅支持1920x1080窗口化运行的游戏。
感谢各个大世界游戏开源项目的先行者,供Whimbox学习参考。
感谢chatgpt、cursor、claude等各种AI模型和AI编程工具
目前项目仅完成了基本框架的验证,还有大量功能需要开发和适配。如果你对此感兴趣,欢迎加入一起研究。开发Q群:821908945。
Whinbox/
├── assets/
│ ├── imgs/ # 图像资源
│ │ ├── Game/ # 游戏解包素材
│ │ ├── Maps/ # 地图相关资源
│ │ ├── Windows/ # 游戏UI截图
│ ├── paths/ # 自动寻路脚本
│ └── PPOCRModels/ # OCR模型文件
├── source/
│ ├── action/ # 动作模块(拾取、钓鱼、战斗等等)
│ ├── api/ # ocr,yolo等第三方模型
│ ├── common/ # 公共模块(日志、工具等等)
│ ├── config/ # 配置模块
│ ├── dev_tool/ # 开发工具
│ ├── ingame_ui/ # 游戏内聊天框
│ ├── interaction/ # 交互核心模块(截图、操作)
│ ├── map/ # 地图模块(小地图识别,大地图操作)
│ ├── task/ # 任务模块(各种功能脚本,供mcp调用)
│ │ ├── daily_task/ # 各种日常任务的脚本
│ │ └── navigation_task/ # 自动寻路脚本
│ ├── ui/ # 游戏UI模块(页面、UI)
│ ├── view_and_move/ # 视角和移动模块
│ ├── mcp_agent.py # 大模型agent
│ └── mcp_server.py # MCP服务器
├── config/ # 配置文件
│ ├── config.ini # 程序的配置文件
│ └── prompt.txt # 大模型提示词
├── Logs/ # 日志文件
├── whimbox.py # 主程序入口
可参考source\task\daily_task
内的几个task,并在source\mcp_server.py
中注册,就能被大模型调用。
详情请查看 如何录制和编辑跑图路线
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for Whimbox
Similar Open Source Tools

Whimbox
Whimbox is a game AI agent based on large language models and image recognition technology, providing users with a new gaming experience. It automates daily tasks such as mining, material collection, and wish checking, as well as features like route recording, image recognition, and AI dialogue. The tool does not modify game files or memory, only captures screenshots and simulates mouse and keyboard actions. It is designed for games running in a 1920x1080 windowed mode on mid to high-end PCs, with plans for future cloud gaming support. Whimbox is grateful to open-source projects like GIA and BetterGI, as well as AI models and programming tools like chatgpt and cursor. Developers interested in contributing to the project can join the development community and explore various functionalities that need development and adaptation.

interview-coder-cn
This is a coding problem-solving assistant for Chinese users, tailored to the domestic AI ecosystem, simple and easy to use. It provides real-time problem-solving ideas and code analysis for coding interviews, avoiding detection during screen sharing. Users can also extend its functionality for other scenarios by customizing prompt words. The tool supports various programming languages and has stealth capabilities to hide its interface from interviewers even when screen sharing.

everyday
Everyday is a story generator tool that uses AI to weave fantasy stories based on daily quotes. It features an intelligent writing engine that expands quotes into captivating short stories, a time capsule storage system for story archiving, an immersive document site with a real-time story gallery, and cloud automation for daily story generation. Users can clone the repository, activate the Python environment, configure AI keys, and start the story furnace to witness quotes transform into complete stories. The project follows the MIT open convention, allowing users to freely use, modify, and share the generated stories while preserving the original magic touch.

OpenAI-Whisper-GUI
OpenAI Whisper GUI is a modern GUI application designed to transcribe and translate audio/video files using OpenAI Whisper. It features a modern UI with light/dark mode, the ability to export transcribed text, add subtitles to videos, and more. The latest version includes updates to widgets, layouts, and themes, as well as new features such as a config handler, GPU info retrieval, a new app logo, settings interface, and bug fixes like code refactoring and fixing Cuda not found warning message. Users can easily install the tool by cloning the GitHub repository and running setup.py and main.py scripts. For more information, users can visit the OpenAI Whisper GitHub repository.

AI-Translation-Assistant-Pro
AI Translation Assistant Pro is a powerful AI-driven platform for multilingual translation and content processing. It offers features such as text translation, image recognition, PDF processing, speech recognition, and video processing. The platform includes a subscription system with different membership levels, user management functionalities, quota management, and real-time usage statistics. It utilizes technologies like Next.js, React, TypeScript for the frontend, Node.js, PostgreSQL for the backend, NextAuth.js for authentication, Stripe for payments, and integrates with cloud services like Aliyun OSS and Tencent Cloud for AI services.

AINO
AINO is a no-code system construction platform that includes front-end applications, back-end API services, and database management tools. The project structure consists of AINO-server for back-end API services, AINO-studio for front-end applications, AINO-APP for front-end client applications, docs for project documentation, start-all.sh for one-click starting of all services, stop-all.sh for stopping all services, status.sh for checking service status, and logs for service logs. AINO utilizes Hono + TypeScript + Drizzle ORM for the back-end, Next.js + React + TypeScript for the front-end, PostgreSQL for the database, and pnpm as the package manager.

NotHotDog
NotHotDog is an open-source platform for testing, evaluating, and simulating AI agents. It offers a robust framework for generating test cases, running conversational scenarios, and analyzing agent performance.

aigc-platform-server
This project aims to integrate mainstream open-source large models to achieve the coordination and cooperation between different types of large models, providing comprehensive and flexible AI content generation services.

ShitCodify
ShitCodify is an AI-powered tool that transforms normal, readable, and maintainable code into hard-to-understand, hard-to-maintain 'shit code'. It uses large language models like GPT-4 to analyze code and apply various 'anti-patterns' and bad practices to reduce code readability and maintainability while keeping the code functional.

Archon
Archon is an AI meta-agent designed to autonomously build, refine, and optimize other AI agents. It serves as a practical tool for developers and an educational framework showcasing the evolution of agentic systems. Through iterative development, Archon demonstrates the power of planning, feedback loops, and domain-specific knowledge in creating robust AI agents.

paper-ai
Paper-ai is a tool that helps you write papers using artificial intelligence. It provides features such as AI writing assistance, reference searching, and editing and formatting tools. With Paper-ai, you can quickly and easily create high-quality papers.

anki_packager
anki_packager is an intelligent tool for generating high-quality Anki flashcards for English vocabulary. It integrates multiple curated dictionaries, provides automated learning experiences, supports various features like Google TTS pronunciation and AI models for word summarization and story generation, offers convenient data import from other sources, ensures a good command-line interface, and can be run using Docker. Each flashcard includes detailed learning resources such as definitions, tenses, AI-generated roots for mnemonic aids, phrases, example sentences, word differentiations, and English explanations with AI-generated stories.

RookieAI_yolov8
RookieAI_yolov8 is an open-source project designed for developers and users interested in utilizing YOLOv8 models for object detection tasks. The project provides instructions for setting up the required libraries and Pytorch, as well as guidance on using custom or official YOLOv8 models. Users can easily train their own models and integrate them with the software. The tool offers features for packaging the code, managing model files, and organizing the necessary resources for running the software. It also includes updates and optimizations for better performance and functionality, with a focus on FPS game aimbot functionalities. The project aims to provide a comprehensive solution for object detection tasks using YOLOv8 models.

oba-live-tool
The oba live tool is a small tool for Douyin small shops and Kuaishou Baiying live broadcasts. It features multiple account management, intelligent message assistant, automatic product explanation, AI automatic reply, and AI intelligent assistant. The tool requires Windows 10 or above, Chrome or Edge browser, and a valid account for Douyin small shops or Kuaishou Baiying. Users can download the tool from the Releases page, connect to the control panel, set API keys for AI functions, and configure auto-reply prompts. The tool is licensed under the MIT license.

farfalle
Farfalle is an open-source AI-powered search engine that allows users to run their own local LLM or utilize the cloud. It provides a tech stack including Next.js for frontend, FastAPI for backend, Tavily for search API, Logfire for logging, and Redis for rate limiting. Users can get started by setting up prerequisites like Docker and Ollama, and obtaining API keys for Tavily, OpenAI, and Groq. The tool supports models like llama3, mistral, and gemma. Users can clone the repository, set environment variables, run containers using Docker Compose, and deploy the backend and frontend using services like Render and Vercel.

mushroom
MRCMS is a Java-based content management system that uses data model + template + plugin implementation, providing built-in article model publishing functionality. The goal is to quickly build small to medium websites.
For similar tasks

Whimbox
Whimbox is a game AI agent based on large language models and image recognition technology, providing users with a new gaming experience. It automates daily tasks such as mining, material collection, and wish checking, as well as features like route recording, image recognition, and AI dialogue. The tool does not modify game files or memory, only captures screenshots and simulates mouse and keyboard actions. It is designed for games running in a 1920x1080 windowed mode on mid to high-end PCs, with plans for future cloud gaming support. Whimbox is grateful to open-source projects like GIA and BetterGI, as well as AI models and programming tools like chatgpt and cursor. Developers interested in contributing to the project can join the development community and explore various functionalities that need development and adaptation.
For similar jobs

sweep
Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

teams-ai
The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

uAgents
uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.