OmniSteward

OmniSteward

🐼基于LLM Agent的全能管家,通过语音或文字交互,调用工具控制智能家居(HA/米家)和电脑。超高拓展性,无限可能。

Stars: 66

Visit
 screenshot

OmniSteward is an AI-powered steward system based on large language models that can interact with users through voice or text to help control smart home devices and computer programs. It supports multi-turn dialogue, tool calling for complex tasks, multiple LLM models, voice recognition, smart home control, computer program management, online information retrieval, command line operations, and file management. The system is highly extensible, allowing users to customize and share their own tools.

README:

OmniSteward 全能管家

中文文档

Note: This project is still under active development, and some features may be unstable, please use with caution

The English README is automatically generated, please refer to the Chinese version for the most accurate information

This is an AI-powered steward system based on large language models that can interact with users through voice or text to help control smart home devices and computer programs.

image

News

  • 2024-12-18: Added support for HomeAssistant, can now control HomeAssistant/Mi Home devices, check omni-ha for more details

Highlights

  • Supports multi-turn dialogue for continuous user interaction
  • Supports tool calling to execute complex tasks on your computer
  • Supports multiple LLM models that can be switched as needed
  • Highly extensible - you can easily customize and share your own tools

Main Features

  • 🎤 Voice recognition and interaction
  • 🏠 Smart home control (HomeAssistant/Bemfa devices/Mi Home devices)
  • 💻 Computer program management (start/stop programs)
  • 🔍 Online information retrieval (via Stepfun Web Search or Kimi AI)
  • ⌨️ Command line operations
  • 📂 File management (file search/read/write/compress/list directory)

Demo Video

We prepared a series of demo videos, please watch demo videos to understand the main features and usage of the system.

System Requirements

  • Python 3.8+
  • Chrome browser (for Kimi AI functionality)
  • Windows OS (some features only support Windows, Linux and Mac untested)

Installation

  1. Clone repository
git clone https://github.com/OmniSteward/OmniSteward.git
cd OmniSteward
  1. Install dependencies
pip install -r requirements.txt

Environment Variables Configuration

See examples/env.cmd file

OPENAI_API_BASE=your_api_base # OpenAI format API base URL
OPENAI_API_KEY=your_api_key   # OpenAI format API key
SILICON_FLOW_API_KEY=your_api_key   # Silicon Flow API key for ASR, Rerank, see [LLM Platforms](docs/PLATFORM.md)
BEMFA_UID=your_bemfa_uid            # Bemfa platform UID (optional, for smart home control)
BEMFA_TOPIC=your_bemfa_topic        # Bemfa platform Topic (optional, for smart home control)
KIMI_PROFILE_PATH=path_to_chrome_profile    # Chrome user data directory (optional, for Kimi AI, uses default path if not set)
LOCATION=your_location                     # Your geographic location (optional, for system prompts)
LLM_MODEL=your_llm_model                   # LLM model to use, optional, defaults to Qwen2.5-7B-Instruct

For obtaining OpenAI format API key and base URL, see LLM Platforms

Reference links:

Launch

This project supports two usage modes:

  • Command Line Interface (CLI): Interact through command line, direct usage.
  • Web Mode: Requires frontend project, interact through WebUI, can be used remotely on phone, tablet, computer to manage smart home devices

Command Line Mode (CLI)

Please first configure environment variables in examples/env.cmd file (see Environment Variables Configuration)

Microphone Voice Input Mode

First start the VAD service:

python -m servers.vad_rpc

Then open a new command prompt window and run:

call examples\env.cmd # Apply environment variables
python -m core.cli --config configs/cli.py # Run CLI

See examples/cli_voice.cmd for more details

Text Input Mode

call examples\env.cmd # Apply environment variables
python -m core.cli --query "open NetEase Music" --config configs/cli.py

Adding Simple Custom Tools

call examples\env.cmd # Apply environment variables
python -m core.cli --query "print hello" --config configs/cli_custom_tool.py

This example adds a simple print tool in configs/cli_custom_tool.py that can print any string. Check this file to learn how to easily add custom tools

Web Mode

  • Requires frontend WebUI, called OmniSteward-Frontend
  • Environment variables must be configured, especially Silicon Flow API key
  • Frontend WebUI should run on http://localhost:3000, backend will forward requests to frontend when started
  • Backend service should run on http://localhost:8000

Start Backend Service

Please first configure environment variables in examples/env.cmd file (see Environment Variables Configuration), then run in project root:

call examples\env.cmd # Apply environment variables
python -m servers.steward --config configs/backend.py

Start Frontend Service

See OmniSteward-Frontend project.

Usage

Use Chrome/Edge browser, open http://localhost:8000 to start using.

Note: For external network access, since Chrome/Edge blocks microphone under HTTP by default, we need to set chrome://flags/#unsafely-treat-insecure-origin-as-secure to http://ip:port, otherwise it cannot be used. See tutorial for reference.

Mobile phones can also use Chrome or Edge browser, open http://ip:port to start using, requires same settings as above.

Available Tools and Customization

See TOOL_LIST.md

Notes

  • Some features require specific API keys and environment configuration
  • Command line tools require user confirmation before execution
  • Smart home control features require corresponding hardware support

Contributing

Currently this project is maintained by ElliottZheng, welcome to submit issues and pull requests!

Thanks

Thanks to Stepfun Stars Program for supporting this project.

License

MIT License

Copyright (c) 2024-present ElliottZheng

More Custom Tool Examples

See steward-utils project for more custom tool examples.

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for OmniSteward

Similar Open Source Tools

For similar tasks

For similar jobs