xiaozhi-esp32

Build your own AI friend

Stars: 10176

Visit

The xiaozhi-esp32 repository is the first hardware project by Xia Ge, focusing on creating an AI chatbot using ESP32, SenseVoice, and Qwen72B. The project aims to help beginners in AI hardware development understand how to apply language models to hardware devices. It supports various functionalities such as Wi-Fi configuration, offline voice wake-up, multilingual speech recognition, voiceprint recognition, TTS using large models, and more. The project encourages participation for learning and improvement, providing resources for hardware and firmware development.

README:

小智 AI 聊天机器人（XiaoZhi AI Chatbot）

（中文 | English | 日本語）

这是虾哥的第一个硬件作品。

👉 ESP32+SenseVoice+Qwen72B打造你的AI聊天伴侣！【bilibili】

👉 给小智装上 DeepSeek 的聪明大脑【bilibili】

👉 手工打造你的 AI 女友，新手入门教程【bilibili】

项目目的

本项目是一个开源项目，以 MIT 许可证发布，允许任何人免费使用，并可以用于商业用途。

我们希望通过这个项目，能够帮助更多人入门 AI 硬件开发，了解如何将当下飞速发展的大语言模型应用到实际的硬件设备中。无论你是对 AI 感兴趣的学生，还是想要探索新技术的开发者，都可以通过这个项目获得宝贵的学习经验。

欢迎所有人参与到项目的开发和改进中来。如果你有任何想法或建议，请随时提出 Issue 或加入群聊。

学习交流 QQ 群：376893254

已实现功能

Wi-Fi / ML307 Cat.1 4G
BOOT 键唤醒和打断，支持点击和长按两种触发方式
离线语音唤醒 ESP-SR
流式语音对话（WebSocket 或 UDP 协议）
支持国语、粤语、英语、日语、韩语 5 种语言识别 SenseVoice
声纹识别，识别是谁在喊 AI 的名字 3D Speaker
大模型 TTS（火山引擎或 CosyVoice）
大模型 LLM（Qwen, DeepSeek, Doubao）
可配置的提示词和音色（自定义角色）
短期记忆，每轮对话后自我总结
OLED / LCD 显示屏，显示信号强弱或对话内容
支持 LCD 显示图片表情
支持多语言（中文、英文）

硬件部分

面包板手工制作实践

详见飞书文档教程：

👉 《小智 AI 聊天机器人百科全书》

面包板效果图如下：

已支持的开源硬件

固件部分

免开发环境烧录

新手第一次操作建议先不要搭建开发环境，直接使用免开发环境烧录的固件。

固件默认接入 xiaozhi.me 官方服务器，目前个人用户注册账号可以免费使用 Qwen 实时模型。

👉 Flash烧录固件（无IDF开发环境）

开发环境

Cursor 或 VSCode
安装 ESP-IDF 插件，选择 SDK 版本 5.3 或以上
Linux 比 Windows 更好，编译速度快，也免去驱动问题的困扰
使用 Google C++ 代码风格，提交代码时请确保符合规范

智能体配置

如果你已经拥有一个小智 AI 聊天机器人设备，可以登录 xiaozhi.me 控制台进行配置。

👉 后台操作视频教程（旧版界面）

技术原理与私有化部署

👉 一份详细的 WebSocket 通信协议文档

在个人电脑上部署服务器，可以参考另一位作者同样以 MIT 许可证开源的项目 xiaozhi-esp32-server

Star History

For Tasks:

Click tags to check more tools for each tasks

build ai chatbot configure ai roles develop voice recognition implement offline voice wake-up customize tts voices

For Jobs:

ai hardware developer student interested in ai developer exploring new technologies voice recognition engineer firmware developer

Alternative AI tools for xiaozhi-esp32

Similar Open Source Tools

xiaozhi-esp32

github

: 10.2k

chatgpt-plus

ChatGPT-PLUS is an open-source AI assistant solution based on AI large language model API, with a built-in operational management backend for easy deployment. It integrates multiple large language models from platforms like OpenAI, Azure, ChatGLM, Xunfei Xinghuo, and Wenxin Yanyan. Additionally, it includes MidJourney and Stable Diffusion AI drawing features. The system offers a complete open-source solution with ready-to-use frontend and backend applications, providing a seamless typing experience via Websocket. It comes with various pre-trained role applications such as Xiaohongshu writer, English translation master, Socrates, Confucius, Steve Jobs, and weekly report assistant to meet various chat and application needs. Users can enjoy features like Suno Wensheng music, integration with MidJourney/Stable Diffusion AI drawing, personal WeChat QR code for payment, built-in Alipay and WeChat payment functions, support for various membership packages and point card purchases, and plugin API integration for developing powerful plugins using large language model functions.

github

: 2.8k

geekai

GeekAI is an open-source AI assistant solution based on AI large language model API, featuring a complete system with ready-to-use front-end and back-end management, providing a seamless typing experience via Websocket. It integrates various pre-trained character applications like Xiaohongshu writing assistant, English translation master, Socrates, Confucius, Steve Jobs, and weekly report assistant. The tool supports multiple large language models from platforms like OpenAI, Azure, Wenxin Yanyan, Xunfei Xinghuo, and Tsinghua ChatGLM. Additionally, it includes MidJourney and Stable Diffusion AI drawing functionalities for creating various artworks such as text-based images, face swapping, and blending images. Users can utilize personal WeChat QR codes for payment without the need for enterprise payment channels, and the tool offers integrated payment options like Alipay and WeChat Pay with support for multiple membership packages and point card purchases. It also features a plugin API for developing powerful plugins using large language model functions, including built-in plugins for Weibo hot search, today's headlines, morning news, and AI drawing functions.

github

: 3.2k

NarratoAI

NarratoAI is an automated video narration tool that provides an all-in-one solution for script writing, automated video editing, voice-over, and subtitle generation. It is powered by LLM to enhance efficient content creation. The tool aims to simplify the process of creating film commentary and editing videos by automating various tasks such as script writing and voice-over generation. NarratoAI offers a user-friendly interface for users to easily generate video scripts, edit videos, and customize video parameters. With future plans to optimize story generation processes and support additional large models, NarratoAI is a versatile tool for content creators looking to streamline their video production workflow.

github

: 4.0k

AIBotPublic

AIBotPublic is an open-source version of AIBotPro, a comprehensive AI tool that provides various features such as knowledge base construction, AI drawing, API hosting, and more. It supports custom plugins and parallel processing of multiple files. The tool is built using bootstrap4 for the frontend, .NET6.0 for the backend, and utilizes technologies like SqlServer, Redis, and Milvus for database and vector database functionalities. It integrates third-party dependencies like Baidu AI OCR, Milvus C# SDK, Google Search, and more to enhance its capabilities.

github

: 199

langchat

LangChat is an enterprise AIGC project solution in the Java ecosystem. It integrates AIGC large model functionality on top of the RBAC permission system to help enterprises quickly customize AI knowledge bases and enterprise AI robots. It supports integration with various large models such as OpenAI, Gemini, Ollama, Azure, Zhifu, Alibaba Tongyi, Baidu Qianfan, etc. The project is developed solely by TyCoding and is continuously evolving. It features multi-modality, dynamic configuration, knowledge base support, advanced RAG capabilities, function call customization, multi-channel deployment, workflows visualization, AIGC client application, and more.

github

: 614

Jarvis

Jarvis is a powerful virtual AI assistant designed to simplify daily tasks through voice command integration. It features automation, device management, and personalized interactions, transforming technology engagement. Built using Python and AI models, it serves personal and administrative needs efficiently, making processes seamless and productive.

github

: 67

chatgpt-infinity

ChatGPT Infinity is a free and powerful add-on that makes ChatGPT generate infinite answers on any topic. It offers customizable topic selection, multilingual support, adjustable response interval, and auto-scroll feature for a seamless chat experience.

github

: 305

easyAi

EasyAi is a lightweight, beginner-friendly Java artificial intelligence algorithm framework. It can be seamlessly integrated into Java projects with Maven, requiring no additional environment configuration or dependencies. The framework provides pre-packaged modules for image object detection and AI customer service, as well as various low-level algorithm tools for deep learning, machine learning, reinforcement learning, heuristic learning, and matrix operations. Developers can easily develop custom micro-models tailored to their business needs.

github

: 75

awesome-cuda-tensorrt-fpga

Okay, here is a JSON object with the requested information about the awesome-cuda-tensorrt-fpga repository:

github

: 103

awesome-hpc-cuda-fpga

github

: 104

chatgpt-auto-refresh

ChatGPT Auto Refresh is a userscript that keeps ChatGPT sessions fresh by eliminating network errors and Cloudflare checks. It removes the 10-minute time limit from conversations when Chat History is disabled, ensuring a seamless experience. The tool is safe, lightweight, and a time-saver, allowing users to keep their sessions alive without constant copy/paste/refresh actions. It works even in background tabs, providing convenience and efficiency for users interacting with ChatGPT. The tool relies on the chatgpt.js library and is compatible with various browsers using Tampermonkey, making it accessible to a wide range of users.

github

: 200

bravegpt

BraveGPT is a userscript that brings the power of ChatGPT to Brave Search. It allows users to engage with a conversational AI assistant directly within their search results, providing instant and personalized responses to their queries. BraveGPT is powered by GPT-4, the latest and most advanced language model from OpenAI, ensuring accurate and comprehensive answers. With BraveGPT, users can ask questions, get summaries, generate creative content, and more, all without leaving the Brave Search interface. The tool is easy to install and use, making it accessible to users of all levels. BraveGPT is a valuable addition to the Brave Search experience, enhancing its capabilities and providing users with a more efficient and informative search experience.

github

: 176

duckduckgpt

DuckDuckGPT brings the magic of ChatGPT to DDG (powered by GPT-4!). DuckDuckGPT is a browser extension that allows you to use ChatGPT within DuckDuckGo. This means you can ask ChatGPT questions, get help with tasks, and generate creative content, all without leaving DuckDuckGo. DuckDuckGPT is easy to use. Once you have installed the extension, simply type your question into the DuckDuckGo search bar and hit enter. ChatGPT will then generate a response that will appear below the search results. DuckDuckGPT is a powerful tool that can help you with a wide variety of tasks. Here are just a few examples of what you can use it for: * Get help with research * Write essays and other creative content * Translate languages * Get coding help * Answer trivia questions * And much more! DuckDuckGPT is still in development, but it is already a very powerful tool. As GPT-4 continues to improve, DuckDuckGPT will only get better. So if you are looking for a way to make your DuckDuckGo searches more productive, be sure to give DuckDuckGPT a try.

github

: 234

Juggle

Juggle is a low-code tool for interface orchestration, which can quickly orchestrate simple APIs into a complex interface. The orchestrated interface can be directly used by the front end, greatly improving development efficiency and reducing development costs.

github

: 799

prompt-optimizer

Prompt Optimizer is a powerful AI prompt optimization tool that helps you write better AI prompts, improving AI output quality. It supports both web application and Chrome extension usage. The tool features intelligent optimization for prompt words, real-time testing to compare before and after optimization, integration with multiple mainstream AI models, client-side processing for security, encrypted local storage for data privacy, responsive design for user experience, and more.

github

: 1.6k

For similar tasks

Virtual_Avatar_ChatBot

Virtual_Avatar_ChatBot is a free AI Chatbot with visual movement that runs on your local computer with minimal GPU requirement. It supports various features like Oogbabooga, betacharacter.ai, and Locall LLM. The tool requires Windows 7 or above, Python, C++ Compiler, Git, and other dependencies. Users can contribute to the open-source project by reporting bugs, creating pull requests, or suggesting new features. The goal is to enhance Voicevox functionality, support local LLM inference, and give the waifu access to the internet. The project references various tools like desktop-waifu, CharacterAI, Whisper, PYVTS, COQUI-AI, VOICEVOX, and VOICEVOX API.

github

: 92

xiaozhi-esp32

github

: 10.2k

fridon-ai

FridonAI is an open-source project offering AI-powered tools for cryptocurrency analysis and blockchain operations. It includes modules like FridonAnalytics for price analysis, FridonSearch for technical indicators, FridonNotifier for custom alerts, FridonBlockchain for blockchain operations, and FridonChat as a unified chat interface. The platform empowers users to create custom AI chatbots, access crypto tools, and interact effortlessly through chat. The core functionality is modular, with plugins, tools, and utilities for easy extension and development. FridonAI implements a scoring system to assess user interactions and incentivize engagement. The application uses Redis extensively for communication and includes a Nest.js backend for system operations.

github

: 82

For similar jobs

xiaozhi-esp32

github

: 10.2k

frame-codebase

The Frame Firmware & RTL Codebase is a comprehensive repository containing code for the Frame hardware system architecture. It includes sections for nRF52 Application, nRF52 Bootloader, and FPGA RTL. The nRF52 handles system operation, Lua scripting, Bluetooth networking, AI tasks, and power management, while the FPGA accelerates graphics and camera processing. The repository provides instructions for firmware development, debugging in VSCode, and FPGA development using tools like ARM GCC Toolchain, nRF Command Line Tools, Yosys, Project Oxide, and nextpnr. Users can build and flash projects for nRF52840 DK, modify FPGA RTL, and access pre-built accelerators bundled in the repo.

github

: 271

Awesome-Embedded

Awesome-Embedded is a curated list of resources for embedded systems enthusiasts. It covers a wide range of topics including MCU programming, RTOS, Linux kernel development, assembly programming, machine learning & AI on MCU, utilities, tips & tricks, and more. The repository provides valuable information, tutorials, and tools for individuals interested in embedded systems development.

github

: 5.4k

AIOC

AIOC is an All-in-one-Cable for Ham Radio enthusiasts, providing a cheap and hackable digital mode USB interface with features like sound-card, virtual tty, and CM108 compatible HID endpoint. It supports various software and tested radios for functions like programming, APRS, and Dual-PTT HTs. Users can fabricate and assemble the AIOC using specific instructions, and program it using STM32CubeIDE. The tool can be used for tasks like programming radios, asserting PTT, and accessing audio data channels. Future work includes configurable AIOC settings, virtual-PTT, and virtual-COS features.

github

: 1.0k

aixt

Aixt is a programming framework for microcontrollers using a modern language syntax based on V, with components including the Aixt programming language, Aixt to C Transpiler, and Aixt API. It is designed to be modular, allowing easy incorporation of new devices and boards through a TOML configuration file. The Aixt to C Transpiler translates Aixt source code to C for specific microcontroller compilers. The Aixt language implements a subset of V with differences in variables, strings, arrays, default integers size, structs, functions, and preprocessor commands. The Aixt API provides functions for digital I/O, analog inputs, PWM outputs, and serial ports.

github

: 69

xiaozhi-esp32

README:

小智 AI 聊天机器人 （XiaoZhi AI Chatbot）

项目目的

已实现功能

硬件部分

面包板手工制作实践

已支持的开源硬件

固件部分

免开发环境烧录

开发环境

智能体配置

技术原理与私有化部署

Star History

For Tasks:

For Jobs:

Alternative AI tools for xiaozhi-esp32

Similar Open Source Tools

xiaozhi-esp32

chatgpt-plus

geekai

NarratoAI

AIBotPublic

langchat

Jarvis

chatgpt-infinity

easyAi

awesome-cuda-tensorrt-fpga

awesome-hpc-cuda-fpga

chatgpt-auto-refresh

bravegpt

duckduckgpt

Juggle

prompt-optimizer

For similar tasks

Virtual_Avatar_ChatBot

xiaozhi-esp32

fridon-ai

For similar jobs

xiaozhi-esp32

frame-codebase

Awesome-Embedded

AIOC

aixt

小智 AI 聊天机器人（XiaoZhi AI Chatbot）