xiaozhi-esp32

Build your own AI friend

Stars: 10176

Visit

The xiaozhi-esp32 repository is the first hardware project by Xia Ge, focusing on creating an AI chatbot using ESP32, SenseVoice, and Qwen72B. The project aims to help beginners in AI hardware development understand how to apply language models to hardware devices. It supports various functionalities such as Wi-Fi configuration, offline voice wake-up, multilingual speech recognition, voiceprint recognition, TTS using large models, and more. The project encourages participation for learning and improvement, providing resources for hardware and firmware development.

README:

小智 AI 聊天机器人（XiaoZhi AI Chatbot）

（中文 | English | 日本語）

这是虾哥的第一个硬件作品。

👉 ESP32+SenseVoice+Qwen72B打造你的AI聊天伴侣！【bilibili】

👉 给小智装上 DeepSeek 的聪明大脑【bilibili】

👉 手工打造你的 AI 女友，新手入门教程【bilibili】

项目目的

本项目是一个开源项目，以 MIT 许可证发布，允许任何人免费使用，并可以用于商业用途。

我们希望通过这个项目，能够帮助更多人入门 AI 硬件开发，了解如何将当下飞速发展的大语言模型应用到实际的硬件设备中。无论你是对 AI 感兴趣的学生，还是想要探索新技术的开发者，都可以通过这个项目获得宝贵的学习经验。

欢迎所有人参与到项目的开发和改进中来。如果你有任何想法或建议，请随时提出 Issue 或加入群聊。

学习交流 QQ 群：376893254

已实现功能

Wi-Fi / ML307 Cat.1 4G
BOOT 键唤醒和打断，支持点击和长按两种触发方式
离线语音唤醒 ESP-SR
流式语音对话（WebSocket 或 UDP 协议）
支持国语、粤语、英语、日语、韩语 5 种语言识别 SenseVoice
声纹识别，识别是谁在喊 AI 的名字 3D Speaker
大模型 TTS（火山引擎或 CosyVoice）
大模型 LLM（Qwen, DeepSeek, Doubao）
可配置的提示词和音色（自定义角色）
短期记忆，每轮对话后自我总结
OLED / LCD 显示屏，显示信号强弱或对话内容
支持 LCD 显示图片表情
支持多语言（中文、英文）

硬件部分

面包板手工制作实践

详见飞书文档教程：

👉 《小智 AI 聊天机器人百科全书》

面包板效果图如下：

已支持的开源硬件

固件部分

免开发环境烧录

新手第一次操作建议先不要搭建开发环境，直接使用免开发环境烧录的固件。

固件默认接入 xiaozhi.me 官方服务器，目前个人用户注册账号可以免费使用 Qwen 实时模型。

👉 Flash烧录固件（无IDF开发环境）

开发环境

Cursor 或 VSCode
安装 ESP-IDF 插件，选择 SDK 版本 5.3 或以上
Linux 比 Windows 更好，编译速度快，也免去驱动问题的困扰
使用 Google C++ 代码风格，提交代码时请确保符合规范

智能体配置

如果你已经拥有一个小智 AI 聊天机器人设备，可以登录 xiaozhi.me 控制台进行配置。

👉 后台操作视频教程（旧版界面）

技术原理与私有化部署

👉 一份详细的 WebSocket 通信协议文档

在个人电脑上部署服务器，可以参考另一位作者同样以 MIT 许可证开源的项目 xiaozhi-esp32-server

Star History

For Tasks:

Click tags to check more tools for each tasks

build ai chatbot configure ai roles develop voice recognition implement offline voice wake-up customize tts voices

For Jobs:

ai hardware developer student interested in ai developer exploring new technologies voice recognition engineer firmware developer

Alternative AI tools for xiaozhi-esp32

Similar Open Source Tools

xiaozhi-esp32

github

: 10.2k

chatgpt-plus

ChatGPT-PLUS is an open-source AI assistant solution based on AI large language model API, with a built-in operational management backend for easy deployment. It integrates multiple large language models from platforms like OpenAI, Azure, ChatGLM, Xunfei Xinghuo, and Wenxin Yanyan. Additionally, it includes MidJourney and Stable Diffusion AI drawing features. The system offers a complete open-source solution with ready-to-use frontend and backend applications, providing a seamless typing experience via Websocket. It comes with various pre-trained role applications such as Xiaohongshu writer, English translation master, Socrates, Confucius, Steve Jobs, and weekly report assistant to meet various chat and application needs. Users can enjoy features like Suno Wensheng music, integration with MidJourney/Stable Diffusion AI drawing, personal WeChat QR code for payment, built-in Alipay and WeChat payment functions, support for various membership packages and point card purchases, and plugin API integration for developing powerful plugins using large language model functions.

github

: 2.8k

NarratoAI

NarratoAI is an automated video narration tool that provides an all-in-one solution for script writing, automated video editing, voice-over, and subtitle generation. It is powered by LLM to enhance efficient content creation. The tool aims to simplify the process of creating film commentary and editing videos by automating various tasks such as script writing and voice-over generation. NarratoAI offers a user-friendly interface for users to easily generate video scripts, edit videos, and customize video parameters. With future plans to optimize story generation processes and support additional large models, NarratoAI is a versatile tool for content creators looking to streamline their video production workflow.

github

: 6.6k

langchat

LangChat is an enterprise AIGC project solution in the Java ecosystem. It integrates AIGC large model functionality on top of the RBAC permission system to help enterprises quickly customize AI knowledge bases and enterprise AI robots. It supports integration with various large models such as OpenAI, Gemini, Ollama, Azure, Zhifu, Alibaba Tongyi, Baidu Qianfan, etc. The project is developed solely by TyCoding and is continuously evolving. It features multi-modality, dynamic configuration, knowledge base support, advanced RAG capabilities, function call customization, multi-channel deployment, workflows visualization, AIGC client application, and more.

github

: 614

Jarvis

Jarvis is a powerful virtual AI assistant designed to simplify daily tasks through voice command integration. It features automation, device management, and personalized interactions, transforming technology engagement. Built using Python and AI models, it serves personal and administrative needs efficiently, making processes seamless and productive.

github

: 67

easyAi

EasyAi is a lightweight, beginner-friendly Java artificial intelligence algorithm framework. It can be seamlessly integrated into Java projects with Maven, requiring no additional environment configuration or dependencies. The framework provides pre-packaged modules for image object detection and AI customer service, as well as various low-level algorithm tools for deep learning, machine learning, reinforcement learning, heuristic learning, and matrix operations. Developers can easily develop custom micro-models tailored to their business needs.

github

: 75

chatgpt-infinity

ChatGPT Infinity is a free and powerful add-on that makes ChatGPT generate infinite answers on any topic. It offers customizable topic selection, multilingual support, adjustable response interval, and auto-scroll feature for a seamless chat experience.

github

: 334

awesome-cuda-tensorrt-fpga

Okay, here is a JSON object with the requested information about the awesome-cuda-tensorrt-fpga repository:

github

: 103

awesome-hpc-cuda-fpga

github

: 104

chatgpt-auto-refresh

ChatGPT Auto Refresh is a userscript that keeps ChatGPT sessions fresh by eliminating network errors and Cloudflare checks. It removes the 10-minute time limit from conversations when Chat History is disabled, ensuring a seamless experience. The tool is safe, lightweight, and a time-saver, allowing users to keep their sessions alive without constant copy/paste/refresh actions. It works even in background tabs, providing convenience and efficiency for users interacting with ChatGPT. The tool relies on the chatgpt.js library and is compatible with various browsers using Tampermonkey, making it accessible to a wide range of users.

github

: 206

awesome-yolo-object-detection

github

: 1.2k

awesome-cuda-and-hpc

github

: 129

awesome-yolo-object-detection

github

: 1.4k

chatgpt-auto-continue

ChatGPT Auto-Continue is a userscript that automatically continues generating ChatGPT responses when chats cut off. It relies on the powerful chatgpt.js library and is easy to install and use. Simply install Tampermonkey and ChatGPT Auto-Continue, and visit chat.openai.com as normal. Multi-reply conversations will automatically continue generating when cut-off!

github

: 172

awesome-cuda-triton-hpc

github

: 211

AI-YinMei

AI-YinMei is an AI virtual anchor Vtuber development tool (N card version). It supports fastgpt knowledge base chat dialogue, a complete set of solutions for LLM large language models: [fastgpt] + [one-api] + [Xinference], supports docking bilibili live broadcast barrage reply and entering live broadcast welcome speech, supports Microsoft edge-tts speech synthesis, supports Bert-VITS2 speech synthesis, supports GPT-SoVITS speech synthesis, supports expression control Vtuber Studio, supports painting stable-diffusion-webui output OBS live broadcast room, supports painting picture pornography public-NSFW-y-distinguish, supports search and image search service duckduckgo (requires magic Internet access), supports image search service Baidu image search (no magic Internet access), supports AI reply chat box [html plug-in], supports AI singing Auto-Convert-Music, supports playlist [html plug-in], supports dancing function, supports expression video playback, supports head touching action, supports gift smashing action, supports singing automatic start dancing function, chat and singing automatic cycle swing action, supports multi scene switching, background music switching, day and night automatic switching scene, supports open singing and painting, let AI automatically judge the content.

github

: 529

For similar tasks

Virtual_Avatar_ChatBot

Virtual_Avatar_ChatBot is a free AI Chatbot with visual movement that runs on your local computer with minimal GPU requirement. It supports various features like Oogbabooga, betacharacter.ai, and Locall LLM. The tool requires Windows 7 or above, Python, C++ Compiler, Git, and other dependencies. Users can contribute to the open-source project by reporting bugs, creating pull requests, or suggesting new features. The goal is to enhance Voicevox functionality, support local LLM inference, and give the waifu access to the internet. The project references various tools like desktop-waifu, CharacterAI, Whisper, PYVTS, COQUI-AI, VOICEVOX, and VOICEVOX API.

github

: 92

xiaozhi-esp32

github

: 10.2k

fridon-ai

FridonAI is an open-source project offering AI-powered tools for cryptocurrency analysis and blockchain operations. It includes modules like FridonAnalytics for price analysis, FridonSearch for technical indicators, FridonNotifier for custom alerts, FridonBlockchain for blockchain operations, and FridonChat as a unified chat interface. The platform empowers users to create custom AI chatbots, access crypto tools, and interact effortlessly through chat. The core functionality is modular, with plugins, tools, and utilities for easy extension and development. FridonAI implements a scoring system to assess user interactions and incentivize engagement. The application uses Redis extensively for communication and includes a Nest.js backend for system operations.

github

: 82

For similar jobs

xiaozhi-esp32

github

: 10.2k

frame-codebase

The Frame Firmware & RTL Codebase is a comprehensive repository containing code for the Frame hardware system architecture. It includes sections for nRF52 Application, nRF52 Bootloader, and FPGA RTL. The nRF52 handles system operation, Lua scripting, Bluetooth networking, AI tasks, and power management, while the FPGA accelerates graphics and camera processing. The repository provides instructions for firmware development, debugging in VSCode, and FPGA development using tools like ARM GCC Toolchain, nRF Command Line Tools, Yosys, Project Oxide, and nextpnr. Users can build and flash projects for nRF52840 DK, modify FPGA RTL, and access pre-built accelerators bundled in the repo.

github

: 271

Awesome-Embedded

Awesome-Embedded is a curated list of resources for embedded systems enthusiasts. It covers a wide range of topics including MCU programming, RTOS, Linux kernel development, assembly programming, machine learning & AI on MCU, utilities, tips & tricks, and more. The repository provides valuable information, tutorials, and tools for individuals interested in embedded systems development.

github

: 5.4k

AIOC

AIOC is an All-in-one-Cable for Ham Radio enthusiasts, providing a cheap and hackable digital mode USB interface with features like sound-card, virtual tty, and CM108 compatible HID endpoint. It supports various software and tested radios for functions like programming, APRS, and Dual-PTT HTs. Users can fabricate and assemble the AIOC using specific instructions, and program it using STM32CubeIDE. The tool can be used for tasks like programming radios, asserting PTT, and accessing audio data channels. Future work includes configurable AIOC settings, virtual-PTT, and virtual-COS features.

github

: 1.2k

aixt

Aixt is a programming framework for microcontrollers using a modern language syntax based on V, with components including the Aixt programming language, Aixt to C Transpiler, and Aixt API. It is designed to be modular, allowing easy incorporation of new devices and boards through a TOML configuration file. The Aixt to C Transpiler translates Aixt source code to C for specific microcontroller compilers. The Aixt language implements a subset of V with differences in variables, strings, arrays, default integers size, structs, functions, and preprocessor commands. The Aixt API provides functions for digital I/O, analog inputs, PWM outputs, and serial ports.

github

: 69

xiaozhi-esp32

README:

小智 AI 聊天机器人 （XiaoZhi AI Chatbot）

项目目的

已实现功能

硬件部分

面包板手工制作实践

已支持的开源硬件

固件部分

免开发环境烧录

开发环境

智能体配置

技术原理与私有化部署

Star History

For Tasks:

For Jobs:

Alternative AI tools for xiaozhi-esp32

Similar Open Source Tools

xiaozhi-esp32

chatgpt-plus

NarratoAI

langchat

Jarvis

easyAi

chatgpt-infinity

awesome-cuda-tensorrt-fpga

awesome-hpc-cuda-fpga

chatgpt-auto-refresh

awesome-yolo-object-detection

awesome-cuda-and-hpc

awesome-yolo-object-detection

chatgpt-auto-continue

awesome-cuda-triton-hpc

AI-YinMei

For similar tasks

Virtual_Avatar_ChatBot

xiaozhi-esp32

fridon-ai

For similar jobs

xiaozhi-esp32

frame-codebase

Awesome-Embedded

AIOC

aixt

小智 AI 聊天机器人（XiaoZhi AI Chatbot）