一、新增功能
1. 聆听状态空闲超时自动退出
- 聆听状态下无用户交互(无语音对话、无按键操作、无音频播放)超过60秒后,
设备自动关闭音频通道回到idle待命状态,行为等同于手动按下BOOT按键退出
- 超时时间通过Kconfig CONFIG_LISTENING_IDLE_TIMEOUT_SECONDS可配置(范围30~300秒,默认60秒)
- speaking状态期间暂停计时,回到listening后从0重新倒计时,确保用户有完整的思考时间
2. 聆听空闲计时器外部重置接口
- 新增ResetListeningIdleTimer()公开方法,供板级按键/触摸回调调用
- 重置触发点:触摸按键(摸脑袋等)、BOOT按键、故事按键、收到服务端stt/tts/music_control/story消息
二、Bug修复
3. 修复超时退出后待命音效无声
- 原因:超时退出路径中audio_processor_.Stop()关闭了功放,之后才播放待命音效
- 修复:在SetDeviceState(kDeviceStateIdle)播放待命音效前调用codec->EnableOutput(true)确保功放开启
4. 修复WebSocket断开与tts:stop竞态导致崩溃重启
- 原因:tts:stop和WebSocket断开同时发生时,设备切换到listening触发SendStartListening失败,
竞态导致WakeWordDetect堆损坏(StoreProhibited崩溃)
- 修复:tts:stop处理中先检查IsAudioChannelOpened(),音频通道不可用时直接回退到idle
5. 修复listening状态音频通道不可用时逻辑错误
- 原因:音频通道不可用时"保持在listening状态"导致后续状态混乱
- 修复:改为直接回退到idle状态
三、优化调整
6. 版本号从1.7.5升级到1.7.6
7. ADC电量采样间隔从10ms缩短为5ms,提高采样效率
8. 日志配置调整:恢复日志输出能力用于调试
9. WiFi组件代码注释补充
10. 新增.gitignore,忽略build目录和.vscode/settings.json
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
XiaoZhi AI Chatbot
Introduction
👉 Build your AI chat companion with ESP32+SenseVoice+Qwen72B!【bilibili】
👉 Equipping XiaoZhi with DeepSeek's smart brain【bilibili】
👉 Build your own AI companion, a beginner's guide【bilibili】
Project Purpose
This is an open-source project released under the MIT license, allowing anyone to use it freely, including for commercial purposes.
Through this project, we aim to help more people get started with AI hardware development and understand how to implement rapidly evolving large language models in actual hardware devices. Whether you're a student interested in AI or a developer exploring new technologies, this project offers valuable learning experiences.
Everyone is welcome to participate in the project's development and improvement. If you have any ideas or suggestions, please feel free to raise an Issue or join the chat group.
Learning & Discussion QQ Group: 376893254
Implemented Features
- Wi-Fi / ML307 Cat.1 4G
- BOOT button wake-up and interruption, supporting both click and long-press triggers
- Offline voice wake-up ESP-SR
- Streaming voice dialogue (WebSocket or UDP protocol)
- Support for 5 languages: Mandarin, Cantonese, English, Japanese, Korean SenseVoice
- Voice print recognition to identify who's calling AI's name 3D Speaker
- Large model TTS (Volcano Engine or CosyVoice)
- Large Language Models (Qwen, DeepSeek, Doubao)
- Configurable prompts and voice tones (custom characters)
- Short-term memory, self-summarizing after each conversation round
- OLED / LCD display showing signal strength or conversation content
- Support for LCD image expressions
- Multi-language support (Chinese, English)
Hardware Section
Breadboard DIY Practice
See the Feishu document tutorial:
👉 XiaoZhi AI Chatbot Encyclopedia
Breadboard demonstration:
Supported Open Source Hardware
- LiChuang ESP32-S3 Development Board
- Espressif ESP32-S3-BOX3
- M5Stack CoreS3
- AtomS3R + Echo Base
- AtomMatrix + Echo Base
- Magic Button 2.4
- Waveshare ESP32-S3-Touch-AMOLED-1.8
- LILYGO T-Circle-S3
- XiaGe Mini C3
- Moji XiaoZhi AI Derivative Version
- CuiCan AI pendant
- WMnologo-Xingzhi-1.54TFT
- SenseCAP Watcher
Firmware Section
Flashing Without Development Environment
For beginners, it's recommended to first use the firmware that can be flashed without setting up a development environment.
The firmware connects to the official xiaozhi.me server by default. Currently, personal users can register an account to use the Qwen real-time model for free.
👉 Flash Firmware Guide (No IDF Environment)
Development Environment
- Cursor or VSCode
- Install ESP-IDF plugin, select SDK version 5.3 or above
- Linux is preferred over Windows for faster compilation and fewer driver issues
- Use Google C++ code style, ensure compliance when submitting code
Developer Documentation
- Board Customization Guide - Learn how to create custom board adaptations for XiaoZhi
- IoT Control Module - Understand how to control IoT devices through AI voice commands
AI Agent Configuration
If you already have a XiaoZhi AI chatbot device, you can configure it through the xiaozhi.me console.
👉 Backend Operation Tutorial (Old Interface)
Technical Principles and Private Deployment
👉 Detailed WebSocket Communication Protocol Documentation
For server deployment on personal computers, refer to another MIT-licensed project xiaozhi-esp32-server
