按 GSD 框架 .planning/milestones/digital_human_rtc/phases/phase_05_subtitle_restore/ 规划完成 Phase 5 字幕显示恢复。 ## 核心变更(main/dzbj/ai_chat_ui.c) ### 1. chat_container 重构 - 新增 static chat_container 变量(lv_obj 父容器) - 尺寸 320×56(= 2 行字 + padding 4px*2) - 位置 LV_ALIGN_BOTTOM_MID 距底 10px - 完全透明背景(LV_OPA_TRANSP),无灰底 - 初始 HIDDEN,有内容时显示 ### 2. chat_label 改造 - 黑色文本(用户反馈白字在浅色背景上不清晰) - 尺寸 312×48 限制最多 2 行 - LV_LABEL_LONG_WRAP → LV_LABEL_LONG_DOT,超出 2 行自动 ... 截断 - font_puhui_20_4 中文字体不变 ### 3. ai_chat_set_chat_message() 实现 原为空函数(PoC 期间 return),本 Phase 完整实现: - 锁外去重:static last_content[256],相同内容直接返回 - lvgl_port_lock 200ms → 500ms(GIF 解码繁忙时给予更长等待) - 内容空时隐藏容器,非空显示 - 成功更新后缓存 last_content ### 4. z-index 修复 bg_gif_demo_start() 后立即 lv_obj_move_foreground(chat_container) 否则 bg_img/gif_obj 后于 chat_container 创建会遮挡字幕 ## 实测验证(用户协作) 60s 对话期间: - ✅ AI 字幕完整推送 3 次(含 54 字符长字幕) - ✅ LVGL 锁超时 14 次 → 0 次(锁外去重生效) - ✅ 表情切换 + 字幕同步工作 - ✅ 长字幕自动 2 行截断 - ✅ 无 abort/重启 ## 调用链(已对接 application.cc 现有逻辑,无需改协议层) RTC 字幕 → display->SetChatMessage(role, msg) → AiChatDisplay::SetChatMessage → ai_chat_set_chat_message() ← 本 Phase 实现 ## GSD 文档 - PLAN.md - SUBTITLE_REPORT.md(含锁优化对比 + 布局规划 + 用户决策记录)
XiaoZhi AI Chatbot
Introduction
👉 Build your AI chat companion with ESP32+SenseVoice+Qwen72B!【bilibili】
👉 Equipping XiaoZhi with DeepSeek's smart brain【bilibili】
👉 Build your own AI companion, a beginner's guide【bilibili】
Project Purpose
This is an open-source project released under the MIT license, allowing anyone to use it freely, including for commercial purposes.
Through this project, we aim to help more people get started with AI hardware development and understand how to implement rapidly evolving large language models in actual hardware devices. Whether you're a student interested in AI or a developer exploring new technologies, this project offers valuable learning experiences.
Everyone is welcome to participate in the project's development and improvement. If you have any ideas or suggestions, please feel free to raise an Issue or join the chat group.
Learning & Discussion QQ Group: 376893254
Implemented Features
- Wi-Fi / ML307 Cat.1 4G
- BOOT button wake-up and interruption, supporting both click and long-press triggers
- Offline voice wake-up ESP-SR
- Streaming voice dialogue (WebSocket or UDP protocol)
- Support for 5 languages: Mandarin, Cantonese, English, Japanese, Korean SenseVoice
- Voice print recognition to identify who's calling AI's name 3D Speaker
- Large model TTS (Volcano Engine or CosyVoice)
- Large Language Models (Qwen, DeepSeek, Doubao)
- Configurable prompts and voice tones (custom characters)
- Short-term memory, self-summarizing after each conversation round
- OLED / LCD display showing signal strength or conversation content
- Support for LCD image expressions
- Multi-language support (Chinese, English)
Hardware Section
Breadboard DIY Practice
See the Feishu document tutorial:
👉 XiaoZhi AI Chatbot Encyclopedia
Breadboard demonstration:
Supported Open Source Hardware
- LiChuang ESP32-S3 Development Board
- Espressif ESP32-S3-BOX3
- M5Stack CoreS3
- AtomS3R + Echo Base
- AtomMatrix + Echo Base
- Magic Button 2.4
- Waveshare ESP32-S3-Touch-AMOLED-1.8
- LILYGO T-Circle-S3
- XiaGe Mini C3
- Moji XiaoZhi AI Derivative Version
- CuiCan AI pendant
- WMnologo-Xingzhi-1.54TFT
- SenseCAP Watcher
Firmware Section
Flashing Without Development Environment
For beginners, it's recommended to first use the firmware that can be flashed without setting up a development environment.
The firmware connects to the official xiaozhi.me server by default. Currently, personal users can register an account to use the Qwen real-time model for free.
👉 Flash Firmware Guide (No IDF Environment)
Development Environment
- Cursor or VSCode
- Install ESP-IDF plugin, select SDK version 5.3 or above
- Linux is preferred over Windows for faster compilation and fewer driver issues
- Use Google C++ code style, ensure compliance when submitting code
Developer Documentation
- Board Customization Guide - Learn how to create custom board adaptations for XiaoZhi
- IoT Control Module - Understand how to control IoT devices through AI voice commands
AI Agent Configuration
If you already have a XiaoZhi AI chatbot device, you can configure it through the xiaozhi.me console.
👉 Backend Operation Tutorial (Old Interface)
Technical Principles and Private Deployment
👉 Detailed WebSocket Communication Protocol Documentation
For server deployment on personal computers, refer to another MIT-licensed project xiaozhi-esp32-server
