Go to file

Rdzleo 22b7a70d7d fix: 同步 Kapi 软 RTC 退出五连修到数字人项目（待命音 + 欢迎语杂音）

从 Kapi commit b1577d8 / a3a476f 完整移植 5 个修复，覆盖三类问题：
1. 开机/唤醒后按 BOOT 进 RTC 房间，欢迎语前 1-3 秒杂音
2. 软 RTC 退出（41s 无对话触发 Dialog watchdog）后待命音"卡卡正在待命"无声/杂音/被截
3. 软退出后按 BOOT 唤醒，欢迎语前杂音

【修复 1】OnAudioChannelOpened EnableOutput(true) 后立刻灌 200ms silence
  - 防止 I2S DMA 启动后到 RTC 真实 PCM 到达 1-3s 空窗的杂音

【修复 2】LeaveRoom 加 notify_closed 参数（默认 true 不变老路径）
  - hibernate 路径传 false 跳过 on_audio_channel_closed_ 回调
  - 避免回调链 player_pipeline_close → EnableOutput(false) 误关 codec
    导致待命音无声

【修复 3】LeaveRoom 不再 volc_rtc_destroy, 保留 rtc_handle_
  - 唤醒时 OpenAudioChannel 直接 volc_rtc_start 复用 handle, 不死循环
  - 服务端 AI 任务无需 destroy 也会按 180s 兜底机制清理

【修复 4 - 最隐蔽】LeaveRoom 末尾重置 downlink_is_pcm_ = false
  - 火山 RTC 下行是 PCM, DataCallback 设 downlink_is_pcm_=true
  - 不重置 → PlaySound 的 Opus 包被 OnAudioOutput 当成 raw PCM 字节流
    直接写 codec → 杂音而非待命音
  - 唤醒重连后 DataCallback 收下一包会自动重置, 不影响欢迎语

【修复 5】OnAudioInput 入口加 hibernating_ guard
  - hibernate 期间禁用输入侧, 防止访问关闭的 codec → std::bad_alloc abort
  - 不冻结 OnAudioOutput, 让待命音队列正常被消费

【EnterIdleHibernate 重写】套用 Kapi 新顺序:
  Step 0: hibernating_=true + 50ms (让 OnAudioInput guard 生效)
  Step 1: LeaveRoom(false) (codec output 保留)
  Step 2: background_task->WaitForCompletion
  Step 3: 清空 audio_decode_queue_
  Step 4: EnableInput(false) + close recorder_pipeline
  Step 5: 强制 esp_pm 禁用 Light Sleep
  Step 5.5: EnableOutput(false→true) + 200ms silence (清 LeaveRoom 副作用)
  Step 6: SetDeviceState(idle) → PlaySound 待命音
  Step 7: WaitForAudioPlayback (队列消费完毕)
  Step 7.5: background_task->WaitForCompletion + vTaskDelay(1000)
            (DMA + ES8311 FIFO + 功放尾音衰减, 防尾音截断)
  Step 8: player_pipeline_close
  Step 9: NVS idle_cycles_++
  Step 10: 显示字幕"已自动退出RTC对话..."(数字人特有, 保留)

【WakeFromHibernate】调整 hibernating_=false 顺序
  - 先放下 hibernating_, 让 ToggleChatState 期间 OnAudioInput guard 通过
  - 否则 ToggleChatState 期间音频上行迟迟不开

编译: kapi.bin 0x41c000 (4.21MB), 分区 25% 空闲。
实测三项全通: 欢迎语干净 + 待命音清晰完整 + 唤醒欢迎语干净。

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-18 10:11:36 +08:00

.cache/clangd/index

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

.planning

plan(rtc-only): Phase 9 取消 + Phase 10/11/12 规划（LVGL → esp_emote_gfx）

2026-05-15 13:37:34 +08:00

.vscode

feat(Rtc_AIavatar): 数字人透明 GIF 显示方案 PoC 完成（背景图+透明GIF叠加）

2026-05-12 17:14:49 +08:00

audios_new_p3

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

audios_p3

初始化补充提交

2026-02-24 15:57:32 +08:00

components

初始化补充提交

2026-02-24 15:57:32 +08:00

docs

docs(rtc-only): 对比报告 v2.1-v2.4 — 卡顿真凶定位 + 触顶诊断 + P4 升级路径

2026-05-14 16:51:34 +08:00

dzbj @ 9223fd5a7d

补充提交dzbj文件夹

2026-02-27 10:44:58 +08:00

esp-spot

初始化补充提交

2026-02-24 15:57:32 +08:00

main

fix: 同步 Kapi 软 RTC 退出五连修到数字人项目（待命音 + 欢迎语杂音）

2026-05-18 10:11:36 +08:00

managed_components

feat(ui): Phase 10 step 1+2 - 背景图 + 中文字幕 + 数字人透明

2026-05-15 17:38:31 +08:00

scripts

初始化补充提交

2026-02-24 15:57:32 +08:00

spiffs_image

feat(assets): 添加 hiyori-assets.bin（Phase 10 EAF 数字人资源包）

2026-05-15 15:56:33 +08:00

tests

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

tools

feat(ui): Phase 10 step 1+2 - 背景图 + 中文字幕 + 数字人透明

2026-05-15 17:38:31 +08:00

.gitignore

feat: 应援灯防撕裂优化 - DMA直接填充GRAM + LVGL flush拦截 + PWM黑屏遮蔽

2026-03-30 15:18:41 +08:00

00Kapi_Rtc_火山RTC整合移植方案.md

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

01Kapi_Rtc_WebSocket_替换为_火山RTC_技术分析报告.md

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

02Kapi_Rtc_火山RTC替换实现方案.md

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

03AEC_VOICE_INTERRUPT_PORTING_PLAN.md

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

04-2025-11-21音频优化记录.md

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

05-最新日志.txt

fix: 滑动切换图片时自动跳过解码失败的无效图片

2026-03-24 18:22:53 +08:00

06-AI对话和电子吧唧双模式适配说明 copy.md

docs: 更新 Claude Code 插件指南 + 新增开发参考文档

2026-04-02 10:09:08 +08:00

AEC_VAD_OPTIMIZATION.md

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

audios_new_p3.zip

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

BLE_JSON_通讯模块开发计划.md

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

BLE图片传输问题分析与优化建议.md

feat: 启用 BLE 5.0 2M PHY 图传加速 + 移除未使用的 BluFi 组件 + BLE 断连内存泄漏修复

2026-03-24 17:12:35 +08:00

BluFi蓝牙配网小程序开发需求说明书.md

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

BOOT_BUTTON_IMPLEMENTATION_COMPARISON.md

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

BOOT_BUTTON_LISTENING_STATE_IMPLEMENTATION_TEST.md

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

BOOT_BUTTON_MODIFICATION_SUMMARY.md

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

BOOT_BUTTON_NEW_IMPLEMENTATION_TEST.md

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

build.log

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

Claude Code 十个最值得装的 Skills copy.md

docs: 更新 Claude Code 插件指南 + 新增开发参考文档

2026-04-02 10:09:08 +08:00

Claude Code插件高效运用指南.md

chore: 更新插件指南/sdkconfig/IDF串口配置 + 从仓库移除 .DS_Store

2026-05-11 14:10:36 +08:00

CMakeLists.txt

feat: 完成 AI/吧唧双模式完全隔离重构 + 触摸坐标日志 + SPIFFS 预烧录

2026-02-28 10:23:04 +08:00

dependencies.lock

feat(ui): Phase 10 - 数字人模式 LVGL → esp_emote_gfx 完整切换

2026-05-15 15:53:21 +08:00

idf_component.yml

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

ip_query_test.py

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

LICENSE

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

partitions_4M.csv

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

partitions_8M.csv

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

partitions_32M_sensecap.csv

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

partitions.csv

feat(rtc-only): Phase 2 - 16MB Flash 分区调整（OTA 5.5MB + SPIFFS 4.9MB）

2026-05-13 11:00:35 +08:00

postman请求.md

1、更新了postman参数、约束和语音请求讲故事的Function Call参数；

2026-03-02 17:27:03 +08:00

QMI8658A_IMU_Sensor_Development_Guide.md

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

QMI8658A驱动适配方案_B站驱动.md

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

QMI8658替换方案_Github驱动.md

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

README_en.md

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

README_ja.md

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

README_RTC.md

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

README.md

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

sdkconfig

feat(assets): 添加 hiyori-assets.bin（Phase 10 EAF 数字人资源包）

2026-05-15 15:56:33 +08:00

sdkconfig.bak

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

sdkconfig.custom_wake_word

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

sdkconfig.defaults

feat(rtc-only): Phase 1 - 通过 CONFIG_BAJI_BADGE_MODE 屏蔽电子吧唧模式

2026-05-13 10:22:48 +08:00

sdkconfig.defaults.esp32c3

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

sdkconfig.defaults.esp32s3

1、启用LVGL GIF解码器（CONFIG_LV_USE_GIF=y），支持吧唧模式GIF图片BLE传输和播放；

2026-03-10 17:36:18 +08:00

sdkconfig.defaults.prod生产环境

feat: 启用 BLE 5.0 2M PHY 图传加速 + 移除未使用的 BluFi 组件 + BLE 断连内存泄漏修复

2026-03-24 17:12:35 +08:00

URGENT_INTERRUPT_FIX.md

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

VOICE_INTERRUPT_FEATURE.md

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

VOICE_INTERRUPT_OPTIMIZATION_GUIDE.md

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

volc_device_manager.o

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

和风天气运行日志.txt

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

自定义唤醒词移植说明.md

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

自定义唤醒词配置使用手册.md

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

蓝牙配网功能实现总结.md

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

蓝牙配网集成指南.md

1、初始化代码，待适配中....

2026-02-24 15:28:34 +08:00

资深嵌入式工程师开发思维.md

docs: 更新 Claude Code 插件指南 + 新增开发参考文档

2026-04-02 10:09:08 +08:00

README_en.md

XiaoZhi AI Chatbot

(中文 | English | 日本語)

Introduction

👉 Build your AI chat companion with ESP32+SenseVoice+Qwen72B!【bilibili】

👉 Equipping XiaoZhi with DeepSeek's smart brain【bilibili】

👉 Build your own AI companion, a beginner's guide【bilibili】

Project Purpose

This is an open-source project released under the MIT license, allowing anyone to use it freely, including for commercial purposes.

Through this project, we aim to help more people get started with AI hardware development and understand how to implement rapidly evolving large language models in actual hardware devices. Whether you're a student interested in AI or a developer exploring new technologies, this project offers valuable learning experiences.

Everyone is welcome to participate in the project's development and improvement. If you have any ideas or suggestions, please feel free to raise an Issue or join the chat group.

Learning & Discussion QQ Group: 376893254

Implemented Features

Wi-Fi / ML307 Cat.1 4G
BOOT button wake-up and interruption, supporting both click and long-press triggers
Offline voice wake-up ESP-SR
Streaming voice dialogue (WebSocket or UDP protocol)
Support for 5 languages: Mandarin, Cantonese, English, Japanese, Korean SenseVoice
Voice print recognition to identify who's calling AI's name 3D Speaker
Large model TTS (Volcano Engine or CosyVoice)
Large Language Models (Qwen, DeepSeek, Doubao)
Configurable prompts and voice tones (custom characters)
Short-term memory, self-summarizing after each conversation round
OLED / LCD display showing signal strength or conversation content
Support for LCD image expressions
Multi-language support (Chinese, English)

Hardware Section

Breadboard DIY Practice

See the Feishu document tutorial:

👉 XiaoZhi AI Chatbot Encyclopedia

Breadboard demonstration:

Supported Open Source Hardware

Firmware Section

Flashing Without Development Environment

For beginners, it's recommended to first use the firmware that can be flashed without setting up a development environment.

The firmware connects to the official xiaozhi.me server by default. Currently, personal users can register an account to use the Qwen real-time model for free.

👉 Flash Firmware Guide (No IDF Environment)

Development Environment

Cursor or VSCode
Install ESP-IDF plugin, select SDK version 5.3 or above
Linux is preferred over Windows for faster compilation and fewer driver issues
Use Google C++ code style, ensure compliance when submitting code

Developer Documentation

Board Customization Guide - Learn how to create custom board adaptations for XiaoZhi
IoT Control Module - Understand how to control IoT devices through AI voice commands

AI Agent Configuration

If you already have a XiaoZhi AI chatbot device, you can configure it through the xiaozhi.me console.

👉 Backend Operation Tutorial (Old Interface)

Technical Principles and Private Deployment

👉 Detailed WebSocket Communication Protocol Documentation

For server deployment on personal computers, refer to another MIT-licensed project xiaozhi-esp32-server

Star History

Languages

C 93.5%

C++ 1.9%

Jupyter Notebook 1.8%

Python 1.6%

Assembly 0.7%

Other 0.1%