按 GSD 框架 .planning/milestones/digital_human_rtc/phases/phase_03_gif_resources/ 规划完成 Phase 3 数字人表情 GIF 资源处理。 ## 处理方式(与 PoC 阶段 hiyori_m05.gif 一致) ```bash gifsicle --resize _x360 -O3 input.gif -o output.gif ``` - 高度 = LCD 360px,宽度按原比例自动算 → 209px - 不裁剪(保持源 GIF 完整人物) - 不加 --lossy / --colors(保留 256 色,画质优先) - 只用 -O3 优化文件大小 ## 处理结果 | GIF | 用途 | 源 | 处理后 | 节省 | |-----|------|-----|--------|------| | m03 | 负面/严肃 | 407×700 3.3MB | 209×360 1.15MB | 66% | | m06 | 默认/积极 | 407×700 1.3MB | 209×360 0.44MB | 66% | | m07 | 思考/疲倦 | 407×700 1.2MB | 209×360 0.40MB | 66% | | 合计 | — | 5.7MB | 1.94MB | 66% | ## 决策过程(避免后续重复犯错) Phase 3 初稿曾尝试裁剪到 240×320 + PIL 全帧 bbox 居中裁剪, 用户烧录后反馈"视觉感官不好"——角色被横向压扁(240×320 纵横比 0.75 vs 源 407×700 纵横比 0.583)。回归 PoC 等比例缩放方式后效果与 PoC 一致。 PoC 处理标准已写入用户级 feedback memory(feedback_hiyori_gif_processing.md), 后续 hiyori GIF 处理一律用本方式,除非用户主动要求修改。 ## 显示效果(用户已目视确认) LCD 360×360 居中显示 209×360 GIF: - 垂直方向: 360 = 360,完全充满 - 横向: 209 < 360,左右各 75.5px 留边显示背景图 - 角色比例: 完整保留源 GIF 的 407:700 = 0.582 纵横比,人物细高自然 ## 删除项 - spiffs_image/hiyori_m05.gif (2.3MB) 已删除 - 被 m06/m07/m03 替代 文件历史保留在 git,可通过 git show eb96130:spiffs_image/hiyori_m05.gif 恢复 ## 默认表情切换 main/dzbj/ai_chat_ui.c:234: - PoC: bg_gif_demo_start(..., "/spiflash/hiyori_m05.gif") - Phase 3: bg_gif_demo_start(..., "/spiflash/hiyori_m06.gif") ## 烧录运行时验证 - 烧录后 0 次重启(连续监控 18 秒) - BG_GIF: GIF 已加载到 PSRAM: /spiflash/hiyori_m06.gif (441.8 KB) - AudioCodec: Audio codec started(首次冷启动直接成功) - 用户目视确认显示效果良好 ## GSD 文档(同时提交) - .planning/milestones/digital_human_rtc/phases/phase_03_gif_resources/PLAN.md - .planning/milestones/digital_human_rtc/phases/phase_03_gif_resources/GIF_REPORT.md ## SPIFFS 容量 新 SPIFFS 4.94MB 当前实际占用 ~2MB(40%),余量 ~2.94MB 充足。
XiaoZhi AI Chatbot
Introduction
👉 Build your AI chat companion with ESP32+SenseVoice+Qwen72B!【bilibili】
👉 Equipping XiaoZhi with DeepSeek's smart brain【bilibili】
👉 Build your own AI companion, a beginner's guide【bilibili】
Project Purpose
This is an open-source project released under the MIT license, allowing anyone to use it freely, including for commercial purposes.
Through this project, we aim to help more people get started with AI hardware development and understand how to implement rapidly evolving large language models in actual hardware devices. Whether you're a student interested in AI or a developer exploring new technologies, this project offers valuable learning experiences.
Everyone is welcome to participate in the project's development and improvement. If you have any ideas or suggestions, please feel free to raise an Issue or join the chat group.
Learning & Discussion QQ Group: 376893254
Implemented Features
- Wi-Fi / ML307 Cat.1 4G
- BOOT button wake-up and interruption, supporting both click and long-press triggers
- Offline voice wake-up ESP-SR
- Streaming voice dialogue (WebSocket or UDP protocol)
- Support for 5 languages: Mandarin, Cantonese, English, Japanese, Korean SenseVoice
- Voice print recognition to identify who's calling AI's name 3D Speaker
- Large model TTS (Volcano Engine or CosyVoice)
- Large Language Models (Qwen, DeepSeek, Doubao)
- Configurable prompts and voice tones (custom characters)
- Short-term memory, self-summarizing after each conversation round
- OLED / LCD display showing signal strength or conversation content
- Support for LCD image expressions
- Multi-language support (Chinese, English)
Hardware Section
Breadboard DIY Practice
See the Feishu document tutorial:
👉 XiaoZhi AI Chatbot Encyclopedia
Breadboard demonstration:
Supported Open Source Hardware
- LiChuang ESP32-S3 Development Board
- Espressif ESP32-S3-BOX3
- M5Stack CoreS3
- AtomS3R + Echo Base
- AtomMatrix + Echo Base
- Magic Button 2.4
- Waveshare ESP32-S3-Touch-AMOLED-1.8
- LILYGO T-Circle-S3
- XiaGe Mini C3
- Moji XiaoZhi AI Derivative Version
- CuiCan AI pendant
- WMnologo-Xingzhi-1.54TFT
- SenseCAP Watcher
Firmware Section
Flashing Without Development Environment
For beginners, it's recommended to first use the firmware that can be flashed without setting up a development environment.
The firmware connects to the official xiaozhi.me server by default. Currently, personal users can register an account to use the Qwen real-time model for free.
👉 Flash Firmware Guide (No IDF Environment)
Development Environment
- Cursor or VSCode
- Install ESP-IDF plugin, select SDK version 5.3 or above
- Linux is preferred over Windows for faster compilation and fewer driver issues
- Use Google C++ code style, ensure compliance when submitting code
Developer Documentation
- Board Customization Guide - Learn how to create custom board adaptations for XiaoZhi
- IoT Control Module - Understand how to control IoT devices through AI voice commands
AI Agent Configuration
If you already have a XiaoZhi AI chatbot device, you can configure it through the xiaozhi.me console.
👉 Backend Operation Tutorial (Old Interface)
Technical Principles and Private Deployment
👉 Detailed WebSocket Communication Protocol Documentation
For server deployment on personal computers, refer to another MIT-licensed project xiaozhi-esp32-server
