diff --git a/API相关/语音合成大模型-单向流式websocket-V3-支持复刻混音mix.md b/API相关/语音合成大模型-单向流式websocket-V3-支持复刻混音mix.md
new file mode 100644
index 0000000..fd679ef
--- /dev/null
+++ b/API相关/语音合成大模型-单向流式websocket-V3-支持复刻混音mix.md
@@ -0,0 +1,1205 @@
+<span id="70413752"></span>
+# 1 接口功能
+单向流式API为用户提供文本转语音的能力，支持多语种、多方言，同时支持websocket协议流式输出。
+<span id="7b9267e4"></span>
+## 1.1 最佳实践
+推荐使用链接复用，可降低耗时约70ms左右。
+对比v1单向流式接口，不同的音色优化程度不同，以具体测试结果为准，理论上相对会有几十ms的提升。
+<span id="cbd635a6"></span>
+# 2 接口说明
+<span id="ebc43a76"></span>
+## 2.1 请求Request
+<span id="879dd657"></span>
+### 请求路径
+`wss://openspeech.bytedance.com/api/v3/tts/unidirectional/stream`
+<span id="b4537ed1"></span>
+### 建连&鉴权
+<span id="7d2a2880"></span>
+#### Request Headers
+
+| | | | | \
+|Key |说明 |是否必须 |Value示例 |
+|---|---|---|---|
+| | | | | \
+|X-Api-App-Id |\
+| |使用火山引擎控制台获取的APP ID，可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F) |是 |\
+| | | |your-app-id |\
+| | | | |
+| | | | | \
+|X-Api-Access-Key |\
+| |使用火山引擎控制台获取的Access Token，可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F) |是 |\
+| | | |your-access-key |\
+| | | | |
+| | | | | \
+|X-Api-Resource-Id |\
+| |表示调用服务的资源信息 ID |\
+| | |\
+| |* 豆包语音合成模型1.0： |\
+| |   * seed-tts-1.0 或者 volc.service_type.10029（字符版） |\
+| |   * seed-tts-1.0-concurr 或者 volc.service_type.10048（并发版） |\
+| |* 豆包语音合成模型2.0:   |\
+| |   * seed-tts-2.0 (字符版) |\
+| |* 声音复刻： |\
+| |   * seed-icl-1.0（声音复刻1.0字符版） |\
+| |   * seed-icl-1.0-concurr（声音复刻1.0并发版） |\
+| |   * seed-icl-2.0 (声音复刻2.0字符版) |\
+| | |\
+| |**注意：** |\
+| | |\
+| |* "豆包语音合成模型1.0"的资源信息ID仅适用于["豆包语音合成模型1.0"的音色](https://www.volcengine.com/docs/6561/1257544) |\
+| |* "豆包语音合成模型2.0"的资源信息ID仅适用于["豆包语音合成模型2.0"的音色](https://www.volcengine.com/docs/6561/1257544) |是 |\
+| | | |* 豆包语音合成模型1.0： |\
+| | | |   * seed-tts-1.0  |\
+| | | |   * seed-tts-1.0-concurr |\
+| | | |* 豆包语音合成模型2.0:   |\
+| | | |   * seed-tts-2.0  |\
+| | | |* 声音复刻： |\
+| | | |   * seed-icl-1.0（声音复刻1.0字符版） |\
+| | | |   * seed-icl-1.0-concurr（声音复刻1.0并发版） |\
+| | | |   * seed-icl-2.0 (声音复刻2.0字符版) |
+| | | | | \
+|X-Api-Request-Id |标识客户端请求ID，uuid随机字符串 |否 |67ee89ba-7050-4c04-a3d7-ac61a63499b3 |
+| | | | | \
+|X-Control-Require-Usage-Tokens-Return |请求消耗的用量返回控制标记。当携带此字段，在SessionFinish事件（152）中会携带用量数据 |否 |* 设置为*，表示返回已支持的用量数据。 |\
+| | | |* 也设置为具体的用量数据标记，如text_words；多个用逗号分隔 |\
+| | | |* 当前已支持的用量数据 |\
+| | | |   * text_words，表示计费字符数 |
+
+<span id="9b6ef23d"></span>
+#### Response Headers
+
+| | | | \
+|Key |说明 |Value示例 |
+|---|---|---|
+| | | | \
+|X-Tt-Logid |服务端返回的 logid，建议用户获取和打印方便定位问题 |2025041513355271DF5CF1A0AE0508E78C |
+
+<span id="1ed7c9ef"></span>
+### WebSocket 二进制协议
+WebSocket 使用二进制协议传输数据。
+协议的组成由至少 4 个字节的可变 header、payload size 和 payload 三部分组成，其中
+
+* header 描述消息类型、序列化方式以及压缩格式等信息；
+* payload size 是 payload 的长度；
+* payload 是具体负载内容，依据消息类型不同 payload 内容不同；
+
+需注意：协议中整数类型的字段都使用**大端**表示。
+<span id="10007f84"></span>
+##### 二进制帧
+
+| | | | | \
+|Byte |Left 4-bit |Right 4-bit |说明 |
+|---|---|---|---|
+| | | | | \
+|0 - Left half |Protocol version | |目前只有v1，始终填0b0001 |
+| | | | | \
+|0 - Right half | |Header size (4x) |目前只有4字节，始终填0b0001 |
+| | | | | \
+|1 - Left half |Message type | |固定为0b001 |
+| | | | | \
+|1 - Right half | |Message type specific flags |在sendText时，为0 |\
+| | | |在finishConnection时，为0b100 |
+| | | | | \
+|2 - Left half |Serialization method | |0b0000：Raw（无特殊序列化方式，主要针对二进制音频数据）0b0001：JSON（主要针对文本类型消息） |
+| | | | | \
+|2 - Right half | |Compression method |0b0000：无压缩0b0001：gzip |
+| | || | \
+|3 |Reserved | |留空（0b0000 0000） |
+| | || | \
+|[4 ~ 7] |[Optional field,like event number,...] | |取决于Message type specific flags，可能有、也可能没有 |
+| | || | \
+|... |Payload | |可能是音频数据、文本数据、音频文本混合数据 |
+
+<span id="e09e1c0e"></span>
+###### payload请求参数
+
+| | | | | | \
+|字段 |描述 |是否必须 |类型 |默认值 |
+|---|---|---|---|---|
+| | | | | | \
+|user |用户信息 | | | |
+| | | | | | \
+|user.uid |用户uid | | | |
+| | | | | | \
+|event |请求的事件 | | | |
+| | | | | | \
+|namespace |请求方法 | |string |BidirectionalTTS |
+| | | | | | \
+|req_params.text |输入文本 | |string | |
+| | | | | | \
+|req_params.model |\
+| |模型版本，传`seed-tts-1.1`较默认版本音质有提升，并且延时更优，不传为默认效果。 |\
+| |注：若使用1.1模型效果，在复刻场景中会放大训练音频prompt特质，因此对prompt的要求更高，使用高质量的训练音频，可以获得更优的音质效果。 |\
+| | |\
+| |以下参数仅针对声音复刻2.0的音色生效，即音色ID的前缀为`saturn_`的音色。音色的取值为以下两种： |\
+| | |\
+| |* `seed-tts-2.0-expressive`：表现力较强，支持QA和Cot能力，不过可能存在抽卡的情况。 |\
+| |* `seed-tts-2.0-standard`：表现力上更加稳定，但是不支持QA和Cot能力。如果此时使用QA或Cot能力，则拒绝请求。 |\
+| |* 如果不传model参数，默认使用`seed-tts-2.0-expressive`模型。 | |string |\
+| | | | | |
+| | | | | | \
+|req_params.ssml |* 当文本格式是ssml时，需要将文本赋值为ssml，此时文本处理的优先级高于text。ssml和text字段，至少有一个不为空 |\
+| |* ["豆包语音合成模型2.0"的音色](https://www.volcengine.com/docs/6561/1257544) 暂不支持 |\
+| |* 豆包声音复刻模型2.0（icl 2.0）的音色暂不支持 | |string | |
+| | | | | | \
+|req_params.speaker |发音人，具体见[发音人列表](https://www.volcengine.com/docs/6561/1257544) |√ |string | |
+| | | | | | \
+|req_params.audio_params |音频参数，便于服务节省音频解码耗时 |√ |object | |
+| | | | | | \
+|req_params.audio_params.format |音频编码格式，mp3/ogg_opus/pcm。<span style="background-color: rgba(255,233,40, 0.96)">接口传入wav并不会报错，在流式场景下传入wav会多次返回wav header，这种场景建议使用pcm。</span> | |string |mp3 |
+| | | | | | \
+|req_params.audio_params.sample_rate |音频采样率，可选值 [8000,16000,22050,24000,32000,44100,48000] | |number |24000 |
+| | | | | | \
+|req_params.audio_params.bit_rate |音频比特率，可传16000、32000等。 |\
+| |bit_rate默认设置范围为64k～160k，传了disable_default_bit_rate为true后可以设置到64k以下 |\
+| |GoLang示例：additions = fmt.Sprintf("{"disable_default_bit_rate":true}") |\
+| |注：bit_rate只针对MP3格式，wav计算比特率跟pcm一样是 比特率 (bps) = 采样率 × 位深度 × 声道数 |\
+| |目前大模型TTS只能改采样率，所以对于wav格式来说只能通过改采样率来变更音频的比特率 | |number | |
+| | | | | | \
+|req_params.audio_params.emotion |设置音色的情感。示例："emotion": "angry" |\
+| |注：当前仅部分音色支持设置情感，且不同音色支持的情感范围存在不同。 |\
+| |详见：[大模型语音合成API-音色列表-多情感音色](https://www.volcengine.com/docs/6561/1257544) | |string | |
+| | | | | | \
+|req_params.audio_params.emotion_scale |调用emotion设置情感参数后可使用emotion_scale进一步设置情绪值，范围1~5，不设置时默认值为4。 |\
+| |注：理论上情绪值越大，情感越明显。但情绪值1~5实际为非线性增长，可能存在超过某个值后，情绪增加不明显，例如设置3和5时情绪值可能接近。 | |number |4 |
+| | | | | | \
+|req_params.audio_params.speech_rate |语速，取值范围[-50,100]，100代表2.0倍速，-50代表0.5倍数 | |number |0 |
+| | | | | | \
+|req_params.audio_params.loudness_rate |音量，取值范围[-50,100]，100代表2.0倍音量，-50代表0.5倍音量（mix音色暂不支持） | |number |0 |
+| | | | | | \
+|req_params.audio_params.enable_timestamp |\
+|([仅TTS1.0支持](https://www.volcengine.com/docs/6561/1257544)) |设置 "enable_timestamp": true 返回句级别字的时间戳（默认为 false，参数传入 true 即表示启用） |\
+| |开启后，在原有返回的事件`event=TTSSentenceEnd`中，新增该子句的时间戳信息。 |\
+| | |\
+| |* 一个子句的时间戳返回之后才会开始返回下一句音频。 |\
+| |* 合成有多个子句会多次返回`TTSSentenceStart`和`TTSSentenceEnd`。开启字幕后字幕跟随`TTSSentenceEnd`返回。 |\
+| |* 字/词粒度的时间戳，其中字/词是tn。具体可以看下面的例子。 |\
+| |* 支持中、英，其他语种、方言暂时不支持。 |\
+| | |\
+| |注：该字段仅适用于["豆包语音合成模型1.0"的音色](https://www.volcengine.com/docs/6561/1257544) | |bool |false |
+| | | | | | \
+|req_params.audio_params.enable_subtitle |设置 "enable_subtitle": true 返回句级别字的时间戳（默认为 false，参数传入 true 即表示启用） |\
+| |开启后，新增返回事件`event=TTSSubtitle`，包含字幕信息。 |\
+| | |\
+| |* 在一句音频合成之后，不会立即返回该句的字幕。合成进度不会被字幕识别阻塞，当一句的字幕识别完成后立即返回。可能一个子句的字幕返回的时候，已经返回下一句的音频帧给调用方了。 |\
+| |* 合成有多个子句，仅返回一次`TTSSentenceStart`和`TTSSentenceEnd`。开启字幕后会多次返回`TTSSubtitle`。 |\
+| |* 字/词粒度的时间戳，其中字/词是原文。具体可以看下面的例子。 |\
+| |* 支持中、英，其他语种、方言暂时不支持； |\
+| |* latex公式不支持 |\
+| |   * req_params.additions.enable_latex_tn为true时，不开启字幕识别功能，即不返回字幕； |\
+| |* ssml不支持 |\
+| |   * req_params.ssml 不传时，不开启字幕识别功能，即不返回字幕； |\
+| | |\
+| |注：该参数只在TTS2.0、ICL2.0生效。 | |bool |false |
+| | | | | | \
+|req_params.additions |用户自定义参数 | |jsonstring | |
+| | | | | | \
+|req_params.additions.silence_duration |设置该参数可在句尾增加静音时长，范围0~30000ms。（注：增加的句尾静音主要针对传入文本最后的句尾，而非每句话的句尾） | |number |0 |
+| | | | | | \
+|req_params.additions.enable_language_detector |自动识别语种 | |bool |false |
+| | | | | | \
+|req_params.additions.disable_markdown_filter |是否开启markdown解析过滤， |\
+| |为true时，解析并过滤markdown语法，例如，`**你好**`，会读为“你好”， |\
+| |为false时，不解析不过滤，例如，`**你好**`，会读为“星星‘你好’星星” | |bool |false |
+| | | | | | \
+|req_params.additions.disable_emoji_filter |开启emoji表情在文本中不过滤显示，默认为false，建议搭配时间戳参数一起使用。 |\
+| |GoLang示例：`additions = fmt.Sprintf("{"disable_emoji_filter":true}")` | |bool |false |
+| | | | | | \
+|req_params.additions.mute_cut_remain_ms |该参数需配合mute_cut_threshold参数一起使用，其中： |\
+| |"mute_cut_threshold": "400", // 静音判断的阈值（音量小于该值时判定为静音） |\
+| |"mute_cut_remain_ms": "50", // 需要保留的静音长度 |\
+| |注：参数和value都为string格式 |\
+| |Golang示例：`additions = fmt.Sprintf("{"mute_cut_threshold":"400", "mute_cut_remain_ms": "1"}")` |\
+| |特别提醒： |\
+| | |\
+| |* 因MP3格式的特殊性，句首始终会存在100ms内的静音无法消除，WAV格式的音频句首静音可全部消除，建议依照自身业务需求综合判断选择 | |string | |
+| | | | | | \
+|req_params.additions.enable_latex_tn |是否可以播报latex公式，需将disable_markdown_filter设为true | |bool |false |
+| | | | | | \
+|req_params.additions.latex_parser |是否使用lid 能力播报latex公式，相较于latex_tn 效果更好； |\
+| |值为“v2”时支持lid能力解析公式，值为“”时不支持lid； |\
+| |需同时将disable_markdown_filter设为true； | |string | |
+| | | | | | \
+|req_params.additions.max_length_to_filter_parenthesis |是否过滤括号内的部分，0为不过滤，100为过滤 | |int |100 |
+| | | | | | \
+|req_params.additions.explicit_language（明确语种） |仅读指定语种的文本 |\
+| |**精品音色和 声音复刻 ICL1.0场景：** |\
+| | |\
+| |* 不给定参数，正常中英混 |\
+| |* `crosslingual` 启用多语种前端（包含`zh/en/ja/es-ms/id/pt-br`） |\
+| |* `zh-cn` 中文为主，支持中英混  |\
+| |* `en` 仅英文 |\
+| |* `ja` 仅日文 |\
+| |* `es-mx` 仅墨西 |\
+| |* `id` 仅印尼 |\
+| |* `pt-br` 仅巴葡 |\
+| | |\
+| |**DIT 声音复刻场景：** |\
+| |当音色是使用model_type=2训练的，即采用dit标准版效果时，建议指定明确语种，目前支持： |\
+| | |\
+| |* 不给定参数，启用多语种前端`zh,en,ja,es-mx,id,pt-br,de,fr` |\
+| |* `zh,en,ja,es-mx,id,pt-br,de,fr` 启用多语种前端 |\
+| |* `zh-cn` 中文为主，支持中英混  |\
+| |* `en` 仅英文 |\
+| |* `ja` 仅日文 |\
+| |* `es-mx` 仅墨西 |\
+| |* `id` 仅印尼 |\
+| |* `pt-br` 仅巴葡 |\
+| |* `de` 仅德语 |\
+| |* `fr` 仅法语 |\
+| | |\
+| |当音色是使用model_type=3训练的，即采用dit还原版效果时，必须指定明确语种，目前支持： |\
+| | |\
+| |* 不给定参数，正常中英混 |\
+| |* `zh-cn` 中文为主，支持中英混  |\
+| |* `en` 仅英文 |\
+| | |\
+| |**声音复刻 ICL2.0场景：** |\
+| |当音色是使用model_type=4训练的 |\
+| | |\
+| |* 不给定参数，正常中英混 |\
+| |* `zh-cn` 中文为主，支持中英混  |\
+| |* `en` 仅英文 |\
+| | |\
+| |GoLang示例：`additions = fmt.Sprintf("{"explicit_language": "zh"}")` | |string | |
+| | | | | | \
+|req_params.additions.context_language（参考语种） |给模型提供参考的语种 |\
+| | |\
+| |* 不给定 西欧语种采用英语 |\
+| |* id 西欧语种采用印尼 |\
+| |* es 西欧语种采用墨西 |\
+| |* pt 西欧语种采用巴葡 | |string | |
+| | | | | | \
+|req_params.additions.unsupported_char_ratio_thresh |默认: 0.3，最大值: 1.0 |\
+| |检测出不支持合成的文本超过设置的比例，则会返回错误。 | |float |0.3 |
+| | | | | | \
+|req_params.additions.aigc_watermark |默认：false |\
+| |是否在合成结尾增加音频节奏标识 | |bool |false |
+| | | | | | \
+|req_params.additions.aigc_metadata （meta 水印） |在合成音频 header加入元数据隐式表示，支持 mp3/wav/ogg_opus | |object | |
+| | | | | | \
+|req_params.additions.aigc_metadata.enable |是否启用隐式水印 | |bool |false |
+| | | | | | \
+|req_params.additions.aigc_metadata.content_producer |合成服务提供者的名称或编码 | |string |"" |
+| | | | | | \
+|req_params.additions.aigc_metadata.produce_id |内容制作编号 | |string |"" |
+| | | | | | \
+|req_params.additions.aigc_metadata.content_propagator |内容传播服务提供者的名称或编码 | |string |"" |
+| | | | | | \
+|req_params.additions.aigc_metadata.propagate_id |内容传播编号 | |string |"" |
+| | | | | | \
+|req_params.additions.cache_config（缓存相关参数） |开启缓存，开启后合成相同文本时，服务会直接读取缓存返回上一次合成该文本的音频，可明显加快相同文本的合成速率，缓存数据保留时间1小时。 |\
+| |（通过缓存返回的数据不会附带时间戳） |\
+| |Golang示例：`additions = fmt.Sprintf("{"disable_default_bit_rate":true, "cache_config": {"text_type": 1,"use_cache": true}}")` | |object | |
+| | | | | | \
+|req_params.additions.cache_config.text_type（缓存相关参数） |和use_cache参数一起使用，需要开启缓存时传1 | |int |1 |
+| | | | | | \
+|req_params.additions.cache_config.use_cache（缓存相关参数） |和text_type参数一起使用，需要开启缓存时传true | |bool |true |
+| | | | | | \
+|req_params.additions.post_process |后处理配置 |\
+| |Golang示例：`additions = fmt.Sprintf("{"post_process":{"pitch":12}}")` | |object | |
+| | | | | | \
+|req_params.additions.post_process.pitch |音调取值范围是[-12,12] | |int |\
+| | | | |0 |
+| | | | | | \
+|req_params.additions.context_texts |\
+|([仅TTS2.0支持](https://www.volcengine.com/docs/6561/1257544)) |语音合成的辅助信息，用于模型对话式合成，能更好的体现语音情感； |\
+| |可以探索，比如常见示例有以下几种： |\
+| | |\
+| |1. 语速调整 |\
+| |   1. 比如：context_texts: ["你可以说慢一点吗？"] |\
+| |2. 情绪/语气调整 |\
+| |   1. 比如：context_texts=["你可以用特别特别痛心的语气说话吗?"] |\
+| |   2. 比如：context_texts=["嗯，你的语气再欢乐一点"] |\
+| |3. 音量调整 |\
+| |   1. 比如：context_texts=["你嗓门再小点。"] |\
+| |4. 音感调整 |\
+| |   1. 比如：context_texts=["你能用骄傲的语气来说话吗？"] |\
+| | |\
+| |注意： |\
+| | |\
+| |1. 该字段仅适用于["豆包语音合成模型2.0"的音色](https://www.volcengine.com/docs/6561/1257544) |\
+| |2. 当前字符串列表只第一个值有效 |\
+| |3. 该字段文本不参与计费 | |string list |null |
+| | | | | | \
+|req_params.additions.section_id |\
+|([仅TTS2.0支持](https://www.volcengine.com/docs/6561/1257544)) |其他合成语音的会话id(session_id)，用于辅助当前语音合成，提供更多的上下文信息； |\
+| |取值，参见接口交互中的session_id |\
+| |示例： |\
+| | |\
+| |1. section_id="bf5b5771-31cd-4f7a-b30c-f4ddcbf2f9da" |\
+| | |\
+| |注意： |\
+| | |\
+| |1. 该字段仅适用于["豆包语音合成模型2.0"的音色](https://www.volcengine.com/docs/6561/1257544) |\
+| |2. 历史上下文的session_id 有效期： |\
+| |   1. 最长30轮 |\
+| |   2. 最长10分钟 | |string |"" |
+| | | | | | \
+|req_params.additions.use_tag_parser |是否开启cot解析能力。cot能力可以辅助当前语音合成，对语速、情感等进行调整。 |\
+| |注意： |\
+| | |\
+| |1. 音色支持范围：仅限声音复刻2.0复刻的音色 |\
+| |2. 文本长度：单句的text字符长度最好小于64（cot标签也计算在内） |\
+| |3. cot能力生效的范围是单句 |\
+| | |\
+| |示例： |\
+| |支持单组和多组cot标签：`<cot text=急促难耐>工作占据了生活的绝大部分</cot>，只有去做自己认为伟大的工作，才能获得满足感。<cot text=语速缓慢>不管生活再苦再累，都绝不放弃寻找</cot>。` | |bool |false |
+| | | | | | \
+|req_params.mix_speaker |混音参数结构 |\
+| |注意： |\
+| | |\
+| |1. 该字段仅适用于["豆包语音合成模型1.0"的音色](https://www.volcengine.com/docs/6561/1257544) | |object | |
+| | | | | | \
+|req_params.mix_speaker.speakers |混音音色名以及影响因子列表 |\
+| | |\
+| |1. 最多支持3个音色混音 |\
+| |2. 混音影响因子和必须=1 |\
+| |3. 使用复刻音色时，需要使用查询接口获取的icl_的speakerid，而非S_开头的speakerid |\
+| |4. 音色风格差异较大的两个音色（如男女混），以0.5-0.5同等比例混合时，可能出现偶发跳变，建议尽量避免 |\
+| | |\
+| |注意：使用Mix能力时，req_params.speaker = custom_mix_bigtts | |list |null |
+| | | | | | \
+|req_params.mix_speaker.speakers[i].source_speaker |混音源音色名（支持大小模型音色和复刻2.0音色） | |string |"" |
+| | | | | | \
+|req_params.mix_speaker.speakers[i].mix_factor |混音源音色名影响因子 | |float |0 |
+
+单音色请求参数示例：
+```JSON
+{
+    "user": {
+        "uid": "12345"
+    },
+    "req_params": {
+        "text": "明朝开国皇帝朱元璋也称这本书为,万物之根",
+        "speaker": "zh_female_shuangkuaisisi_moon_bigtts",
+        "audio_params": {
+            "format": "mp3",
+            "sample_rate": 24000
+        },
+      }
+    }
+}
+```
+
+mix请求参数示例：
+```JSON
+{
+    "user": {
+        "uid": "12345"
+    },
+    "req_params": {
+        "text": "明朝开国皇帝朱元璋也称这本书为万物之根",
+        "speaker": "custom_mix_bigtts",
+        "audio_params": {
+            "format": "mp3",
+            "sample_rate": 24000
+        },
+        "mix_speaker": {
+            "speakers": [{
+                "source_speaker": "zh_male_bvlazysheep",
+                "mix_factor": 0.3
+            }, {
+                "source_speaker": "BV120_streaming",
+                "mix_factor": 0.3
+            }, {
+                "source_speaker": "zh_male_ahu_conversation_wvae_bigtts",
+                "mix_factor": 0.4
+            }]
+        }
+    }
+}
+```
+
+<span id="7196a9df"></span>
+## 2.2 响应Response
+<span id="4272eb93"></span>
+### 建连响应
+主要关注建连阶段 HTTP Response 的状态码和 Body
+
+* 建连成功：状态码为 200
+* 建连失败：状态码不为 200，Body 中提供错误原因说明
+
+<span id="2d7a5370"></span>
+### WebSocket 传输响应
+<span id="141caac4"></span>
+#### 二进制帧 - 正常响应帧
+
+| | | | | \
+|Byte |Left 4-bit |Right 4-bit |说明 |
+|---|---|---|---|
+| | | | | \
+|0 - Left half |Protocol version | |目前只有v1，始终填0b0001 |
+| | | | | \
+|0 - Right half | |Header size (4x) |目前只有4字节，始终填0b0001 |
+| | | | | \
+|1 - Left half |Message type | |音频帧返回：0b1011 |\
+| | | |其他帧返回：0b1001 |
+| | | | | \
+|1 - Right half | |Message type specific flags |固定为0b0100 |
+| | | | | \
+|2 - Left half |Serialization method | |0b0000：Raw（无特殊序列化方式，主要针对二进制音频数据）0b0001：JSON（主要针对文本类型消息） |
+| | | | | \
+|2 - Right half | |Compression method |0b0000：无压缩0b0001：gzip |
+| | || | \
+|3 |Reserved | |留空（0b0000 0000） |
+| | || | \
+|[4 ~ 7] |[Optional field,like event number,...] |\
+| | | |取决于Message type specific flags，可能有、也可能没有 |
+| | || | \
+|... |Payload | |可能是音频数据、文本数据、音频文本混合数据 |
+
+<span id="c7404398"></span>
+##### payload响应参数
+
+| | | | \
+|字段 |描述 |类型 |
+|---|---|---|
+| | | | \
+|data |返回的二进制数据包 |[]byte |
+| | | | \
+|event |返回的事件类型 |number |
+| | | | \
+|res_params.text |经文本分句后的句子 |string |
+
+<span id="65eb0f21"></span>
+#### 二进制帧 - 错误响应帧
+
+| | | | | \
+|Byte |Left 4-bit |Right 4-bit |说明 |
+|---|---|---|---|
+| | | | | \
+|0 - Left half |Protocol version | |目前只有v1，始终填0b0001 |
+| | | | | \
+|0 - Right half | |Header size (4x) |目前只有4字节，始终填0b0001 |
+| | | | | \
+|1  |Message type |Message type specific flags |0b11110000 |
+| | | | | \
+|2 - Left half |Serialization method | |0b0000：Raw（无特殊序列化方式，主要针对二进制音频数据）0b0001：JSON（主要针对文本类型消息） |
+| | | | | \
+|2 - Right half | |Compression method |0b0000：无压缩0b0001：gzip |
+| | || | \
+|3 |Reserved | |留空（0b0000 0000） |
+| | || | \
+|[4 ~ 7] |Error code | |错误码 |
+| | || | \
+|... |Payload | |错误消息对象 |
+
+<span id="37909556"></span>
+## 2.3 event定义
+在发送文本转TTS阶段，不需要客户端发送上行的event帧。event类型如下：
+
+| | | | | \
+|Event code |含义 |事件类型 |应用阶段：上行/下行 |
+|---|---|---|---|
+| | | | | \
+|152 |SessionFinished，会话已结束（上行&下行） |\
+| |标识语音一个完整的语音合成完成 |Session 类 |下行 |
+| | | | | \
+|350 |TTSSentenceStart，TTS 返回句内容开始 |数据类 |下行 |
+| | | | | \
+|351 |TTSSentenceEnd，TTS 返回句内容结束 |数据类 |下行 |
+| | | | | \
+|352 |TTSResponse，TTS 返回句的音频内容 |数据类 |下行 |
+
+在关闭连接阶段，需要客户端传递上行event帧去关闭连接。event类型如下：
+
+| | | | | \
+|Event code |含义 |事件类型 |应用阶段：上行/下行 |
+|---|---|---|---|
+| | | | | \
+|2 |FinishConnection，结束连接 |Connect 类 |上行 |
+| | | | | \
+|52 |ConnectionFinished 结束连接成功 |Connect 类 |下行 |
+
+交互示例：
+![Image](https://p9-arcosite.byteimg.com/tos-cn-i-goo7wpa0wc/a9005d7ddd564ad79ad6dda9699a4a65~tplv-goo7wpa0wc-image.image =419x)
+<span id="71e5b133"></span>
+## 2.4 不同类型帧举例说明
+<span id="109f8def"></span>
+### SendText
+<span id="3544c657"></span>
+#### 请求Request
+
+| | | | || \
+|Byte |Left 4-bit |Right 4-bit |说明 | |
+|---|---|---|---|---|
+| | | | | | \
+|0 |0001 |0001 |v1 |4-byte header |
+| | | | | | \
+|1 |0001 |0000 |Full-client request |with no event number |
+| | | | | | \
+|2 |0001 |0000 |JSON |no compression |
+| | | | | | \
+|3 |0000 |0000 | | |
+| | || || \
+|4 ~ 7 |uint32(...) | |len(payload_json) | |
+| | || || \
+|8 ~ ... |\
+| |{...} |\
+| | | |文本 |\
+| | | | | |
+
+payload
+```JSON
+{
+    "user": {
+        "uid": "12345"
+    },
+    "req_params": {
+        "text": "明朝开国皇帝朱元璋也称这本书为,万物之根",
+        "speaker": "zh_female_shuangkuaisisi_moon_bigtts",
+        "audio_params": {
+            "format": "mp3",
+            "sample_rate": 24000
+        },
+      }
+    }
+}
+```
+
+<span id="9b307cb7"></span>
+#### 响应Response
+<span id="683ea12d"></span>
+##### TTSSentenceStart
+
+| | | | || \
+|Byte |Left 4-bit |Right 4-bit |说明 | |
+|---|---|---|---|---|
+| | | | | | \
+|0 |0001 |0001 |v1 |4-byte header |
+| | | | | | \
+|1 |1001 |0100 |Full-client request |with event number |
+| | | | | | \
+|2 |0001 |0000 |JSON |no compression |
+| | | | | | \
+|3 |0000 |0000 | | |
+| | || || \
+|4 ~ 7 |TTSSentenceStart | |event type | |
+| | || || \
+|8 ~ 11 |uint32(12) | |len(<session_id>) | |
+| | || || \
+|12 ~ 23 |nxckjoejnkegf | |session_id | |
+| | || || \
+|24 ~ 27 |uint32( ...) | |len(text_binary) | |
+| | || || \
+|28 ~ ... |\
+| |{...} | |text_binary | |
+
+<span id="3da131c9"></span>
+##### TTSResponse
+
+| | | | || \
+|Byte |Left 4-bit |Right 4-bit |说明 | |
+|---|---|---|---|---|
+| | | | | | \
+|0 |0001 |0001 |v1 |4-byte header |
+| | | | | | \
+|1 |1011 |0100 |Audio-only response |with event number |
+| | | | | | \
+|2 |0001 |0000 |JSON |no compression |
+| | | | | | \
+|3 |0000 |0000 | | |
+| | || | | \
+|4 ~ 7 |TTSResponse | |event type | |
+| | || | | \
+|8 ~ 11 |uint32(12) | |len(<session_id>) | |
+| | || | | \
+|12 ~ 23 |nxckjoejnkegf | |session_id | |
+| | || | | \
+|24 ~ 27 |uint32( ...) | |len(audio_binary) | |
+| | || | | \
+|28 ~ ... |{...} |\
+| | | |audio_binary |\
+| | | | | |
+
+<span id="edc35acf"></span>
+##### TTSSentenceEnd
+
+| | | | || \
+|Byte |Left 4-bit |Right 4-bit |说明 | |
+|---|---|---|---|---|
+| | | | | | \
+|0 |0001 |0001 |v1 |4-byte header |
+| | | | | | \
+|1 |1001 |0100 |Full-client request |with event number |
+| | | | | | \
+|2 |0001 |0000 |JSON |no compression |
+| | | | | | \
+|3 |0000 |0000 | | |
+| | || || \
+|4 ~ 7 |TTSSentenceEnd | |event type | |
+| | || || \
+|8 ~ 11 |uint32(12) | |len(<session_id>) | |
+| | || || \
+|12 ~ 23 |nxckjoejnkegf | |session_id | |
+| | || || \
+|24 ~ 27 |uint32( ...) | |len(payload) | |
+| | || || \
+|28 ~ ... |{...} |\
+| | | |payload |\
+| | | | | |
+
+<span id="04a1a1b7"></span>
+##### SessionFinished
+
+| | | | || \
+|Byte |Left 4-bit |Right 4-bit |说明 | |
+|---|---|---|---|---|
+| | | | | | \
+|0 |0001 |0001 |v1 |4-byte header |
+| | | | | | \
+|1 |1001 |0100 |Full-client request |with event number |
+| | | | | | \
+|2 |0001 |0000 |JSON |no compression |
+| | | | | | \
+|3 |0000 |0000 | | |
+| | || | | \
+|4 ~ 7 |SessionFinished | |event type | |
+| | || || \
+|8 ~ 11 |uint32(12) | |len(<session_id>) | |
+| | || || \
+|12 ~ 23 |nxckjoejnkegf | |session_id | |
+| | || || \
+|24 ~ 27 |uint32( ...) | |len(response_meta_json) | |
+| | || || \
+|28 ~ ... |{ |\
+| | "status_code": 20000000, |\
+| | "message": "ok"， |\
+| |"usage": { |\
+| |        "text_words"：4 |\
+| |    } |\
+| |} |\
+| | | |response_meta_json |\
+| | | | |\
+| | | |* 仅含status_code和message字段 |\
+| | | |* usage仅当header中携带X-Control-Require-Usage-Tokens-Return存在 | |
+
+<span id="c2620002"></span>
+#### FinishConnection
+<span id="7b009499"></span>
+##### 请求request
+
+| | | | || \
+|Byte |Left 4-bit |Right 4-bit |说明 | |
+|---|---|---|---|---|
+| | | | | | \
+|0 |0001 |0001 |v1 |4-byte header |
+| | | | | | \
+|1 |0001 |0100 |Full-client request |with event number |
+| | | | | | \
+|2 |0001 |0000 |JSON |no compression |
+| | | | | | \
+|3 |0000 |0000 | | |
+| | || || \
+|4-7 |uint32(...) | |len(payload_json) | |
+| | || || \
+|8 ~ ... |\
+| |{...} |\
+| | | |payload_json |\
+| | | |扩展保留，暂留空JSON | |
+
+<span id="9b812c2d"></span>
+##### 响应response
+
+| | | | || \
+|Byte |Left 4-bit |Right 4-bit |说明 | |
+|---|---|---|---|---|
+| | | | | | \
+|0 |0001 |0001 |v1 |4-byte header |
+| | | | | | \
+|1 |1001 |0100 |Full-client request |with event number |
+| | | | | | \
+|2 |0001 |0000 |JSON |no compression |
+| | | | | | \
+|3 |0000 |0000 | | |
+| | || || \
+|4 ~ 7 |ConnectionFinished | |event type | |
+| | || || \
+|8 ~ 11 |uint32(7) | |len(<connection_id>) | |
+| | || || \
+|12 ~ 15 |uint32(58) | |len(<response_meta_json>) | |
+| | || || \
+|28 ~ ... |{ |\
+| | "status_code": 20000000, |\
+| | "message": "ok" |\
+| |} | |response_meta_json |\
+| | | | |\
+| | | |* 仅含status_code和message字段 |\
+| | | | |\
+| | | | | |
+
+<span id="89bf6f66"></span>
+## 2.5 时间戳句子格式说明
+
+| | | | \
+| |\
+|<span id="532996f7"></span> |\
+|#  |**TTS1.0** |\
+| |**ICL1.0** |**TTS2.0** |\
+| | |**ICL2.0** |
+|---|---|---|
+| | | | \
+|事件交互区别 |合成有多个子句会多次返回`TTSSentenceStart`和`TTSSentenceEnd`。开启字幕后字幕跟随`TTSSentenceEnd`返回。 |合成有多个子句，仅返回一次`TTSSentenceStart`和`TTSSentenceEnd`。 |\
+| | |开启字幕后会多次返回`TTSSubtitle`。 |
+| | | | \
+|返回时机 |一个子句的时间戳返回之后才会开始返回下一句音频。 |\
+| | |在一句音频合成之后，不会立即返回该句的字幕。 |\
+| | |合成进度不会被字幕识别阻塞，当一句的字幕识别完成后立即返回。 |\
+| | |可能一个子句的字幕返回的时候，已经返回下一句的音频帧给调用方了。 |
+| | | | \
+|句子返回格式 |\
+| |字幕信息是基于tn打轴 |\
+| |:::tip |\
+| |1. text字段对应于：原文 |\
+| |2. words内文本字段对应于：tn |\
+| |::: |\
+| |第一句： |\
+| |```JSON |\
+| |{ |\
+| |    "phonemes": [ |\
+| |    ], |\
+| |    "text": "2019年1月8日，软件2.0版本于格萨拉彝族乡应时而生。发布会当日，一场瑞雪将天地映衬得纯净无瑕。", |\
+| |    "words": [ |\
+| |        { |\
+| |            "confidence": 0.8766515, |\
+| |            "endTime": 0.295, |\
+| |            "startTime": 0.155, |\
+| |            "word": "二" |\
+| |        }, |\
+| |        { |\
+| |            "confidence": 0.95224416, |\
+| |            "endTime": 0.425, |\
+| |            "startTime": 0.295, |\
+| |            "word": "零" |\
+| |        }, |\
+| |        { |\
+| |            "confidence": 0.9108828, |\
+| |            "endTime": 0.575, |\
+| |            "startTime": 0.425, |\
+| |            "word": "一" |\
+| |        }, |\
+| |        { |\
+| |            "confidence": 0.9609025, |\
+| |            "endTime": 0.755, |\
+| |            "startTime": 0.575, |\
+| |            "word": "九" |\
+| |        }, |\
+| |        { |\
+| |            "confidence": 0.96244556, |\
+| |            "endTime": 1.005, |\
+| |            "startTime": 0.755, |\
+| |            "word": "年" |\
+| |        }, |\
+| |        { |\
+| |            "confidence": 0.85796577, |\
+| |            "endTime": 1.155, |\
+| |            "startTime": 1.005, |\
+| |            "word": "一" |\
+| |        }, |\
+| |        { |\
+| |            "confidence": 0.8460129, |\
+| |            "endTime": 1.275, |\
+| |            "startTime": 1.155, |\
+| |            "word": "月" |\
+| |        }, |\
+| |        { |\
+| |            "confidence": 0.90833753, |\
+| |            "endTime": 1.505, |\
+| |            "startTime": 1.275, |\
+| |            "word": "八" |\
+| |        }, |\
+| |        { |\
+| |            "confidence": 0.9403977, |\
+| |            "endTime": 1.935, |\
+| |            "startTime": 1.505, |\
+| |            "word": "日，" |\
+| |        }, |\
+| |         |\
+| |        ... |\
+| |         |\
+| |        { |\
+| |            "confidence": 0.9415791, |\
+| |            "endTime": 10.505, |\
+| |            "startTime": 10.355, |\
+| |            "word": "无" |\
+| |        }, |\
+| |        { |\
+| |            "confidence": 0.903162, |\
+| |            "endTime": 10.895, // 第一句结束时间 |\
+| |            "startTime": 10.505, |\
+| |            "word": "瑕。" |\
+| |        } |\
+| |    ] |\
+| |} |\
+| |``` |\
+| | |\
+| |第二句： |\
+| |```JSON |\
+| |{ |\
+| |    "phonemes": [ |\
+| | |\
+| |    ], |\
+| |    "text": "这仿佛一则自然寓言：我们致力于在不断的版本迭代中，为您带来如雪后初霁般清晰、焕然一新的体验。", |\
+| |    "words": [ |\
+| |        { |\
+| |            "confidence": 0.8970245, |\
+| |            "endTime": 11.6953745, |\
+| |            "startTime": 11.535375, // 第二句开始时间，是相对整个session的位置 |\
+| |            "word": "这" |\
+| |        }, |\
+| |        { |\
+| |            "confidence": 0.86508185, |\
+| |            "endTime": 11.875375, |\
+| |            "startTime": 11.6953745, |\
+| |            "word": "仿" |\
+| |        }, |\
+| |        { |\
+| |            "confidence": 0.73354065, |\
+| |            "endTime": 12.095375, |\
+| |            "startTime": 11.875375, |\
+| |            "word": "佛" |\
+| |        }, |\
+| |        { |\
+| |            "confidence": 0.8525295, |\
+| |            "endTime": 12.275374, |\
+| |            "startTime": 12.095375, |\
+| |            "word": "一" |\
+| |        }... |\
+| |    ] |\
+| |} |\
+| |``` |\
+| | |字幕信息是基于原文打轴 |\
+| | |:::tip |\
+| | |1. text字段对应于：原文 |\
+| | |2. words内文本字段对应于：原文 |\
+| | |::: |\
+| | |第一句： |\
+| | |```JSON |\
+| | |{ |\
+| | |    "phonemes": [ |\
+| | |    ], |\
+| | |    "text": "2019年1月8日，软件2.0版本于格萨拉彝族乡应时而生。", |\
+| | |    "words": [ |\
+| | |        { |\
+| | |            "confidence": 0.11120544, |\
+| | |            "endTime": 0.615, |\
+| | |            "startTime": 0.585, |\
+| | |            "word": "2019" |\
+| | |        }, |\
+| | |        { |\
+| | |            "confidence": 0.8413397, |\
+| | |            "endTime": 0.845, |\
+| | |            "startTime": 0.615, |\
+| | |            "word": "年" |\
+| | |        }, |\
+| | |        { |\
+| | |            "confidence": 0.2413961, |\
+| | |            "endTime": 0.875, |\
+| | |            "startTime": 0.845, |\
+| | |            "word": "1" |\
+| | |        }, |\
+| | |        { |\
+| | |            "confidence": 0.8487973, |\
+| | |            "endTime": 1.055, |\
+| | |            "startTime": 0.875, |\
+| | |            "word": "月" |\
+| | |        }, |\
+| | |        { |\
+| | |            "confidence": 0.509697, |\
+| | |            "endTime": 1.225, |\
+| | |            "startTime": 1.165, |\
+| | |            "word": "8" |\
+| | |        }, |\
+| | |        { |\
+| | |            "confidence": 0.9516253, |\
+| | |            "endTime": 1.485, |\
+| | |            "startTime": 1.225, |\
+| | |            "word": "日，" |\
+| | |        }, |\
+| | |         |\
+| | |        ... |\
+| | |         |\
+| | |        { |\
+| | |            "confidence": 0.6933777, |\
+| | |            "endTime": 5.435, |\
+| | |            "startTime": 5.325, |\
+| | |            "word": "而" |\
+| | |        }, |\
+| | |        { |\
+| | |            "confidence": 0.921702, |\
+| | |            "endTime": 5.695, // 第一句结束时间 |\
+| | |            "startTime": 5.435, |\
+| | |            "word": "生。" |\
+| | |        } |\
+| | |    ] |\
+| | |} |\
+| | |``` |\
+| | | |\
+| | | |\
+| | |第二句： |\
+| | |```JSON |\
+| | |{ |\
+| | |    "phonemes": [ |\
+| | | |\
+| | |    ], |\
+| | |    "text": "发布会当日，一场瑞雪将天地映衬得纯净无瑕。", |\
+| | |    "words": [ |\
+| | |        { |\
+| | |            "confidence": 0.7016578, |\
+| | |            "endTime": 6.3550415, |\
+| | |            "startTime": 6.2150416, // 第二句开始时间，是相对整个session的位置 |\
+| | |            "word": "发" |\
+| | |        }, |\
+| | |        { |\
+| | |            "confidence": 0.6800497, |\
+| | |            "endTime": 6.4450417, |\
+| | |            "startTime": 6.3550415, |\
+| | |            "word": "布" |\
+| | |        }, |\
+| | |         |\
+| | |        ... |\
+| | |         |\
+| | |        { |\
+| | |            "confidence": 0.8818264, |\
+| | |            "endTime": 10.145041, |\
+| | |            "startTime": 9.945042, |\
+| | |            "word": "净" |\
+| | |        }, |\
+| | |        { |\
+| | |            "confidence": 0.87248623, |\
+| | |            "endTime": 10.285042, |\
+| | |            "startTime": 10.145041, |\
+| | |            "word": "无" |\
+| | |        }, |\
+| | |        { |\
+| | |            "confidence": 0.8069703, |\
+| | |            "endTime": 10.505041, |\
+| | |            "startTime": 10.285042, |\
+| | |            "word": "瑕。" |\
+| | |        } |\
+| | |    ] |\
+| | |} |\
+| | |``` |\
+| | | |\
+| | | |
+| | | | \
+|语种 |中、英，不支持小语种、方言 |中、英，不支持小语种、方言 |
+| | | | \
+|latex |enable_latex_tn=true，有字幕返回 |enable_latex_tn=true，无字幕返回，接口不报错 |
+| | | | \
+|ssml |req_params.ssml不为空，有字幕返回 |req_params.ssml不为空，无字幕返回，接口不报错 |
+
+<span id="8164feca"></span>
+# 3 错误码
+
+| | | | \
+|Code |Message |说明 |
+|---|---|---|
+| | | | \
+|20000000 |ok |音频合成结束的成功状态码 |
+| | | | \
+|45000000 |\
+| |speaker permission denied: get resource id: access denied |音色鉴权失败，一般是speaker指定音色未授权或者错误导致 |\
+| | | |
+|^^| | | \
+| |quota exceeded for types: concurrency |并发限流，一般是请求并发数超过限制 |
+| | | | \
+|55000000 |服务端一些error |服务端通用错误 |
+
+<span id="00165867"></span>
+# 4 调用示例
+
+```mixin-react
+return (<Tabs>
+<Tabs.TabPane title="Python调用示例" key="iYrQ6gaeNz"><RenderMd content={`<span id="32c5df89"></span>
+### 前提条件
+
+* 调用之前，您需要获取以下信息：
+   * \`<appid>\`：使用控制台获取的APP ID，可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
+   * \`<access_token>\`：使用控制台获取的Access Token，可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
+   * \`<voice_type>\`：您预期使用的音色ID，可参考 [大模型音色列表](https://www.volcengine.com/docs/6561/1257544)。
+
+<span id="e50a7eed"></span>
+### Python环境
+
+* Python：3.9版本及以上。
+* Pip：25.1.1版本及以上。您可以使用下面命令安装。
+
+\`\`\`Bash
+python3 -m pip install --upgrade pip
+\`\`\`
+
+<span id="57159ec2"></span>
+### 下载代码示例
+<Attachment link="https://p9-arcosite.byteimg.com/tos-cn-i-goo7wpa0wc/a67dd285912648c2980a853c486c560f~tplv-goo7wpa0wc-image.image" name="volcengine_unidirectional_stream_demo.tar.gz" ></Attachment>
+<span id="b93a1eb6"></span>
+### 解压缩代码包，安装依赖
+\`\`\`Bash
+mkdir -p volcengine_unidirectional_stream_demo
+tar xvzf volcengine_unidirectional_stream_demo.tar.gz -C ./volcengine_unidirectional_stream_demo
+cd volcengine_unidirectional_stream_demo
+python3 -m venv .venv
+source .venv/bin/activate
+python3 -m pip install --upgrade pip
+pip3 install -e .
+\`\`\`
+
+<span id="a0896222"></span>
+### 发起调用
+> \`<appid>\`替换为您的APP ID。
+> \`<access_token>\`替换为您的Access Token。
+> \`<voice_type>\`替换为您预期使用的音色ID，例如\`zh_female_cancan_mars_bigtts\`。
+
+\`\`\`Bash
+python3 examples/volcengine/unidirectional_stream.py --appid <appid> --access_token <access_token> --voice_type <voice_type> --text "你好，我是火山引擎的语音合成服务。这是一个美好的旅程。"
+\`\`\`
+
+`}></RenderMd></Tabs.TabPane>
+<Tabs.TabPane title="Java调用示例" key="OeOm28iImI"><RenderMd content={`<span id="c778cfe1"></span>
+### 前提条件
+
+* 调用之前，您需要获取以下信息：
+   * \`<appid>\`：使用控制台获取的APP ID，可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
+   * \`<access_token>\`：使用控制台获取的Access Token，可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
+   * \`<voice_type>\`：您预期使用的音色ID，可参考 [大模型音色列表](https://www.volcengine.com/docs/6561/1257544)。
+
+<span id="28217763"></span>
+### Java环境
+
+* Java：21版本及以上。
+* Maven：3.9.10版本及以上。
+
+<span id="e56568a4"></span>
+### 下载代码示例
+<Attachment link="https://p9-arcosite.byteimg.com/tos-cn-i-goo7wpa0wc/ad93b596b83445de994f9dc991ef5a83~tplv-goo7wpa0wc-image.image" name="volcengine_unidirectional_stream_demo.tar.gz" ></Attachment>
+<span id="25d4d614"></span>
+### 解压缩代码包，安装依赖
+\`\`\`Bash
+mkdir -p volcengine_unidirectional_stream_demo
+tar xvzf volcengine_unidirectional_stream_demo.tar.gz -C ./volcengine_unidirectional_stream_demo
+cd volcengine_unidirectional_stream_demo
+\`\`\`
+
+<span id="c9ffd562"></span>
+### 发起调用
+> \`<appid>\`替换为您的APP ID。
+> \`<access_token>\`替换为您的Access Token。
+> \`<voice_type>\`替换为您预期使用的音色ID，例如\`zh_female_cancan_mars_bigtts\`。
+
+\`\`\`Bash
+mvn compile exec:java -Dexec.mainClass=com.speech.volcengine.UnidirectionalStream -DappId=<appid> -DaccessToken=<access_token> -Dvoice=<voice_type> -Dtext="**你好**，我是豆包语音助手，很高兴认识你。这是一个愉快的旅程。"
+\`\`\`
+
+`}></RenderMd></Tabs.TabPane>
+<Tabs.TabPane title="Go调用示例" key="vZz5H44DbG"><RenderMd content={`<span id="0fc38f07"></span>
+### 前提条件
+
+* 调用之前，您需要获取以下信息：
+   * \`<appid>\`：使用控制台获取的APP ID，可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
+   * \`<access_token>\`：使用控制台获取的Access Token，可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
+   * \`<voice_type>\`：您预期使用的音色ID，可参考 [大模型音色列表](https://www.volcengine.com/docs/6561/1257544)。
+
+<span id="9984b64a"></span>
+### Go环境
+
+* Go：1.21.0版本及以上。
+
+<span id="66b098f9"></span>
+### 下载代码示例
+<Attachment link="https://p9-arcosite.byteimg.com/tos-cn-i-goo7wpa0wc/00313795a9c041fda8d105ef9c6e2f47~tplv-goo7wpa0wc-image.image" name="volcengine_unidirectional_stream_demo.tar.gz" ></Attachment>
+<span id="8303b57b"></span>
+### 解压缩代码包，安装依赖
+\`\`\`Bash
+mkdir -p volcengine_unidirectional_stream_demo
+tar xvzf volcengine_unidirectional_stream_demo.tar.gz -C ./volcengine_unidirectional_stream_demo
+cd volcengine_unidirectional_stream_demo
+\`\`\`
+
+<span id="757d5902"></span>
+### 发起调用
+> \`<appid>\`替换为您的APP ID。
+> \`<access_token>\`替换为您的Access Token。
+> \`<voice_type>\`替换为您预期使用的音色ID，例如\`zh_female_cancan_mars_bigtts\`。
+
+\`\`\`Bash
+go run volcengine/unidirectional_stream/main.go --appid <appid> --access_token <access_token> --voice_type <voice_type> --text "**你好**，我是火山引擎的语音合成服务。"
+\`\`\`
+
+`}></RenderMd></Tabs.TabPane>
+<Tabs.TabPane title="C#调用示例" key="ty6KpXuQRP"><RenderMd content={`<span id="ad8b79ae"></span>
+### 前提条件
+
+* 调用之前，您需要获取以下信息：
+   * \`<appid>\`：使用控制台获取的APP ID，可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
+   * \`<access_token>\`：使用控制台获取的Access Token，可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
+   * \`<voice_type>\`：您预期使用的音色ID，可参考 [大模型音色列表](https://www.volcengine.com/docs/6561/1257544)。
+
+<span id="c2214808"></span>
+### C#环境
+
+* .Net 9.0版本。
+
+<span id="dc062b19"></span>
+### 下载代码示例
+<Attachment link="https://p9-arcosite.byteimg.com/tos-cn-i-goo7wpa0wc/ebca47d9446a436caf379cb5c08d5b47~tplv-goo7wpa0wc-image.image" name="volcengine_unidirectional_stream_demo.tar.gz" ></Attachment>
+<span id="e8bce75a"></span>
+### 解压缩代码包，安装依赖
+\`\`\`Bash
+mkdir -p volcengine_unidirectional_stream_demo
+tar xvzf volcengine_unidirectional_stream_demo.tar.gz -C ./volcengine_unidirectional_stream_demo
+cd volcengine_unidirectional_stream_demo
+\`\`\`
+
+<span id="a82d7a34"></span>
+### 发起调用
+> \`<appid>\`替换为您的APP ID。
+> \`<access_token>\`替换为您的Access Token。
+> \`<voice_type>\`替换为您预期使用的音色ID，例如\`zh_female_cancan_mars_bigtts\`。
+
+\`\`\`Bash
+dotnet run --project Volcengine/UnidirectionalStream/Volcengine.Speech.UnidirectionalStream.csproj -- --appid <appid> --access_token <access_token> --voice_type <voice_type> --text "**你好**，这是一个测试文本。我们正在测试文本转语音功能。"
+\`\`\`
+
+`}></RenderMd></Tabs.TabPane>
+<Tabs.TabPane title="TypeScript调用示例" key="fsH2BSKG95"><RenderMd content={`<span id="dcc516e7"></span>
+### 前提条件
+
+* 调用之前，您需要获取以下信息：
+   * \`<appid>\`：使用控制台获取的APP ID，可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
+   * \`<access_token>\`：使用控制台获取的Access Token，可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
+   * \`<voice_type>\`：您预期使用的音色ID，可参考 [大模型音色列表](https://www.volcengine.com/docs/6561/1257544)。
+
+<span id="cb1aa0a9"></span>
+### node环境
+
+* node：v24.0版本及以上。
+
+<span id="abd541dd"></span>
+### 下载代码示例
+<Attachment link="https://p9-arcosite.byteimg.com/tos-cn-i-goo7wpa0wc/27e058ffbf9d4dac9bc91cb0258c459a~tplv-goo7wpa0wc-image.image" name="volcengine_unidirectional_stream_demo.tar.gz" ></Attachment>
+<span id="ad736548"></span>
+### 解压缩代码包，安装依赖
+\`\`\`Bash
+mkdir -p volcengine_unidirectional_stream_demo
+tar xvzf volcengine_unidirectional_stream_demo.tar.gz -C ./volcengine_unidirectional_stream_demo
+cd volcengine_unidirectional_stream_demo
+npm install
+npm install -g typescript
+npm install -g ts-node
+\`\`\`
+
+<span id="e391c738"></span>
+### 发起调用
+> \`<appid>\`替换为您的APP ID。
+> \`<access_token>\`替换为您的Access Token。
+> \`<voice_type>\`替换为您预期使用的音色ID，例如\`<voice_type>\`。
+
+\`\`\`Bash
+npx ts-node src/volcengine/unidirectional_stream.ts --appid <appid> --access_token <access_token> --voice_type <voice_type> --text "**你好**，我是火山引擎的语音合成服务。"
+\`\`\`
+
+`}></RenderMd></Tabs.TabPane></Tabs>);
+ ```
+
+
diff --git a/API相关/语音合成大模型音色列表.md b/API相关/语音合成大模型音色列表.md
new file mode 100644
index 0000000..e69de29
diff --git a/API相关/豆包大模型语音合成API.md b/API相关/豆包大模型语音合成API.md
new file mode 100644
index 0000000..3797797
--- /dev/null
+++ b/API相关/豆包大模型语音合成API.md
@@ -0,0 +1,627 @@
+
+<span id="60c34d72"></span>
+# Websocket
+> 使用账号申请部分申请到的 appid&access_token 进行调用
+> 文本一次性送入，后端边合成边返回音频数据
+
+<span id="9e6b61a2"></span>
+## 1. 接口说明
+> V1: 
+> **wss://openspeech.bytedance.com/api/v1/tts/ws_binary                            (V1 单向流式)**
+> **https://openspeech.bytedance.com/api/v1/tts                                                (V1 http非流式)**
+> V3: 
+> **wss://openspeech.bytedance.com/api/v3/tts/unidirectional/stream    (V3 wss单向流式)**
+> [V3 websocket单向流式文档](https://www.volcengine.com/docs/6561/1719100)
+> **wss://openspeech.bytedance.com/api/v3/tts/bidirection                          (V3 wss双向流式)**
+> [V3 websocket双向流式文档](https://www.volcengine.com/docs/6561/1329505)
+> **https://openspeech.bytedance.com/api/v3/tts/unidirectional                 (V3 http单向流式)**
+> [V3 http单向流式文档](https://www.volcengine.com/docs/6561/1598757)
+
+:::warning
+大模型音色都推荐接入V3接口，时延上的表现会更好
+:::
+<span id="34dcdf3a"></span>
+## 2. 身份认证
+认证方式使用 Bearer Token，在请求的    header 中加上`"Authorization": "Bearer; {token}"`，并在请求的 json 中填入对应的 appid。
+:::warning
+Bearer 和 token 使用分号 ; 分隔，替换时请勿保留{}
+:::
+AppID/Token/Cluster 等信息可参考 [控制台使用FAQ-Q1](/docs/6561/196768#q1：哪里可以获取到以下参数appid，cluster，token，authorization-type，secret-key-？)
+<span id="f1d92aff"></span>
+## 3. 请求方式
+<span id="14624bd9"></span>
+### 3.1 二进制协议
+<span id="7574a509"></span>
+#### 报文格式(Message format)
+![Image](https://lf3-volc-editor.volccdn.com/obj/volcfe/sop-public/upload_cc1c1cdd61bf29f5bde066dc693dcb2b.png =1816x)
+所有字段以 [Big Endian(大端序)](https://zh.wikipedia.org/wiki/%E5%AD%97%E8%8A%82%E5%BA%8F#%E5%A4%A7%E7%AB%AF%E5%BA%8F) 的方式存储。
+**字段描述**
+
+| | | | \
+|字段 Field （大小， 单位 bit) |描述 Description |值 Values |
+|---|---|---|
+| | | | \
+|协议版本(Protocol version) (4) |可能会在将来使用不同的协议版本，所以这个字段是为了让客户端和服务器在版本上保持一致。 |`0b0001` - 版本 1 （目前只有版本 1) |
+| | | | \
+|报头大小(Header size) (4) |header 实际大小是 `header size value x 4` bytes. |\
+| |这里有个特殊值 `0b1111` 表示 header 大小大于或等于 60(15 x 4 bytes)，也就是会存在 header extension 字段。 |`0b0001` - 报头大小 ＝ 4 (1 x 4) |\
+| | |`0b0010` - 报头大小 ＝ 8 (2 x 4) |\
+| | |`0b1010` - 报头大小 ＝ 40 (10 x 4) |\
+| | |`0b1110` - 报头大小 = 56 (14 x 4) |\
+| | |`0b1111` - 报头大小为 60 或更大; 实际大小在 header extension 中定义 |
+| | | | \
+|消息类型(Message type) (4) |定义消息类型。 |`0b0001` - full client request. |\
+| | |`~~0b1001~~` ~~- full server response(弃用).~~ |\
+| | |`0b1011` - Audio-only server response (ACK). |\
+| | |`0b1111` - Error message from server (例如错误的消息类型，不支持的序列化方法等等) |
+| | | | \
+|Message type specific flags (4) |flags 含义取决于消息类型。 |\
+| |具体内容请看消息类型小节. | |
+| | | | \
+|序列化方法(Message serialization method) (4) |定义序列化 payload 的方法。 |\
+| |注意：它只对某些特定的消息类型有意义 (例如 Audio-only server response `0b1011` 就不需要序列化). |`0b0000` - 无序列化 (raw bytes) |\
+| | |`0b0001` - JSON |\
+| | |`0b1111` - 自定义类型, 在 header extension 中定义 |
+| | | | \
+|压缩方法(Message Compression) (4) |定义 payload 的压缩方法。 |\
+| |Payload size 字段不压缩(如果有的话，取决于消息类型)，而且 Payload size 指的是 payload 压缩后的大小。 |\
+| |Header 不压缩。 |`0b0000` - 无压缩 |\
+| | |`0b0001` - gzip |\
+| | |`0b1111` - 自定义压缩方法, 在 header extension 中定义 |
+| | | | \
+|保留字段(Reserved) (8) |保留字段，同时作为边界 (使整个报头大小为 4 个字节). |`0x00` - 目前只有 0 |
+
+<span id="95a31a2c"></span>
+#### 消息类型详细说明
+目前所有 TTS websocket 请求都使用 full client request 格式，无论"query"还是"submit"。
+<span id="d05f01f6"></span>
+#### Full client request
+
+* Header size为`b0001`(即 4B，没有 header extension)。
+* Message type为`b0001`.
+* Message type specific flags 固定为`b0000`.
+* Message serialization method为`b0001`JSON。字段参考上方表格。
+* 如果使用 gzip 压缩 payload，则 payload size 为压缩后的大小。
+
+<span id="6e82d7df"></span>
+#### Audio-only server response
+
+* Header size 应该为`b0001`.
+* Message type为`b1011`.
+* Message type specific flags 可能的值有：
+   * `b0000` - 没有 sequence number.
+   * `b0001` - sequence number > 0.
+   * `b0010`or`b0011` - sequence number < 0，表示来自服务器的最后一条消息，此时客户端应合并所有音频片段(如果有多条)。
+* Message serialization method为`b0000`(raw bytes).
+
+<span id="4f9397bc"></span>
+## 4.注意事项
+
+* 每次合成时reqid这个参数需要重新设置，且要保证唯一性（建议使用uuid.V4生成）
+* websocket demo中单条链接仅支持单次合成，若需要合成多次，需自行实现。每次创建websocket连接后，按顺序串行发送每一包。一次合成结束后，可以发送新的合成请求。
+* operation需要设置为submit才是流式返回
+* 在 websocket 握手成功后，会返回这些 Response header
+* 不支持["豆包语音合成模型2.0"的音色](https://www.volcengine.com/docs/6561/1257544)，比如："zh_female_vv_uranus_bigtts"，如需使用推荐使用v3 接口
+
+
+| | | | \
+|Key |说明 |Value 示例 |
+|---|---|---|
+| | | | \
+|X-Tt-Logid |服务端返回的 logid，建议用户获取和打印方便定位问题 |202407261553070FACFE6D19421815D605 |
+
+<span id="fe504ac4"></span>
+## 5.调用示例
+
+```mixin-react
+return (<Tabs>
+<Tabs.TabPane title="Python调用示例" key="buVUUlzaRC"><RenderMd content={`<span id="fccb89b1"></span>
+### 前提条件
+
+* 调用之前，您需要获取以下信息：
+   * \`<appid>\`：使用控制台获取的APP ID，可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
+   * \`<access_token>\`：使用控制台获取的Access Token，可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
+   * \`<voice_type>\`：您预期使用的音色ID，可参考 [大模型音色列表](https://www.volcengine.com/docs/6561/1257544)。
+
+<span id="824abc9d"></span>
+### Python环境
+
+* Python：3.9版本及以上。
+* Pip：25.1.1版本及以上。您可以使用下面命令安装。
+
+\`\`\`Bash
+python3 -m pip install --upgrade pip
+\`\`\`
+
+<span id="5cbec8af"></span>
+### 下载代码示例
+<Attachment link="https://p9-arcosite.byteimg.com/tos-cn-i-goo7wpa0wc/90fc1f44eaac49f0b4e2cbabdaee8010~tplv-goo7wpa0wc-image.image" name="volcengine_binary_demo.tar.gz" ></Attachment>
+<span id="44d95afb"></span>
+### 解压缩代码包，安装依赖
+\`\`\`Bash
+mkdir -p volcengine_binary_demo
+tar xvzf volcengine_binary_demo.tar.gz -C ./volcengine_binary_demo
+cd volcengine_binary_demo
+python3 -m venv .venv
+source .venv/bin/activate
+python3 -m pip install --upgrade pip
+pip3 install -e .
+\`\`\`
+
+<span id="fdf69422"></span>
+### 发起调用
+> \`<appid>\`替换为您的APP ID。
+> \`<access_token>\`替换为您的Access Token。
+> \`<voice_type>\`替换为您预期使用的音色ID，例如\`zh_female_cancan_mars_bigtts\`。
+
+\`\`\`Bash
+python3 examples/volcengine/binary.py --appid <appid> --access_token <access_token> --voice_type <voice_type> --text "你好，我是火山引擎的语音合成服务。这是一个美好的旅程。"
+\`\`\`
+
+`}></RenderMd></Tabs.TabPane>
+<Tabs.TabPane title="Java调用示例" key="bfjarx0zlZ"><RenderMd content={`<span id="e0bca07e"></span>
+### 前提条件
+
+* 调用之前，您需要获取以下信息：
+   * \`<appid>\`：使用控制台获取的APP ID，可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
+   * \`<access_token>\`：使用控制台获取的Access Token，可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
+   * \`<voice_type>\`：您预期使用的音色ID，可参考 [大模型音色列表](https://www.volcengine.com/docs/6561/1257544)。
+
+<span id="5f338843"></span>
+### Java环境
+
+* Java：21版本及以上。
+* Maven：3.9.10版本及以上。
+
+<span id="96af51fa"></span>
+### 下载代码示例
+<Attachment link="https://p9-arcosite.byteimg.com/tos-cn-i-goo7wpa0wc/ba78519b2dc0459fb7a6935b63775c66~tplv-goo7wpa0wc-image.image" name="volcengine_binary_demo.tar.gz" ></Attachment>
+<span id="8e0ecd00"></span>
+### 解压缩代码包，安装依赖
+\`\`\`Bash
+mkdir -p volcengine_binary_demo
+tar xvzf volcengine_binary_demo.tar.gz -C ./volcengine_binary_demo
+cd volcengine_binary_demo
+\`\`\`
+
+<span id="fa0a6230"></span>
+### 发起调用
+> \`<appid>\`替换为您的APP ID。
+> \`<access_token>\`替换为您的Access Token。
+> \`<voice_type>\`替换为您预期使用的音色ID，例如\`zh_female_cancan_mars_bigtts\`。
+
+\`\`\`Bash
+mvn compile exec:java -Dexec.mainClass=com.speech.volcengine.Binary -DappId=<appid> -DaccessToken=<access_token> -Dvoice=<voice_type> -Dtext="**你好**，我是豆包语音助手，很高兴认识你。这是一个愉快的旅程。"
+\`\`\`
+
+`}></RenderMd></Tabs.TabPane>
+<Tabs.TabPane title="Go调用示例" key="s8zQJ7cCr3"><RenderMd content={`<span id="2733f4d4"></span>
+### 前提条件
+
+* 调用之前，您需要获取以下信息：
+   * \`<appid>\`：使用控制台获取的APP ID，可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
+   * \`<access_token>\`：使用控制台获取的Access Token，可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
+   * \`<voice_type>\`：您预期使用的音色ID，可参考 [大模型音色列表](https://www.volcengine.com/docs/6561/1257544)。
+
+<span id="ee9617a6"></span>
+### Go环境
+
+* Go：1.21.0版本及以上。
+
+<span id="cf9bb2bf"></span>
+### 下载代码示例
+<Attachment link="https://p9-arcosite.byteimg.com/tos-cn-i-goo7wpa0wc/c553a4a4373840d4a4870a1ef2a4e494~tplv-goo7wpa0wc-image.image" name="volcengine_binary_demo.tar.gz" ></Attachment>
+<span id="363963c4"></span>
+### 解压缩代码包，安装依赖
+\`\`\`Bash
+mkdir -p volcengine_binary_demo
+tar xvzf volcengine_binary_demo.tar.gz -C ./volcengine_binary_demo
+cd volcengine_binary_demo
+\`\`\`
+
+<span id="f0acb02c"></span>
+### 发起调用
+> \`<appid>\`替换为您的APP ID。
+> \`<access_token>\`替换为您的Access Token。
+> \`<voice_type>\`替换为您预期使用的音色ID，例如\`zh_female_cancan_mars_bigtts\`。
+
+\`\`\`Bash
+go run volcengine/binary/main.go --appid <appid> --access_token <access_token> --voice_type <voice_type> --text "**你好**，我是火山引擎的语音合成服务。"
+\`\`\`
+
+`}></RenderMd></Tabs.TabPane>
+<Tabs.TabPane title="C#调用示例" key="Thg5rLaSjq"><RenderMd content={`<span id="c60c1d5f"></span>
+### 前提条件
+
+* 调用之前，您需要获取以下信息：
+   * \`<appid>\`：使用控制台获取的APP ID，可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
+   * \`<access_token>\`：使用控制台获取的Access Token，可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
+   * \`<voice_type>\`：您预期使用的音色ID，可参考 [大模型音色列表](https://www.volcengine.com/docs/6561/1257544)。
+
+<span id="cf2199fe"></span>
+### C#环境
+
+* .Net 9.0版本。
+
+<span id="f7e91692"></span>
+### 下载代码示例
+<Attachment link="https://p9-arcosite.byteimg.com/tos-cn-i-goo7wpa0wc/a95ff3e7604d4bb4ade8fb49e110fef5~tplv-goo7wpa0wc-image.image" name="volcengine_binary_demo.tar.gz" ></Attachment>
+<span id="f9131897"></span>
+### 解压缩代码包，安装依赖
+\`\`\`Bash
+mkdir -p volcengine_binary_demo
+tar xvzf volcengine_binary_demo.tar.gz -C ./volcengine_binary_demo
+cd volcengine_binary_demo
+\`\`\`
+
+<span id="5834585b"></span>
+### 发起调用
+> \`<appid>\`替换为您的APP ID。
+> \`<access_token>\`替换为您的Access Token。
+> \`<voice_type>\`替换为您预期使用的音色ID，例如\`zh_female_cancan_mars_bigtts\`。
+
+\`\`\`Bash
+dotnet run --project Volcengine/Binary/Volcengine.Speech.Binary.csproj -- --appid <appid> --access_token <access_token> --voice_type <voice_type> --text "**你好**，这是一个测试文本。我们正在测试文本转语音功能。"
+\`\`\`
+
+`}></RenderMd></Tabs.TabPane>
+<Tabs.TabPane title="TypeScript调用示例" key="p1GEs3rWU7"><RenderMd content={`<span id="8b865031"></span>
+### 前提条件
+
+* 调用之前，您需要获取以下信息：
+   * \`<appid>\`：使用控制台获取的APP ID，可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
+   * \`<access_token>\`：使用控制台获取的Access Token，可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
+   * \`<voice_type>\`：您预期使用的音色ID，可参考 [大模型音色列表](https://www.volcengine.com/docs/6561/1257544)。
+
+<span id="e7697c4e"></span>
+### node环境
+
+* node：v24.0版本及以上。
+
+<span id="03fe45f1"></span>
+### 下载代码示例
+<Attachment link="https://p9-arcosite.byteimg.com/tos-cn-i-goo7wpa0wc/12ef1b1188a84f0c8883a0114da741ad~tplv-goo7wpa0wc-image.image" name="volcengine_binary_demo.tar.gz" ></Attachment>
+<span id="13e8a71a"></span>
+### 解压缩代码包，安装依赖
+\`\`\`Bash
+mkdir -p volcengine_binary_demo
+tar xvzf volcengine_binary_demo.tar.gz -C ./volcengine_binary_demo
+cd volcengine_binary_demo
+npm install
+npm install -g typescript
+npm install -g ts-node
+\`\`\`
+
+<span id="0c57973f"></span>
+### 发起调用
+> \`<appid>\`替换为您的APP ID。
+> \`<access_token>\`替换为您的Access Token。
+> \`<voice_type>\`替换为您预期使用的音色ID，例如\`<voice_type>\`。
+
+\`\`\`Bash
+npx ts-node src/volcengine/binary.ts --appid <appid> --access_token <access_token> --voice_type <voice_type> --text "**你好**，我是火山引擎的语音合成服务。"
+\`\`\`
+
+`}></RenderMd></Tabs.TabPane></Tabs>);
+ ```
+
+<span id="9ea45813"></span>
+# HTTP
+> 使用账号申请部分申请到的 appid&access_token 进行调用
+> 文本全部合成完毕之后，一次性返回全部的音频数据
+
+<span id="4d23f0f6"></span>
+## 1. 接口说明
+接口地址为 **https://openspeech.bytedance.com/api/v1/tts**
+<span id="6f96a6fa"></span>
+## 2. 身份认证
+认证方式采用 Bearer Token.
+1)需要在请求的 Header 中填入"Authorization":"Bearer;${token}"
+:::warning
+Bearer 和 token 使用分号 ; 分隔，替换时请勿保留${}
+:::
+AppID/Token/Cluster 等信息可参考 [控制台使用FAQ-Q1](/docs/6561/196768#q1：哪里可以获取到以下参数appid，cluster，token，authorization-type，secret-key-？)
+<span id="a8c19c9a"></span>
+## 3. 注意事项
+
+* 使用 HTTP Post 方式进行请求，返回的结果为 JSON 格式，需要进行解析
+* 因 json 格式无法直接携带二进制音频，音频经 base64 编码。使用 base64 解码后，即为二进制音频
+* 每次合成时 reqid 这个参数需要重新设置，且要保证唯一性（建议使用 UUID/GUID 等生成）
+* 不支持["豆包语音合成模型2.0"的音色](https://www.volcengine.com/docs/6561/1257544)，比如："zh_female_vv_uranus_bigtts"，如需使用推荐使用v3 接口
+
+<span id="参数列表"></span>
+# 参数列表
+> Websocket 与 Http 调用参数相同
+
+<span id="931a7b76"></span>
+## 请求参数
+
+| | | | | | | \
+|字段 |含义 |层级 |格式 |必需 |备注 |
+|---|---|---|---|---|---|
+| | | | | | | \
+|app |应用相关配置 |1 |dict |✓ | |
+| | | | | | | \
+|appid |应用标识 |2 |string |✓ |需要申请 |
+| | | | | | | \
+|token |应用令牌 |2 |string |✓ |无实际鉴权作用的Fake token，可传任意非空字符串 |
+| | | | | | | \
+|cluster |业务集群 |2 |string |✓ |volcano_tts |
+| | | | | | | \
+|user |用户相关配置 |1 |dict |✓ | |
+| | | | | | | \
+|uid |用户标识 |2 |string |✓ |可传任意非空字符串，传入值可以通过服务端日志追溯 |
+| | | | | | | \
+|audio |音频相关配置 |1 |dict |✓ | |
+| | | | | | | \
+|voice_type |音色类型 |2 |string |✓ | |
+| | | | | | | \
+|emotion |音色情感 |2 |string | |设置音色的情感。示例："emotion": "angry" |\
+| | | | | |注：当前仅部分音色支持设置情感，且不同音色支持的情感范围存在不同。 |\
+| | | | | |详见：[大模型语音合成API-音色列表-多情感音色](https://www.volcengine.com/docs/6561/1257544) |
+| | | | | | | \
+|enable_emotion |开启音色情感 |2 |bool | |是否可以设置音色情感，需将enable_emotion设为true |\
+| | | | | |示例："enable_emotion": True |
+| | | | | | | \
+|emotion_scale |情绪值设置 |2 |float | |调用emotion设置情感参数后可使用emotion_scale进一步设置情绪值，范围1~5，不设置时默认值为4。 |\
+| | | | | |注：理论上情绪值越大，情感越明显。但情绪值1~5实际为非线性增长，可能存在超过某个值后，情绪增加不明显，例如设置3和5时情绪值可能接近。 |
+| | | | | | | \
+|encoding |音频编码格式 |2 |string | |wav / pcm / ogg_opus / mp3，默认为 pcm |\
+| | | | | |<span style="background-color: rgba(255,246,122, 0.8)">注意：wav 不支持流式</span> |
+| | | | | | | \
+|speed_ratio |语速 |2 |float | |[0.1,2]，默认为 1，通常保留一位小数即可 |
+| | | | | | | \
+|rate |音频采样率 |2 |int | |默认为 24000，可选8000，16000 |
+| | | | | | | \
+|bitrate |比特率 |2 |int | |单位 kb/s，默认160 kb/s |\
+| | | | | |**注：** |\
+| | | | | |bitrate只针对MP3格式，wav计算比特率跟pcm一样是 比特率 (bps) = 采样率 × 位深度 × 声道数 |\
+| | | | | |目前大模型TTS只能改采样率，所以对于wav格式来说只能通过改采样率来变更音频的比特率 |
+| | | | | | | \
+|explicit_language |明确语种 |2 |string | |仅读指定语种的文本 |\
+| | | | | |精品音色和 ICL 声音复刻场景： |\
+| | | | | | |\
+| | | | | |* 不给定参数，正常中英混 |\
+| | | | | |* `crosslingual` 启用多语种前端（包含`zh/en/ja/es-ms/id/pt-br`） |\
+| | | | | |* `zh-cn` 中文为主，支持中英混 |\
+| | | | | |* `en` 仅英文 |\
+| | | | | |* `ja` 仅日文 |\
+| | | | | |* `es-mx` 仅墨西 |\
+| | | | | |* `id` 仅印尼 |\
+| | | | | |* `pt-br` 仅巴葡 |\
+| | | | | | |\
+| | | | | |DIT 声音复刻场景： |\
+| | | | | |当音色是使用model_type=2训练的，即采用dit标准版效果时，建议指定明确语种，目前支持：  |\
+| | | | | | |\
+| | | | | |* 不给定参数，启用多语种前端`zh,en,ja,es-mx,id,pt-br,de,fr` |\
+| | | | | |* `zh,en,ja,es-mx,id,pt-br,de,fr` 启用多语种前端 |\
+| | | | | |* `zh-cn` 中文为主，支持中英混 |\
+| | | | | |* `en` 仅英文 |\
+| | | | | |* `ja` 仅日文  |\
+| | | | | |* `es-mx` 仅墨西  |\
+| | | | | |* `id` 仅印尼  |\
+| | | | | |* `pt-br` 仅巴葡  |\
+| | | | | |* `de` 仅德语 |\
+| | | | | |* `fr` 仅法语 |\
+| | | | | | |\
+| | | | | |当音色是使用model_type=3训练的，即采用dit还原版效果时，必须指定明确语种，目前支持：  |\
+| | | | | | |\
+| | | | | |* 不给定参数，正常中英混 |\
+| | | | | |* `zh-cn` 中文为主，支持中英混  |\
+| | | | | |* `en` 仅英文 |
+| | | | | | | \
+|context_language |参考语种 |2 |string | |给模型提供参考的语种 |\
+| | | | | | |\
+| | | | | |* 不给定 西欧语种采用英语 |\
+| | | | | |* id 西欧语种采用印尼 |\
+| | | | | |* es 西欧语种采用墨西 |\
+| | | | | |* pt 西欧语种采用巴葡 |
+| | | | | | | \
+|loudness_ratio |音量调节 |2 |float | |[0.5,2]，默认为1，通常保留一位小数即可。0.5代表原音量0.5倍，2代表原音量2倍 |
+| | | | | | | \
+|request |请求相关配置 |1 |dict |✓ | |
+| | | | | | | \
+|reqid |请求标识 |2 |string |✓ |需要保证每次调用传入值唯一，建议使用 UUID |
+| | | | | | | \
+|text |文本 |2 |string |✓ |合成语音的文本，长度限制 1024 字节（UTF-8 编码）建议小于300字符，超出容易增加badcase出现概率或报错 |
+| | | | | | | \
+|model |模型版本 |\
+| | |2 |\
+| | | |string |否 |模型版本，传`seed-tts-1.1`较默认版本音质有提升，并且延时更优，不传为默认效果。 |\
+| | | | | |注：若使用1.1模型效果，在复刻场景中会放大训练音频prompt特质，因此对prompt的要求更高，使用高质量的训练音频，可以获得更优的音质效果。 |
+| | | | | | | \
+|text_type |文本类型 |2 |string | |使用 ssml 时需要指定，值为"ssml" |
+| | | | | | | \
+|silence_duration |句尾静音 |2 |float | |设置该参数可在句尾增加静音时长，范围0~30000ms。（注：增加的句尾静音主要针对传入文本最后的句尾，而非每句话的句尾）若启用该参数，必须在request下首先设置enable_trailing_silence_audio = true |
+| | | | | | | \
+|with_timestamp |时间戳相关 |2 |int |\
+| | | |string | |传入1时表示启用，将返回TN后文本的时间戳，例如：2025。根据语义，TN后文本为“两千零二十五”或“二零二五”。 |\
+| | | | | |注：原文本中的多个标点连用或者空格仍会被处理，但不影响时间戳的连贯性（仅限大模型场景使用）。 |\
+| | | | | |附加说明（小模型和大模型时间戳原理差异）： |\
+| | | | | |1. 小模型依据前端模型生成时间戳，然后合成音频。在处理时间戳时，TN前后文本进行了映射，所以小模型可返回TN前原文本的时间戳，即保留原文中的阿拉伯数字或者特殊符号等。 |\
+| | | | | |2. 大模型在对传入文本语义理解后合成音频，再针对合成音频进行TN后打轴以输出时间戳。若不采用TN后文本，输出的时间戳将与合成音频无法对齐，所以大模型返回的时间戳对应TN后的文本。 |
+| | | | | | | \
+|operation |操作 |2 |string |✓ |query（非流式，http 只能 query） / submit（流式） |
+| | | | | | | \
+|extra_param |附加参数 |2 |jsonstring | | |
+| | | | | | | \
+|disable_markdown_filter | |3 |bool | |是否开启markdown解析过滤， |\
+| | | | | |为true时，解析并过滤markdown语法，例如，**你好**，会读为“你好”， |\
+| | | | | |为false时，不解析不过滤，例如，**你好**，会读为“星星‘你好’星星” |\
+| | | | | |示例："disable_markdown_filter": True |
+| | | | | | | \
+|enable_latex_tn | |3 |bool | |是否可以播报latex公式，需将disable_markdown_filter设为true |\
+| | | | | |示例："enable_latex_tn": True |
+| | | | | | | \
+|mute_cut_remain_ms |句首静音参数 |3 |string | |该参数需配合mute_cut_threshold参数一起使用，其中： |\
+| | | | | |"mute_cut_threshold": "400",   // 静音判断的阈值（音量小于该值时判定为静音） |\
+| | | | | |"mute_cut_remain_ms": "50", // 需要保留的静音长度 |\
+| | | | | |注：参数和value都为string格式 |\
+| | | | | |以python为示例： |\
+| | | | | |```Python |\
+| | | | | |"extra_param":("{\"mute_cut_threshold\":\"400\", \"mute_cut_remain_ms\": \"0\"}") |\
+| | | | | |``` |\
+| | | | | | |\
+| | | | | |特别提醒： |\
+| | | | | | |\
+| | | | | |* 因MP3格式的特殊性，句首始终会存在100ms内的静音无法消除，WAV格式的音频句首静音可全部消除，建议依照自身业务需求综合判断选择 |
+| | | | | | | \
+|disable_emoji_filter |emoji不过滤显示 |3 |bool | |开启emoji表情在文本中不过滤显示，默认为False，建议搭配时间戳参数一起使用。 |\
+| | | | | |Python示例：`"extra_param": json.dumps({"disable_emoji_filter": True})` |
+| | | | | | | \
+|unsupported_char_ratio_thresh |不支持语种占比阈值 |3 |float | |默认: 0.3，最大值: 1.0 |\
+| | | | | |检测出不支持合成的文本超过设置的比例，则会返回错误。 |\
+| | | | | |Python示例：`"extra_param": json.dumps({"`unsupported_char_ratio_thresh`": 0.3})` |
+| | | | | | | \
+|aigc_watermark |是否在合成结尾增加音频节奏标识 |3 |bool | |默认: false |\
+| | | | | |Python示例：`"extra_param": json.dumps({"aigc_watermark": True})` |
+| | | | | | | \
+|cache_config |缓存相关参数 |3 |dict | |开启缓存，开启后合成相同文本时，服务会直接读取缓存返回上一次合成该文本的音频，可明显加快相同文本的合成速率，缓存数据保留时间1小时。 |\
+| | | | | |（通过缓存返回的数据不会附带时间戳） |\
+| | | | | |Python示例：`"extra_param": json.dumps({"cache_config": {"text_type": 1,"use_cache": True}})` |
+| | | | | | | \
+|text_type |缓存相关参数 |4 |int | |和use_cache参数一起使用，需要开启缓存时传1 |
+| | | | | | | \
+|use_cache |缓存相关参数 |4 |bool | |和text_type参数一起使用，需要开启缓存时传true |
+
+
+
+
+备注：
+
+1. 已支持字级别时间戳能力（ssml文本类型不支持）
+2. ssml 能力已支持，详见 [SSML 标记语言--豆包语音-火山引擎 (volcengine.com)](https://www.volcengine.com/docs/6561/1330194)
+3. 暂时不支持音高调节
+4. 大模型音色语种支持中英混
+5. 大模型非双向流式已支持latex公式
+6. 在 websocket/http 握手成功后，会返回这些 Response header
+
+
+| | | | \
+|Key |说明 |Value 示例 |
+|---|---|---|
+| | | | \
+|X-Tt-Logid |服务端返回的 logid，建议用户获取和打印方便定位问题，使用默认格式即可，不要自定义格式 |202407261553070FACFE6D19421815D605 |
+
+请求示例：
+```go
+{
+    "app": {
+        "appid": "appid123",
+        "token": "access_token",
+        "cluster": "volcano_tts",
+    },
+    "user": {
+        "uid": "uid123"
+    },
+    "audio": {
+        "voice_type": "zh_male_M392_conversation_wvae_bigtts",
+        "encoding": "mp3",
+        "speed_ratio": 1.0,
+    },
+    "request": {
+        "reqid": "uuid",
+        "text": "字节跳动语音合成",
+        "operation": "query",
+    }
+}
+```
+
+<span id="返回参数"></span>
+## 返回参数
+
+| | | | | | \
+|字段 |含义 |层级 |格式 |备注 |
+|---|---|---|---|---|
+| | | | | | \
+|reqid |请求 ID |1 |string |请求 ID,与传入的参数中 reqid 一致 |
+| | | | | | \
+|code |请求状态码 |1 |int |错误码，参考下方说明 |
+| | | | | | \
+|message |请求状态信息 |1 |string |错误信息 |
+| | | | | | \
+|sequence |音频段序号 |1 |int |负数表示合成完毕 |
+| | | | | | \
+|data |合成音频 |1 |string |返回的音频数据，base64 编码 |
+| | | | | | \
+|addition |额外信息 |1 |string |额外信息父节点 |
+| | | | | | \
+|duration |音频时长 |2 |string |返回音频的长度，单位 ms |
+
+响应示例
+```go
+{
+        "reqid": "reqid",
+        "code": 3000,
+        "operation": "query",
+        "message": "Success",
+        "sequence": -1,
+        "data": "base64 encoded binary data",
+        "addition": {
+                "duration": "1960",
+        }
+}
+```
+
+<span id="ca57b94d"></span>
+## 注意事项
+
+* websocket 单条链接仅支持单次合成，若需要合成多次，则需要多次建立链接
+* 每次合成时 reqid 这个参数需要重新设置，且要保证唯一性（建议使用 uuid.V4 生成）
+* operation 需要设置为 submit
+
+<span id="返回码说明"></span>
+# 返回码说明
+
+| | | | | \
+|错误码 |错误描述 |举例 |建议行为 |
+|---|---|---|---|
+| | | | | \
+|3000 |请求正确 |正常合成 |正常处理 |
+| | | | | \
+|3001 |无效的请求 |一些参数的值非法，比如 operation 配置错误 |检查参数 |
+| | | | | \
+|3003 |并发超限 |超过在线设置的并发阈值 |重试；使用 sdk 的情况下切换离线 |
+| | | | | \
+|3005 |后端服务忙 |后端服务器负载高 |重试；使用 sdk 的情况下切换离线 |
+| | | | | \
+|3006 |服务中断 |请求已完成/失败之后，相同 reqid 再次请求 |检查参数 |
+| | | | | \
+|3010 |文本长度超限 |单次请求超过设置的文本长度阈值 |检查参数 |
+| | | | | \
+|3011 |无效文本 |参数有误或者文本为空、文本与语种不匹配、文本只含标点 |检查参数 |
+| | | | | \
+|3030 |处理超时 |单次请求超过服务最长时间限制 |重试或检查文本 |
+| | | | | \
+|3031 |处理错误 |后端出现异常 |重试；使用 sdk 的情况下切换离线 |
+| | | | | \
+|3032 |等待获取音频超时 |后端网络异常 |重试；使用 sdk 的情况下切换离线 |
+| | | | | \
+|3040 |后端链路连接错误 |后端网络异常 |重试 |
+| | | | | \
+|3050 |音色不存在 |检查使用的 voice_type 代号 |检查参数 |
+
+<span id="常见错误返回说明"></span>
+# 常见错误返回说明
+
+1. 错误返回：
+       "message": "quota exceeded for types: xxxxxxxxx_lifetime"
+   **错误原因：试用版用量用完了，需要开通正式版才能继续使用**
+2. 错误返回：
+   "message": "quota exceeded for types: concurrency"
+   **错误原因：并发超过了限定值，需要减少并发调用情况或者增购并发**
+3. 错误返回：
+       "message": "Fail to feed text, reason Init Engine Instance failed"
+   **错误原因：voice_type / cluster 传递错误**
+4. 错误返回：
+   "message": "illegal input text!"
+   **错误原因：传入的 text 无效，没有可合成的有效文本。比如全部是标点符号或者 emoji 表情，或者使用中文音色时，传递日语，以此类推。多语种音色，也需要使用 language 指定对应的语种**
+5. 错误返回：
+   "message": "authenticate request: load grant: requested grant not found"
+   **错误原因：鉴权失败，需要检查 appid&token 的值是否设置正确，同时，鉴权的正确格式为**
+   **headers["Authorization"] = "Bearer;${token}"**
+6. 错误返回：
+   "message': 'extract request resource id: get resource id: access denied"
+   **错误原因：语音合成已开通正式版且未拥有当前音色授权，需要在控制台购买该音色才能调用。标注免费的音色除 BV001_streaming 及 BV002_streaming 外，需要在控制台进行下单（支付 0 元）**
+
+
diff --git a/Capybara audio/勇敢的小裁缝_1770727373.mp3 b/Capybara audio/勇敢的小裁缝_1770727373.mp3
new file mode 100644
index 0000000..cd8ffb9
Binary files /dev/null and b/Capybara audio/勇敢的小裁缝_1770727373.mp3 differ
diff --git a/Capybara audio/卡皮巴拉的奇幻漂流_1770727390.mp3 b/Capybara audio/卡皮巴拉的奇幻漂流_1770727390.mp3
new file mode 100644
index 0000000..c33e659
Binary files /dev/null and b/Capybara audio/卡皮巴拉的奇幻漂流_1770727390.mp3 differ
diff --git a/Capybara audio/小红帽与大灰狼_1770723087.mp3 b/Capybara audio/小红帽与大灰狼_1770723087.mp3
new file mode 100644
index 0000000..6debfbd
Binary files /dev/null and b/Capybara audio/小红帽与大灰狼_1770723087.mp3 differ
diff --git a/Capybara audio/杰克与魔豆_1770727355.mp3 b/Capybara audio/杰克与魔豆_1770727355.mp3
new file mode 100644
index 0000000..0274f5b
Binary files /dev/null and b/Capybara audio/杰克与魔豆_1770727355.mp3 differ
diff --git a/Capybara audio/海盗找朋友_1770718270.mp3 b/Capybara audio/海盗找朋友_1770718270.mp3
new file mode 100644
index 0000000..1d56aef
Binary files /dev/null and b/Capybara audio/海盗找朋友_1770718270.mp3 differ
diff --git a/Capybara audio/糖果屋历险记_1770721395.mp3 b/Capybara audio/糖果屋历险记_1770721395.mp3
new file mode 100644
index 0000000..1152f11
Binary files /dev/null and b/Capybara audio/糖果屋历险记_1770721395.mp3 differ
diff --git a/Capybara music/lyrics/书房咔咔茶_1770634690.txt b/Capybara music/lyrics/书房咔咔茶_1770634690.txt
new file mode 100644
index 0000000..3baa1ef
--- /dev/null
+++ b/Capybara music/lyrics/书房咔咔茶_1770634690.txt	
@@ -0,0 +1,17 @@
+在书房角落 沏上一杯茶
+窗外微风轻拂 摇曳着树梢
+咔咔坐在椅上 沉浸在思考
+书页轻轻翻动 世界变得渺小
+咔咔咔咔 书房里的我
+静享时光 悠然自得
+茶香飘散 心灵得到慰藉
+咔咔咔咔 享受这刻
+阳光透过窗帘 柔和又温暖
+每个字每个句 都是心灵的食粮
+咔咔轻轻点头 感受着文字的力量
+在这安静的角落 找到了自我方向
+咔咔咔咔 书房里的我
+静享时光 悠然自得
+茶香飘散 心灵得到慰藉
+咔咔咔咔 享受这刻
+(茶杯轻放的声音...)
\ No newline at end of file
diff --git a/Capybara music/lyrics/书房咔咔茶_1770637242.txt b/Capybara music/lyrics/书房咔咔茶_1770637242.txt
new file mode 100644
index 0000000..c3b5f20
--- /dev/null
+++ b/Capybara music/lyrics/书房咔咔茶_1770637242.txt	
@@ -0,0 +1,17 @@
+在书房角落里，我找到了安静
+一杯茶香飘来，思绪开始飞腾
+书页轻轻翻动，知识在心间
+咔咔我在这里，享受这宁静
+咔咔咔咔，独自享受
+书中的世界，如此美妙
+咔咔咔咔，心无旁骛
+沉浸在知识的海洋，自在飞翔
+窗外微风轻拂，阳光洒满书桌
+咔咔我在这里，与文字共舞
+每个字每个句，都像是音符
+奏出心灵的乐章，如此动听
+咔咔咔咔，独自享受
+书中的世界，如此美妙
+咔咔咔咔，心无旁骛
+沉浸在知识的海洋，自在飞翔
+(翻书声...风铃声...咔咔的呼吸声...)
\ No newline at end of file
diff --git a/Capybara music/lyrics/夜深了窗外下着小雨盖着被子准备入睡_1770627405.txt b/Capybara music/lyrics/夜深了窗外下着小雨盖着被子准备入睡_1770627405.txt
new file mode 100644
index 0000000..7c9342a
--- /dev/null
+++ b/Capybara music/lyrics/夜深了窗外下着小雨盖着被子准备入睡_1770627405.txt	
@@ -0,0 +1,8 @@
+[verse]
+窗外细雨轻敲窗，
+被窝里温暖如常。
+[chorus]
+咔咔咔咔，梦乡近了，
+小雨伴我入眠床。
+[outro]
+(雨声和咔咔的呼吸声...)
\ No newline at end of file
diff --git a/Capybara music/lyrics/慵懒的午后泡在温泉里听水声发呆什么都不想_1770627905.txt b/Capybara music/lyrics/慵懒的午后泡在温泉里听水声发呆什么都不想_1770627905.txt
new file mode 100644
index 0000000..d090562
--- /dev/null
+++ b/Capybara music/lyrics/慵懒的午后泡在温泉里听水声发呆什么都不想_1770627905.txt	
@@ -0,0 +1 @@
+[Inst]
\ No newline at end of file
diff --git a/Capybara music/lyrics/洗脑咔咔舞_1770631313.txt b/Capybara music/lyrics/洗脑咔咔舞_1770631313.txt
new file mode 100644
index 0000000..8b0c054
--- /dev/null
+++ b/Capybara music/lyrics/洗脑咔咔舞_1770631313.txt	
@@ -0,0 +1,20 @@
+咔咔咔咔来跳舞，魔性旋律不停步
+跟着节奏摇摆身，洗脑神曲不放手
+重复的旋律像魔法，让人听了就上瘾
+咔咔咔咔的魔力，谁也挡不住
+洗脑咔咔舞，洗脑咔咔舞
+魔性的旋律，让人停不下来
+洗脑咔咔舞，洗脑咔咔舞
+跟着咔咔一起跳，快乐无边
+每个节拍都精准，咔咔的舞步最迷人
+不管走到哪里去，都能听到这魔音
+咔咔的舞蹈最独特，让人看了就想学
+洗脑神曲的魅力，就是让人忘不掉
+洗脑咔咔舞，洗脑咔咔舞
+魔性的旋律，让人停不下来
+洗脑咔咔舞，洗脑咔咔舞
+跟着咔咔一起跳，快乐无边
+咔咔咔咔，魔性洗脑舞
+重复的节奏，快乐的旋律
+洗脑咔咔舞，洗脑咔咔舞
+让快乐无限循环，直到永远
\ No newline at end of file
diff --git a/Capybara music/lyrics/温泉发呆曲_1770628235.txt b/Capybara music/lyrics/温泉发呆曲_1770628235.txt
new file mode 100644
index 0000000..deb4c0e
--- /dev/null
+++ b/Capybara music/lyrics/温泉发呆曲_1770628235.txt	
@@ -0,0 +1,26 @@
+[verse 1]\n
+    懒懒的午后阳光暖，\n
+    温泉里我泡得欢。\n
+    水声潺潺耳边响，\n
+    什么都不想干。\n
+    \n
+    [chorus]\n
+    咔咔咔咔，悠然自得，\n
+    水波轻摇，心情舒畅。\n
+    咔咔咔咔，享受此刻，\n
+    懒懒午后，最是惬意。\n
+    \n
+    [verse 2]\n
+    看着云朵慢慢飘，\n
+    心思像水一样柔。\n
+    闭上眼，世界都静了，\n
+    只有我和这温泉。\n
+    \n
+    [chorus]\n
+    咔咔咔咔，悠然自得，\n
+    水波轻摇，心情舒畅。\n
+    咔咔咔咔，享受此刻，\n
+    懒懒午后，最是惬意。\n
+    \n
+    [outro]\n
+    (水声渐渐远去...)
\ No newline at end of file
diff --git a/Capybara music/lyrics/温泉发呆曲_1770630396.txt b/Capybara music/lyrics/温泉发呆曲_1770630396.txt
new file mode 100644
index 0000000..931cb55
--- /dev/null
+++ b/Capybara music/lyrics/温泉发呆曲_1770630396.txt	
@@ -0,0 +1,21 @@
+慵懒午后阳光暖，温泉里我发呆
+
+水声潺潺耳边响，思绪飘向云外
+
+咔咔咔咔，泡在温泉
+
+心无杂念，享受此刻安宁
+
+什么都不想去做，只想静静享受
+
+水波轻抚我的背，世界变得温柔
+
+咔咔咔咔，泡在温泉
+
+心无杂念，享受此刻安宁
+
+(水花声...)
+
+咔咔的午后，慵懒又自在
+
+温泉里的世界，只有我和水声
\ No newline at end of file
diff --git a/Capybara music/lyrics/温泉发呆曲_1770630635.txt b/Capybara music/lyrics/温泉发呆曲_1770630635.txt
new file mode 100644
index 0000000..2e913e1
--- /dev/null
+++ b/Capybara music/lyrics/温泉发呆曲_1770630635.txt	
@@ -0,0 +1,33 @@
+懒懒的午后阳光暖，
+
+温泉里我泡得欢。
+
+水声潺潺耳边响，
+
+什么都不想干。
+
+咔咔咔咔，发呆好时光，
+
+懒懒的我，享受这阳光。
+
+咔咔咔咔，让思绪飘扬，
+
+在温泉里，找到我的天堂。
+
+想法像泡泡一样浮上来，
+
+又慢慢沉下去，消失在水里。
+
+时间仿佛静止，我自在如鱼，
+
+在这温暖的怀抱里。
+
+咔咔咔咔，发呆好时光，
+
+懒懒的我，享受这阳光。
+
+咔咔咔咔，让思绪飘扬，
+
+在温泉里，找到我的天堂。
+
+(水声渐渐远去...)
\ No newline at end of file
diff --git a/Capybara music/lyrics/温泉发呆曲_1770639509.txt b/Capybara music/lyrics/温泉发呆曲_1770639509.txt
new file mode 100644
index 0000000..f1457e9
--- /dev/null
+++ b/Capybara music/lyrics/温泉发呆曲_1770639509.txt	
@@ -0,0 +1,33 @@
+懒懒的午后阳光暖，
+
+温泉里我泡得欢。
+
+水声潺潺耳边响，
+
+什么都不想干。
+
+咔咔咔咔，发呆真好，
+
+懒懒的我，享受这秒。
+
+水波轻摇，心也飘，
+
+咔咔世界，别来无恙。
+
+想着云卷云又舒，
+
+温泉里的我多舒服。
+
+时间慢慢流，不急不徐，
+
+咔咔的梦，轻轻浮。
+
+咔咔咔咔，发呆真好，
+
+懒懒的我，享受这秒。
+
+水波轻摇，心也飘，
+
+咔咔世界，别来无恙。
+
+(水声渐渐远去...)
\ No newline at end of file
diff --git a/Capybara music/lyrics/温泉里的咔咔_1770730481.txt b/Capybara music/lyrics/温泉里的咔咔_1770730481.txt
new file mode 100644
index 0000000..7919ee9
--- /dev/null
+++ b/Capybara music/lyrics/温泉里的咔咔_1770730481.txt	
@@ -0,0 +1,37 @@
+懒懒的午后阳光暖，
+
+温泉里我泡得欢。
+
+水声潺潺耳边响，
+
+什么都不想干。
+
+咔咔咔咔，悠然自得，
+
+水波荡漾心情悦。
+
+咔咔咔咔，闭上眼，
+
+享受这刻的宁静。
+
+想象自己是条鱼，
+
+在水里自由游来游去。
+
+没有烦恼没有压力，
+
+只有我和这温泉池。
+
+咔咔咔咔，悠然自得，
+
+水波荡漾心情悦。
+
+咔咔咔咔，闭上眼，
+
+享受这刻的宁静。
+
+(水花声...)
+
+咔咔，慵懒午后，
+
+水中世界最逍遥。
\ No newline at end of file
diff --git a/Capybara music/lyrics/草地上的咔咔_1770628910.txt b/Capybara music/lyrics/草地上的咔咔_1770628910.txt
new file mode 100644
index 0000000..b365ebd
--- /dev/null
+++ b/Capybara music/lyrics/草地上的咔咔_1770628910.txt	
@@ -0,0 +1,26 @@
+[verse 1]\n"
+   "阳光洒满草地绿\n"
+   "咔咔奔跑心情舒畅\n"
+   "风儿轻拂过脸庞\n"
+   "快乐就像泡泡糖\n"
+   "\n"
+   "[chorus]\n"
+   "咔咔咔咔 快乐无边\n"
+   "草地上的我自由自在\n"
+   "阳光下的影子拉得好长\n"
+   "咔咔咔咔 快乐无边\n"
+   "\n"
+   "[verse 2]\n"
+   "蝴蝶飞舞花儿笑\n"
+   "咔咔摇摆尾巴摇\n"
+ "每一步都跳着舞\n"
+   "生活就像一首歌\n"
+   "\n"
+   "[chorus]\n"
+   "咔咔咔咔 快乐无边\n"
+   "草地上的我自由自在\n"
+   "阳光下的影子拉得好长\n"
+   "咔咔咔咔 快乐无边\n"
+   "\n"
+   "[outro]\n"
+   "(草地上咔咔的笑声...)
\ No newline at end of file
diff --git a/Capybara music/lyrics/草地上的咔咔_1770629673.txt b/Capybara music/lyrics/草地上的咔咔_1770629673.txt
new file mode 100644
index 0000000..76226a8
--- /dev/null
+++ b/Capybara music/lyrics/草地上的咔咔_1770629673.txt	
@@ -0,0 +1,17 @@
+阳光洒满地 草香扑鼻来
+咔咔在草地上 跑得飞快
+风儿轻轻吹 摇曳着花海
+心情像彩虹 七彩斑斓开
+咔咔咔咔 快乐无边
+草地上的我 自由自在
+阳光下的梦 美好无限
+咔咔咔咔 快乐无边
+蝴蝶在飞舞 蜜蜂在歌唱
+咔咔跟着它们 一起欢唱
+天空蓝得像画 没有一丝阴霾
+咔咔的心里 只有满满的爱
+咔咔咔咔 快乐无边
+草地上的我 自由自在
+阳光下的梦 美好无限
+咔咔咔咔 快乐无边
+(草地上咔咔的笑声...)
\ No newline at end of file
diff --git a/Capybara music/lyrics/草地上的咔咔_1770640911.txt b/Capybara music/lyrics/草地上的咔咔_1770640911.txt
new file mode 100644
index 0000000..b069c3f
--- /dev/null
+++ b/Capybara music/lyrics/草地上的咔咔_1770640911.txt	
@@ -0,0 +1,19 @@
+阳光洒满地 绿草如茵间
+咔咔跑起来 心情像飞燕
+风儿轻拂过 花香满径边
+快乐如此简单 每一步都新鲜
+咔咔咔咔 快乐咔咔
+草地上的我 自由自在
+阳光下的舞 轻松又欢快
+咔咔咔咔 快乐咔咔
+无忧无虑的我 最爱这蓝天
+蝴蝶翩翩起 蜜蜂忙采蜜
+咔咔我最棒 每个瞬间都美丽
+朋友在旁边 笑声传千里
+这世界多美好 有你有我有草地
+咔咔咔咔 快乐咔咔
+草地上的我 自由自在
+阳光下的舞 轻松又欢快
+咔咔咔咔 快乐咔咔
+无忧无虑的我 最爱这蓝天
+(草地上咔咔的笑声...)
\ No newline at end of file
diff --git a/Capybara music/lyrics/阳光灿烂的日子在草地上奔跑撒欢心情超级好_1770626906.txt b/Capybara music/lyrics/阳光灿烂的日子在草地上奔跑撒欢心情超级好_1770626906.txt
new file mode 100644
index 0000000..0b50271
--- /dev/null
+++ b/Capybara music/lyrics/阳光灿烂的日子在草地上奔跑撒欢心情超级好_1770626906.txt	
@@ -0,0 +1,8 @@
+[verse]
+阳光洒满草地，我跑得飞快
+心情像彩虹，七彩斑斓真美
+[chorus]
+咔咔咔咔，快乐无边
+在阳光下，自由自在
+[outro]
+(风吹草低见水豚)
\ No newline at end of file
diff --git a/Capybara music/lyrics/阳光灿烂的日子在草地上奔跑撒欢心情超级好_1770639287.txt b/Capybara music/lyrics/阳光灿烂的日子在草地上奔跑撒欢心情超级好_1770639287.txt
new file mode 100644
index 0000000..d090562
--- /dev/null
+++ b/Capybara music/lyrics/阳光灿烂的日子在草地上奔跑撒欢心情超级好_1770639287.txt	
@@ -0,0 +1 @@
+[Inst]
\ No newline at end of file
diff --git a/Capybara music/书房咔咔茶_1770634690.mp3 b/Capybara music/书房咔咔茶_1770634690.mp3
new file mode 100644
index 0000000..835700b
Binary files /dev/null and b/Capybara music/书房咔咔茶_1770634690.mp3 differ
diff --git a/Capybara music/书房咔咔茶_1770637242.mp3 b/Capybara music/书房咔咔茶_1770637242.mp3
new file mode 100644
index 0000000..d5912b7
Binary files /dev/null and b/Capybara music/书房咔咔茶_1770637242.mp3 differ
diff --git a/Capybara music/夜深了窗外下着小雨盖着被子准备入睡_1770627405.mp3 b/Capybara music/夜深了窗外下着小雨盖着被子准备入睡_1770627405.mp3
new file mode 100644
index 0000000..38c6ca1
Binary files /dev/null and b/Capybara music/夜深了窗外下着小雨盖着被子准备入睡_1770627405.mp3 differ
diff --git a/Capybara music/惊喜咔咔派_1770642290.mp3 b/Capybara music/惊喜咔咔派_1770642290.mp3
new file mode 100644
index 0000000..47b7f4a
Binary files /dev/null and b/Capybara music/惊喜咔咔派_1770642290.mp3 differ
diff --git a/Capybara music/慵懒的午后泡在温泉里听水声发呆什么都不想_1770627905.mp3 b/Capybara music/慵懒的午后泡在温泉里听水声发呆什么都不想_1770627905.mp3
new file mode 100644
index 0000000..f2747e2
Binary files /dev/null and b/Capybara music/慵懒的午后泡在温泉里听水声发呆什么都不想_1770627905.mp3 differ
diff --git a/Capybara music/洗脑咔咔舞_1770631313.mp3 b/Capybara music/洗脑咔咔舞_1770631313.mp3
new file mode 100644
index 0000000..8c51742
Binary files /dev/null and b/Capybara music/洗脑咔咔舞_1770631313.mp3 differ
diff --git a/Capybara music/温泉发呆曲_1770628235.mp3 b/Capybara music/温泉发呆曲_1770628235.mp3
new file mode 100644
index 0000000..d88634b
Binary files /dev/null and b/Capybara music/温泉发呆曲_1770628235.mp3 differ
diff --git a/Capybara music/温泉发呆曲_1770630396.mp3 b/Capybara music/温泉发呆曲_1770630396.mp3
new file mode 100644
index 0000000..5023065
Binary files /dev/null and b/Capybara music/温泉发呆曲_1770630396.mp3 differ
diff --git a/Capybara music/温泉发呆曲_1770630635.mp3 b/Capybara music/温泉发呆曲_1770630635.mp3
new file mode 100644
index 0000000..a31ecc8
Binary files /dev/null and b/Capybara music/温泉发呆曲_1770630635.mp3 differ
diff --git a/Capybara music/温泉发呆曲_1770639509.mp3 b/Capybara music/温泉发呆曲_1770639509.mp3
new file mode 100644
index 0000000..2f77a83
Binary files /dev/null and b/Capybara music/温泉发呆曲_1770639509.mp3 differ
diff --git a/Capybara music/温泉里的咔咔_1770730481.mp3 b/Capybara music/温泉里的咔咔_1770730481.mp3
new file mode 100644
index 0000000..d07f22a
Binary files /dev/null and b/Capybara music/温泉里的咔咔_1770730481.mp3 differ
diff --git a/Capybara music/草地上的咔咔_1770628910.mp3 b/Capybara music/草地上的咔咔_1770628910.mp3
new file mode 100644
index 0000000..ebbde2c
Binary files /dev/null and b/Capybara music/草地上的咔咔_1770628910.mp3 differ
diff --git a/Capybara music/草地上的咔咔_1770629673.mp3 b/Capybara music/草地上的咔咔_1770629673.mp3
new file mode 100644
index 0000000..44f4ec4
Binary files /dev/null and b/Capybara music/草地上的咔咔_1770629673.mp3 differ
diff --git a/Capybara music/草地上的咔咔_1770640911.mp3 b/Capybara music/草地上的咔咔_1770640911.mp3
new file mode 100644
index 0000000..2cab36e
Binary files /dev/null and b/Capybara music/草地上的咔咔_1770640911.mp3 differ
diff --git a/Capybara music/阳光灿烂的日子在草地上奔跑撒欢心情超级好_1770626906.mp3 b/Capybara music/阳光灿烂的日子在草地上奔跑撒欢心情超级好_1770626906.mp3
new file mode 100644
index 0000000..8f48317
Binary files /dev/null and b/Capybara music/阳光灿烂的日子在草地上奔跑撒欢心情超级好_1770626906.mp3 differ
diff --git a/Capybara music/阳光灿烂的日子在草地上奔跑撒欢心情超级好_1770639287.mp3 b/Capybara music/阳光灿烂的日子在草地上奔跑撒欢心情超级好_1770639287.mp3
new file mode 100644
index 0000000..b16cbba
Binary files /dev/null and b/Capybara music/阳光灿烂的日子在草地上奔跑撒欢心情超级好_1770639287.mp3 differ
diff --git a/Capybara stories/海盗找朋友_1770647563.txt b/Capybara stories/海盗找朋友_1770647563.txt
new file mode 100644
index 0000000..123215a
--- /dev/null
+++ b/Capybara stories/海盗找朋友_1770647563.txt	
@@ -0,0 +1,11 @@
+# 海盗找朋友
+
+在蓝色的大海上，有一艘小小的海盗船，船上只有一个小海盗。他戴着歪歪的海盗帽，举着塑料做的小钩子手，每天对着海浪喊：“谁来和我玩呀？”
+
+这天，小海盗的船被海浪冲到了一座彩虹岛。岛上的沙滩上，躺着一个会发光的贝壳。小海盗刚捡起贝壳，贝壳突然“叮咚”响了一声，跳出一只圆滚滚的小海豚！
+
+“哇！你是我的宝藏吗？”小海盗举着贝壳问。小海豚摇摇头，用尾巴拍了拍海水：“我带你去找真正的宝藏！”它驮着小海盗游向海底，那里有一个藏着星星的洞穴。
+
+洞穴里，小海豚拿出了一个会唱歌的海螺：“这是友谊海螺，对着它喊朋友的名字，就会有惊喜哦！”小海盗对着海螺喊：“我的朋友！”突然，从海螺里钻出一群小螃蟹，举着彩色的小旗子，还有一只会吹泡泡的章鱼！
+
+原来，小海豚早就听说小海盗很孤单，特意用友谊海螺召集了伙伴们。现在，小海盗的船上每天都飘着笑声，他再也不是孤单的小海盗啦！
\ No newline at end of file
diff --git a/airhub_app/lib/pages/music_creation_page.dart b/airhub_app/lib/pages/music_creation_page.dart
index 5145f3b..ac90f91 100644
--- a/airhub_app/lib/pages/music_creation_page.dart
+++ b/airhub_app/lib/pages/music_creation_page.dart
@@ -417,9 +417,18 @@ class _MusicCreationPageState extends State<MusicCreationPage>
     // Actually play or pause audio
     try {
       if (_isPlaying) {
+        // Show now-playing bubble immediately (before await)
+        _playStickyText = '正在播放: ${_playlist[_currentTrackIndex].title}';
+        setState(() {
+          _speechText = _playStickyText;
+          _speechVisible = true;
+        });
         await _audioPlayer.play();
       } else {
         await _audioPlayer.pause();
+        // Hide bubble on pause
+        _playStickyText = null;
+        setState(() => _speechVisible = false);
       }
     } catch (e) {
       debugPrint('Playback error: $e');
@@ -428,6 +437,7 @@ class _MusicCreationPageState extends State<MusicCreationPage>
         // Revert UI state on error
         setState(() {
           _isPlaying = false;
+          _playStickyText = null;
           _vinylSpinController.stop();
           _tonearmController.reverse();
         });
@@ -474,7 +484,8 @@ class _MusicCreationPageState extends State<MusicCreationPage>
       }
     }
 
-    _showSpeech('正在播放: ${_playlist[index].title}');
+    _playStickyText = '正在播放: ${_playlist[index].title}';
+    _showSpeech(_playStickyText!, duration: 0);
   }
 
   // ── Mood Selection ──
@@ -646,6 +657,7 @@ class _MusicCreationPageState extends State<MusicCreationPage>
 
   // ── Speech Bubble ──
   String? _genStickyText; // Persistent text during generation
+  String? _playStickyText; // Persistent text during playback
 
   void _showSpeech(String text, {int duration = 3000}) {
     // If this is a generation-related message (duration == 0), save it as sticky
@@ -667,6 +679,12 @@ class _MusicCreationPageState extends State<MusicCreationPage>
               _speechText = _genStickyText;
               _speechVisible = true;
             });
+          } else if (_isPlaying && _playStickyText != null) {
+            // If playing, restore the now-playing message
+            setState(() {
+              _speechText = _playStickyText;
+              _speechVisible = true;
+            });
           } else {
             setState(() => _speechVisible = false);
           }
@@ -800,7 +818,9 @@ class _MusicCreationPageState extends State<MusicCreationPage>
             child: _buildVinylWrapper(),
           ),
           // Speech bubble — positioned top-right
-          if (_speechVisible && _speechText != null)
+          // Always show during playback; otherwise use _speechVisible
+          if ((_speechVisible && _speechText != null) ||
+              (_isPlaying && _playStickyText != null))
             Positioned(
               top: 0,
               right: -24, // HTML: right: -24px
@@ -1067,12 +1087,18 @@ class _MusicCreationPageState extends State<MusicCreationPage>
   Widget _buildSpeechBubble() {
     // HTML: .capy-speech-bubble with clip-path iMessage-style tail at bottom-left
     const tailH = 8.0;
+    // During playback, always show the playing text even if _speechVisible is false
+    final bool showBubble = _speechVisible || (_isPlaying && _playStickyText != null);
+    final String bubbleText = (_isPlaying && _playStickyText != null && !_speechVisible)
+        ? _playStickyText!
+        : (_speechText ?? '');
+
     return AnimatedOpacity(
       duration: const Duration(milliseconds: 200),
-      opacity: _speechVisible ? 1.0 : 0.0,
+      opacity: showBubble ? 1.0 : 0.0,
       child: AnimatedScale(
         duration: const Duration(milliseconds: 350),
-        scale: _speechVisible ? 1.0 : 0.7,
+        scale: showBubble ? 1.0 : 0.7,
         curve: const Cubic(0.34, 1.56, 0.64, 1.0),
         alignment: Alignment.bottomLeft,
         child: Column(
@@ -1098,7 +1124,7 @@ class _MusicCreationPageState extends State<MusicCreationPage>
                 ],
               ),
               child: Text(
-                _speechText ?? '',
+                bubbleText,
                 style: GoogleFonts.dmSans(
                   fontSize: 12.5,
                   fontWeight: FontWeight.w500,
@@ -1485,6 +1511,7 @@ class _MusicCreationPageState extends State<MusicCreationPage>
       builder: (ctx) => _PlaylistModalContent(
         tracks: _playlist,
         currentIndex: _currentTrackIndex,
+        isPlaying: _isPlaying,
         onSelect: (index) {
           Navigator.pop(ctx);
           _playTrack(index);
@@ -1921,17 +1948,53 @@ class _InputModalContent extends StatelessWidget {
 }
 
 /// Playlist Modal — HTML: .playlist-container
-class _PlaylistModalContent extends StatelessWidget {
+class _PlaylistModalContent extends StatefulWidget {
   final List<_Track> tracks;
   final int currentIndex;
+  final bool isPlaying;
   final ValueChanged<int> onSelect;
 
   const _PlaylistModalContent({
     required this.tracks,
     required this.currentIndex,
+    required this.isPlaying,
     required this.onSelect,
   });
 
+  @override
+  State<_PlaylistModalContent> createState() => _PlaylistModalContentState();
+}
+
+class _PlaylistModalContentState extends State<_PlaylistModalContent>
+    with SingleTickerProviderStateMixin {
+  late AnimationController _waveController;
+
+  @override
+  void initState() {
+    super.initState();
+    _waveController = AnimationController(
+      vsync: this,
+      duration: const Duration(milliseconds: 800),
+    );
+    if (widget.isPlaying) _waveController.repeat(reverse: true);
+  }
+
+  @override
+  void didUpdateWidget(covariant _PlaylistModalContent oldWidget) {
+    super.didUpdateWidget(oldWidget);
+    if (widget.isPlaying && !_waveController.isAnimating) {
+      _waveController.repeat(reverse: true);
+    } else if (!widget.isPlaying && _waveController.isAnimating) {
+      _waveController.stop();
+    }
+  }
+
+  @override
+  void dispose() {
+    _waveController.dispose();
+    super.dispose();
+  }
+
   @override
   Widget build(BuildContext context) {
     final screenWidth = MediaQuery.of(context).size.width;
@@ -2015,23 +2078,39 @@ class _PlaylistModalContent extends StatelessWidget {
                 mainAxisSpacing: 8,
                 childAspectRatio: 0.75,
               ),
-              itemCount: tracks.length,
+              itemCount: widget.tracks.length,
               itemBuilder: (context, index) {
-                final track = tracks[index];
-                final isPlaying = index == currentIndex;
+                final track = widget.tracks[index];
+                final isCurrent = index == widget.currentIndex;
+                final isPlaying = isCurrent && widget.isPlaying;
 
                 // HTML: .record-slot { background: rgba(0,0,0,0.03); border-radius: 12px;
                 //   padding: 10px 4px; border: 1px solid rgba(0,0,0,0.02); }
                 return GestureDetector(
-                  onTap: () => onSelect(index),
+                  onTap: () => widget.onSelect(index),
                   child: Container(
                     padding:
                         const EdgeInsets.symmetric(horizontal: 4, vertical: 10),
                     decoration: BoxDecoration(
-                      color: Colors.black.withOpacity(0.03),
+                      // Current track: warm golden background; others: subtle grey
+                      color: isCurrent
+                          ? const Color(0xFFFDF3E3)
+                          : Colors.black.withOpacity(0.03),
                       borderRadius: BorderRadius.circular(12),
                       border: Border.all(
-                          color: Colors.black.withOpacity(0.02)),
+                          color: isCurrent
+                              ? const Color(0xFFECCFA8).withOpacity(0.6)
+                              : Colors.black.withOpacity(0.02),
+                          width: isCurrent ? 1.5 : 1.0),
+                      boxShadow: isCurrent
+                          ? [
+                              BoxShadow(
+                                color: const Color(0xFFECCFA8).withOpacity(0.25),
+                                blurRadius: 8,
+                                offset: const Offset(0, 2),
+                              ),
+                            ]
+                          : null,
                     ),
                     child: Column(
                       children: [
@@ -2043,10 +2122,8 @@ class _PlaylistModalContent extends StatelessWidget {
                               decoration: BoxDecoration(
                                 shape: BoxShape.circle,
                                 color: const Color(0xFF18181B),
-                                // HTML: .record-item.playing .record-cover-wrapper
-                                //   { box-shadow: 0 0 0 2px #ECCFA8, ... }
                                 boxShadow: [
-                                  if (isPlaying)
+                                  if (isCurrent)
                                     const BoxShadow(
                                       color: Color(0xFFECCFA8),
                                       spreadRadius: 2,
@@ -2096,23 +2173,57 @@ class _PlaylistModalContent extends StatelessWidget {
                                       ),
                                     ),
                                   ),
+                                  // Sound wave overlay for playing track
+                                  if (isPlaying)
+                                    Center(
+                                      child: AnimatedBuilder(
+                                        animation: _waveController,
+                                        builder: (context, child) {
+                                          return CustomPaint(
+                                            painter: _MiniWavePainter(
+                                              progress: _waveController.value,
+                                            ),
+                                            size: const Size(28, 20),
+                                          );
+                                        },
+                                      ),
+                                    ),
                                 ],
                               ),
                             ),
                           ),
                         ),
                         const SizedBox(height: 8),
-                        // HTML: .record-title { font-size: 12px; font-weight: 500; }
-                        Text(
-                          track.title,
-                          style: GoogleFonts.dmSans(
-                            fontSize: 12,
-                            fontWeight: FontWeight.w500,
-                            color: const Color(0xFF374151),
-                          ),
-                          textAlign: TextAlign.center,
-                          maxLines: 1,
-                          overflow: TextOverflow.ellipsis,
+                        // Title with playing indicator
+                        Row(
+                          mainAxisAlignment: MainAxisAlignment.center,
+                          mainAxisSize: MainAxisSize.min,
+                          children: [
+                            if (isCurrent)
+                              Padding(
+                                padding: const EdgeInsets.only(right: 3),
+                                child: Icon(
+                                  isPlaying ? Icons.volume_up_rounded : Icons.volume_off_rounded,
+                                  size: 12,
+                                  color: const Color(0xFFECCFA8),
+                                ),
+                              ),
+                            Flexible(
+                              child: Text(
+                                track.title,
+                                style: GoogleFonts.dmSans(
+                                  fontSize: 12,
+                                  fontWeight: isCurrent ? FontWeight.w600 : FontWeight.w500,
+                                  color: isCurrent
+                                      ? const Color(0xFFB8860B)
+                                      : const Color(0xFF374151),
+                                ),
+                                textAlign: TextAlign.center,
+                                maxLines: 1,
+                                overflow: TextOverflow.ellipsis,
+                              ),
+                            ),
+                          ],
                         ),
                       ],
                     ),
@@ -2127,3 +2238,39 @@ class _PlaylistModalContent extends StatelessWidget {
   }
 }
 
+/// Mini sound wave painter for playlist playing indicator
+class _MiniWavePainter extends CustomPainter {
+  final double progress;
+
+  _MiniWavePainter({required this.progress});
+
+  @override
+  void paint(Canvas canvas, Size size) {
+    final paint = Paint()
+      ..color = const Color(0xFFECCFA8)
+      ..strokeWidth = 2.5
+      ..strokeCap = StrokeCap.round;
+
+    const barCount = 4;
+    final barWidth = size.width / (barCount * 2 - 1);
+    final centerY = size.height / 2;
+
+    for (int i = 0; i < barCount; i++) {
+      // Each bar has a different phase offset for wave effect
+      final phase = (progress + i * 0.25) % 1.0;
+      final height = size.height * (0.3 + 0.7 * (0.5 + 0.5 * sin(phase * 3.14159 * 2)));
+      final x = i * barWidth * 2 + barWidth / 2;
+
+      canvas.drawLine(
+        Offset(x, centerY - height / 2),
+        Offset(x, centerY + height / 2),
+        paint,
+      );
+    }
+  }
+
+  @override
+  bool shouldRepaint(covariant _MiniWavePainter oldDelegate) =>
+      oldDelegate.progress != progress;
+}
+
diff --git a/airhub_app/lib/pages/story_detail_page.dart b/airhub_app/lib/pages/story_detail_page.dart
index dfe34d4..fa96b33 100644
--- a/airhub_app/lib/pages/story_detail_page.dart
+++ b/airhub_app/lib/pages/story_detail_page.dart
@@ -1,9 +1,12 @@
+﻿import 'dart:async';
 import 'dart:ui' as ui;
 
 import 'package:flutter/material.dart';
-import 'package:flutter_svg/flutter_svg.dart';
+import 'package:just_audio/just_audio.dart';
 import '../theme/design_tokens.dart';
 import '../widgets/gradient_button.dart';
+import '../widgets/pill_progress_button.dart';
+import '../services/tts_service.dart';
 import 'story_loading_page.dart';
 
 enum StoryMode { generated, read }
@@ -30,6 +33,14 @@ class _StoryDetailPageState extends State<StoryDetailPage>
   bool _hasGeneratedVideo = false;
   bool _isLoadingVideo = false;
 
+  // TTS — uses global TTSService singleton
+  final TTSService _ttsService = TTSService.instance;
+  final AudioPlayer _audioPlayer = AudioPlayer();
+  StreamSubscription<Duration>? _positionSub;
+  StreamSubscription<PlayerState>? _playerStateSub;
+  Duration _audioDuration = Duration.zero;
+  Duration _audioPosition = Duration.zero;
+
   // Genie Suck Animation
   bool _isSaving = false;
   AnimationController? _genieController;
@@ -41,9 +52,9 @@ class _StoryDetailPageState extends State<StoryDetailPage>
     'content': """
 在遥远的银河系边缘，有一个被星云包裹的神秘茶馆。今天，这里迎来了两位特殊的客人：刚执行完火星探测任务的宇航员波波，和正在追捕暗影怪兽的忍者小次郎。
 
-“这儿的重力好像有点不对劲？”波波飘在半空中，试图抓住飞来飞去的茶杯。小次郎则冷静地倒挂在天花板上，手里紧握着一枚手里剑——其实那是用来切月饼的。
+"这儿的重力好像有点不对劲？"波波飘在半空中，试图抓住飞来飞去的茶杯。小次郎则冷静地倒挂在天花板上，手里紧握着一枚手里剑——其实那是用来切月饼的。
 
-突然，桌上的魔法茶壶“噗”地一声喷出了七彩烟雾，一只会说话的卡皮巴拉钻了出来：“别打架，别打架，喝了这杯‘银河气泡茶’，我们都是好朋友！”
+突然，桌上的魔法茶壶"噗"地一声喷出了七彩烟雾，一只会说话的卡皮巴拉钻了出来："别打架，别打架，喝了这杯'银河气泡茶'，我们都是好朋友！"
 
 于是，宇宙中最奇怪的组合诞生了。他们决定，下一站，去黑洞边缘钓星星。
 """,
@@ -54,7 +65,6 @@ class _StoryDetailPageState extends State<StoryDetailPage>
   Map<String, dynamic> _initStory() {
     final source = widget.story ?? _defaultStory;
     final result = Map<String, dynamic>.from(source);
-    // 兜底：如果没有 content 就用默认故事内容
     result['content'] ??= _defaultStory['content'];
     result['title'] ??= _defaultStory['title'];
     return result;
@@ -64,18 +74,171 @@ class _StoryDetailPageState extends State<StoryDetailPage>
   void initState() {
     super.initState();
     _currentStory = _initStory();
+
+    // Subscribe to TTSService changes
+    _ttsService.addListener(_onTTSChanged);
+
+    // Listen to audio player state
+    _playerStateSub = _audioPlayer.playerStateStream.listen((state) {
+      if (!mounted) return;
+      if (state.processingState == ProcessingState.completed) {
+        setState(() {
+          _isPlaying = false;
+          _audioPosition = Duration.zero;
+        });
+      }
+    });
+
+    // Listen to playback position for ring progress
+    _positionSub = _audioPlayer.positionStream.listen((pos) {
+      if (!mounted) return;
+      setState(() => _audioPosition = pos);
+    });
+
+    // Listen to duration changes
+    _audioPlayer.durationStream.listen((dur) {
+      if (!mounted || dur == null) return;
+      setState(() => _audioDuration = dur);
+    });
+
+    // Check if audio already exists (via TTSService)
+    final title = _currentStory['title'] as String? ?? '';
+    _ttsService.checkExistingAudio(title);
+  }
+
+  void _onTTSChanged() {
+    if (!mounted) return;
+
+    // Auto-play when generation completes
+    if (_ttsService.justCompleted &&
+        _ttsService.hasAudioFor(_currentStory['title'] ?? '')) {
+      // Delay slightly to let the completion flash play
+      Future.delayed(const Duration(milliseconds: 1500), () {
+        if (mounted) {
+          _ttsService.clearJustCompleted();
+          final route = ModalRoute.of(context);
+          if (route != null && route.isCurrent) {
+            _playAudio();
+          }
+        }
+      });
+    }
+
+    setState(() {});
   }
 
   @override
   void dispose() {
+    _ttsService.removeListener(_onTTSChanged);
+    _positionSub?.cancel();
+    _playerStateSub?.cancel();
+    _audioPlayer.dispose();
     _genieController?.dispose();
     super.dispose();
   }
 
-  /// Trigger Genie Suck animation matching HTML:
-  /// CSS: animation: genieSuck 0.8s cubic-bezier(0.6, -0.28, 0.735, 0.045) forwards
-  /// Phase 1 (0→15%): card scales up to 1.05 (tension)
-  /// Phase 2 (15%→100%): card shrinks to 0.05, moves toward bottom, blurs & fades
+  // ── TTS button logic ──
+
+  bool _audioLoaded = false; // Track if audio URL is loaded in player
+  String? _loadedUrl; // Which URL is currently loaded
+
+  TTSButtonState get _ttsState {
+    final title = _currentStory['title'] as String? ?? '';
+
+    if (_ttsService.error != null &&
+        !_ttsService.isGenerating &&
+        _ttsService.audioUrl == null) {
+      return TTSButtonState.error;
+    }
+    if (_ttsService.isGeneratingFor(title)) {
+      return TTSButtonState.generating;
+    }
+    if (_ttsService.justCompleted && _ttsService.hasAudioFor(title)) {
+      return TTSButtonState.completed;
+    }
+    if (_isPlaying) {
+      return TTSButtonState.playing;
+    }
+    if (_ttsService.hasAudioFor(title) && !_audioLoaded) {
+      return TTSButtonState.ready; // audio ready, not yet played -> show "鎾斁"
+    }
+    if (_audioLoaded) {
+      return TTSButtonState.paused; // was playing, now paused -> show "缁х画"
+    }
+    return TTSButtonState.idle;
+  }
+
+  double get _ttsProgress {
+    final state = _ttsState;
+    switch (state) {
+      case TTSButtonState.generating:
+        return _ttsService.progress;
+      case TTSButtonState.ready:
+        return 0.0;
+      case TTSButtonState.completed:
+        return 1.0;
+      case TTSButtonState.playing:
+      case TTSButtonState.paused:
+        if (_audioDuration.inMilliseconds > 0) {
+          return (_audioPosition.inMilliseconds / _audioDuration.inMilliseconds)
+              .clamp(0.0, 1.0);
+        }
+        return 0.0;
+      default:
+        return 0.0;
+    }
+  }
+
+  void _handleTTSTap() {
+    final state = _ttsState;
+    switch (state) {
+      case TTSButtonState.idle:
+      case TTSButtonState.error:
+        final title = _currentStory['title'] as String? ?? '';
+        final content = _currentStory['content'] as String? ?? '';
+        _ttsService.generate(title: title, content: content);
+        break;
+      case TTSButtonState.generating:
+        break;
+      case TTSButtonState.ready:
+      case TTSButtonState.completed:
+      case TTSButtonState.paused:
+        _playAudio();
+        break;
+      case TTSButtonState.playing:
+        _audioPlayer.pause();
+        setState(() => _isPlaying = false);
+        break;
+    }
+  }
+
+  Future<void> _playAudio() async {
+    final title = _currentStory['title'] as String? ?? '';
+    final url = _ttsService.hasAudioFor(title) ? _ttsService.audioUrl : null;
+    if (url == null) return;
+
+    try {
+      // If already loaded the same URL, seek to saved position and resume
+      if (_audioLoaded && _loadedUrl == url) {
+        await _audioPlayer.seek(_audioPosition);
+        _audioPlayer.play();
+      } else {
+        // Load new URL and play from start
+        await _audioPlayer.setUrl(url);
+        _audioLoaded = true;
+        _loadedUrl = url;
+        _audioPlayer.play();
+      }
+      if (mounted) {
+        setState(() => _isPlaying = true);
+      }
+    } catch (e) {
+      debugPrint('Audio play error: $e');
+    }
+  }
+
+  // ── Genie Suck Animation ──
+
   void _triggerGenieSuck() {
     if (_isSaving) return;
 
@@ -84,7 +247,6 @@ class _StoryDetailPageState extends State<StoryDetailPage>
       duration: const Duration(milliseconds: 800),
     );
 
-    // Calculate how far the card should travel downward (toward the save button)
     final screenHeight = MediaQuery.of(context).size.height;
     _targetDY = screenHeight * 0.35;
 
@@ -94,23 +256,20 @@ class _StoryDetailPageState extends State<StoryDetailPage>
       }
     });
 
-    setState(() {
-      _isSaving = true;
-    });
+    setState(() => _isSaving = true);
     _genieController!.forward();
   }
 
+  // ── Build ──
+
   @override
   Widget build(BuildContext context) {
     return Scaffold(
-      backgroundColor: AppColors.storyBackground, // #FDF9F3
+      backgroundColor: AppColors.storyBackground,
       body: SafeArea(
         child: Column(
           children: [
-            // Header + Content Card — animated together during genie suck
             Expanded(child: _buildAnimatedBody()),
-
-            // Footer
             _buildFooter(),
           ],
         ),
@@ -118,7 +277,6 @@ class _StoryDetailPageState extends State<StoryDetailPage>
     );
   }
 
-  /// Wraps header + content card in genie suck animation
   Widget _buildAnimatedBody() {
     Widget body = Column(
       children: [
@@ -132,7 +290,7 @@ class _StoryDetailPageState extends State<StoryDetailPage>
       return AnimatedBuilder(
         animation: _genieController!,
         builder: (context, child) {
-          final t = _genieController!.value; // linear 0→1
+          final t = _genieController!.value;
 
           double scale;
           double translateY;
@@ -140,14 +298,12 @@ class _StoryDetailPageState extends State<StoryDetailPage>
           double blur;
 
           if (t <= 0.15) {
-            // Phase 1: tension — whole area scales up slightly
             final p = t / 0.15;
             scale = 1.0 + 0.05 * Curves.easeOut.transform(p);
             translateY = 0;
             opacity = 1.0;
             blur = 0;
           } else {
-            // Phase 2: suck — shrinks, moves down, fades and blurs
             final p = ((t - 0.15) / 0.85).clamp(0.0, 1.0);
             final curved =
                 const Cubic(0.6, -0.28, 0.735, 0.045).transform(p);
@@ -209,7 +365,7 @@ class _StoryDetailPageState extends State<StoryDetailPage>
             ),
           ),
           Text(
-            _currentStory['title'],
+            _currentStory['title'] ?? '',
             style: const TextStyle(
               fontSize: 17,
               fontWeight: FontWeight.w600,
@@ -227,9 +383,9 @@ class _StoryDetailPageState extends State<StoryDetailPage>
       child: Row(
         mainAxisAlignment: MainAxisAlignment.center,
         children: [
-          _buildTabBtn('📄 故事', 'text'),
+          _buildTabBtn('故事', 'text'),
           const SizedBox(width: 8),
-          _buildTabBtn('🎬 绘本', 'video'),
+          _buildTabBtn('绘本', 'video'),
         ],
       ),
     );
@@ -238,11 +394,7 @@ class _StoryDetailPageState extends State<StoryDetailPage>
   Widget _buildTabBtn(String label, String key) {
     bool isActive = _activeTab == key;
     return GestureDetector(
-      onTap: () {
-        setState(() {
-          _activeTab = key;
-        });
-      },
+      onTap: () => setState(() => _activeTab = key),
       child: Container(
         padding: const EdgeInsets.symmetric(horizontal: 16, vertical: 8),
         decoration: BoxDecoration(
@@ -271,7 +423,6 @@ class _StoryDetailPageState extends State<StoryDetailPage>
   }
 
   Widget _buildContentCard() {
-    // HTML: .story-paper
     bool isVideoMode = _activeTab == 'video';
 
     return Container(
@@ -292,11 +443,11 @@ class _StoryDetailPageState extends State<StoryDetailPage>
         _currentStory['content']
             .toString()
             .replaceAll(RegExp(r'\n+'), '\n\n')
-            .trim(), // Simple paragraph spacing
+            .trim(),
         style: const TextStyle(
-          fontSize: 16, // HTML: 16px
-          height: 2.0, // HTML: line-height 2.0
-          color: AppColors.storyText, // #374151
+          fontSize: 16,
+          height: 2.0,
+          color: AppColors.storyText,
         ),
         textAlign: TextAlign.justify,
       ),
@@ -313,7 +464,7 @@ class _StoryDetailPageState extends State<StoryDetailPage>
               width: 40,
               height: 40,
               child: CircularProgressIndicator(
-                color: Color(0xFFF43F5E), // HTML: #F43F5E
+                color: Color(0xFFF43F5E),
                 strokeWidth: 3,
               ),
             ),
@@ -339,15 +490,14 @@ class _StoryDetailPageState extends State<StoryDetailPage>
       alignment: Alignment.center,
       children: [
         AspectRatio(
-          aspectRatio: 16 / 9, // Assume landscape video
+          aspectRatio: 16 / 9,
           child: Container(
             color: Colors.black,
             child: const Center(
               child: Icon(Icons.videocam, color: Colors.white54, size: 48),
-            ), // Placeholder for Video Player
+            ),
           ),
         ),
-        // Play Button Overlay
         Container(
           width: 48,
           height: 48,
@@ -372,7 +522,6 @@ class _StoryDetailPageState extends State<StoryDetailPage>
       child: _activeTab == 'text' ? _buildTextFooter() : _buildVideoFooter(),
     );
 
-    // Fade out footer during genie suck animation
     if (_isSaving) {
       return IgnorePointer(
         child: AnimatedOpacity(
@@ -387,12 +536,9 @@ class _StoryDetailPageState extends State<StoryDetailPage>
   }
 
   void _handleRewrite() async {
-    // 跳到 loading 页重新生成
     final result = await Navigator.of(context).push<String>(
       MaterialPageRoute(builder: (context) => const StoryLoadingPage()),
     );
-
-    // loading 完成后返回结果
     if (mounted && result == 'saved') {
       Navigator.of(context).pop('saved');
     }
@@ -403,7 +549,6 @@ class _StoryDetailPageState extends State<StoryDetailPage>
       // Generator Mode: Rewrite + Save
       return Row(
         children: [
-          // Rewrite (Secondary)
           Expanded(
             child: GestureDetector(
               onTap: _handleRewrite,
@@ -415,19 +560,25 @@ class _StoryDetailPageState extends State<StoryDetailPage>
                   color: Colors.white.withOpacity(0.8),
                 ),
                 alignment: Alignment.center,
-                child: const Text(
-                  '↻ 重写',
-                  style: TextStyle(
-                    fontSize: 16,
-                    fontWeight: FontWeight.w600,
-                    color: Color(0xFF4B5563),
-                  ),
+                child: const Row(
+                  mainAxisAlignment: MainAxisAlignment.center,
+                  children: [
+                    Icon(Icons.refresh_rounded, size: 18, color: Color(0xFF4B5563)),
+                    SizedBox(width: 4),
+                    Text(
+                      '重写',
+                      style: TextStyle(
+                        fontSize: 16,
+                        fontWeight: FontWeight.w600,
+                        color: Color(0xFF4B5563),
+                      ),
+                    ),
+                  ],
                 ),
               ),
             ),
           ),
           const SizedBox(width: 16),
-          // Save (Primary) - Returns 'saved' to trigger add book animation
           Expanded(
             child: GradientButton(
               text: '保存故事',
@@ -441,41 +592,14 @@ class _StoryDetailPageState extends State<StoryDetailPage>
         ],
       );
     } else {
-      // Read Mode: TTS + Make Picture Book
+      // Read Mode: TTS pill button + Make Picture Book
       return Row(
         children: [
-          // TTS
           Expanded(
-            child: GestureDetector(
-              onTap: () => setState(() => _isPlaying = !_isPlaying),
-              child: Container(
-                height: 48,
-                decoration: BoxDecoration(
-                  border: Border.all(color: const Color(0xFFE5E7EB)),
-                  borderRadius: BorderRadius.circular(24),
-                  color: Colors.white.withOpacity(0.8),
-                ),
-                alignment: Alignment.center,
-                child: Row(
-                  mainAxisAlignment: MainAxisAlignment.center,
-                  children: [
-                    Icon(
-                      _isPlaying ? Icons.pause : Icons.headphones,
-                      size: 20,
-                      color: const Color(0xFF4B5563),
-                    ),
-                    const SizedBox(width: 6),
-                    Text(
-                      _isPlaying ? '暂停' : '朗读',
-                      style: const TextStyle(
-                        fontSize: 16,
-                        fontWeight: FontWeight.w600,
-                        color: Color(0xFF4B5563),
-                      ),
-                    ),
-                  ],
-                ),
-              ),
+            child: PillProgressButton(
+              state: _ttsState,
+              progress: _ttsProgress,
+              onTap: _handleTTSTap,
             ),
           ),
           const SizedBox(width: 16),
@@ -500,7 +624,7 @@ class _StoryDetailPageState extends State<StoryDetailPage>
       children: [
         Expanded(
           child: GradientButton(
-            text: '↻ 重新生成',
+            text: '重新生成',
             onPressed: _startVideoGeneration,
             gradient: const LinearGradient(
               colors: AppColors.btnCapybaraGradient,
@@ -517,7 +641,6 @@ class _StoryDetailPageState extends State<StoryDetailPage>
       _isLoadingVideo = true;
       _activeTab = 'video';
     });
-    // Mock delay
     Future.delayed(const Duration(seconds: 2), () {
       if (mounted) {
         setState(() {
diff --git a/airhub_app/lib/services/tts_service.dart b/airhub_app/lib/services/tts_service.dart
new file mode 100644
index 0000000..5c6458a
--- /dev/null
+++ b/airhub_app/lib/services/tts_service.dart
@@ -0,0 +1,190 @@
+import 'dart:convert';
+import 'package:flutter/foundation.dart';
+import 'package:http/http.dart' as http;
+
+/// Singleton service that manages TTS generation in the background.
+/// Survives page navigation — when user leaves and comes back,
+/// generation continues and result is available.
+class TTSService extends ChangeNotifier {
+  TTSService._();
+  static final TTSService instance = TTSService._();
+
+  static const String _kServerBase = 'http://localhost:3000';
+
+  // ── Current task state ──
+  bool _isGenerating = false;
+  double _progress = 0.0; // 0.0 ~ 1.0
+  String _statusMessage = '';
+  String? _currentStoryTitle; // Which story is being generated
+
+  // ── Result ──
+  String? _audioUrl;
+  String? _completedStoryTitle; // Which story the audio belongs to
+  bool _justCompleted = false; // Flash animation trigger
+
+  // ── Error ──
+  String? _error;
+
+  // ── Getters ──
+  bool get isGenerating => _isGenerating;
+  double get progress => _progress;
+  String get statusMessage => _statusMessage;
+  String? get currentStoryTitle => _currentStoryTitle;
+  String? get audioUrl => _audioUrl;
+  String? get completedStoryTitle => _completedStoryTitle;
+  bool get justCompleted => _justCompleted;
+  String? get error => _error;
+
+  /// Check if audio is ready for a specific story.
+  bool hasAudioFor(String title) {
+    return _completedStoryTitle == title && _audioUrl != null;
+  }
+
+  /// Check if currently generating for a specific story.
+  bool isGeneratingFor(String title) {
+    return _isGenerating && _currentStoryTitle == title;
+  }
+
+  /// Clear the "just completed" flag (after flash animation plays).
+  void clearJustCompleted() {
+    _justCompleted = false;
+    notifyListeners();
+  }
+
+  /// Set audio URL directly (e.g. from pre-check).
+  void setExistingAudio(String title, String url) {
+    _completedStoryTitle = title;
+    _audioUrl = url;
+    _justCompleted = false;
+    notifyListeners();
+  }
+
+  /// Check server for existing audio file.
+  Future<void> checkExistingAudio(String title) async {
+    if (title.isEmpty) return;
+    try {
+      final resp = await http.get(
+        Uri.parse(
+          '$_kServerBase/api/tts_check?title=${Uri.encodeComponent(title)}',
+        ),
+      );
+      if (resp.statusCode == 200) {
+        final data = jsonDecode(resp.body);
+        if (data['exists'] == true && data['audio_url'] != null) {
+          _completedStoryTitle = title;
+          _audioUrl = '$_kServerBase/${data['audio_url']}';
+          notifyListeners();
+        }
+      }
+    } catch (_) {}
+  }
+
+  /// Start TTS generation. Safe to call even if page navigates away.
+  Future<void> generate({
+    required String title,
+    required String content,
+  }) async {
+    if (_isGenerating) return;
+
+    _isGenerating = true;
+    _progress = 0.0;
+    _statusMessage = '正在连接...';
+    _currentStoryTitle = title;
+    _audioUrl = null;
+    _completedStoryTitle = null;
+    _justCompleted = false;
+    _error = null;
+    notifyListeners();
+
+    try {
+      final client = http.Client();
+      final request = http.Request(
+        'POST',
+        Uri.parse('$_kServerBase/api/create_tts'),
+      );
+      request.headers['Content-Type'] = 'application/json';
+      request.body = jsonEncode({'title': title, 'content': content});
+
+      final streamed = await client.send(request);
+
+      await for (final chunk in streamed.stream.transform(utf8.decoder)) {
+        for (final line in chunk.split('\n')) {
+          if (!line.startsWith('data: ')) continue;
+          try {
+            final data = jsonDecode(line.substring(6));
+            final stage = data['stage'] as String? ?? '';
+            final message = data['message'] as String? ?? '';
+
+            switch (stage) {
+              case 'connecting':
+                _updateProgress(0.10, '正在连接...');
+                break;
+              case 'generating':
+                _updateProgress(0.30, '语音生成中...');
+                break;
+              case 'saving':
+                _updateProgress(0.88, '正在保存...');
+                break;
+              case 'done':
+                if (data['audio_url'] != null) {
+                  _audioUrl = '$_kServerBase/${data['audio_url']}';
+                  _completedStoryTitle = title;
+                  _justCompleted = true;
+                  _updateProgress(1.0, '生成完成');
+                }
+                break;
+              case 'error':
+                throw Exception(message);
+              default:
+                // Progress slowly increases during generation
+                if (_progress < 0.85) {
+                  _updateProgress(_progress + 0.02, message);
+                }
+            }
+          } catch (e) {
+            if (e is Exception &&
+                e.toString().contains('语音合成失败')) {
+              rethrow;
+            }
+          }
+        }
+      }
+
+      client.close();
+
+      _isGenerating = false;
+      if (_audioUrl == null) {
+        _error = '未获取到音频';
+        _statusMessage = '生成失败';
+      }
+      notifyListeners();
+    } catch (e) {
+      debugPrint('TTS generation error: $e');
+      _isGenerating = false;
+      _progress = 0.0;
+      _error = e.toString();
+      _statusMessage = '生成失败';
+      _justCompleted = false;
+      notifyListeners();
+    }
+  }
+
+  void _updateProgress(double progress, String message) {
+    _progress = progress.clamp(0.0, 1.0);
+    _statusMessage = message;
+    notifyListeners();
+  }
+
+  /// Reset all state (e.g. when switching stories).
+  void reset() {
+    if (_isGenerating) return; // Don't reset during generation
+    _progress = 0.0;
+    _statusMessage = '';
+    _currentStoryTitle = null;
+    _audioUrl = null;
+    _completedStoryTitle = null;
+    _justCompleted = false;
+    _error = null;
+    notifyListeners();
+  }
+}
diff --git a/airhub_app/lib/widgets/pill_progress_button.dart b/airhub_app/lib/widgets/pill_progress_button.dart
new file mode 100644
index 0000000..f76b51c
--- /dev/null
+++ b/airhub_app/lib/widgets/pill_progress_button.dart
@@ -0,0 +1,335 @@
+﻿import 'dart:math' as math;
+import 'package:flutter/material.dart';
+
+enum TTSButtonState {
+  idle,
+  ready,
+  generating,
+  completed,
+  playing,
+  paused,
+  error,
+}
+
+class PillProgressButton extends StatefulWidget {
+  final TTSButtonState state;
+  final double progress;
+  final VoidCallback? onTap;
+  final double height;
+
+  const PillProgressButton({
+    super.key,
+    required this.state,
+    this.progress = 0.0,
+    this.onTap,
+    this.height = 48,
+  });
+
+  @override
+  State<PillProgressButton> createState() => _PillProgressButtonState();
+}
+
+class _PillProgressButtonState extends State<PillProgressButton>
+    with TickerProviderStateMixin {
+  late AnimationController _progressCtrl;
+  double _displayProgress = 0.0;
+
+  late AnimationController _glowCtrl;
+  late Animation<double> _glowAnim;
+
+  late AnimationController _waveCtrl;
+
+  bool _wasCompleted = false;
+
+  @override
+  void initState() {
+    super.initState();
+
+    _progressCtrl = AnimationController(
+      vsync: this,
+      duration: const Duration(milliseconds: 500),
+    );
+    _progressCtrl.addListener(() => setState(() {}));
+
+    _glowCtrl = AnimationController(
+      vsync: this,
+      duration: const Duration(milliseconds: 1000),
+    );
+    _glowAnim = TweenSequence<double>([
+      TweenSequenceItem(tween: Tween(begin: 0.0, end: 1.0), weight: 35),
+      TweenSequenceItem(tween: Tween(begin: 1.0, end: 0.0), weight: 65),
+    ]).animate(CurvedAnimation(parent: _glowCtrl, curve: Curves.easeOut));
+    _glowCtrl.addListener(() => setState(() {}));
+
+    _waveCtrl = AnimationController(
+      vsync: this,
+      duration: const Duration(milliseconds: 800),
+    );
+
+    _syncAnimations();
+  }
+
+  @override
+  void didUpdateWidget(PillProgressButton oldWidget) {
+    super.didUpdateWidget(oldWidget);
+
+    if (widget.progress != oldWidget.progress) {
+      if (oldWidget.state == TTSButtonState.completed &&
+          (widget.state == TTSButtonState.playing || widget.state == TTSButtonState.ready)) {
+        _displayProgress = 0.0;
+      } else {
+        _animateProgressTo(widget.progress);
+      }
+    }
+
+    if (widget.state == TTSButtonState.completed && !_wasCompleted) {
+      _wasCompleted = true;
+      _glowCtrl.forward(from: 0);
+    } else if (widget.state != TTSButtonState.completed) {
+      _wasCompleted = false;
+    }
+
+    _syncAnimations();
+  }
+
+  void _animateProgressTo(double target) {
+    final from = _displayProgress;
+    _progressCtrl.reset();
+    _progressCtrl.addListener(() {
+      final t = Curves.easeInOut.transform(_progressCtrl.value);
+      _displayProgress = from + (target - from) * t;
+    });
+    _progressCtrl.forward();
+  }
+
+  void _syncAnimations() {
+    if (widget.state == TTSButtonState.generating) {
+      if (!_waveCtrl.isAnimating) _waveCtrl.repeat();
+    } else {
+      if (_waveCtrl.isAnimating) {
+        _waveCtrl.stop();
+        _waveCtrl.value = 0;
+      }
+    }
+  }
+
+  @override
+  void dispose() {
+    _progressCtrl.dispose();
+    _glowCtrl.dispose();
+    _waveCtrl.dispose();
+    super.dispose();
+  }
+
+  bool get _showBorder =>
+      widget.state == TTSButtonState.generating ||
+      widget.state == TTSButtonState.completed ||
+      widget.state == TTSButtonState.playing ||
+      widget.state == TTSButtonState.paused;
+
+  @override
+  Widget build(BuildContext context) {
+    const borderColor = Color(0xFFE5E7EB);
+    const progressColor = Color(0xFFECCFA8);
+    const bgColor = Color(0xCCFFFFFF);
+
+    return GestureDetector(
+      onTap: widget.state == TTSButtonState.generating ? null : widget.onTap,
+      child: Container(
+        height: widget.height,
+        decoration: BoxDecoration(
+          borderRadius: BorderRadius.circular(widget.height / 2),
+          boxShadow: _glowAnim.value > 0
+              ? [
+                  BoxShadow(
+                    color: progressColor.withOpacity(0.5 * _glowAnim.value),
+                    blurRadius: 16 * _glowAnim.value,
+                    spreadRadius: 2 * _glowAnim.value,
+                  ),
+                ]
+              : null,
+        ),
+        child: CustomPaint(
+          painter: PillBorderPainter(
+            progress: _showBorder ? _displayProgress.clamp(0.0, 1.0) : 0.0,
+            borderColor: borderColor,
+            progressColor: progressColor,
+            radius: widget.height / 2,
+            stroke: _showBorder ? 2.5 : 1.0,
+            bg: bgColor,
+          ),
+          child: Center(child: _buildContent()),
+        ),
+      ),
+    );
+  }
+
+  Widget _buildContent() {
+    switch (widget.state) {
+      case TTSButtonState.idle:
+        return _label(Icons.headphones_rounded, '\u6717\u8bfb');
+      case TTSButtonState.generating:
+        return Row(
+          mainAxisAlignment: MainAxisAlignment.center,
+          children: [
+            AnimatedBuilder(
+              animation: _waveCtrl,
+              builder: (context, _) => CustomPaint(
+                size: const Size(20, 18),
+                painter: WavePainter(t: _waveCtrl.value, color: const Color(0xFFC99672)),
+              ),
+            ),
+            const SizedBox(width: 6),
+            const Text('\u751f\u6210\u4e2d',
+                style: TextStyle(fontSize: 15, fontWeight: FontWeight.w600, color: Color(0xFF4B5563))),
+          ],
+        );
+      case TTSButtonState.ready:
+        return _label(Icons.play_arrow_rounded, '\u64ad\u653e');
+      case TTSButtonState.completed:
+        return _label(Icons.play_arrow_rounded, '\u64ad\u653e');
+      case TTSButtonState.playing:
+        return _label(Icons.pause_rounded, '\u6682\u505c');
+      case TTSButtonState.paused:
+        return _label(Icons.play_arrow_rounded, '\u7ee7\u7eed');
+      case TTSButtonState.error:
+        return _label(Icons.refresh_rounded, '\u91cd\u8bd5', isError: true);
+    }
+  }
+
+  Widget _label(IconData icon, String text, {bool isError = false}) {
+    final c = isError ? const Color(0xFFEF4444) : const Color(0xFF4B5563);
+    return Row(
+      mainAxisAlignment: MainAxisAlignment.center,
+      mainAxisSize: MainAxisSize.min,
+      children: [
+        Icon(icon, size: 20, color: c),
+        const SizedBox(width: 4),
+        Text(text, style: TextStyle(fontSize: 16, fontWeight: FontWeight.w600, color: c)),
+      ],
+    );
+  }
+}
+
+class PillBorderPainter extends CustomPainter {
+  final double progress;
+  final Color borderColor;
+  final Color progressColor;
+  final double radius;
+  final double stroke;
+  final Color bg;
+
+  PillBorderPainter({
+    required this.progress,
+    required this.borderColor,
+    required this.progressColor,
+    required this.radius,
+    required this.stroke,
+    required this.bg,
+  });
+
+  @override
+  void paint(Canvas canvas, Size size) {
+    final r = radius.clamp(0.0, size.height / 2);
+    final rrect = RRect.fromRectAndRadius(
+      Rect.fromLTWH(0, 0, size.width, size.height),
+      Radius.circular(r),
+    );
+
+    canvas.drawRRect(rrect, Paint()
+      ..color = bg
+      ..style = PaintingStyle.fill);
+    canvas.drawRRect(rrect, Paint()
+      ..color = borderColor
+      ..style = PaintingStyle.stroke
+      ..strokeWidth = stroke);
+
+    if (progress <= 0.001) return;
+
+    final straightH = size.width - 2 * r;
+    final halfTop = straightH / 2;
+    final arcLen = math.pi * r;
+    final totalLen = halfTop + arcLen + straightH + arcLen + halfTop;
+    final target = totalLen * progress;
+
+    final path = Path();
+    double done = 0;
+    final cx = size.width / 2;
+
+    path.moveTo(cx, 0);
+    var seg = math.min(halfTop, target - done);
+    path.lineTo(cx + seg, 0);
+    done += seg;
+    if (done >= target) { _drawPath(canvas, path); return; }
+
+    seg = math.min(arcLen, target - done);
+    _traceArc(path, size.width - r, r, r, -math.pi / 2, seg / r);
+    done += seg;
+    if (done >= target) { _drawPath(canvas, path); return; }
+
+    seg = math.min(straightH, target - done);
+    path.lineTo(size.width - r - seg, size.height);
+    done += seg;
+    if (done >= target) { _drawPath(canvas, path); return; }
+
+    seg = math.min(arcLen, target - done);
+    _traceArc(path, r, r, r, math.pi / 2, seg / r);
+    done += seg;
+    if (done >= target) { _drawPath(canvas, path); return; }
+
+    seg = math.min(halfTop, target - done);
+    path.lineTo(r + seg, 0);
+    _drawPath(canvas, path);
+  }
+
+  void _drawPath(Canvas canvas, Path path) {
+    canvas.drawPath(path, Paint()
+      ..color = progressColor
+      ..style = PaintingStyle.stroke
+      ..strokeWidth = stroke
+      ..strokeCap = StrokeCap.round);
+  }
+
+  void _traceArc(Path p, double cx, double cy, double r, double start, double sweep) {
+    const n = 24;
+    final step = sweep / n;
+    for (int i = 0; i <= n; i++) {
+      final a = start + step * i;
+      p.lineTo(cx + r * math.cos(a), cy + r * math.sin(a));
+    }
+  }
+
+  @override
+  bool shouldRepaint(PillBorderPainter old) => old.progress != progress || old.stroke != stroke;
+}
+
+class WavePainter extends CustomPainter {
+  final double t;
+  final Color color;
+  WavePainter({required this.t, required this.color});
+
+  @override
+  void paint(Canvas canvas, Size size) {
+    final paint = Paint()
+      ..color = color
+      ..style = PaintingStyle.fill;
+    final bw = size.width * 0.2;
+    final gap = size.width * 0.1;
+    final tw = 3 * bw + 2 * gap;
+    final sx = (size.width - tw) / 2;
+    for (int i = 0; i < 3; i++) {
+      final phase = t * 2 * math.pi + i * math.pi * 0.7;
+      final hr = 0.3 + 0.7 * ((math.sin(phase) + 1) / 2);
+      final bh = size.height * hr;
+      final x = sx + i * (bw + gap);
+      final y = (size.height - bh) / 2;
+      canvas.drawRRect(
+        RRect.fromRectAndRadius(Rect.fromLTWH(x, y, bw, bh), Radius.circular(bw / 2)),
+        paint,
+      );
+    }
+  }
+
+  @override
+  bool shouldRepaint(WavePainter old) => old.t != t;
+}
\ No newline at end of file
diff --git a/prompts/music_director.md b/prompts/music_director.md
index d06828f..6c63a22 100644
--- a/prompts/music_director.md
+++ b/prompts/music_director.md
@@ -24,8 +24,7 @@
 
 1. **song_title** (歌曲名称)
    - 使用**中文**，简短有趣，3-8个字。
-   - 体现咔咔的可爱风格。
-   - 示例："温泉咔咔乐"、"草地蹦蹦跳"、"雨夜安眠曲"
+   - 根据用户描述的场景自由发挥，不要套用固定模板。
 
 2. **style** (风格描述)
    - 使用**英文**描述音乐风格、乐器、节奏、情绪。
diff --git a/server.py b/server.py
index 0280680..45b7489 100644
--- a/server.py
+++ b/server.py
@@ -2,10 +2,14 @@ import os
 import re
 import sys
 import time
+import uuid
+import struct
+import asyncio
 import uvicorn
 import requests
 import json
-from fastapi import FastAPI, HTTPException
+import websockets
+from fastapi import FastAPI, HTTPException, Query
 from fastapi.responses import StreamingResponse
 from fastapi.middleware.cors import CORSMiddleware
 from pydantic import BaseModel
@@ -20,11 +24,15 @@ if sys.platform == "win32":
 load_dotenv()
 MINIMAX_API_KEY = os.getenv("MINIMAX_API_KEY")
 VOLCENGINE_API_KEY = os.getenv("VOLCENGINE_API_KEY")
+TTS_APP_ID = os.getenv("TTS_APP_ID")
+TTS_ACCESS_TOKEN = os.getenv("TTS_ACCESS_TOKEN")
 
 if not MINIMAX_API_KEY:
     print("Warning: MINIMAX_API_KEY not found in .env")
 if not VOLCENGINE_API_KEY:
     print("Warning: VOLCENGINE_API_KEY not found in .env")
+if not TTS_APP_ID or not TTS_ACCESS_TOKEN:
+    print("Warning: TTS_APP_ID or TTS_ACCESS_TOKEN not found in .env")
 
 # Initialize FastAPI
 app = FastAPI()
@@ -606,14 +614,244 @@ def get_playlist():
     return {"playlist": playlist}
 
 
-# ── Static file serving for generated music ──
+# ═══════════════════════════════════════════════════════════════════
+# ── TTS: 豆包语音合成 2.0 WebSocket V3 二进制协议 ──
+# ═══════════════════════════════════════════════════════════════════
+
+TTS_WS_URL = "wss://openspeech.bytedance.com/api/v1/tts/ws_binary"
+TTS_CLUSTER = "volcano_tts"
+TTS_SPEAKER = "ICL_zh_female_keainvsheng_tob"
+
+_audio_dir = os.path.join(os.path.dirname(__file__) or ".", "Capybara audio")
+os.makedirs(_audio_dir, exist_ok=True)
+
+
+def _build_tts_v1_request(payload_json: dict) -> bytes:
+    """Build a V1 full-client-request binary frame.
+    Header: 0x11 0x10 0x10 0x00 (v1, 4-byte header, full-client-request, JSON, no compression)
+    Then 4-byte big-endian payload length, then JSON payload bytes.
+    """
+    payload_bytes = json.dumps(payload_json, ensure_ascii=False).encode("utf-8")
+    header = bytes([0x11, 0x10, 0x10, 0x00])
+    length = struct.pack(">I", len(payload_bytes))
+    return header + length + payload_bytes
+
+
+def _parse_tts_v1_response(data: bytes):
+    """Parse a V1 TTS response binary frame.
+    Returns (audio_bytes_or_none, is_last, is_error, error_msg).
+    """
+    if len(data) < 4:
+        return None, False, True, "Frame too short"
+
+    byte1 = data[1]
+    msg_type = (byte1 >> 4) & 0x0F
+    msg_flags = byte1 & 0x0F
+
+    # Error frame: msg_type = 0xF
+    if msg_type == 0x0F:
+        offset = 4
+        error_code = 0
+        if len(data) >= offset + 4:
+            error_code = struct.unpack(">I", data[offset:offset + 4])[0]
+            offset += 4
+        if len(data) >= offset + 4:
+            msg_len = struct.unpack(">I", data[offset:offset + 4])[0]
+            offset += 4
+            error_msg = data[offset:offset + msg_len].decode("utf-8", errors="replace")
+        else:
+            error_msg = f"error code {error_code}"
+        print(f"[TTS Error] code={error_code}, msg={error_msg}", flush=True)
+        return None, False, True, error_msg
+
+    # Audio-only response: msg_type = 0xB
+    if msg_type == 0x0B:
+        # flags: 0b0000=no seq, 0b0001=seq>0, 0b0010/0b0011=last (seq<0)
+        is_last = (msg_flags & 0x02) != 0  # bit 1 set = last message
+        offset = 4
+
+        # If flags != 0, there's a 4-byte sequence number
+        if msg_flags != 0:
+            offset += 4  # skip sequence number
+
+        if len(data) < offset + 4:
+            return None, is_last, False, ""
+
+        payload_size = struct.unpack(">I", data[offset:offset + 4])[0]
+        offset += 4
+        audio_data = data[offset:offset + payload_size]
+        return audio_data, is_last, False, ""
+
+    # Server response with JSON (msg_type = 0x9): usually contains metadata
+    if msg_type == 0x09:
+        offset = 4
+        if len(data) >= offset + 4:
+            payload_size = struct.unpack(">I", data[offset:offset + 4])[0]
+            offset += 4
+            json_str = data[offset:offset + payload_size].decode("utf-8", errors="replace")
+            print(f"[TTS] Server JSON: {json_str[:200]}", flush=True)
+        return None, False, False, ""
+
+    return None, False, False, ""
+
+
+async def tts_synthesize(text: str) -> bytes:
+    """Connect to Doubao TTS V1 WebSocket and synthesize text to MP3 bytes."""
+    headers = {
+        "Authorization": f"Bearer;{TTS_ACCESS_TOKEN}",
+    }
+
+    payload = {
+        "app": {
+            "appid": TTS_APP_ID,
+            "token": "placeholder",
+            "cluster": TTS_CLUSTER,
+        },
+        "user": {
+            "uid": "airhub_user",
+        },
+        "audio": {
+            "voice_type": TTS_SPEAKER,
+            "encoding": "mp3",
+            "speed_ratio": 1.0,
+            "rate": 24000,
+        },
+        "request": {
+            "reqid": str(uuid.uuid4()),
+            "text": text,
+            "operation": "submit",  # streaming mode
+        },
+    }
+
+    audio_buffer = bytearray()
+    request_frame = _build_tts_v1_request(payload)
+
+    print(f"[TTS] Connecting to V1 WebSocket... text length={len(text)}", flush=True)
+
+    async with websockets.connect(
+        TTS_WS_URL,
+        extra_headers=headers,
+        max_size=10 * 1024 * 1024,  # 10MB max frame
+        ping_interval=None,
+    ) as ws:
+        # Send request
+        await ws.send(request_frame)
+        print("[TTS] Request sent, waiting for audio...", flush=True)
+
+        # Receive audio chunks
+        chunk_count = 0
+        async for message in ws:
+            if isinstance(message, bytes):
+                audio_data, is_last, is_error, error_msg = _parse_tts_v1_response(message)
+
+                if is_error:
+                    raise RuntimeError(f"TTS error: {error_msg}")
+
+                if audio_data and len(audio_data) > 0:
+                    audio_buffer.extend(audio_data)
+                    chunk_count += 1
+
+                if is_last:
+                    print(f"[TTS] Last frame received. chunks={chunk_count}, "
+                          f"audio size={len(audio_buffer)} bytes", flush=True)
+                    break
+
+    return bytes(audio_buffer)
+
+
+class TTSRequest(BaseModel):
+    title: str
+    content: str
+
+
+@app.get("/api/tts_check")
+def tts_check(title: str = Query(...)):
+    """Check if audio already exists for a story title."""
+    for f in os.listdir(_audio_dir):
+        if f.lower().endswith(".mp3"):
+            # Match by title prefix (before timestamp)
+            name = f[:-4]  # strip .mp3
+            name_without_ts = re.sub(r'_\d{10,}$', '', name)
+            if name_without_ts == title or name == title:
+                return {
+                    "exists": True,
+                    "audio_url": f"Capybara audio/{f}",
+                }
+    return {"exists": False, "audio_url": None}
+
+
+@app.post("/api/create_tts")
+def create_tts(req: TTSRequest):
+    """Generate TTS audio for a story. Returns SSE stream with progress."""
+
+    def event_stream():
+        import asyncio
+
+        yield sse_event({"stage": "connecting", "progress": 10,
+                         "message": "正在连接语音合成服务..."})
+
+        # Check if audio already exists
+        for f in os.listdir(_audio_dir):
+            if f.lower().endswith(".mp3"):
+                name = f[:-4]
+                name_without_ts = re.sub(r'_\d{10,}$', '', name)
+                if name_without_ts == req.title:
+                    yield sse_event({"stage": "done", "progress": 100,
+                                     "message": "语音已存在",
+                                     "audio_url": f"Capybara audio/{f}"})
+                    return
+
+        yield sse_event({"stage": "generating", "progress": 30,
+                         "message": "AI 正在朗读故事..."})
+
+        try:
+            # Run async TTS in a new event loop
+            loop = asyncio.new_event_loop()
+            audio_bytes = loop.run_until_complete(tts_synthesize(req.content))
+            loop.close()
+
+            if not audio_bytes or len(audio_bytes) < 100:
+                yield sse_event({"stage": "error", "progress": 0,
+                                 "message": "语音合成返回了空音频"})
+                return
+
+            yield sse_event({"stage": "saving", "progress": 80,
+                             "message": "正在保存音频..."})
+
+            # Save audio file
+            timestamp = int(time.time())
+            safe_title = re.sub(r'[<>:"/\\|?*]', '', req.title)[:50]
+            filename = f"{safe_title}_{timestamp}.mp3"
+            filepath = os.path.join(_audio_dir, filename)
+
+            with open(filepath, "wb") as f:
+                f.write(audio_bytes)
+
+            print(f"[TTS Saved] {filepath} ({len(audio_bytes)} bytes)", flush=True)
+
+            yield sse_event({"stage": "done", "progress": 100,
+                             "message": "语音生成完成！",
+                             "audio_url": f"Capybara audio/{filename}"})
+
+        except Exception as e:
+            print(f"[TTS Error] {e}", flush=True)
+            yield sse_event({"stage": "error", "progress": 0,
+                             "message": f"语音合成失败: {str(e)}"})
+
+    return StreamingResponse(event_stream(), media_type="text/event-stream")
+
+
+# ── Static file serving ──
 from fastapi.staticfiles import StaticFiles
 
-# Create music directory if it doesn't exist
+# Music directory
 _music_dir = os.path.join(os.path.dirname(__file__) or ".", "Capybara music")
 os.makedirs(_music_dir, exist_ok=True)
 app.mount("/Capybara music", StaticFiles(directory=_music_dir), name="music_files")
 
+# Audio directory (TTS generated)
+app.mount("/Capybara audio", StaticFiles(directory=_audio_dir), name="audio_files")
+
 
 if __name__ == "__main__":
     print("[Server] Music Server running on http://localhost:3000")
diff --git a/阶段总结/session_progress.md b/阶段总结/session_progress.md
index 852908d..ae4fc88 100644
--- a/阶段总结/session_progress.md
+++ b/阶段总结/session_progress.md
@@ -3,7 +3,7 @@
 > **用途**：每次对话结束前 / 做完一个阶段后更新此文件。
 > 新对话开始时，AI 先读此文件恢复上下文。
 > 
-> **最后更新**：2026-02-09 (第八次对话)
+> **最后更新**：2026-02-10 (第九次对话)
 
 ---
 
@@ -155,9 +155,47 @@
 - **封面区分**：预设故事显示封面图，AI 生成的故事显示淡紫渐变"暂无封面"占位
 - **乱码过滤**：API 层自动跳过无中文标题的异常文件
 
-### 正在做的
-- TTS 语音合成待后续接入（用户去开通火山语音服务后再做）
+### 第九次对话完成的工作（2026-02-10）
+
+#### TTS 语音合成全链路接入（上次对话完成，此处补记）
+- **后端**：`server.py` 新增 `/api/tts` 接口，WebSocket 流式调用豆包 TTS V1 API
+- **音色**：可爱女生（`ICL_zh_female_keainvsheng_tob`）
+- **前端组件**：`PillProgressButton`（药丸形进度按钮）替代旧 RingProgressButton
+  - 5 种状态：idle / ready / generating / completed / playing / paused / error
+  - 进度环动画 + 音波动效 + 发光效果
+- **TTSService 单例**：后台持续运行，切页面不中断生成
+- **音频保存**：生成的 TTS 音频保存到 `Capybara audio/` 目录
+- **暂停/续播修复**：显式 seek 到暂停位置再 play，解决 Web 端从头播放的 bug
+- **按钮状态修复**：新增 `ready` 状态，未播放过的音频显示"播放"而非"继续"
+- **自动播放控制**：仅在用户停留在故事页时自动播放，切出页面不自动播
+
+#### 音乐总监 Prompt 优化
+- **歌名去重复**：移除固定示例（"温泉咔咔乐"等），改为"根据场景自由发挥，不要套用固定模板"
+- **效果**：AI 每次为相似场景生成不同歌名，唱片架不再出现一堆同名歌曲
+
+#### 唱片架播放状态可视化
+- **卡片高亮**：当前播放的歌曲整张卡片变暖金色底 + 金色边框 + 阴影
+- **标题标识**：播放中的歌曲标题前加小喇叭图标 + 金色加粗文字
+- **音波动效**：播放中的唱片中心叠加跳动音波 CustomPaint 动画
+
+#### 气泡持续显示当前歌名
+- 播放期间气泡始终显示"正在播放: xxx"，不再 3 秒后消失
+- 直接点播放按钮（非从唱片架选歌）也会显示歌名
+- 暂停时气泡自动隐藏，切歌时自动更新
+- 使用 `_playStickyText` 机制，即使其他临时消息弹出后也会恢复播放信息
+
+#### 调研 AI 音乐生成平台
+- 对比了 MiniMax Music 2.5（现用）、Mureka（昆仑万维）、天谱乐、ACE-Step
+- 发现 Mureka 有中国站 API（platform.mureka.cn），质量评测超越 Suno V4
+- 用户的朋友用的 Muse AI App 底层就是 Mureka 模型
+- MiniMax 文本模型（abab6.5s-chat）价格偏高，可考虑切豆包
+- 歌词生成费用极低（每次约 0.005 元），主要成本在音乐生成（1 元/首）
+
+### 正在做的 / 待办
 - 故事封面方案待定（付费生成 or 免费生成）
+- 考虑将音乐生成从 MiniMax 切换到 Mureka（用户在评估中）
+- 考虑将歌词生成的 LLM 从 MiniMax abab6.5s-chat 切到豆包（更便宜）
+- 长歌名 fallback 问题：LLM 返回空 song_title 时用了用户输入原文当歌名，后续可优化
 
 ---