feat: TTS语音合成 + 唱片架播放状态 + 气泡持续显示 + 音乐Prompt优化
- 接入豆包TTS V1 WebSocket API,支持故事朗读语音合成 - 新增 PillProgressButton 组件(药丸形进度按钮) - 新增 TTSService 单例,后台生成不中断 - 音频保存到 Capybara audio/ 目录 - 唱片架当前播放歌曲高亮(金色卡片+音波动效+喇叭图标) - 播放时气泡持续显示当前歌名,暂停后隐藏 - 音乐总监Prompt去固定模板,歌名不再重复 - 新增 API 参考文档(豆包语音合成) Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
parent
8f5fb32b37
commit
84243f2be4
1205
API相关/语音合成大模型-单向流式websocket-V3-支持复刻混音mix.md
Normal file
1205
API相关/语音合成大模型-单向流式websocket-V3-支持复刻混音mix.md
Normal file
File diff suppressed because it is too large
Load Diff
0
API相关/语音合成大模型音色列表.md
Normal file
0
API相关/语音合成大模型音色列表.md
Normal file
627
API相关/豆包大模型语音合成API.md
Normal file
627
API相关/豆包大模型语音合成API.md
Normal file
@ -0,0 +1,627 @@
|
|||||||
|
|
||||||
|
<span id="60c34d72"></span>
|
||||||
|
# Websocket
|
||||||
|
> 使用账号申请部分申请到的 appid&access_token 进行调用
|
||||||
|
> 文本一次性送入,后端边合成边返回音频数据
|
||||||
|
|
||||||
|
<span id="9e6b61a2"></span>
|
||||||
|
## 1. 接口说明
|
||||||
|
> V1:
|
||||||
|
> **wss://openspeech.bytedance.com/api/v1/tts/ws_binary (V1 单向流式)**
|
||||||
|
> **https://openspeech.bytedance.com/api/v1/tts (V1 http非流式)**
|
||||||
|
> V3:
|
||||||
|
> **wss://openspeech.bytedance.com/api/v3/tts/unidirectional/stream (V3 wss单向流式)**
|
||||||
|
> [V3 websocket单向流式文档](https://www.volcengine.com/docs/6561/1719100)
|
||||||
|
> **wss://openspeech.bytedance.com/api/v3/tts/bidirection (V3 wss双向流式)**
|
||||||
|
> [V3 websocket双向流式文档](https://www.volcengine.com/docs/6561/1329505)
|
||||||
|
> **https://openspeech.bytedance.com/api/v3/tts/unidirectional (V3 http单向流式)**
|
||||||
|
> [V3 http单向流式文档](https://www.volcengine.com/docs/6561/1598757)
|
||||||
|
|
||||||
|
:::warning
|
||||||
|
大模型音色都推荐接入V3接口,时延上的表现会更好
|
||||||
|
:::
|
||||||
|
<span id="34dcdf3a"></span>
|
||||||
|
## 2. 身份认证
|
||||||
|
认证方式使用 Bearer Token,在请求的 header 中加上`"Authorization": "Bearer; {token}"`,并在请求的 json 中填入对应的 appid。
|
||||||
|
:::warning
|
||||||
|
Bearer 和 token 使用分号 ; 分隔,替换时请勿保留{}
|
||||||
|
:::
|
||||||
|
AppID/Token/Cluster 等信息可参考 [控制台使用FAQ-Q1](/docs/6561/196768#q1:哪里可以获取到以下参数appid,cluster,token,authorization-type,secret-key-?)
|
||||||
|
<span id="f1d92aff"></span>
|
||||||
|
## 3. 请求方式
|
||||||
|
<span id="14624bd9"></span>
|
||||||
|
### 3.1 二进制协议
|
||||||
|
<span id="7574a509"></span>
|
||||||
|
#### 报文格式(Message format)
|
||||||
|

|
||||||
|
所有字段以 [Big Endian(大端序)](https://zh.wikipedia.org/wiki/%E5%AD%97%E8%8A%82%E5%BA%8F#%E5%A4%A7%E7%AB%AF%E5%BA%8F) 的方式存储。
|
||||||
|
**字段描述**
|
||||||
|
|
||||||
|
| | | | \
|
||||||
|
|字段 Field (大小, 单位 bit) |描述 Description |值 Values |
|
||||||
|
|---|---|---|
|
||||||
|
| | | | \
|
||||||
|
|协议版本(Protocol version) (4) |可能会在将来使用不同的协议版本,所以这个字段是为了让客户端和服务器在版本上保持一致。 |`0b0001` - 版本 1 (目前只有版本 1) |
|
||||||
|
| | | | \
|
||||||
|
|报头大小(Header size) (4) |header 实际大小是 `header size value x 4` bytes. |\
|
||||||
|
| |这里有个特殊值 `0b1111` 表示 header 大小大于或等于 60(15 x 4 bytes),也就是会存在 header extension 字段。 |`0b0001` - 报头大小 = 4 (1 x 4) |\
|
||||||
|
| | |`0b0010` - 报头大小 = 8 (2 x 4) |\
|
||||||
|
| | |`0b1010` - 报头大小 = 40 (10 x 4) |\
|
||||||
|
| | |`0b1110` - 报头大小 = 56 (14 x 4) |\
|
||||||
|
| | |`0b1111` - 报头大小为 60 或更大; 实际大小在 header extension 中定义 |
|
||||||
|
| | | | \
|
||||||
|
|消息类型(Message type) (4) |定义消息类型。 |`0b0001` - full client request. |\
|
||||||
|
| | |`~~0b1001~~` ~~- full server response(弃用).~~ |\
|
||||||
|
| | |`0b1011` - Audio-only server response (ACK). |\
|
||||||
|
| | |`0b1111` - Error message from server (例如错误的消息类型,不支持的序列化方法等等) |
|
||||||
|
| | | | \
|
||||||
|
|Message type specific flags (4) |flags 含义取决于消息类型。 |\
|
||||||
|
| |具体内容请看消息类型小节. | |
|
||||||
|
| | | | \
|
||||||
|
|序列化方法(Message serialization method) (4) |定义序列化 payload 的方法。 |\
|
||||||
|
| |注意:它只对某些特定的消息类型有意义 (例如 Audio-only server response `0b1011` 就不需要序列化). |`0b0000` - 无序列化 (raw bytes) |\
|
||||||
|
| | |`0b0001` - JSON |\
|
||||||
|
| | |`0b1111` - 自定义类型, 在 header extension 中定义 |
|
||||||
|
| | | | \
|
||||||
|
|压缩方法(Message Compression) (4) |定义 payload 的压缩方法。 |\
|
||||||
|
| |Payload size 字段不压缩(如果有的话,取决于消息类型),而且 Payload size 指的是 payload 压缩后的大小。 |\
|
||||||
|
| |Header 不压缩。 |`0b0000` - 无压缩 |\
|
||||||
|
| | |`0b0001` - gzip |\
|
||||||
|
| | |`0b1111` - 自定义压缩方法, 在 header extension 中定义 |
|
||||||
|
| | | | \
|
||||||
|
|保留字段(Reserved) (8) |保留字段,同时作为边界 (使整个报头大小为 4 个字节). |`0x00` - 目前只有 0 |
|
||||||
|
|
||||||
|
<span id="95a31a2c"></span>
|
||||||
|
#### 消息类型详细说明
|
||||||
|
目前所有 TTS websocket 请求都使用 full client request 格式,无论"query"还是"submit"。
|
||||||
|
<span id="d05f01f6"></span>
|
||||||
|
#### Full client request
|
||||||
|
|
||||||
|
* Header size为`b0001`(即 4B,没有 header extension)。
|
||||||
|
* Message type为`b0001`.
|
||||||
|
* Message type specific flags 固定为`b0000`.
|
||||||
|
* Message serialization method为`b0001`JSON。字段参考上方表格。
|
||||||
|
* 如果使用 gzip 压缩 payload,则 payload size 为压缩后的大小。
|
||||||
|
|
||||||
|
<span id="6e82d7df"></span>
|
||||||
|
#### Audio-only server response
|
||||||
|
|
||||||
|
* Header size 应该为`b0001`.
|
||||||
|
* Message type为`b1011`.
|
||||||
|
* Message type specific flags 可能的值有:
|
||||||
|
* `b0000` - 没有 sequence number.
|
||||||
|
* `b0001` - sequence number > 0.
|
||||||
|
* `b0010`or`b0011` - sequence number < 0,表示来自服务器的最后一条消息,此时客户端应合并所有音频片段(如果有多条)。
|
||||||
|
* Message serialization method为`b0000`(raw bytes).
|
||||||
|
|
||||||
|
<span id="4f9397bc"></span>
|
||||||
|
## 4.注意事项
|
||||||
|
|
||||||
|
* 每次合成时reqid这个参数需要重新设置,且要保证唯一性(建议使用uuid.V4生成)
|
||||||
|
* websocket demo中单条链接仅支持单次合成,若需要合成多次,需自行实现。每次创建websocket连接后,按顺序串行发送每一包。一次合成结束后,可以发送新的合成请求。
|
||||||
|
* operation需要设置为submit才是流式返回
|
||||||
|
* 在 websocket 握手成功后,会返回这些 Response header
|
||||||
|
* 不支持["豆包语音合成模型2.0"的音色](https://www.volcengine.com/docs/6561/1257544),比如:"zh_female_vv_uranus_bigtts",如需使用推荐使用v3 接口
|
||||||
|
|
||||||
|
|
||||||
|
| | | | \
|
||||||
|
|Key |说明 |Value 示例 |
|
||||||
|
|---|---|---|
|
||||||
|
| | | | \
|
||||||
|
|X-Tt-Logid |服务端返回的 logid,建议用户获取和打印方便定位问题 |202407261553070FACFE6D19421815D605 |
|
||||||
|
|
||||||
|
<span id="fe504ac4"></span>
|
||||||
|
## 5.调用示例
|
||||||
|
|
||||||
|
```mixin-react
|
||||||
|
return (<Tabs>
|
||||||
|
<Tabs.TabPane title="Python调用示例" key="buVUUlzaRC"><RenderMd content={`<span id="fccb89b1"></span>
|
||||||
|
### 前提条件
|
||||||
|
|
||||||
|
* 调用之前,您需要获取以下信息:
|
||||||
|
* \`<appid>\`:使用控制台获取的APP ID,可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
|
||||||
|
* \`<access_token>\`:使用控制台获取的Access Token,可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
|
||||||
|
* \`<voice_type>\`:您预期使用的音色ID,可参考 [大模型音色列表](https://www.volcengine.com/docs/6561/1257544)。
|
||||||
|
|
||||||
|
<span id="824abc9d"></span>
|
||||||
|
### Python环境
|
||||||
|
|
||||||
|
* Python:3.9版本及以上。
|
||||||
|
* Pip:25.1.1版本及以上。您可以使用下面命令安装。
|
||||||
|
|
||||||
|
\`\`\`Bash
|
||||||
|
python3 -m pip install --upgrade pip
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
<span id="5cbec8af"></span>
|
||||||
|
### 下载代码示例
|
||||||
|
<Attachment link="https://p9-arcosite.byteimg.com/tos-cn-i-goo7wpa0wc/90fc1f44eaac49f0b4e2cbabdaee8010~tplv-goo7wpa0wc-image.image" name="volcengine_binary_demo.tar.gz" ></Attachment>
|
||||||
|
<span id="44d95afb"></span>
|
||||||
|
### 解压缩代码包,安装依赖
|
||||||
|
\`\`\`Bash
|
||||||
|
mkdir -p volcengine_binary_demo
|
||||||
|
tar xvzf volcengine_binary_demo.tar.gz -C ./volcengine_binary_demo
|
||||||
|
cd volcengine_binary_demo
|
||||||
|
python3 -m venv .venv
|
||||||
|
source .venv/bin/activate
|
||||||
|
python3 -m pip install --upgrade pip
|
||||||
|
pip3 install -e .
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
<span id="fdf69422"></span>
|
||||||
|
### 发起调用
|
||||||
|
> \`<appid>\`替换为您的APP ID。
|
||||||
|
> \`<access_token>\`替换为您的Access Token。
|
||||||
|
> \`<voice_type>\`替换为您预期使用的音色ID,例如\`zh_female_cancan_mars_bigtts\`。
|
||||||
|
|
||||||
|
\`\`\`Bash
|
||||||
|
python3 examples/volcengine/binary.py --appid <appid> --access_token <access_token> --voice_type <voice_type> --text "你好,我是火山引擎的语音合成服务。这是一个美好的旅程。"
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
`}></RenderMd></Tabs.TabPane>
|
||||||
|
<Tabs.TabPane title="Java调用示例" key="bfjarx0zlZ"><RenderMd content={`<span id="e0bca07e"></span>
|
||||||
|
### 前提条件
|
||||||
|
|
||||||
|
* 调用之前,您需要获取以下信息:
|
||||||
|
* \`<appid>\`:使用控制台获取的APP ID,可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
|
||||||
|
* \`<access_token>\`:使用控制台获取的Access Token,可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
|
||||||
|
* \`<voice_type>\`:您预期使用的音色ID,可参考 [大模型音色列表](https://www.volcengine.com/docs/6561/1257544)。
|
||||||
|
|
||||||
|
<span id="5f338843"></span>
|
||||||
|
### Java环境
|
||||||
|
|
||||||
|
* Java:21版本及以上。
|
||||||
|
* Maven:3.9.10版本及以上。
|
||||||
|
|
||||||
|
<span id="96af51fa"></span>
|
||||||
|
### 下载代码示例
|
||||||
|
<Attachment link="https://p9-arcosite.byteimg.com/tos-cn-i-goo7wpa0wc/ba78519b2dc0459fb7a6935b63775c66~tplv-goo7wpa0wc-image.image" name="volcengine_binary_demo.tar.gz" ></Attachment>
|
||||||
|
<span id="8e0ecd00"></span>
|
||||||
|
### 解压缩代码包,安装依赖
|
||||||
|
\`\`\`Bash
|
||||||
|
mkdir -p volcengine_binary_demo
|
||||||
|
tar xvzf volcengine_binary_demo.tar.gz -C ./volcengine_binary_demo
|
||||||
|
cd volcengine_binary_demo
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
<span id="fa0a6230"></span>
|
||||||
|
### 发起调用
|
||||||
|
> \`<appid>\`替换为您的APP ID。
|
||||||
|
> \`<access_token>\`替换为您的Access Token。
|
||||||
|
> \`<voice_type>\`替换为您预期使用的音色ID,例如\`zh_female_cancan_mars_bigtts\`。
|
||||||
|
|
||||||
|
\`\`\`Bash
|
||||||
|
mvn compile exec:java -Dexec.mainClass=com.speech.volcengine.Binary -DappId=<appid> -DaccessToken=<access_token> -Dvoice=<voice_type> -Dtext="**你好**,我是豆包语音助手,很高兴认识你。这是一个愉快的旅程。"
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
`}></RenderMd></Tabs.TabPane>
|
||||||
|
<Tabs.TabPane title="Go调用示例" key="s8zQJ7cCr3"><RenderMd content={`<span id="2733f4d4"></span>
|
||||||
|
### 前提条件
|
||||||
|
|
||||||
|
* 调用之前,您需要获取以下信息:
|
||||||
|
* \`<appid>\`:使用控制台获取的APP ID,可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
|
||||||
|
* \`<access_token>\`:使用控制台获取的Access Token,可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
|
||||||
|
* \`<voice_type>\`:您预期使用的音色ID,可参考 [大模型音色列表](https://www.volcengine.com/docs/6561/1257544)。
|
||||||
|
|
||||||
|
<span id="ee9617a6"></span>
|
||||||
|
### Go环境
|
||||||
|
|
||||||
|
* Go:1.21.0版本及以上。
|
||||||
|
|
||||||
|
<span id="cf9bb2bf"></span>
|
||||||
|
### 下载代码示例
|
||||||
|
<Attachment link="https://p9-arcosite.byteimg.com/tos-cn-i-goo7wpa0wc/c553a4a4373840d4a4870a1ef2a4e494~tplv-goo7wpa0wc-image.image" name="volcengine_binary_demo.tar.gz" ></Attachment>
|
||||||
|
<span id="363963c4"></span>
|
||||||
|
### 解压缩代码包,安装依赖
|
||||||
|
\`\`\`Bash
|
||||||
|
mkdir -p volcengine_binary_demo
|
||||||
|
tar xvzf volcengine_binary_demo.tar.gz -C ./volcengine_binary_demo
|
||||||
|
cd volcengine_binary_demo
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
<span id="f0acb02c"></span>
|
||||||
|
### 发起调用
|
||||||
|
> \`<appid>\`替换为您的APP ID。
|
||||||
|
> \`<access_token>\`替换为您的Access Token。
|
||||||
|
> \`<voice_type>\`替换为您预期使用的音色ID,例如\`zh_female_cancan_mars_bigtts\`。
|
||||||
|
|
||||||
|
\`\`\`Bash
|
||||||
|
go run volcengine/binary/main.go --appid <appid> --access_token <access_token> --voice_type <voice_type> --text "**你好**,我是火山引擎的语音合成服务。"
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
`}></RenderMd></Tabs.TabPane>
|
||||||
|
<Tabs.TabPane title="C#调用示例" key="Thg5rLaSjq"><RenderMd content={`<span id="c60c1d5f"></span>
|
||||||
|
### 前提条件
|
||||||
|
|
||||||
|
* 调用之前,您需要获取以下信息:
|
||||||
|
* \`<appid>\`:使用控制台获取的APP ID,可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
|
||||||
|
* \`<access_token>\`:使用控制台获取的Access Token,可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
|
||||||
|
* \`<voice_type>\`:您预期使用的音色ID,可参考 [大模型音色列表](https://www.volcengine.com/docs/6561/1257544)。
|
||||||
|
|
||||||
|
<span id="cf2199fe"></span>
|
||||||
|
### C#环境
|
||||||
|
|
||||||
|
* .Net 9.0版本。
|
||||||
|
|
||||||
|
<span id="f7e91692"></span>
|
||||||
|
### 下载代码示例
|
||||||
|
<Attachment link="https://p9-arcosite.byteimg.com/tos-cn-i-goo7wpa0wc/a95ff3e7604d4bb4ade8fb49e110fef5~tplv-goo7wpa0wc-image.image" name="volcengine_binary_demo.tar.gz" ></Attachment>
|
||||||
|
<span id="f9131897"></span>
|
||||||
|
### 解压缩代码包,安装依赖
|
||||||
|
\`\`\`Bash
|
||||||
|
mkdir -p volcengine_binary_demo
|
||||||
|
tar xvzf volcengine_binary_demo.tar.gz -C ./volcengine_binary_demo
|
||||||
|
cd volcengine_binary_demo
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
<span id="5834585b"></span>
|
||||||
|
### 发起调用
|
||||||
|
> \`<appid>\`替换为您的APP ID。
|
||||||
|
> \`<access_token>\`替换为您的Access Token。
|
||||||
|
> \`<voice_type>\`替换为您预期使用的音色ID,例如\`zh_female_cancan_mars_bigtts\`。
|
||||||
|
|
||||||
|
\`\`\`Bash
|
||||||
|
dotnet run --project Volcengine/Binary/Volcengine.Speech.Binary.csproj -- --appid <appid> --access_token <access_token> --voice_type <voice_type> --text "**你好**,这是一个测试文本。我们正在测试文本转语音功能。"
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
`}></RenderMd></Tabs.TabPane>
|
||||||
|
<Tabs.TabPane title="TypeScript调用示例" key="p1GEs3rWU7"><RenderMd content={`<span id="8b865031"></span>
|
||||||
|
### 前提条件
|
||||||
|
|
||||||
|
* 调用之前,您需要获取以下信息:
|
||||||
|
* \`<appid>\`:使用控制台获取的APP ID,可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
|
||||||
|
* \`<access_token>\`:使用控制台获取的Access Token,可参考 [控制台使用FAQ-Q1](https://www.volcengine.com/docs/6561/196768#q1%EF%BC%9A%E5%93%AA%E9%87%8C%E5%8F%AF%E4%BB%A5%E8%8E%B7%E5%8F%96%E5%88%B0%E4%BB%A5%E4%B8%8B%E5%8F%82%E6%95%B0appid%EF%BC%8Ccluster%EF%BC%8Ctoken%EF%BC%8Cauthorization-type%EF%BC%8Csecret-key-%EF%BC%9F)。
|
||||||
|
* \`<voice_type>\`:您预期使用的音色ID,可参考 [大模型音色列表](https://www.volcengine.com/docs/6561/1257544)。
|
||||||
|
|
||||||
|
<span id="e7697c4e"></span>
|
||||||
|
### node环境
|
||||||
|
|
||||||
|
* node:v24.0版本及以上。
|
||||||
|
|
||||||
|
<span id="03fe45f1"></span>
|
||||||
|
### 下载代码示例
|
||||||
|
<Attachment link="https://p9-arcosite.byteimg.com/tos-cn-i-goo7wpa0wc/12ef1b1188a84f0c8883a0114da741ad~tplv-goo7wpa0wc-image.image" name="volcengine_binary_demo.tar.gz" ></Attachment>
|
||||||
|
<span id="13e8a71a"></span>
|
||||||
|
### 解压缩代码包,安装依赖
|
||||||
|
\`\`\`Bash
|
||||||
|
mkdir -p volcengine_binary_demo
|
||||||
|
tar xvzf volcengine_binary_demo.tar.gz -C ./volcengine_binary_demo
|
||||||
|
cd volcengine_binary_demo
|
||||||
|
npm install
|
||||||
|
npm install -g typescript
|
||||||
|
npm install -g ts-node
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
<span id="0c57973f"></span>
|
||||||
|
### 发起调用
|
||||||
|
> \`<appid>\`替换为您的APP ID。
|
||||||
|
> \`<access_token>\`替换为您的Access Token。
|
||||||
|
> \`<voice_type>\`替换为您预期使用的音色ID,例如\`<voice_type>\`。
|
||||||
|
|
||||||
|
\`\`\`Bash
|
||||||
|
npx ts-node src/volcengine/binary.ts --appid <appid> --access_token <access_token> --voice_type <voice_type> --text "**你好**,我是火山引擎的语音合成服务。"
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
`}></RenderMd></Tabs.TabPane></Tabs>);
|
||||||
|
```
|
||||||
|
|
||||||
|
<span id="9ea45813"></span>
|
||||||
|
# HTTP
|
||||||
|
> 使用账号申请部分申请到的 appid&access_token 进行调用
|
||||||
|
> 文本全部合成完毕之后,一次性返回全部的音频数据
|
||||||
|
|
||||||
|
<span id="4d23f0f6"></span>
|
||||||
|
## 1. 接口说明
|
||||||
|
接口地址为 **https://openspeech.bytedance.com/api/v1/tts**
|
||||||
|
<span id="6f96a6fa"></span>
|
||||||
|
## 2. 身份认证
|
||||||
|
认证方式采用 Bearer Token.
|
||||||
|
1)需要在请求的 Header 中填入"Authorization":"Bearer;${token}"
|
||||||
|
:::warning
|
||||||
|
Bearer 和 token 使用分号 ; 分隔,替换时请勿保留${}
|
||||||
|
:::
|
||||||
|
AppID/Token/Cluster 等信息可参考 [控制台使用FAQ-Q1](/docs/6561/196768#q1:哪里可以获取到以下参数appid,cluster,token,authorization-type,secret-key-?)
|
||||||
|
<span id="a8c19c9a"></span>
|
||||||
|
## 3. 注意事项
|
||||||
|
|
||||||
|
* 使用 HTTP Post 方式进行请求,返回的结果为 JSON 格式,需要进行解析
|
||||||
|
* 因 json 格式无法直接携带二进制音频,音频经 base64 编码。使用 base64 解码后,即为二进制音频
|
||||||
|
* 每次合成时 reqid 这个参数需要重新设置,且要保证唯一性(建议使用 UUID/GUID 等生成)
|
||||||
|
* 不支持["豆包语音合成模型2.0"的音色](https://www.volcengine.com/docs/6561/1257544),比如:"zh_female_vv_uranus_bigtts",如需使用推荐使用v3 接口
|
||||||
|
|
||||||
|
<span id="参数列表"></span>
|
||||||
|
# 参数列表
|
||||||
|
> Websocket 与 Http 调用参数相同
|
||||||
|
|
||||||
|
<span id="931a7b76"></span>
|
||||||
|
## 请求参数
|
||||||
|
|
||||||
|
| | | | | | | \
|
||||||
|
|字段 |含义 |层级 |格式 |必需 |备注 |
|
||||||
|
|---|---|---|---|---|---|
|
||||||
|
| | | | | | | \
|
||||||
|
|app |应用相关配置 |1 |dict |✓ | |
|
||||||
|
| | | | | | | \
|
||||||
|
|appid |应用标识 |2 |string |✓ |需要申请 |
|
||||||
|
| | | | | | | \
|
||||||
|
|token |应用令牌 |2 |string |✓ |无实际鉴权作用的Fake token,可传任意非空字符串 |
|
||||||
|
| | | | | | | \
|
||||||
|
|cluster |业务集群 |2 |string |✓ |volcano_tts |
|
||||||
|
| | | | | | | \
|
||||||
|
|user |用户相关配置 |1 |dict |✓ | |
|
||||||
|
| | | | | | | \
|
||||||
|
|uid |用户标识 |2 |string |✓ |可传任意非空字符串,传入值可以通过服务端日志追溯 |
|
||||||
|
| | | | | | | \
|
||||||
|
|audio |音频相关配置 |1 |dict |✓ | |
|
||||||
|
| | | | | | | \
|
||||||
|
|voice_type |音色类型 |2 |string |✓ | |
|
||||||
|
| | | | | | | \
|
||||||
|
|emotion |音色情感 |2 |string | |设置音色的情感。示例:"emotion": "angry" |\
|
||||||
|
| | | | | |注:当前仅部分音色支持设置情感,且不同音色支持的情感范围存在不同。 |\
|
||||||
|
| | | | | |详见:[大模型语音合成API-音色列表-多情感音色](https://www.volcengine.com/docs/6561/1257544) |
|
||||||
|
| | | | | | | \
|
||||||
|
|enable_emotion |开启音色情感 |2 |bool | |是否可以设置音色情感,需将enable_emotion设为true |\
|
||||||
|
| | | | | |示例:"enable_emotion": True |
|
||||||
|
| | | | | | | \
|
||||||
|
|emotion_scale |情绪值设置 |2 |float | |调用emotion设置情感参数后可使用emotion_scale进一步设置情绪值,范围1~5,不设置时默认值为4。 |\
|
||||||
|
| | | | | |注:理论上情绪值越大,情感越明显。但情绪值1~5实际为非线性增长,可能存在超过某个值后,情绪增加不明显,例如设置3和5时情绪值可能接近。 |
|
||||||
|
| | | | | | | \
|
||||||
|
|encoding |音频编码格式 |2 |string | |wav / pcm / ogg_opus / mp3,默认为 pcm |\
|
||||||
|
| | | | | |<span style="background-color: rgba(255,246,122, 0.8)">注意:wav 不支持流式</span> |
|
||||||
|
| | | | | | | \
|
||||||
|
|speed_ratio |语速 |2 |float | |[0.1,2],默认为 1,通常保留一位小数即可 |
|
||||||
|
| | | | | | | \
|
||||||
|
|rate |音频采样率 |2 |int | |默认为 24000,可选8000,16000 |
|
||||||
|
| | | | | | | \
|
||||||
|
|bitrate |比特率 |2 |int | |单位 kb/s,默认160 kb/s |\
|
||||||
|
| | | | | |**注:** |\
|
||||||
|
| | | | | |bitrate只针对MP3格式,wav计算比特率跟pcm一样是 比特率 (bps) = 采样率 × 位深度 × 声道数 |\
|
||||||
|
| | | | | |目前大模型TTS只能改采样率,所以对于wav格式来说只能通过改采样率来变更音频的比特率 |
|
||||||
|
| | | | | | | \
|
||||||
|
|explicit_language |明确语种 |2 |string | |仅读指定语种的文本 |\
|
||||||
|
| | | | | |精品音色和 ICL 声音复刻场景: |\
|
||||||
|
| | | | | | |\
|
||||||
|
| | | | | |* 不给定参数,正常中英混 |\
|
||||||
|
| | | | | |* `crosslingual` 启用多语种前端(包含`zh/en/ja/es-ms/id/pt-br`) |\
|
||||||
|
| | | | | |* `zh-cn` 中文为主,支持中英混 |\
|
||||||
|
| | | | | |* `en` 仅英文 |\
|
||||||
|
| | | | | |* `ja` 仅日文 |\
|
||||||
|
| | | | | |* `es-mx` 仅墨西 |\
|
||||||
|
| | | | | |* `id` 仅印尼 |\
|
||||||
|
| | | | | |* `pt-br` 仅巴葡 |\
|
||||||
|
| | | | | | |\
|
||||||
|
| | | | | |DIT 声音复刻场景: |\
|
||||||
|
| | | | | |当音色是使用model_type=2训练的,即采用dit标准版效果时,建议指定明确语种,目前支持: |\
|
||||||
|
| | | | | | |\
|
||||||
|
| | | | | |* 不给定参数,启用多语种前端`zh,en,ja,es-mx,id,pt-br,de,fr` |\
|
||||||
|
| | | | | |* `zh,en,ja,es-mx,id,pt-br,de,fr` 启用多语种前端 |\
|
||||||
|
| | | | | |* `zh-cn` 中文为主,支持中英混 |\
|
||||||
|
| | | | | |* `en` 仅英文 |\
|
||||||
|
| | | | | |* `ja` 仅日文 |\
|
||||||
|
| | | | | |* `es-mx` 仅墨西 |\
|
||||||
|
| | | | | |* `id` 仅印尼 |\
|
||||||
|
| | | | | |* `pt-br` 仅巴葡 |\
|
||||||
|
| | | | | |* `de` 仅德语 |\
|
||||||
|
| | | | | |* `fr` 仅法语 |\
|
||||||
|
| | | | | | |\
|
||||||
|
| | | | | |当音色是使用model_type=3训练的,即采用dit还原版效果时,必须指定明确语种,目前支持: |\
|
||||||
|
| | | | | | |\
|
||||||
|
| | | | | |* 不给定参数,正常中英混 |\
|
||||||
|
| | | | | |* `zh-cn` 中文为主,支持中英混 |\
|
||||||
|
| | | | | |* `en` 仅英文 |
|
||||||
|
| | | | | | | \
|
||||||
|
|context_language |参考语种 |2 |string | |给模型提供参考的语种 |\
|
||||||
|
| | | | | | |\
|
||||||
|
| | | | | |* 不给定 西欧语种采用英语 |\
|
||||||
|
| | | | | |* id 西欧语种采用印尼 |\
|
||||||
|
| | | | | |* es 西欧语种采用墨西 |\
|
||||||
|
| | | | | |* pt 西欧语种采用巴葡 |
|
||||||
|
| | | | | | | \
|
||||||
|
|loudness_ratio |音量调节 |2 |float | |[0.5,2],默认为1,通常保留一位小数即可。0.5代表原音量0.5倍,2代表原音量2倍 |
|
||||||
|
| | | | | | | \
|
||||||
|
|request |请求相关配置 |1 |dict |✓ | |
|
||||||
|
| | | | | | | \
|
||||||
|
|reqid |请求标识 |2 |string |✓ |需要保证每次调用传入值唯一,建议使用 UUID |
|
||||||
|
| | | | | | | \
|
||||||
|
|text |文本 |2 |string |✓ |合成语音的文本,长度限制 1024 字节(UTF-8 编码)建议小于300字符,超出容易增加badcase出现概率或报错 |
|
||||||
|
| | | | | | | \
|
||||||
|
|model |模型版本 |\
|
||||||
|
| | |2 |\
|
||||||
|
| | | |string |否 |模型版本,传`seed-tts-1.1`较默认版本音质有提升,并且延时更优,不传为默认效果。 |\
|
||||||
|
| | | | | |注:若使用1.1模型效果,在复刻场景中会放大训练音频prompt特质,因此对prompt的要求更高,使用高质量的训练音频,可以获得更优的音质效果。 |
|
||||||
|
| | | | | | | \
|
||||||
|
|text_type |文本类型 |2 |string | |使用 ssml 时需要指定,值为"ssml" |
|
||||||
|
| | | | | | | \
|
||||||
|
|silence_duration |句尾静音 |2 |float | |设置该参数可在句尾增加静音时长,范围0~30000ms。(注:增加的句尾静音主要针对传入文本最后的句尾,而非每句话的句尾)若启用该参数,必须在request下首先设置enable_trailing_silence_audio = true |
|
||||||
|
| | | | | | | \
|
||||||
|
|with_timestamp |时间戳相关 |2 |int |\
|
||||||
|
| | | |string | |传入1时表示启用,将返回TN后文本的时间戳,例如:2025。根据语义,TN后文本为“两千零二十五”或“二零二五”。 |\
|
||||||
|
| | | | | |注:原文本中的多个标点连用或者空格仍会被处理,但不影响时间戳的连贯性(仅限大模型场景使用)。 |\
|
||||||
|
| | | | | |附加说明(小模型和大模型时间戳原理差异): |\
|
||||||
|
| | | | | |1. 小模型依据前端模型生成时间戳,然后合成音频。在处理时间戳时,TN前后文本进行了映射,所以小模型可返回TN前原文本的时间戳,即保留原文中的阿拉伯数字或者特殊符号等。 |\
|
||||||
|
| | | | | |2. 大模型在对传入文本语义理解后合成音频,再针对合成音频进行TN后打轴以输出时间戳。若不采用TN后文本,输出的时间戳将与合成音频无法对齐,所以大模型返回的时间戳对应TN后的文本。 |
|
||||||
|
| | | | | | | \
|
||||||
|
|operation |操作 |2 |string |✓ |query(非流式,http 只能 query) / submit(流式) |
|
||||||
|
| | | | | | | \
|
||||||
|
|extra_param |附加参数 |2 |jsonstring | | |
|
||||||
|
| | | | | | | \
|
||||||
|
|disable_markdown_filter | |3 |bool | |是否开启markdown解析过滤, |\
|
||||||
|
| | | | | |为true时,解析并过滤markdown语法,例如,**你好**,会读为“你好”, |\
|
||||||
|
| | | | | |为false时,不解析不过滤,例如,**你好**,会读为“星星‘你好’星星” |\
|
||||||
|
| | | | | |示例:"disable_markdown_filter": True |
|
||||||
|
| | | | | | | \
|
||||||
|
|enable_latex_tn | |3 |bool | |是否可以播报latex公式,需将disable_markdown_filter设为true |\
|
||||||
|
| | | | | |示例:"enable_latex_tn": True |
|
||||||
|
| | | | | | | \
|
||||||
|
|mute_cut_remain_ms |句首静音参数 |3 |string | |该参数需配合mute_cut_threshold参数一起使用,其中: |\
|
||||||
|
| | | | | |"mute_cut_threshold": "400", // 静音判断的阈值(音量小于该值时判定为静音) |\
|
||||||
|
| | | | | |"mute_cut_remain_ms": "50", // 需要保留的静音长度 |\
|
||||||
|
| | | | | |注:参数和value都为string格式 |\
|
||||||
|
| | | | | |以python为示例: |\
|
||||||
|
| | | | | |```Python |\
|
||||||
|
| | | | | |"extra_param":("{\"mute_cut_threshold\":\"400\", \"mute_cut_remain_ms\": \"0\"}") |\
|
||||||
|
| | | | | |``` |\
|
||||||
|
| | | | | | |\
|
||||||
|
| | | | | |特别提醒: |\
|
||||||
|
| | | | | | |\
|
||||||
|
| | | | | |* 因MP3格式的特殊性,句首始终会存在100ms内的静音无法消除,WAV格式的音频句首静音可全部消除,建议依照自身业务需求综合判断选择 |
|
||||||
|
| | | | | | | \
|
||||||
|
|disable_emoji_filter |emoji不过滤显示 |3 |bool | |开启emoji表情在文本中不过滤显示,默认为False,建议搭配时间戳参数一起使用。 |\
|
||||||
|
| | | | | |Python示例:`"extra_param": json.dumps({"disable_emoji_filter": True})` |
|
||||||
|
| | | | | | | \
|
||||||
|
|unsupported_char_ratio_thresh |不支持语种占比阈值 |3 |float | |默认: 0.3,最大值: 1.0 |\
|
||||||
|
| | | | | |检测出不支持合成的文本超过设置的比例,则会返回错误。 |\
|
||||||
|
| | | | | |Python示例:`"extra_param": json.dumps({"`unsupported_char_ratio_thresh`": 0.3})` |
|
||||||
|
| | | | | | | \
|
||||||
|
|aigc_watermark |是否在合成结尾增加音频节奏标识 |3 |bool | |默认: false |\
|
||||||
|
| | | | | |Python示例:`"extra_param": json.dumps({"aigc_watermark": True})` |
|
||||||
|
| | | | | | | \
|
||||||
|
|cache_config |缓存相关参数 |3 |dict | |开启缓存,开启后合成相同文本时,服务会直接读取缓存返回上一次合成该文本的音频,可明显加快相同文本的合成速率,缓存数据保留时间1小时。 |\
|
||||||
|
| | | | | |(通过缓存返回的数据不会附带时间戳) |\
|
||||||
|
| | | | | |Python示例:`"extra_param": json.dumps({"cache_config": {"text_type": 1,"use_cache": True}})` |
|
||||||
|
| | | | | | | \
|
||||||
|
|text_type |缓存相关参数 |4 |int | |和use_cache参数一起使用,需要开启缓存时传1 |
|
||||||
|
| | | | | | | \
|
||||||
|
|use_cache |缓存相关参数 |4 |bool | |和text_type参数一起使用,需要开启缓存时传true |
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
备注:
|
||||||
|
|
||||||
|
1. 已支持字级别时间戳能力(ssml文本类型不支持)
|
||||||
|
2. ssml 能力已支持,详见 [SSML 标记语言--豆包语音-火山引擎 (volcengine.com)](https://www.volcengine.com/docs/6561/1330194)
|
||||||
|
3. 暂时不支持音高调节
|
||||||
|
4. 大模型音色语种支持中英混
|
||||||
|
5. 大模型非双向流式已支持latex公式
|
||||||
|
6. 在 websocket/http 握手成功后,会返回这些 Response header
|
||||||
|
|
||||||
|
|
||||||
|
| | | | \
|
||||||
|
|Key |说明 |Value 示例 |
|
||||||
|
|---|---|---|
|
||||||
|
| | | | \
|
||||||
|
|X-Tt-Logid |服务端返回的 logid,建议用户获取和打印方便定位问题,使用默认格式即可,不要自定义格式 |202407261553070FACFE6D19421815D605 |
|
||||||
|
|
||||||
|
请求示例:
|
||||||
|
```go
|
||||||
|
{
|
||||||
|
"app": {
|
||||||
|
"appid": "appid123",
|
||||||
|
"token": "access_token",
|
||||||
|
"cluster": "volcano_tts",
|
||||||
|
},
|
||||||
|
"user": {
|
||||||
|
"uid": "uid123"
|
||||||
|
},
|
||||||
|
"audio": {
|
||||||
|
"voice_type": "zh_male_M392_conversation_wvae_bigtts",
|
||||||
|
"encoding": "mp3",
|
||||||
|
"speed_ratio": 1.0,
|
||||||
|
},
|
||||||
|
"request": {
|
||||||
|
"reqid": "uuid",
|
||||||
|
"text": "字节跳动语音合成",
|
||||||
|
"operation": "query",
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
<span id="返回参数"></span>
|
||||||
|
## 返回参数
|
||||||
|
|
||||||
|
| | | | | | \
|
||||||
|
|字段 |含义 |层级 |格式 |备注 |
|
||||||
|
|---|---|---|---|---|
|
||||||
|
| | | | | | \
|
||||||
|
|reqid |请求 ID |1 |string |请求 ID,与传入的参数中 reqid 一致 |
|
||||||
|
| | | | | | \
|
||||||
|
|code |请求状态码 |1 |int |错误码,参考下方说明 |
|
||||||
|
| | | | | | \
|
||||||
|
|message |请求状态信息 |1 |string |错误信息 |
|
||||||
|
| | | | | | \
|
||||||
|
|sequence |音频段序号 |1 |int |负数表示合成完毕 |
|
||||||
|
| | | | | | \
|
||||||
|
|data |合成音频 |1 |string |返回的音频数据,base64 编码 |
|
||||||
|
| | | | | | \
|
||||||
|
|addition |额外信息 |1 |string |额外信息父节点 |
|
||||||
|
| | | | | | \
|
||||||
|
|duration |音频时长 |2 |string |返回音频的长度,单位 ms |
|
||||||
|
|
||||||
|
响应示例
|
||||||
|
```go
|
||||||
|
{
|
||||||
|
"reqid": "reqid",
|
||||||
|
"code": 3000,
|
||||||
|
"operation": "query",
|
||||||
|
"message": "Success",
|
||||||
|
"sequence": -1,
|
||||||
|
"data": "base64 encoded binary data",
|
||||||
|
"addition": {
|
||||||
|
"duration": "1960",
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
<span id="ca57b94d"></span>
|
||||||
|
## 注意事项
|
||||||
|
|
||||||
|
* websocket 单条链接仅支持单次合成,若需要合成多次,则需要多次建立链接
|
||||||
|
* 每次合成时 reqid 这个参数需要重新设置,且要保证唯一性(建议使用 uuid.V4 生成)
|
||||||
|
* operation 需要设置为 submit
|
||||||
|
|
||||||
|
<span id="返回码说明"></span>
|
||||||
|
# 返回码说明
|
||||||
|
|
||||||
|
| | | | | \
|
||||||
|
|错误码 |错误描述 |举例 |建议行为 |
|
||||||
|
|---|---|---|---|
|
||||||
|
| | | | | \
|
||||||
|
|3000 |请求正确 |正常合成 |正常处理 |
|
||||||
|
| | | | | \
|
||||||
|
|3001 |无效的请求 |一些参数的值非法,比如 operation 配置错误 |检查参数 |
|
||||||
|
| | | | | \
|
||||||
|
|3003 |并发超限 |超过在线设置的并发阈值 |重试;使用 sdk 的情况下切换离线 |
|
||||||
|
| | | | | \
|
||||||
|
|3005 |后端服务忙 |后端服务器负载高 |重试;使用 sdk 的情况下切换离线 |
|
||||||
|
| | | | | \
|
||||||
|
|3006 |服务中断 |请求已完成/失败之后,相同 reqid 再次请求 |检查参数 |
|
||||||
|
| | | | | \
|
||||||
|
|3010 |文本长度超限 |单次请求超过设置的文本长度阈值 |检查参数 |
|
||||||
|
| | | | | \
|
||||||
|
|3011 |无效文本 |参数有误或者文本为空、文本与语种不匹配、文本只含标点 |检查参数 |
|
||||||
|
| | | | | \
|
||||||
|
|3030 |处理超时 |单次请求超过服务最长时间限制 |重试或检查文本 |
|
||||||
|
| | | | | \
|
||||||
|
|3031 |处理错误 |后端出现异常 |重试;使用 sdk 的情况下切换离线 |
|
||||||
|
| | | | | \
|
||||||
|
|3032 |等待获取音频超时 |后端网络异常 |重试;使用 sdk 的情况下切换离线 |
|
||||||
|
| | | | | \
|
||||||
|
|3040 |后端链路连接错误 |后端网络异常 |重试 |
|
||||||
|
| | | | | \
|
||||||
|
|3050 |音色不存在 |检查使用的 voice_type 代号 |检查参数 |
|
||||||
|
|
||||||
|
<span id="常见错误返回说明"></span>
|
||||||
|
# 常见错误返回说明
|
||||||
|
|
||||||
|
1. 错误返回:
|
||||||
|
"message": "quota exceeded for types: xxxxxxxxx_lifetime"
|
||||||
|
**错误原因:试用版用量用完了,需要开通正式版才能继续使用**
|
||||||
|
2. 错误返回:
|
||||||
|
"message": "quota exceeded for types: concurrency"
|
||||||
|
**错误原因:并发超过了限定值,需要减少并发调用情况或者增购并发**
|
||||||
|
3. 错误返回:
|
||||||
|
"message": "Fail to feed text, reason Init Engine Instance failed"
|
||||||
|
**错误原因:voice_type / cluster 传递错误**
|
||||||
|
4. 错误返回:
|
||||||
|
"message": "illegal input text!"
|
||||||
|
**错误原因:传入的 text 无效,没有可合成的有效文本。比如全部是标点符号或者 emoji 表情,或者使用中文音色时,传递日语,以此类推。多语种音色,也需要使用 language 指定对应的语种**
|
||||||
|
5. 错误返回:
|
||||||
|
"message": "authenticate request: load grant: requested grant not found"
|
||||||
|
**错误原因:鉴权失败,需要检查 appid&token 的值是否设置正确,同时,鉴权的正确格式为**
|
||||||
|
**headers["Authorization"] = "Bearer;${token}"**
|
||||||
|
6. 错误返回:
|
||||||
|
"message': 'extract request resource id: get resource id: access denied"
|
||||||
|
**错误原因:语音合成已开通正式版且未拥有当前音色授权,需要在控制台购买该音色才能调用。标注免费的音色除 BV001_streaming 及 BV002_streaming 外,需要在控制台进行下单(支付 0 元)**
|
||||||
|
|
||||||
|
|
||||||
BIN
Capybara audio/勇敢的小裁缝_1770727373.mp3
Normal file
BIN
Capybara audio/勇敢的小裁缝_1770727373.mp3
Normal file
Binary file not shown.
BIN
Capybara audio/卡皮巴拉的奇幻漂流_1770727390.mp3
Normal file
BIN
Capybara audio/卡皮巴拉的奇幻漂流_1770727390.mp3
Normal file
Binary file not shown.
BIN
Capybara audio/小红帽与大灰狼_1770723087.mp3
Normal file
BIN
Capybara audio/小红帽与大灰狼_1770723087.mp3
Normal file
Binary file not shown.
BIN
Capybara audio/杰克与魔豆_1770727355.mp3
Normal file
BIN
Capybara audio/杰克与魔豆_1770727355.mp3
Normal file
Binary file not shown.
BIN
Capybara audio/海盗找朋友_1770718270.mp3
Normal file
BIN
Capybara audio/海盗找朋友_1770718270.mp3
Normal file
Binary file not shown.
BIN
Capybara audio/糖果屋历险记_1770721395.mp3
Normal file
BIN
Capybara audio/糖果屋历险记_1770721395.mp3
Normal file
Binary file not shown.
17
Capybara music/lyrics/书房咔咔茶_1770634690.txt
Normal file
17
Capybara music/lyrics/书房咔咔茶_1770634690.txt
Normal file
@ -0,0 +1,17 @@
|
|||||||
|
在书房角落 沏上一杯茶
|
||||||
|
窗外微风轻拂 摇曳着树梢
|
||||||
|
咔咔坐在椅上 沉浸在思考
|
||||||
|
书页轻轻翻动 世界变得渺小
|
||||||
|
咔咔咔咔 书房里的我
|
||||||
|
静享时光 悠然自得
|
||||||
|
茶香飘散 心灵得到慰藉
|
||||||
|
咔咔咔咔 享受这刻
|
||||||
|
阳光透过窗帘 柔和又温暖
|
||||||
|
每个字每个句 都是心灵的食粮
|
||||||
|
咔咔轻轻点头 感受着文字的力量
|
||||||
|
在这安静的角落 找到了自我方向
|
||||||
|
咔咔咔咔 书房里的我
|
||||||
|
静享时光 悠然自得
|
||||||
|
茶香飘散 心灵得到慰藉
|
||||||
|
咔咔咔咔 享受这刻
|
||||||
|
(茶杯轻放的声音...)
|
||||||
17
Capybara music/lyrics/书房咔咔茶_1770637242.txt
Normal file
17
Capybara music/lyrics/书房咔咔茶_1770637242.txt
Normal file
@ -0,0 +1,17 @@
|
|||||||
|
在书房角落里,我找到了安静
|
||||||
|
一杯茶香飘来,思绪开始飞腾
|
||||||
|
书页轻轻翻动,知识在心间
|
||||||
|
咔咔我在这里,享受这宁静
|
||||||
|
咔咔咔咔,独自享受
|
||||||
|
书中的世界,如此美妙
|
||||||
|
咔咔咔咔,心无旁骛
|
||||||
|
沉浸在知识的海洋,自在飞翔
|
||||||
|
窗外微风轻拂,阳光洒满书桌
|
||||||
|
咔咔我在这里,与文字共舞
|
||||||
|
每个字每个句,都像是音符
|
||||||
|
奏出心灵的乐章,如此动听
|
||||||
|
咔咔咔咔,独自享受
|
||||||
|
书中的世界,如此美妙
|
||||||
|
咔咔咔咔,心无旁骛
|
||||||
|
沉浸在知识的海洋,自在飞翔
|
||||||
|
(翻书声...风铃声...咔咔的呼吸声...)
|
||||||
8
Capybara music/lyrics/夜深了窗外下着小雨盖着被子准备入睡_1770627405.txt
Normal file
8
Capybara music/lyrics/夜深了窗外下着小雨盖着被子准备入睡_1770627405.txt
Normal file
@ -0,0 +1,8 @@
|
|||||||
|
[verse]
|
||||||
|
窗外细雨轻敲窗,
|
||||||
|
被窝里温暖如常。
|
||||||
|
[chorus]
|
||||||
|
咔咔咔咔,梦乡近了,
|
||||||
|
小雨伴我入眠床。
|
||||||
|
[outro]
|
||||||
|
(雨声和咔咔的呼吸声...)
|
||||||
@ -0,0 +1 @@
|
|||||||
|
[Inst]
|
||||||
20
Capybara music/lyrics/洗脑咔咔舞_1770631313.txt
Normal file
20
Capybara music/lyrics/洗脑咔咔舞_1770631313.txt
Normal file
@ -0,0 +1,20 @@
|
|||||||
|
咔咔咔咔来跳舞,魔性旋律不停步
|
||||||
|
跟着节奏摇摆身,洗脑神曲不放手
|
||||||
|
重复的旋律像魔法,让人听了就上瘾
|
||||||
|
咔咔咔咔的魔力,谁也挡不住
|
||||||
|
洗脑咔咔舞,洗脑咔咔舞
|
||||||
|
魔性的旋律,让人停不下来
|
||||||
|
洗脑咔咔舞,洗脑咔咔舞
|
||||||
|
跟着咔咔一起跳,快乐无边
|
||||||
|
每个节拍都精准,咔咔的舞步最迷人
|
||||||
|
不管走到哪里去,都能听到这魔音
|
||||||
|
咔咔的舞蹈最独特,让人看了就想学
|
||||||
|
洗脑神曲的魅力,就是让人忘不掉
|
||||||
|
洗脑咔咔舞,洗脑咔咔舞
|
||||||
|
魔性的旋律,让人停不下来
|
||||||
|
洗脑咔咔舞,洗脑咔咔舞
|
||||||
|
跟着咔咔一起跳,快乐无边
|
||||||
|
咔咔咔咔,魔性洗脑舞
|
||||||
|
重复的节奏,快乐的旋律
|
||||||
|
洗脑咔咔舞,洗脑咔咔舞
|
||||||
|
让快乐无限循环,直到永远
|
||||||
26
Capybara music/lyrics/温泉发呆曲_1770628235.txt
Normal file
26
Capybara music/lyrics/温泉发呆曲_1770628235.txt
Normal file
@ -0,0 +1,26 @@
|
|||||||
|
[verse 1]\n
|
||||||
|
懒懒的午后阳光暖,\n
|
||||||
|
温泉里我泡得欢。\n
|
||||||
|
水声潺潺耳边响,\n
|
||||||
|
什么都不想干。\n
|
||||||
|
\n
|
||||||
|
[chorus]\n
|
||||||
|
咔咔咔咔,悠然自得,\n
|
||||||
|
水波轻摇,心情舒畅。\n
|
||||||
|
咔咔咔咔,享受此刻,\n
|
||||||
|
懒懒午后,最是惬意。\n
|
||||||
|
\n
|
||||||
|
[verse 2]\n
|
||||||
|
看着云朵慢慢飘,\n
|
||||||
|
心思像水一样柔。\n
|
||||||
|
闭上眼,世界都静了,\n
|
||||||
|
只有我和这温泉。\n
|
||||||
|
\n
|
||||||
|
[chorus]\n
|
||||||
|
咔咔咔咔,悠然自得,\n
|
||||||
|
水波轻摇,心情舒畅。\n
|
||||||
|
咔咔咔咔,享受此刻,\n
|
||||||
|
懒懒午后,最是惬意。\n
|
||||||
|
\n
|
||||||
|
[outro]\n
|
||||||
|
(水声渐渐远去...)
|
||||||
21
Capybara music/lyrics/温泉发呆曲_1770630396.txt
Normal file
21
Capybara music/lyrics/温泉发呆曲_1770630396.txt
Normal file
@ -0,0 +1,21 @@
|
|||||||
|
慵懒午后阳光暖,温泉里我发呆
|
||||||
|
|
||||||
|
水声潺潺耳边响,思绪飘向云外
|
||||||
|
|
||||||
|
咔咔咔咔,泡在温泉
|
||||||
|
|
||||||
|
心无杂念,享受此刻安宁
|
||||||
|
|
||||||
|
什么都不想去做,只想静静享受
|
||||||
|
|
||||||
|
水波轻抚我的背,世界变得温柔
|
||||||
|
|
||||||
|
咔咔咔咔,泡在温泉
|
||||||
|
|
||||||
|
心无杂念,享受此刻安宁
|
||||||
|
|
||||||
|
(水花声...)
|
||||||
|
|
||||||
|
咔咔的午后,慵懒又自在
|
||||||
|
|
||||||
|
温泉里的世界,只有我和水声
|
||||||
33
Capybara music/lyrics/温泉发呆曲_1770630635.txt
Normal file
33
Capybara music/lyrics/温泉发呆曲_1770630635.txt
Normal file
@ -0,0 +1,33 @@
|
|||||||
|
懒懒的午后阳光暖,
|
||||||
|
|
||||||
|
温泉里我泡得欢。
|
||||||
|
|
||||||
|
水声潺潺耳边响,
|
||||||
|
|
||||||
|
什么都不想干。
|
||||||
|
|
||||||
|
咔咔咔咔,发呆好时光,
|
||||||
|
|
||||||
|
懒懒的我,享受这阳光。
|
||||||
|
|
||||||
|
咔咔咔咔,让思绪飘扬,
|
||||||
|
|
||||||
|
在温泉里,找到我的天堂。
|
||||||
|
|
||||||
|
想法像泡泡一样浮上来,
|
||||||
|
|
||||||
|
又慢慢沉下去,消失在水里。
|
||||||
|
|
||||||
|
时间仿佛静止,我自在如鱼,
|
||||||
|
|
||||||
|
在这温暖的怀抱里。
|
||||||
|
|
||||||
|
咔咔咔咔,发呆好时光,
|
||||||
|
|
||||||
|
懒懒的我,享受这阳光。
|
||||||
|
|
||||||
|
咔咔咔咔,让思绪飘扬,
|
||||||
|
|
||||||
|
在温泉里,找到我的天堂。
|
||||||
|
|
||||||
|
(水声渐渐远去...)
|
||||||
33
Capybara music/lyrics/温泉发呆曲_1770639509.txt
Normal file
33
Capybara music/lyrics/温泉发呆曲_1770639509.txt
Normal file
@ -0,0 +1,33 @@
|
|||||||
|
懒懒的午后阳光暖,
|
||||||
|
|
||||||
|
温泉里我泡得欢。
|
||||||
|
|
||||||
|
水声潺潺耳边响,
|
||||||
|
|
||||||
|
什么都不想干。
|
||||||
|
|
||||||
|
咔咔咔咔,发呆真好,
|
||||||
|
|
||||||
|
懒懒的我,享受这秒。
|
||||||
|
|
||||||
|
水波轻摇,心也飘,
|
||||||
|
|
||||||
|
咔咔世界,别来无恙。
|
||||||
|
|
||||||
|
想着云卷云又舒,
|
||||||
|
|
||||||
|
温泉里的我多舒服。
|
||||||
|
|
||||||
|
时间慢慢流,不急不徐,
|
||||||
|
|
||||||
|
咔咔的梦,轻轻浮。
|
||||||
|
|
||||||
|
咔咔咔咔,发呆真好,
|
||||||
|
|
||||||
|
懒懒的我,享受这秒。
|
||||||
|
|
||||||
|
水波轻摇,心也飘,
|
||||||
|
|
||||||
|
咔咔世界,别来无恙。
|
||||||
|
|
||||||
|
(水声渐渐远去...)
|
||||||
37
Capybara music/lyrics/温泉里的咔咔_1770730481.txt
Normal file
37
Capybara music/lyrics/温泉里的咔咔_1770730481.txt
Normal file
@ -0,0 +1,37 @@
|
|||||||
|
懒懒的午后阳光暖,
|
||||||
|
|
||||||
|
温泉里我泡得欢。
|
||||||
|
|
||||||
|
水声潺潺耳边响,
|
||||||
|
|
||||||
|
什么都不想干。
|
||||||
|
|
||||||
|
咔咔咔咔,悠然自得,
|
||||||
|
|
||||||
|
水波荡漾心情悦。
|
||||||
|
|
||||||
|
咔咔咔咔,闭上眼,
|
||||||
|
|
||||||
|
享受这刻的宁静。
|
||||||
|
|
||||||
|
想象自己是条鱼,
|
||||||
|
|
||||||
|
在水里自由游来游去。
|
||||||
|
|
||||||
|
没有烦恼没有压力,
|
||||||
|
|
||||||
|
只有我和这温泉池。
|
||||||
|
|
||||||
|
咔咔咔咔,悠然自得,
|
||||||
|
|
||||||
|
水波荡漾心情悦。
|
||||||
|
|
||||||
|
咔咔咔咔,闭上眼,
|
||||||
|
|
||||||
|
享受这刻的宁静。
|
||||||
|
|
||||||
|
(水花声...)
|
||||||
|
|
||||||
|
咔咔,慵懒午后,
|
||||||
|
|
||||||
|
水中世界最逍遥。
|
||||||
26
Capybara music/lyrics/草地上的咔咔_1770628910.txt
Normal file
26
Capybara music/lyrics/草地上的咔咔_1770628910.txt
Normal file
@ -0,0 +1,26 @@
|
|||||||
|
[verse 1]\n"
|
||||||
|
"阳光洒满草地绿\n"
|
||||||
|
"咔咔奔跑心情舒畅\n"
|
||||||
|
"风儿轻拂过脸庞\n"
|
||||||
|
"快乐就像泡泡糖\n"
|
||||||
|
"\n"
|
||||||
|
"[chorus]\n"
|
||||||
|
"咔咔咔咔 快乐无边\n"
|
||||||
|
"草地上的我自由自在\n"
|
||||||
|
"阳光下的影子拉得好长\n"
|
||||||
|
"咔咔咔咔 快乐无边\n"
|
||||||
|
"\n"
|
||||||
|
"[verse 2]\n"
|
||||||
|
"蝴蝶飞舞花儿笑\n"
|
||||||
|
"咔咔摇摆尾巴摇\n"
|
||||||
|
"每一步都跳着舞\n"
|
||||||
|
"生活就像一首歌\n"
|
||||||
|
"\n"
|
||||||
|
"[chorus]\n"
|
||||||
|
"咔咔咔咔 快乐无边\n"
|
||||||
|
"草地上的我自由自在\n"
|
||||||
|
"阳光下的影子拉得好长\n"
|
||||||
|
"咔咔咔咔 快乐无边\n"
|
||||||
|
"\n"
|
||||||
|
"[outro]\n"
|
||||||
|
"(草地上咔咔的笑声...)
|
||||||
17
Capybara music/lyrics/草地上的咔咔_1770629673.txt
Normal file
17
Capybara music/lyrics/草地上的咔咔_1770629673.txt
Normal file
@ -0,0 +1,17 @@
|
|||||||
|
阳光洒满地 草香扑鼻来
|
||||||
|
咔咔在草地上 跑得飞快
|
||||||
|
风儿轻轻吹 摇曳着花海
|
||||||
|
心情像彩虹 七彩斑斓开
|
||||||
|
咔咔咔咔 快乐无边
|
||||||
|
草地上的我 自由自在
|
||||||
|
阳光下的梦 美好无限
|
||||||
|
咔咔咔咔 快乐无边
|
||||||
|
蝴蝶在飞舞 蜜蜂在歌唱
|
||||||
|
咔咔跟着它们 一起欢唱
|
||||||
|
天空蓝得像画 没有一丝阴霾
|
||||||
|
咔咔的心里 只有满满的爱
|
||||||
|
咔咔咔咔 快乐无边
|
||||||
|
草地上的我 自由自在
|
||||||
|
阳光下的梦 美好无限
|
||||||
|
咔咔咔咔 快乐无边
|
||||||
|
(草地上咔咔的笑声...)
|
||||||
19
Capybara music/lyrics/草地上的咔咔_1770640911.txt
Normal file
19
Capybara music/lyrics/草地上的咔咔_1770640911.txt
Normal file
@ -0,0 +1,19 @@
|
|||||||
|
阳光洒满地 绿草如茵间
|
||||||
|
咔咔跑起来 心情像飞燕
|
||||||
|
风儿轻拂过 花香满径边
|
||||||
|
快乐如此简单 每一步都新鲜
|
||||||
|
咔咔咔咔 快乐咔咔
|
||||||
|
草地上的我 自由自在
|
||||||
|
阳光下的舞 轻松又欢快
|
||||||
|
咔咔咔咔 快乐咔咔
|
||||||
|
无忧无虑的我 最爱这蓝天
|
||||||
|
蝴蝶翩翩起 蜜蜂忙采蜜
|
||||||
|
咔咔我最棒 每个瞬间都美丽
|
||||||
|
朋友在旁边 笑声传千里
|
||||||
|
这世界多美好 有你有我有草地
|
||||||
|
咔咔咔咔 快乐咔咔
|
||||||
|
草地上的我 自由自在
|
||||||
|
阳光下的舞 轻松又欢快
|
||||||
|
咔咔咔咔 快乐咔咔
|
||||||
|
无忧无虑的我 最爱这蓝天
|
||||||
|
(草地上咔咔的笑声...)
|
||||||
@ -0,0 +1,8 @@
|
|||||||
|
[verse]
|
||||||
|
阳光洒满草地,我跑得飞快
|
||||||
|
心情像彩虹,七彩斑斓真美
|
||||||
|
[chorus]
|
||||||
|
咔咔咔咔,快乐无边
|
||||||
|
在阳光下,自由自在
|
||||||
|
[outro]
|
||||||
|
(风吹草低见水豚)
|
||||||
@ -0,0 +1 @@
|
|||||||
|
[Inst]
|
||||||
BIN
Capybara music/书房咔咔茶_1770634690.mp3
Normal file
BIN
Capybara music/书房咔咔茶_1770634690.mp3
Normal file
Binary file not shown.
BIN
Capybara music/书房咔咔茶_1770637242.mp3
Normal file
BIN
Capybara music/书房咔咔茶_1770637242.mp3
Normal file
Binary file not shown.
BIN
Capybara music/夜深了窗外下着小雨盖着被子准备入睡_1770627405.mp3
Normal file
BIN
Capybara music/夜深了窗外下着小雨盖着被子准备入睡_1770627405.mp3
Normal file
Binary file not shown.
BIN
Capybara music/惊喜咔咔派_1770642290.mp3
Normal file
BIN
Capybara music/惊喜咔咔派_1770642290.mp3
Normal file
Binary file not shown.
BIN
Capybara music/慵懒的午后泡在温泉里听水声发呆什么都不想_1770627905.mp3
Normal file
BIN
Capybara music/慵懒的午后泡在温泉里听水声发呆什么都不想_1770627905.mp3
Normal file
Binary file not shown.
BIN
Capybara music/洗脑咔咔舞_1770631313.mp3
Normal file
BIN
Capybara music/洗脑咔咔舞_1770631313.mp3
Normal file
Binary file not shown.
BIN
Capybara music/温泉发呆曲_1770628235.mp3
Normal file
BIN
Capybara music/温泉发呆曲_1770628235.mp3
Normal file
Binary file not shown.
BIN
Capybara music/温泉发呆曲_1770630396.mp3
Normal file
BIN
Capybara music/温泉发呆曲_1770630396.mp3
Normal file
Binary file not shown.
BIN
Capybara music/温泉发呆曲_1770630635.mp3
Normal file
BIN
Capybara music/温泉发呆曲_1770630635.mp3
Normal file
Binary file not shown.
BIN
Capybara music/温泉发呆曲_1770639509.mp3
Normal file
BIN
Capybara music/温泉发呆曲_1770639509.mp3
Normal file
Binary file not shown.
BIN
Capybara music/温泉里的咔咔_1770730481.mp3
Normal file
BIN
Capybara music/温泉里的咔咔_1770730481.mp3
Normal file
Binary file not shown.
BIN
Capybara music/草地上的咔咔_1770628910.mp3
Normal file
BIN
Capybara music/草地上的咔咔_1770628910.mp3
Normal file
Binary file not shown.
BIN
Capybara music/草地上的咔咔_1770629673.mp3
Normal file
BIN
Capybara music/草地上的咔咔_1770629673.mp3
Normal file
Binary file not shown.
BIN
Capybara music/草地上的咔咔_1770640911.mp3
Normal file
BIN
Capybara music/草地上的咔咔_1770640911.mp3
Normal file
Binary file not shown.
BIN
Capybara music/阳光灿烂的日子在草地上奔跑撒欢心情超级好_1770626906.mp3
Normal file
BIN
Capybara music/阳光灿烂的日子在草地上奔跑撒欢心情超级好_1770626906.mp3
Normal file
Binary file not shown.
BIN
Capybara music/阳光灿烂的日子在草地上奔跑撒欢心情超级好_1770639287.mp3
Normal file
BIN
Capybara music/阳光灿烂的日子在草地上奔跑撒欢心情超级好_1770639287.mp3
Normal file
Binary file not shown.
11
Capybara stories/海盗找朋友_1770647563.txt
Normal file
11
Capybara stories/海盗找朋友_1770647563.txt
Normal file
@ -0,0 +1,11 @@
|
|||||||
|
# 海盗找朋友
|
||||||
|
|
||||||
|
在蓝色的大海上,有一艘小小的海盗船,船上只有一个小海盗。他戴着歪歪的海盗帽,举着塑料做的小钩子手,每天对着海浪喊:“谁来和我玩呀?”
|
||||||
|
|
||||||
|
这天,小海盗的船被海浪冲到了一座彩虹岛。岛上的沙滩上,躺着一个会发光的贝壳。小海盗刚捡起贝壳,贝壳突然“叮咚”响了一声,跳出一只圆滚滚的小海豚!
|
||||||
|
|
||||||
|
“哇!你是我的宝藏吗?”小海盗举着贝壳问。小海豚摇摇头,用尾巴拍了拍海水:“我带你去找真正的宝藏!”它驮着小海盗游向海底,那里有一个藏着星星的洞穴。
|
||||||
|
|
||||||
|
洞穴里,小海豚拿出了一个会唱歌的海螺:“这是友谊海螺,对着它喊朋友的名字,就会有惊喜哦!”小海盗对着海螺喊:“我的朋友!”突然,从海螺里钻出一群小螃蟹,举着彩色的小旗子,还有一只会吹泡泡的章鱼!
|
||||||
|
|
||||||
|
原来,小海豚早就听说小海盗很孤单,特意用友谊海螺召集了伙伴们。现在,小海盗的船上每天都飘着笑声,他再也不是孤单的小海盗啦!
|
||||||
@ -417,9 +417,18 @@ class _MusicCreationPageState extends State<MusicCreationPage>
|
|||||||
// Actually play or pause audio
|
// Actually play or pause audio
|
||||||
try {
|
try {
|
||||||
if (_isPlaying) {
|
if (_isPlaying) {
|
||||||
|
// Show now-playing bubble immediately (before await)
|
||||||
|
_playStickyText = '正在播放: ${_playlist[_currentTrackIndex].title}';
|
||||||
|
setState(() {
|
||||||
|
_speechText = _playStickyText;
|
||||||
|
_speechVisible = true;
|
||||||
|
});
|
||||||
await _audioPlayer.play();
|
await _audioPlayer.play();
|
||||||
} else {
|
} else {
|
||||||
await _audioPlayer.pause();
|
await _audioPlayer.pause();
|
||||||
|
// Hide bubble on pause
|
||||||
|
_playStickyText = null;
|
||||||
|
setState(() => _speechVisible = false);
|
||||||
}
|
}
|
||||||
} catch (e) {
|
} catch (e) {
|
||||||
debugPrint('Playback error: $e');
|
debugPrint('Playback error: $e');
|
||||||
@ -428,6 +437,7 @@ class _MusicCreationPageState extends State<MusicCreationPage>
|
|||||||
// Revert UI state on error
|
// Revert UI state on error
|
||||||
setState(() {
|
setState(() {
|
||||||
_isPlaying = false;
|
_isPlaying = false;
|
||||||
|
_playStickyText = null;
|
||||||
_vinylSpinController.stop();
|
_vinylSpinController.stop();
|
||||||
_tonearmController.reverse();
|
_tonearmController.reverse();
|
||||||
});
|
});
|
||||||
@ -474,7 +484,8 @@ class _MusicCreationPageState extends State<MusicCreationPage>
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
_showSpeech('正在播放: ${_playlist[index].title}');
|
_playStickyText = '正在播放: ${_playlist[index].title}';
|
||||||
|
_showSpeech(_playStickyText!, duration: 0);
|
||||||
}
|
}
|
||||||
|
|
||||||
// ── Mood Selection ──
|
// ── Mood Selection ──
|
||||||
@ -646,6 +657,7 @@ class _MusicCreationPageState extends State<MusicCreationPage>
|
|||||||
|
|
||||||
// ── Speech Bubble ──
|
// ── Speech Bubble ──
|
||||||
String? _genStickyText; // Persistent text during generation
|
String? _genStickyText; // Persistent text during generation
|
||||||
|
String? _playStickyText; // Persistent text during playback
|
||||||
|
|
||||||
void _showSpeech(String text, {int duration = 3000}) {
|
void _showSpeech(String text, {int duration = 3000}) {
|
||||||
// If this is a generation-related message (duration == 0), save it as sticky
|
// If this is a generation-related message (duration == 0), save it as sticky
|
||||||
@ -667,6 +679,12 @@ class _MusicCreationPageState extends State<MusicCreationPage>
|
|||||||
_speechText = _genStickyText;
|
_speechText = _genStickyText;
|
||||||
_speechVisible = true;
|
_speechVisible = true;
|
||||||
});
|
});
|
||||||
|
} else if (_isPlaying && _playStickyText != null) {
|
||||||
|
// If playing, restore the now-playing message
|
||||||
|
setState(() {
|
||||||
|
_speechText = _playStickyText;
|
||||||
|
_speechVisible = true;
|
||||||
|
});
|
||||||
} else {
|
} else {
|
||||||
setState(() => _speechVisible = false);
|
setState(() => _speechVisible = false);
|
||||||
}
|
}
|
||||||
@ -800,7 +818,9 @@ class _MusicCreationPageState extends State<MusicCreationPage>
|
|||||||
child: _buildVinylWrapper(),
|
child: _buildVinylWrapper(),
|
||||||
),
|
),
|
||||||
// Speech bubble — positioned top-right
|
// Speech bubble — positioned top-right
|
||||||
if (_speechVisible && _speechText != null)
|
// Always show during playback; otherwise use _speechVisible
|
||||||
|
if ((_speechVisible && _speechText != null) ||
|
||||||
|
(_isPlaying && _playStickyText != null))
|
||||||
Positioned(
|
Positioned(
|
||||||
top: 0,
|
top: 0,
|
||||||
right: -24, // HTML: right: -24px
|
right: -24, // HTML: right: -24px
|
||||||
@ -1067,12 +1087,18 @@ class _MusicCreationPageState extends State<MusicCreationPage>
|
|||||||
Widget _buildSpeechBubble() {
|
Widget _buildSpeechBubble() {
|
||||||
// HTML: .capy-speech-bubble with clip-path iMessage-style tail at bottom-left
|
// HTML: .capy-speech-bubble with clip-path iMessage-style tail at bottom-left
|
||||||
const tailH = 8.0;
|
const tailH = 8.0;
|
||||||
|
// During playback, always show the playing text even if _speechVisible is false
|
||||||
|
final bool showBubble = _speechVisible || (_isPlaying && _playStickyText != null);
|
||||||
|
final String bubbleText = (_isPlaying && _playStickyText != null && !_speechVisible)
|
||||||
|
? _playStickyText!
|
||||||
|
: (_speechText ?? '');
|
||||||
|
|
||||||
return AnimatedOpacity(
|
return AnimatedOpacity(
|
||||||
duration: const Duration(milliseconds: 200),
|
duration: const Duration(milliseconds: 200),
|
||||||
opacity: _speechVisible ? 1.0 : 0.0,
|
opacity: showBubble ? 1.0 : 0.0,
|
||||||
child: AnimatedScale(
|
child: AnimatedScale(
|
||||||
duration: const Duration(milliseconds: 350),
|
duration: const Duration(milliseconds: 350),
|
||||||
scale: _speechVisible ? 1.0 : 0.7,
|
scale: showBubble ? 1.0 : 0.7,
|
||||||
curve: const Cubic(0.34, 1.56, 0.64, 1.0),
|
curve: const Cubic(0.34, 1.56, 0.64, 1.0),
|
||||||
alignment: Alignment.bottomLeft,
|
alignment: Alignment.bottomLeft,
|
||||||
child: Column(
|
child: Column(
|
||||||
@ -1098,7 +1124,7 @@ class _MusicCreationPageState extends State<MusicCreationPage>
|
|||||||
],
|
],
|
||||||
),
|
),
|
||||||
child: Text(
|
child: Text(
|
||||||
_speechText ?? '',
|
bubbleText,
|
||||||
style: GoogleFonts.dmSans(
|
style: GoogleFonts.dmSans(
|
||||||
fontSize: 12.5,
|
fontSize: 12.5,
|
||||||
fontWeight: FontWeight.w500,
|
fontWeight: FontWeight.w500,
|
||||||
@ -1485,6 +1511,7 @@ class _MusicCreationPageState extends State<MusicCreationPage>
|
|||||||
builder: (ctx) => _PlaylistModalContent(
|
builder: (ctx) => _PlaylistModalContent(
|
||||||
tracks: _playlist,
|
tracks: _playlist,
|
||||||
currentIndex: _currentTrackIndex,
|
currentIndex: _currentTrackIndex,
|
||||||
|
isPlaying: _isPlaying,
|
||||||
onSelect: (index) {
|
onSelect: (index) {
|
||||||
Navigator.pop(ctx);
|
Navigator.pop(ctx);
|
||||||
_playTrack(index);
|
_playTrack(index);
|
||||||
@ -1921,17 +1948,53 @@ class _InputModalContent extends StatelessWidget {
|
|||||||
}
|
}
|
||||||
|
|
||||||
/// Playlist Modal — HTML: .playlist-container
|
/// Playlist Modal — HTML: .playlist-container
|
||||||
class _PlaylistModalContent extends StatelessWidget {
|
class _PlaylistModalContent extends StatefulWidget {
|
||||||
final List<_Track> tracks;
|
final List<_Track> tracks;
|
||||||
final int currentIndex;
|
final int currentIndex;
|
||||||
|
final bool isPlaying;
|
||||||
final ValueChanged<int> onSelect;
|
final ValueChanged<int> onSelect;
|
||||||
|
|
||||||
const _PlaylistModalContent({
|
const _PlaylistModalContent({
|
||||||
required this.tracks,
|
required this.tracks,
|
||||||
required this.currentIndex,
|
required this.currentIndex,
|
||||||
|
required this.isPlaying,
|
||||||
required this.onSelect,
|
required this.onSelect,
|
||||||
});
|
});
|
||||||
|
|
||||||
|
@override
|
||||||
|
State<_PlaylistModalContent> createState() => _PlaylistModalContentState();
|
||||||
|
}
|
||||||
|
|
||||||
|
class _PlaylistModalContentState extends State<_PlaylistModalContent>
|
||||||
|
with SingleTickerProviderStateMixin {
|
||||||
|
late AnimationController _waveController;
|
||||||
|
|
||||||
|
@override
|
||||||
|
void initState() {
|
||||||
|
super.initState();
|
||||||
|
_waveController = AnimationController(
|
||||||
|
vsync: this,
|
||||||
|
duration: const Duration(milliseconds: 800),
|
||||||
|
);
|
||||||
|
if (widget.isPlaying) _waveController.repeat(reverse: true);
|
||||||
|
}
|
||||||
|
|
||||||
|
@override
|
||||||
|
void didUpdateWidget(covariant _PlaylistModalContent oldWidget) {
|
||||||
|
super.didUpdateWidget(oldWidget);
|
||||||
|
if (widget.isPlaying && !_waveController.isAnimating) {
|
||||||
|
_waveController.repeat(reverse: true);
|
||||||
|
} else if (!widget.isPlaying && _waveController.isAnimating) {
|
||||||
|
_waveController.stop();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
@override
|
||||||
|
void dispose() {
|
||||||
|
_waveController.dispose();
|
||||||
|
super.dispose();
|
||||||
|
}
|
||||||
|
|
||||||
@override
|
@override
|
||||||
Widget build(BuildContext context) {
|
Widget build(BuildContext context) {
|
||||||
final screenWidth = MediaQuery.of(context).size.width;
|
final screenWidth = MediaQuery.of(context).size.width;
|
||||||
@ -2015,23 +2078,39 @@ class _PlaylistModalContent extends StatelessWidget {
|
|||||||
mainAxisSpacing: 8,
|
mainAxisSpacing: 8,
|
||||||
childAspectRatio: 0.75,
|
childAspectRatio: 0.75,
|
||||||
),
|
),
|
||||||
itemCount: tracks.length,
|
itemCount: widget.tracks.length,
|
||||||
itemBuilder: (context, index) {
|
itemBuilder: (context, index) {
|
||||||
final track = tracks[index];
|
final track = widget.tracks[index];
|
||||||
final isPlaying = index == currentIndex;
|
final isCurrent = index == widget.currentIndex;
|
||||||
|
final isPlaying = isCurrent && widget.isPlaying;
|
||||||
|
|
||||||
// HTML: .record-slot { background: rgba(0,0,0,0.03); border-radius: 12px;
|
// HTML: .record-slot { background: rgba(0,0,0,0.03); border-radius: 12px;
|
||||||
// padding: 10px 4px; border: 1px solid rgba(0,0,0,0.02); }
|
// padding: 10px 4px; border: 1px solid rgba(0,0,0,0.02); }
|
||||||
return GestureDetector(
|
return GestureDetector(
|
||||||
onTap: () => onSelect(index),
|
onTap: () => widget.onSelect(index),
|
||||||
child: Container(
|
child: Container(
|
||||||
padding:
|
padding:
|
||||||
const EdgeInsets.symmetric(horizontal: 4, vertical: 10),
|
const EdgeInsets.symmetric(horizontal: 4, vertical: 10),
|
||||||
decoration: BoxDecoration(
|
decoration: BoxDecoration(
|
||||||
color: Colors.black.withOpacity(0.03),
|
// Current track: warm golden background; others: subtle grey
|
||||||
|
color: isCurrent
|
||||||
|
? const Color(0xFFFDF3E3)
|
||||||
|
: Colors.black.withOpacity(0.03),
|
||||||
borderRadius: BorderRadius.circular(12),
|
borderRadius: BorderRadius.circular(12),
|
||||||
border: Border.all(
|
border: Border.all(
|
||||||
color: Colors.black.withOpacity(0.02)),
|
color: isCurrent
|
||||||
|
? const Color(0xFFECCFA8).withOpacity(0.6)
|
||||||
|
: Colors.black.withOpacity(0.02),
|
||||||
|
width: isCurrent ? 1.5 : 1.0),
|
||||||
|
boxShadow: isCurrent
|
||||||
|
? [
|
||||||
|
BoxShadow(
|
||||||
|
color: const Color(0xFFECCFA8).withOpacity(0.25),
|
||||||
|
blurRadius: 8,
|
||||||
|
offset: const Offset(0, 2),
|
||||||
|
),
|
||||||
|
]
|
||||||
|
: null,
|
||||||
),
|
),
|
||||||
child: Column(
|
child: Column(
|
||||||
children: [
|
children: [
|
||||||
@ -2043,10 +2122,8 @@ class _PlaylistModalContent extends StatelessWidget {
|
|||||||
decoration: BoxDecoration(
|
decoration: BoxDecoration(
|
||||||
shape: BoxShape.circle,
|
shape: BoxShape.circle,
|
||||||
color: const Color(0xFF18181B),
|
color: const Color(0xFF18181B),
|
||||||
// HTML: .record-item.playing .record-cover-wrapper
|
|
||||||
// { box-shadow: 0 0 0 2px #ECCFA8, ... }
|
|
||||||
boxShadow: [
|
boxShadow: [
|
||||||
if (isPlaying)
|
if (isCurrent)
|
||||||
const BoxShadow(
|
const BoxShadow(
|
||||||
color: Color(0xFFECCFA8),
|
color: Color(0xFFECCFA8),
|
||||||
spreadRadius: 2,
|
spreadRadius: 2,
|
||||||
@ -2096,23 +2173,57 @@ class _PlaylistModalContent extends StatelessWidget {
|
|||||||
),
|
),
|
||||||
),
|
),
|
||||||
),
|
),
|
||||||
|
// Sound wave overlay for playing track
|
||||||
|
if (isPlaying)
|
||||||
|
Center(
|
||||||
|
child: AnimatedBuilder(
|
||||||
|
animation: _waveController,
|
||||||
|
builder: (context, child) {
|
||||||
|
return CustomPaint(
|
||||||
|
painter: _MiniWavePainter(
|
||||||
|
progress: _waveController.value,
|
||||||
|
),
|
||||||
|
size: const Size(28, 20),
|
||||||
|
);
|
||||||
|
},
|
||||||
|
),
|
||||||
|
),
|
||||||
],
|
],
|
||||||
),
|
),
|
||||||
),
|
),
|
||||||
),
|
),
|
||||||
),
|
),
|
||||||
const SizedBox(height: 8),
|
const SizedBox(height: 8),
|
||||||
// HTML: .record-title { font-size: 12px; font-weight: 500; }
|
// Title with playing indicator
|
||||||
Text(
|
Row(
|
||||||
track.title,
|
mainAxisAlignment: MainAxisAlignment.center,
|
||||||
style: GoogleFonts.dmSans(
|
mainAxisSize: MainAxisSize.min,
|
||||||
fontSize: 12,
|
children: [
|
||||||
fontWeight: FontWeight.w500,
|
if (isCurrent)
|
||||||
color: const Color(0xFF374151),
|
Padding(
|
||||||
),
|
padding: const EdgeInsets.only(right: 3),
|
||||||
textAlign: TextAlign.center,
|
child: Icon(
|
||||||
maxLines: 1,
|
isPlaying ? Icons.volume_up_rounded : Icons.volume_off_rounded,
|
||||||
overflow: TextOverflow.ellipsis,
|
size: 12,
|
||||||
|
color: const Color(0xFFECCFA8),
|
||||||
|
),
|
||||||
|
),
|
||||||
|
Flexible(
|
||||||
|
child: Text(
|
||||||
|
track.title,
|
||||||
|
style: GoogleFonts.dmSans(
|
||||||
|
fontSize: 12,
|
||||||
|
fontWeight: isCurrent ? FontWeight.w600 : FontWeight.w500,
|
||||||
|
color: isCurrent
|
||||||
|
? const Color(0xFFB8860B)
|
||||||
|
: const Color(0xFF374151),
|
||||||
|
),
|
||||||
|
textAlign: TextAlign.center,
|
||||||
|
maxLines: 1,
|
||||||
|
overflow: TextOverflow.ellipsis,
|
||||||
|
),
|
||||||
|
),
|
||||||
|
],
|
||||||
),
|
),
|
||||||
],
|
],
|
||||||
),
|
),
|
||||||
@ -2127,3 +2238,39 @@ class _PlaylistModalContent extends StatelessWidget {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// Mini sound wave painter for playlist playing indicator
|
||||||
|
class _MiniWavePainter extends CustomPainter {
|
||||||
|
final double progress;
|
||||||
|
|
||||||
|
_MiniWavePainter({required this.progress});
|
||||||
|
|
||||||
|
@override
|
||||||
|
void paint(Canvas canvas, Size size) {
|
||||||
|
final paint = Paint()
|
||||||
|
..color = const Color(0xFFECCFA8)
|
||||||
|
..strokeWidth = 2.5
|
||||||
|
..strokeCap = StrokeCap.round;
|
||||||
|
|
||||||
|
const barCount = 4;
|
||||||
|
final barWidth = size.width / (barCount * 2 - 1);
|
||||||
|
final centerY = size.height / 2;
|
||||||
|
|
||||||
|
for (int i = 0; i < barCount; i++) {
|
||||||
|
// Each bar has a different phase offset for wave effect
|
||||||
|
final phase = (progress + i * 0.25) % 1.0;
|
||||||
|
final height = size.height * (0.3 + 0.7 * (0.5 + 0.5 * sin(phase * 3.14159 * 2)));
|
||||||
|
final x = i * barWidth * 2 + barWidth / 2;
|
||||||
|
|
||||||
|
canvas.drawLine(
|
||||||
|
Offset(x, centerY - height / 2),
|
||||||
|
Offset(x, centerY + height / 2),
|
||||||
|
paint,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
@override
|
||||||
|
bool shouldRepaint(covariant _MiniWavePainter oldDelegate) =>
|
||||||
|
oldDelegate.progress != progress;
|
||||||
|
}
|
||||||
|
|
||||||
|
|||||||
@ -1,9 +1,12 @@
|
|||||||
|
import 'dart:async';
|
||||||
import 'dart:ui' as ui;
|
import 'dart:ui' as ui;
|
||||||
|
|
||||||
import 'package:flutter/material.dart';
|
import 'package:flutter/material.dart';
|
||||||
import 'package:flutter_svg/flutter_svg.dart';
|
import 'package:just_audio/just_audio.dart';
|
||||||
import '../theme/design_tokens.dart';
|
import '../theme/design_tokens.dart';
|
||||||
import '../widgets/gradient_button.dart';
|
import '../widgets/gradient_button.dart';
|
||||||
|
import '../widgets/pill_progress_button.dart';
|
||||||
|
import '../services/tts_service.dart';
|
||||||
import 'story_loading_page.dart';
|
import 'story_loading_page.dart';
|
||||||
|
|
||||||
enum StoryMode { generated, read }
|
enum StoryMode { generated, read }
|
||||||
@ -30,6 +33,14 @@ class _StoryDetailPageState extends State<StoryDetailPage>
|
|||||||
bool _hasGeneratedVideo = false;
|
bool _hasGeneratedVideo = false;
|
||||||
bool _isLoadingVideo = false;
|
bool _isLoadingVideo = false;
|
||||||
|
|
||||||
|
// TTS — uses global TTSService singleton
|
||||||
|
final TTSService _ttsService = TTSService.instance;
|
||||||
|
final AudioPlayer _audioPlayer = AudioPlayer();
|
||||||
|
StreamSubscription<Duration>? _positionSub;
|
||||||
|
StreamSubscription<PlayerState>? _playerStateSub;
|
||||||
|
Duration _audioDuration = Duration.zero;
|
||||||
|
Duration _audioPosition = Duration.zero;
|
||||||
|
|
||||||
// Genie Suck Animation
|
// Genie Suck Animation
|
||||||
bool _isSaving = false;
|
bool _isSaving = false;
|
||||||
AnimationController? _genieController;
|
AnimationController? _genieController;
|
||||||
@ -41,9 +52,9 @@ class _StoryDetailPageState extends State<StoryDetailPage>
|
|||||||
'content': """
|
'content': """
|
||||||
在遥远的银河系边缘,有一个被星云包裹的神秘茶馆。今天,这里迎来了两位特殊的客人:刚执行完火星探测任务的宇航员波波,和正在追捕暗影怪兽的忍者小次郎。
|
在遥远的银河系边缘,有一个被星云包裹的神秘茶馆。今天,这里迎来了两位特殊的客人:刚执行完火星探测任务的宇航员波波,和正在追捕暗影怪兽的忍者小次郎。
|
||||||
|
|
||||||
“这儿的重力好像有点不对劲?”波波飘在半空中,试图抓住飞来飞去的茶杯。小次郎则冷静地倒挂在天花板上,手里紧握着一枚手里剑——其实那是用来切月饼的。
|
"这儿的重力好像有点不对劲?"波波飘在半空中,试图抓住飞来飞去的茶杯。小次郎则冷静地倒挂在天花板上,手里紧握着一枚手里剑——其实那是用来切月饼的。
|
||||||
|
|
||||||
突然,桌上的魔法茶壶“噗”地一声喷出了七彩烟雾,一只会说话的卡皮巴拉钻了出来:“别打架,别打架,喝了这杯‘银河气泡茶’,我们都是好朋友!”
|
突然,桌上的魔法茶壶"噗"地一声喷出了七彩烟雾,一只会说话的卡皮巴拉钻了出来:"别打架,别打架,喝了这杯'银河气泡茶',我们都是好朋友!"
|
||||||
|
|
||||||
于是,宇宙中最奇怪的组合诞生了。他们决定,下一站,去黑洞边缘钓星星。
|
于是,宇宙中最奇怪的组合诞生了。他们决定,下一站,去黑洞边缘钓星星。
|
||||||
""",
|
""",
|
||||||
@ -54,7 +65,6 @@ class _StoryDetailPageState extends State<StoryDetailPage>
|
|||||||
Map<String, dynamic> _initStory() {
|
Map<String, dynamic> _initStory() {
|
||||||
final source = widget.story ?? _defaultStory;
|
final source = widget.story ?? _defaultStory;
|
||||||
final result = Map<String, dynamic>.from(source);
|
final result = Map<String, dynamic>.from(source);
|
||||||
// 兜底:如果没有 content 就用默认故事内容
|
|
||||||
result['content'] ??= _defaultStory['content'];
|
result['content'] ??= _defaultStory['content'];
|
||||||
result['title'] ??= _defaultStory['title'];
|
result['title'] ??= _defaultStory['title'];
|
||||||
return result;
|
return result;
|
||||||
@ -64,18 +74,171 @@ class _StoryDetailPageState extends State<StoryDetailPage>
|
|||||||
void initState() {
|
void initState() {
|
||||||
super.initState();
|
super.initState();
|
||||||
_currentStory = _initStory();
|
_currentStory = _initStory();
|
||||||
|
|
||||||
|
// Subscribe to TTSService changes
|
||||||
|
_ttsService.addListener(_onTTSChanged);
|
||||||
|
|
||||||
|
// Listen to audio player state
|
||||||
|
_playerStateSub = _audioPlayer.playerStateStream.listen((state) {
|
||||||
|
if (!mounted) return;
|
||||||
|
if (state.processingState == ProcessingState.completed) {
|
||||||
|
setState(() {
|
||||||
|
_isPlaying = false;
|
||||||
|
_audioPosition = Duration.zero;
|
||||||
|
});
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
// Listen to playback position for ring progress
|
||||||
|
_positionSub = _audioPlayer.positionStream.listen((pos) {
|
||||||
|
if (!mounted) return;
|
||||||
|
setState(() => _audioPosition = pos);
|
||||||
|
});
|
||||||
|
|
||||||
|
// Listen to duration changes
|
||||||
|
_audioPlayer.durationStream.listen((dur) {
|
||||||
|
if (!mounted || dur == null) return;
|
||||||
|
setState(() => _audioDuration = dur);
|
||||||
|
});
|
||||||
|
|
||||||
|
// Check if audio already exists (via TTSService)
|
||||||
|
final title = _currentStory['title'] as String? ?? '';
|
||||||
|
_ttsService.checkExistingAudio(title);
|
||||||
|
}
|
||||||
|
|
||||||
|
void _onTTSChanged() {
|
||||||
|
if (!mounted) return;
|
||||||
|
|
||||||
|
// Auto-play when generation completes
|
||||||
|
if (_ttsService.justCompleted &&
|
||||||
|
_ttsService.hasAudioFor(_currentStory['title'] ?? '')) {
|
||||||
|
// Delay slightly to let the completion flash play
|
||||||
|
Future.delayed(const Duration(milliseconds: 1500), () {
|
||||||
|
if (mounted) {
|
||||||
|
_ttsService.clearJustCompleted();
|
||||||
|
final route = ModalRoute.of(context);
|
||||||
|
if (route != null && route.isCurrent) {
|
||||||
|
_playAudio();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
setState(() {});
|
||||||
}
|
}
|
||||||
|
|
||||||
@override
|
@override
|
||||||
void dispose() {
|
void dispose() {
|
||||||
|
_ttsService.removeListener(_onTTSChanged);
|
||||||
|
_positionSub?.cancel();
|
||||||
|
_playerStateSub?.cancel();
|
||||||
|
_audioPlayer.dispose();
|
||||||
_genieController?.dispose();
|
_genieController?.dispose();
|
||||||
super.dispose();
|
super.dispose();
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Trigger Genie Suck animation matching HTML:
|
// ── TTS button logic ──
|
||||||
/// CSS: animation: genieSuck 0.8s cubic-bezier(0.6, -0.28, 0.735, 0.045) forwards
|
|
||||||
/// Phase 1 (0→15%): card scales up to 1.05 (tension)
|
bool _audioLoaded = false; // Track if audio URL is loaded in player
|
||||||
/// Phase 2 (15%→100%): card shrinks to 0.05, moves toward bottom, blurs & fades
|
String? _loadedUrl; // Which URL is currently loaded
|
||||||
|
|
||||||
|
TTSButtonState get _ttsState {
|
||||||
|
final title = _currentStory['title'] as String? ?? '';
|
||||||
|
|
||||||
|
if (_ttsService.error != null &&
|
||||||
|
!_ttsService.isGenerating &&
|
||||||
|
_ttsService.audioUrl == null) {
|
||||||
|
return TTSButtonState.error;
|
||||||
|
}
|
||||||
|
if (_ttsService.isGeneratingFor(title)) {
|
||||||
|
return TTSButtonState.generating;
|
||||||
|
}
|
||||||
|
if (_ttsService.justCompleted && _ttsService.hasAudioFor(title)) {
|
||||||
|
return TTSButtonState.completed;
|
||||||
|
}
|
||||||
|
if (_isPlaying) {
|
||||||
|
return TTSButtonState.playing;
|
||||||
|
}
|
||||||
|
if (_ttsService.hasAudioFor(title) && !_audioLoaded) {
|
||||||
|
return TTSButtonState.ready; // audio ready, not yet played -> show "鎾斁"
|
||||||
|
}
|
||||||
|
if (_audioLoaded) {
|
||||||
|
return TTSButtonState.paused; // was playing, now paused -> show "缁х画"
|
||||||
|
}
|
||||||
|
return TTSButtonState.idle;
|
||||||
|
}
|
||||||
|
|
||||||
|
double get _ttsProgress {
|
||||||
|
final state = _ttsState;
|
||||||
|
switch (state) {
|
||||||
|
case TTSButtonState.generating:
|
||||||
|
return _ttsService.progress;
|
||||||
|
case TTSButtonState.ready:
|
||||||
|
return 0.0;
|
||||||
|
case TTSButtonState.completed:
|
||||||
|
return 1.0;
|
||||||
|
case TTSButtonState.playing:
|
||||||
|
case TTSButtonState.paused:
|
||||||
|
if (_audioDuration.inMilliseconds > 0) {
|
||||||
|
return (_audioPosition.inMilliseconds / _audioDuration.inMilliseconds)
|
||||||
|
.clamp(0.0, 1.0);
|
||||||
|
}
|
||||||
|
return 0.0;
|
||||||
|
default:
|
||||||
|
return 0.0;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
void _handleTTSTap() {
|
||||||
|
final state = _ttsState;
|
||||||
|
switch (state) {
|
||||||
|
case TTSButtonState.idle:
|
||||||
|
case TTSButtonState.error:
|
||||||
|
final title = _currentStory['title'] as String? ?? '';
|
||||||
|
final content = _currentStory['content'] as String? ?? '';
|
||||||
|
_ttsService.generate(title: title, content: content);
|
||||||
|
break;
|
||||||
|
case TTSButtonState.generating:
|
||||||
|
break;
|
||||||
|
case TTSButtonState.ready:
|
||||||
|
case TTSButtonState.completed:
|
||||||
|
case TTSButtonState.paused:
|
||||||
|
_playAudio();
|
||||||
|
break;
|
||||||
|
case TTSButtonState.playing:
|
||||||
|
_audioPlayer.pause();
|
||||||
|
setState(() => _isPlaying = false);
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
Future<void> _playAudio() async {
|
||||||
|
final title = _currentStory['title'] as String? ?? '';
|
||||||
|
final url = _ttsService.hasAudioFor(title) ? _ttsService.audioUrl : null;
|
||||||
|
if (url == null) return;
|
||||||
|
|
||||||
|
try {
|
||||||
|
// If already loaded the same URL, seek to saved position and resume
|
||||||
|
if (_audioLoaded && _loadedUrl == url) {
|
||||||
|
await _audioPlayer.seek(_audioPosition);
|
||||||
|
_audioPlayer.play();
|
||||||
|
} else {
|
||||||
|
// Load new URL and play from start
|
||||||
|
await _audioPlayer.setUrl(url);
|
||||||
|
_audioLoaded = true;
|
||||||
|
_loadedUrl = url;
|
||||||
|
_audioPlayer.play();
|
||||||
|
}
|
||||||
|
if (mounted) {
|
||||||
|
setState(() => _isPlaying = true);
|
||||||
|
}
|
||||||
|
} catch (e) {
|
||||||
|
debugPrint('Audio play error: $e');
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// ── Genie Suck Animation ──
|
||||||
|
|
||||||
void _triggerGenieSuck() {
|
void _triggerGenieSuck() {
|
||||||
if (_isSaving) return;
|
if (_isSaving) return;
|
||||||
|
|
||||||
@ -84,7 +247,6 @@ class _StoryDetailPageState extends State<StoryDetailPage>
|
|||||||
duration: const Duration(milliseconds: 800),
|
duration: const Duration(milliseconds: 800),
|
||||||
);
|
);
|
||||||
|
|
||||||
// Calculate how far the card should travel downward (toward the save button)
|
|
||||||
final screenHeight = MediaQuery.of(context).size.height;
|
final screenHeight = MediaQuery.of(context).size.height;
|
||||||
_targetDY = screenHeight * 0.35;
|
_targetDY = screenHeight * 0.35;
|
||||||
|
|
||||||
@ -94,23 +256,20 @@ class _StoryDetailPageState extends State<StoryDetailPage>
|
|||||||
}
|
}
|
||||||
});
|
});
|
||||||
|
|
||||||
setState(() {
|
setState(() => _isSaving = true);
|
||||||
_isSaving = true;
|
|
||||||
});
|
|
||||||
_genieController!.forward();
|
_genieController!.forward();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// ── Build ──
|
||||||
|
|
||||||
@override
|
@override
|
||||||
Widget build(BuildContext context) {
|
Widget build(BuildContext context) {
|
||||||
return Scaffold(
|
return Scaffold(
|
||||||
backgroundColor: AppColors.storyBackground, // #FDF9F3
|
backgroundColor: AppColors.storyBackground,
|
||||||
body: SafeArea(
|
body: SafeArea(
|
||||||
child: Column(
|
child: Column(
|
||||||
children: [
|
children: [
|
||||||
// Header + Content Card — animated together during genie suck
|
|
||||||
Expanded(child: _buildAnimatedBody()),
|
Expanded(child: _buildAnimatedBody()),
|
||||||
|
|
||||||
// Footer
|
|
||||||
_buildFooter(),
|
_buildFooter(),
|
||||||
],
|
],
|
||||||
),
|
),
|
||||||
@ -118,7 +277,6 @@ class _StoryDetailPageState extends State<StoryDetailPage>
|
|||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Wraps header + content card in genie suck animation
|
|
||||||
Widget _buildAnimatedBody() {
|
Widget _buildAnimatedBody() {
|
||||||
Widget body = Column(
|
Widget body = Column(
|
||||||
children: [
|
children: [
|
||||||
@ -132,7 +290,7 @@ class _StoryDetailPageState extends State<StoryDetailPage>
|
|||||||
return AnimatedBuilder(
|
return AnimatedBuilder(
|
||||||
animation: _genieController!,
|
animation: _genieController!,
|
||||||
builder: (context, child) {
|
builder: (context, child) {
|
||||||
final t = _genieController!.value; // linear 0→1
|
final t = _genieController!.value;
|
||||||
|
|
||||||
double scale;
|
double scale;
|
||||||
double translateY;
|
double translateY;
|
||||||
@ -140,14 +298,12 @@ class _StoryDetailPageState extends State<StoryDetailPage>
|
|||||||
double blur;
|
double blur;
|
||||||
|
|
||||||
if (t <= 0.15) {
|
if (t <= 0.15) {
|
||||||
// Phase 1: tension — whole area scales up slightly
|
|
||||||
final p = t / 0.15;
|
final p = t / 0.15;
|
||||||
scale = 1.0 + 0.05 * Curves.easeOut.transform(p);
|
scale = 1.0 + 0.05 * Curves.easeOut.transform(p);
|
||||||
translateY = 0;
|
translateY = 0;
|
||||||
opacity = 1.0;
|
opacity = 1.0;
|
||||||
blur = 0;
|
blur = 0;
|
||||||
} else {
|
} else {
|
||||||
// Phase 2: suck — shrinks, moves down, fades and blurs
|
|
||||||
final p = ((t - 0.15) / 0.85).clamp(0.0, 1.0);
|
final p = ((t - 0.15) / 0.85).clamp(0.0, 1.0);
|
||||||
final curved =
|
final curved =
|
||||||
const Cubic(0.6, -0.28, 0.735, 0.045).transform(p);
|
const Cubic(0.6, -0.28, 0.735, 0.045).transform(p);
|
||||||
@ -209,7 +365,7 @@ class _StoryDetailPageState extends State<StoryDetailPage>
|
|||||||
),
|
),
|
||||||
),
|
),
|
||||||
Text(
|
Text(
|
||||||
_currentStory['title'],
|
_currentStory['title'] ?? '',
|
||||||
style: const TextStyle(
|
style: const TextStyle(
|
||||||
fontSize: 17,
|
fontSize: 17,
|
||||||
fontWeight: FontWeight.w600,
|
fontWeight: FontWeight.w600,
|
||||||
@ -227,9 +383,9 @@ class _StoryDetailPageState extends State<StoryDetailPage>
|
|||||||
child: Row(
|
child: Row(
|
||||||
mainAxisAlignment: MainAxisAlignment.center,
|
mainAxisAlignment: MainAxisAlignment.center,
|
||||||
children: [
|
children: [
|
||||||
_buildTabBtn('📄 故事', 'text'),
|
_buildTabBtn('故事', 'text'),
|
||||||
const SizedBox(width: 8),
|
const SizedBox(width: 8),
|
||||||
_buildTabBtn('🎬 绘本', 'video'),
|
_buildTabBtn('绘本', 'video'),
|
||||||
],
|
],
|
||||||
),
|
),
|
||||||
);
|
);
|
||||||
@ -238,11 +394,7 @@ class _StoryDetailPageState extends State<StoryDetailPage>
|
|||||||
Widget _buildTabBtn(String label, String key) {
|
Widget _buildTabBtn(String label, String key) {
|
||||||
bool isActive = _activeTab == key;
|
bool isActive = _activeTab == key;
|
||||||
return GestureDetector(
|
return GestureDetector(
|
||||||
onTap: () {
|
onTap: () => setState(() => _activeTab = key),
|
||||||
setState(() {
|
|
||||||
_activeTab = key;
|
|
||||||
});
|
|
||||||
},
|
|
||||||
child: Container(
|
child: Container(
|
||||||
padding: const EdgeInsets.symmetric(horizontal: 16, vertical: 8),
|
padding: const EdgeInsets.symmetric(horizontal: 16, vertical: 8),
|
||||||
decoration: BoxDecoration(
|
decoration: BoxDecoration(
|
||||||
@ -271,7 +423,6 @@ class _StoryDetailPageState extends State<StoryDetailPage>
|
|||||||
}
|
}
|
||||||
|
|
||||||
Widget _buildContentCard() {
|
Widget _buildContentCard() {
|
||||||
// HTML: .story-paper
|
|
||||||
bool isVideoMode = _activeTab == 'video';
|
bool isVideoMode = _activeTab == 'video';
|
||||||
|
|
||||||
return Container(
|
return Container(
|
||||||
@ -292,11 +443,11 @@ class _StoryDetailPageState extends State<StoryDetailPage>
|
|||||||
_currentStory['content']
|
_currentStory['content']
|
||||||
.toString()
|
.toString()
|
||||||
.replaceAll(RegExp(r'\n+'), '\n\n')
|
.replaceAll(RegExp(r'\n+'), '\n\n')
|
||||||
.trim(), // Simple paragraph spacing
|
.trim(),
|
||||||
style: const TextStyle(
|
style: const TextStyle(
|
||||||
fontSize: 16, // HTML: 16px
|
fontSize: 16,
|
||||||
height: 2.0, // HTML: line-height 2.0
|
height: 2.0,
|
||||||
color: AppColors.storyText, // #374151
|
color: AppColors.storyText,
|
||||||
),
|
),
|
||||||
textAlign: TextAlign.justify,
|
textAlign: TextAlign.justify,
|
||||||
),
|
),
|
||||||
@ -313,7 +464,7 @@ class _StoryDetailPageState extends State<StoryDetailPage>
|
|||||||
width: 40,
|
width: 40,
|
||||||
height: 40,
|
height: 40,
|
||||||
child: CircularProgressIndicator(
|
child: CircularProgressIndicator(
|
||||||
color: Color(0xFFF43F5E), // HTML: #F43F5E
|
color: Color(0xFFF43F5E),
|
||||||
strokeWidth: 3,
|
strokeWidth: 3,
|
||||||
),
|
),
|
||||||
),
|
),
|
||||||
@ -339,15 +490,14 @@ class _StoryDetailPageState extends State<StoryDetailPage>
|
|||||||
alignment: Alignment.center,
|
alignment: Alignment.center,
|
||||||
children: [
|
children: [
|
||||||
AspectRatio(
|
AspectRatio(
|
||||||
aspectRatio: 16 / 9, // Assume landscape video
|
aspectRatio: 16 / 9,
|
||||||
child: Container(
|
child: Container(
|
||||||
color: Colors.black,
|
color: Colors.black,
|
||||||
child: const Center(
|
child: const Center(
|
||||||
child: Icon(Icons.videocam, color: Colors.white54, size: 48),
|
child: Icon(Icons.videocam, color: Colors.white54, size: 48),
|
||||||
), // Placeholder for Video Player
|
),
|
||||||
),
|
),
|
||||||
),
|
),
|
||||||
// Play Button Overlay
|
|
||||||
Container(
|
Container(
|
||||||
width: 48,
|
width: 48,
|
||||||
height: 48,
|
height: 48,
|
||||||
@ -372,7 +522,6 @@ class _StoryDetailPageState extends State<StoryDetailPage>
|
|||||||
child: _activeTab == 'text' ? _buildTextFooter() : _buildVideoFooter(),
|
child: _activeTab == 'text' ? _buildTextFooter() : _buildVideoFooter(),
|
||||||
);
|
);
|
||||||
|
|
||||||
// Fade out footer during genie suck animation
|
|
||||||
if (_isSaving) {
|
if (_isSaving) {
|
||||||
return IgnorePointer(
|
return IgnorePointer(
|
||||||
child: AnimatedOpacity(
|
child: AnimatedOpacity(
|
||||||
@ -387,12 +536,9 @@ class _StoryDetailPageState extends State<StoryDetailPage>
|
|||||||
}
|
}
|
||||||
|
|
||||||
void _handleRewrite() async {
|
void _handleRewrite() async {
|
||||||
// 跳到 loading 页重新生成
|
|
||||||
final result = await Navigator.of(context).push<String>(
|
final result = await Navigator.of(context).push<String>(
|
||||||
MaterialPageRoute(builder: (context) => const StoryLoadingPage()),
|
MaterialPageRoute(builder: (context) => const StoryLoadingPage()),
|
||||||
);
|
);
|
||||||
|
|
||||||
// loading 完成后返回结果
|
|
||||||
if (mounted && result == 'saved') {
|
if (mounted && result == 'saved') {
|
||||||
Navigator.of(context).pop('saved');
|
Navigator.of(context).pop('saved');
|
||||||
}
|
}
|
||||||
@ -403,7 +549,6 @@ class _StoryDetailPageState extends State<StoryDetailPage>
|
|||||||
// Generator Mode: Rewrite + Save
|
// Generator Mode: Rewrite + Save
|
||||||
return Row(
|
return Row(
|
||||||
children: [
|
children: [
|
||||||
// Rewrite (Secondary)
|
|
||||||
Expanded(
|
Expanded(
|
||||||
child: GestureDetector(
|
child: GestureDetector(
|
||||||
onTap: _handleRewrite,
|
onTap: _handleRewrite,
|
||||||
@ -415,19 +560,25 @@ class _StoryDetailPageState extends State<StoryDetailPage>
|
|||||||
color: Colors.white.withOpacity(0.8),
|
color: Colors.white.withOpacity(0.8),
|
||||||
),
|
),
|
||||||
alignment: Alignment.center,
|
alignment: Alignment.center,
|
||||||
child: const Text(
|
child: const Row(
|
||||||
'↻ 重写',
|
mainAxisAlignment: MainAxisAlignment.center,
|
||||||
style: TextStyle(
|
children: [
|
||||||
fontSize: 16,
|
Icon(Icons.refresh_rounded, size: 18, color: Color(0xFF4B5563)),
|
||||||
fontWeight: FontWeight.w600,
|
SizedBox(width: 4),
|
||||||
color: Color(0xFF4B5563),
|
Text(
|
||||||
),
|
'重写',
|
||||||
|
style: TextStyle(
|
||||||
|
fontSize: 16,
|
||||||
|
fontWeight: FontWeight.w600,
|
||||||
|
color: Color(0xFF4B5563),
|
||||||
|
),
|
||||||
|
),
|
||||||
|
],
|
||||||
),
|
),
|
||||||
),
|
),
|
||||||
),
|
),
|
||||||
),
|
),
|
||||||
const SizedBox(width: 16),
|
const SizedBox(width: 16),
|
||||||
// Save (Primary) - Returns 'saved' to trigger add book animation
|
|
||||||
Expanded(
|
Expanded(
|
||||||
child: GradientButton(
|
child: GradientButton(
|
||||||
text: '保存故事',
|
text: '保存故事',
|
||||||
@ -441,41 +592,14 @@ class _StoryDetailPageState extends State<StoryDetailPage>
|
|||||||
],
|
],
|
||||||
);
|
);
|
||||||
} else {
|
} else {
|
||||||
// Read Mode: TTS + Make Picture Book
|
// Read Mode: TTS pill button + Make Picture Book
|
||||||
return Row(
|
return Row(
|
||||||
children: [
|
children: [
|
||||||
// TTS
|
|
||||||
Expanded(
|
Expanded(
|
||||||
child: GestureDetector(
|
child: PillProgressButton(
|
||||||
onTap: () => setState(() => _isPlaying = !_isPlaying),
|
state: _ttsState,
|
||||||
child: Container(
|
progress: _ttsProgress,
|
||||||
height: 48,
|
onTap: _handleTTSTap,
|
||||||
decoration: BoxDecoration(
|
|
||||||
border: Border.all(color: const Color(0xFFE5E7EB)),
|
|
||||||
borderRadius: BorderRadius.circular(24),
|
|
||||||
color: Colors.white.withOpacity(0.8),
|
|
||||||
),
|
|
||||||
alignment: Alignment.center,
|
|
||||||
child: Row(
|
|
||||||
mainAxisAlignment: MainAxisAlignment.center,
|
|
||||||
children: [
|
|
||||||
Icon(
|
|
||||||
_isPlaying ? Icons.pause : Icons.headphones,
|
|
||||||
size: 20,
|
|
||||||
color: const Color(0xFF4B5563),
|
|
||||||
),
|
|
||||||
const SizedBox(width: 6),
|
|
||||||
Text(
|
|
||||||
_isPlaying ? '暂停' : '朗读',
|
|
||||||
style: const TextStyle(
|
|
||||||
fontSize: 16,
|
|
||||||
fontWeight: FontWeight.w600,
|
|
||||||
color: Color(0xFF4B5563),
|
|
||||||
),
|
|
||||||
),
|
|
||||||
],
|
|
||||||
),
|
|
||||||
),
|
|
||||||
),
|
),
|
||||||
),
|
),
|
||||||
const SizedBox(width: 16),
|
const SizedBox(width: 16),
|
||||||
@ -500,7 +624,7 @@ class _StoryDetailPageState extends State<StoryDetailPage>
|
|||||||
children: [
|
children: [
|
||||||
Expanded(
|
Expanded(
|
||||||
child: GradientButton(
|
child: GradientButton(
|
||||||
text: '↻ 重新生成',
|
text: '重新生成',
|
||||||
onPressed: _startVideoGeneration,
|
onPressed: _startVideoGeneration,
|
||||||
gradient: const LinearGradient(
|
gradient: const LinearGradient(
|
||||||
colors: AppColors.btnCapybaraGradient,
|
colors: AppColors.btnCapybaraGradient,
|
||||||
@ -517,7 +641,6 @@ class _StoryDetailPageState extends State<StoryDetailPage>
|
|||||||
_isLoadingVideo = true;
|
_isLoadingVideo = true;
|
||||||
_activeTab = 'video';
|
_activeTab = 'video';
|
||||||
});
|
});
|
||||||
// Mock delay
|
|
||||||
Future.delayed(const Duration(seconds: 2), () {
|
Future.delayed(const Duration(seconds: 2), () {
|
||||||
if (mounted) {
|
if (mounted) {
|
||||||
setState(() {
|
setState(() {
|
||||||
|
|||||||
190
airhub_app/lib/services/tts_service.dart
Normal file
190
airhub_app/lib/services/tts_service.dart
Normal file
@ -0,0 +1,190 @@
|
|||||||
|
import 'dart:convert';
|
||||||
|
import 'package:flutter/foundation.dart';
|
||||||
|
import 'package:http/http.dart' as http;
|
||||||
|
|
||||||
|
/// Singleton service that manages TTS generation in the background.
|
||||||
|
/// Survives page navigation — when user leaves and comes back,
|
||||||
|
/// generation continues and result is available.
|
||||||
|
class TTSService extends ChangeNotifier {
|
||||||
|
TTSService._();
|
||||||
|
static final TTSService instance = TTSService._();
|
||||||
|
|
||||||
|
static const String _kServerBase = 'http://localhost:3000';
|
||||||
|
|
||||||
|
// ── Current task state ──
|
||||||
|
bool _isGenerating = false;
|
||||||
|
double _progress = 0.0; // 0.0 ~ 1.0
|
||||||
|
String _statusMessage = '';
|
||||||
|
String? _currentStoryTitle; // Which story is being generated
|
||||||
|
|
||||||
|
// ── Result ──
|
||||||
|
String? _audioUrl;
|
||||||
|
String? _completedStoryTitle; // Which story the audio belongs to
|
||||||
|
bool _justCompleted = false; // Flash animation trigger
|
||||||
|
|
||||||
|
// ── Error ──
|
||||||
|
String? _error;
|
||||||
|
|
||||||
|
// ── Getters ──
|
||||||
|
bool get isGenerating => _isGenerating;
|
||||||
|
double get progress => _progress;
|
||||||
|
String get statusMessage => _statusMessage;
|
||||||
|
String? get currentStoryTitle => _currentStoryTitle;
|
||||||
|
String? get audioUrl => _audioUrl;
|
||||||
|
String? get completedStoryTitle => _completedStoryTitle;
|
||||||
|
bool get justCompleted => _justCompleted;
|
||||||
|
String? get error => _error;
|
||||||
|
|
||||||
|
/// Check if audio is ready for a specific story.
|
||||||
|
bool hasAudioFor(String title) {
|
||||||
|
return _completedStoryTitle == title && _audioUrl != null;
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Check if currently generating for a specific story.
|
||||||
|
bool isGeneratingFor(String title) {
|
||||||
|
return _isGenerating && _currentStoryTitle == title;
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Clear the "just completed" flag (after flash animation plays).
|
||||||
|
void clearJustCompleted() {
|
||||||
|
_justCompleted = false;
|
||||||
|
notifyListeners();
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Set audio URL directly (e.g. from pre-check).
|
||||||
|
void setExistingAudio(String title, String url) {
|
||||||
|
_completedStoryTitle = title;
|
||||||
|
_audioUrl = url;
|
||||||
|
_justCompleted = false;
|
||||||
|
notifyListeners();
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Check server for existing audio file.
|
||||||
|
Future<void> checkExistingAudio(String title) async {
|
||||||
|
if (title.isEmpty) return;
|
||||||
|
try {
|
||||||
|
final resp = await http.get(
|
||||||
|
Uri.parse(
|
||||||
|
'$_kServerBase/api/tts_check?title=${Uri.encodeComponent(title)}',
|
||||||
|
),
|
||||||
|
);
|
||||||
|
if (resp.statusCode == 200) {
|
||||||
|
final data = jsonDecode(resp.body);
|
||||||
|
if (data['exists'] == true && data['audio_url'] != null) {
|
||||||
|
_completedStoryTitle = title;
|
||||||
|
_audioUrl = '$_kServerBase/${data['audio_url']}';
|
||||||
|
notifyListeners();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
} catch (_) {}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Start TTS generation. Safe to call even if page navigates away.
|
||||||
|
Future<void> generate({
|
||||||
|
required String title,
|
||||||
|
required String content,
|
||||||
|
}) async {
|
||||||
|
if (_isGenerating) return;
|
||||||
|
|
||||||
|
_isGenerating = true;
|
||||||
|
_progress = 0.0;
|
||||||
|
_statusMessage = '正在连接...';
|
||||||
|
_currentStoryTitle = title;
|
||||||
|
_audioUrl = null;
|
||||||
|
_completedStoryTitle = null;
|
||||||
|
_justCompleted = false;
|
||||||
|
_error = null;
|
||||||
|
notifyListeners();
|
||||||
|
|
||||||
|
try {
|
||||||
|
final client = http.Client();
|
||||||
|
final request = http.Request(
|
||||||
|
'POST',
|
||||||
|
Uri.parse('$_kServerBase/api/create_tts'),
|
||||||
|
);
|
||||||
|
request.headers['Content-Type'] = 'application/json';
|
||||||
|
request.body = jsonEncode({'title': title, 'content': content});
|
||||||
|
|
||||||
|
final streamed = await client.send(request);
|
||||||
|
|
||||||
|
await for (final chunk in streamed.stream.transform(utf8.decoder)) {
|
||||||
|
for (final line in chunk.split('\n')) {
|
||||||
|
if (!line.startsWith('data: ')) continue;
|
||||||
|
try {
|
||||||
|
final data = jsonDecode(line.substring(6));
|
||||||
|
final stage = data['stage'] as String? ?? '';
|
||||||
|
final message = data['message'] as String? ?? '';
|
||||||
|
|
||||||
|
switch (stage) {
|
||||||
|
case 'connecting':
|
||||||
|
_updateProgress(0.10, '正在连接...');
|
||||||
|
break;
|
||||||
|
case 'generating':
|
||||||
|
_updateProgress(0.30, '语音生成中...');
|
||||||
|
break;
|
||||||
|
case 'saving':
|
||||||
|
_updateProgress(0.88, '正在保存...');
|
||||||
|
break;
|
||||||
|
case 'done':
|
||||||
|
if (data['audio_url'] != null) {
|
||||||
|
_audioUrl = '$_kServerBase/${data['audio_url']}';
|
||||||
|
_completedStoryTitle = title;
|
||||||
|
_justCompleted = true;
|
||||||
|
_updateProgress(1.0, '生成完成');
|
||||||
|
}
|
||||||
|
break;
|
||||||
|
case 'error':
|
||||||
|
throw Exception(message);
|
||||||
|
default:
|
||||||
|
// Progress slowly increases during generation
|
||||||
|
if (_progress < 0.85) {
|
||||||
|
_updateProgress(_progress + 0.02, message);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
} catch (e) {
|
||||||
|
if (e is Exception &&
|
||||||
|
e.toString().contains('语音合成失败')) {
|
||||||
|
rethrow;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
client.close();
|
||||||
|
|
||||||
|
_isGenerating = false;
|
||||||
|
if (_audioUrl == null) {
|
||||||
|
_error = '未获取到音频';
|
||||||
|
_statusMessage = '生成失败';
|
||||||
|
}
|
||||||
|
notifyListeners();
|
||||||
|
} catch (e) {
|
||||||
|
debugPrint('TTS generation error: $e');
|
||||||
|
_isGenerating = false;
|
||||||
|
_progress = 0.0;
|
||||||
|
_error = e.toString();
|
||||||
|
_statusMessage = '生成失败';
|
||||||
|
_justCompleted = false;
|
||||||
|
notifyListeners();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
void _updateProgress(double progress, String message) {
|
||||||
|
_progress = progress.clamp(0.0, 1.0);
|
||||||
|
_statusMessage = message;
|
||||||
|
notifyListeners();
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Reset all state (e.g. when switching stories).
|
||||||
|
void reset() {
|
||||||
|
if (_isGenerating) return; // Don't reset during generation
|
||||||
|
_progress = 0.0;
|
||||||
|
_statusMessage = '';
|
||||||
|
_currentStoryTitle = null;
|
||||||
|
_audioUrl = null;
|
||||||
|
_completedStoryTitle = null;
|
||||||
|
_justCompleted = false;
|
||||||
|
_error = null;
|
||||||
|
notifyListeners();
|
||||||
|
}
|
||||||
|
}
|
||||||
335
airhub_app/lib/widgets/pill_progress_button.dart
Normal file
335
airhub_app/lib/widgets/pill_progress_button.dart
Normal file
@ -0,0 +1,335 @@
|
|||||||
|
import 'dart:math' as math;
|
||||||
|
import 'package:flutter/material.dart';
|
||||||
|
|
||||||
|
enum TTSButtonState {
|
||||||
|
idle,
|
||||||
|
ready,
|
||||||
|
generating,
|
||||||
|
completed,
|
||||||
|
playing,
|
||||||
|
paused,
|
||||||
|
error,
|
||||||
|
}
|
||||||
|
|
||||||
|
class PillProgressButton extends StatefulWidget {
|
||||||
|
final TTSButtonState state;
|
||||||
|
final double progress;
|
||||||
|
final VoidCallback? onTap;
|
||||||
|
final double height;
|
||||||
|
|
||||||
|
const PillProgressButton({
|
||||||
|
super.key,
|
||||||
|
required this.state,
|
||||||
|
this.progress = 0.0,
|
||||||
|
this.onTap,
|
||||||
|
this.height = 48,
|
||||||
|
});
|
||||||
|
|
||||||
|
@override
|
||||||
|
State<PillProgressButton> createState() => _PillProgressButtonState();
|
||||||
|
}
|
||||||
|
|
||||||
|
class _PillProgressButtonState extends State<PillProgressButton>
|
||||||
|
with TickerProviderStateMixin {
|
||||||
|
late AnimationController _progressCtrl;
|
||||||
|
double _displayProgress = 0.0;
|
||||||
|
|
||||||
|
late AnimationController _glowCtrl;
|
||||||
|
late Animation<double> _glowAnim;
|
||||||
|
|
||||||
|
late AnimationController _waveCtrl;
|
||||||
|
|
||||||
|
bool _wasCompleted = false;
|
||||||
|
|
||||||
|
@override
|
||||||
|
void initState() {
|
||||||
|
super.initState();
|
||||||
|
|
||||||
|
_progressCtrl = AnimationController(
|
||||||
|
vsync: this,
|
||||||
|
duration: const Duration(milliseconds: 500),
|
||||||
|
);
|
||||||
|
_progressCtrl.addListener(() => setState(() {}));
|
||||||
|
|
||||||
|
_glowCtrl = AnimationController(
|
||||||
|
vsync: this,
|
||||||
|
duration: const Duration(milliseconds: 1000),
|
||||||
|
);
|
||||||
|
_glowAnim = TweenSequence<double>([
|
||||||
|
TweenSequenceItem(tween: Tween(begin: 0.0, end: 1.0), weight: 35),
|
||||||
|
TweenSequenceItem(tween: Tween(begin: 1.0, end: 0.0), weight: 65),
|
||||||
|
]).animate(CurvedAnimation(parent: _glowCtrl, curve: Curves.easeOut));
|
||||||
|
_glowCtrl.addListener(() => setState(() {}));
|
||||||
|
|
||||||
|
_waveCtrl = AnimationController(
|
||||||
|
vsync: this,
|
||||||
|
duration: const Duration(milliseconds: 800),
|
||||||
|
);
|
||||||
|
|
||||||
|
_syncAnimations();
|
||||||
|
}
|
||||||
|
|
||||||
|
@override
|
||||||
|
void didUpdateWidget(PillProgressButton oldWidget) {
|
||||||
|
super.didUpdateWidget(oldWidget);
|
||||||
|
|
||||||
|
if (widget.progress != oldWidget.progress) {
|
||||||
|
if (oldWidget.state == TTSButtonState.completed &&
|
||||||
|
(widget.state == TTSButtonState.playing || widget.state == TTSButtonState.ready)) {
|
||||||
|
_displayProgress = 0.0;
|
||||||
|
} else {
|
||||||
|
_animateProgressTo(widget.progress);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if (widget.state == TTSButtonState.completed && !_wasCompleted) {
|
||||||
|
_wasCompleted = true;
|
||||||
|
_glowCtrl.forward(from: 0);
|
||||||
|
} else if (widget.state != TTSButtonState.completed) {
|
||||||
|
_wasCompleted = false;
|
||||||
|
}
|
||||||
|
|
||||||
|
_syncAnimations();
|
||||||
|
}
|
||||||
|
|
||||||
|
void _animateProgressTo(double target) {
|
||||||
|
final from = _displayProgress;
|
||||||
|
_progressCtrl.reset();
|
||||||
|
_progressCtrl.addListener(() {
|
||||||
|
final t = Curves.easeInOut.transform(_progressCtrl.value);
|
||||||
|
_displayProgress = from + (target - from) * t;
|
||||||
|
});
|
||||||
|
_progressCtrl.forward();
|
||||||
|
}
|
||||||
|
|
||||||
|
void _syncAnimations() {
|
||||||
|
if (widget.state == TTSButtonState.generating) {
|
||||||
|
if (!_waveCtrl.isAnimating) _waveCtrl.repeat();
|
||||||
|
} else {
|
||||||
|
if (_waveCtrl.isAnimating) {
|
||||||
|
_waveCtrl.stop();
|
||||||
|
_waveCtrl.value = 0;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
@override
|
||||||
|
void dispose() {
|
||||||
|
_progressCtrl.dispose();
|
||||||
|
_glowCtrl.dispose();
|
||||||
|
_waveCtrl.dispose();
|
||||||
|
super.dispose();
|
||||||
|
}
|
||||||
|
|
||||||
|
bool get _showBorder =>
|
||||||
|
widget.state == TTSButtonState.generating ||
|
||||||
|
widget.state == TTSButtonState.completed ||
|
||||||
|
widget.state == TTSButtonState.playing ||
|
||||||
|
widget.state == TTSButtonState.paused;
|
||||||
|
|
||||||
|
@override
|
||||||
|
Widget build(BuildContext context) {
|
||||||
|
const borderColor = Color(0xFFE5E7EB);
|
||||||
|
const progressColor = Color(0xFFECCFA8);
|
||||||
|
const bgColor = Color(0xCCFFFFFF);
|
||||||
|
|
||||||
|
return GestureDetector(
|
||||||
|
onTap: widget.state == TTSButtonState.generating ? null : widget.onTap,
|
||||||
|
child: Container(
|
||||||
|
height: widget.height,
|
||||||
|
decoration: BoxDecoration(
|
||||||
|
borderRadius: BorderRadius.circular(widget.height / 2),
|
||||||
|
boxShadow: _glowAnim.value > 0
|
||||||
|
? [
|
||||||
|
BoxShadow(
|
||||||
|
color: progressColor.withOpacity(0.5 * _glowAnim.value),
|
||||||
|
blurRadius: 16 * _glowAnim.value,
|
||||||
|
spreadRadius: 2 * _glowAnim.value,
|
||||||
|
),
|
||||||
|
]
|
||||||
|
: null,
|
||||||
|
),
|
||||||
|
child: CustomPaint(
|
||||||
|
painter: PillBorderPainter(
|
||||||
|
progress: _showBorder ? _displayProgress.clamp(0.0, 1.0) : 0.0,
|
||||||
|
borderColor: borderColor,
|
||||||
|
progressColor: progressColor,
|
||||||
|
radius: widget.height / 2,
|
||||||
|
stroke: _showBorder ? 2.5 : 1.0,
|
||||||
|
bg: bgColor,
|
||||||
|
),
|
||||||
|
child: Center(child: _buildContent()),
|
||||||
|
),
|
||||||
|
),
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
Widget _buildContent() {
|
||||||
|
switch (widget.state) {
|
||||||
|
case TTSButtonState.idle:
|
||||||
|
return _label(Icons.headphones_rounded, '\u6717\u8bfb');
|
||||||
|
case TTSButtonState.generating:
|
||||||
|
return Row(
|
||||||
|
mainAxisAlignment: MainAxisAlignment.center,
|
||||||
|
children: [
|
||||||
|
AnimatedBuilder(
|
||||||
|
animation: _waveCtrl,
|
||||||
|
builder: (context, _) => CustomPaint(
|
||||||
|
size: const Size(20, 18),
|
||||||
|
painter: WavePainter(t: _waveCtrl.value, color: const Color(0xFFC99672)),
|
||||||
|
),
|
||||||
|
),
|
||||||
|
const SizedBox(width: 6),
|
||||||
|
const Text('\u751f\u6210\u4e2d',
|
||||||
|
style: TextStyle(fontSize: 15, fontWeight: FontWeight.w600, color: Color(0xFF4B5563))),
|
||||||
|
],
|
||||||
|
);
|
||||||
|
case TTSButtonState.ready:
|
||||||
|
return _label(Icons.play_arrow_rounded, '\u64ad\u653e');
|
||||||
|
case TTSButtonState.completed:
|
||||||
|
return _label(Icons.play_arrow_rounded, '\u64ad\u653e');
|
||||||
|
case TTSButtonState.playing:
|
||||||
|
return _label(Icons.pause_rounded, '\u6682\u505c');
|
||||||
|
case TTSButtonState.paused:
|
||||||
|
return _label(Icons.play_arrow_rounded, '\u7ee7\u7eed');
|
||||||
|
case TTSButtonState.error:
|
||||||
|
return _label(Icons.refresh_rounded, '\u91cd\u8bd5', isError: true);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
Widget _label(IconData icon, String text, {bool isError = false}) {
|
||||||
|
final c = isError ? const Color(0xFFEF4444) : const Color(0xFF4B5563);
|
||||||
|
return Row(
|
||||||
|
mainAxisAlignment: MainAxisAlignment.center,
|
||||||
|
mainAxisSize: MainAxisSize.min,
|
||||||
|
children: [
|
||||||
|
Icon(icon, size: 20, color: c),
|
||||||
|
const SizedBox(width: 4),
|
||||||
|
Text(text, style: TextStyle(fontSize: 16, fontWeight: FontWeight.w600, color: c)),
|
||||||
|
],
|
||||||
|
);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
class PillBorderPainter extends CustomPainter {
|
||||||
|
final double progress;
|
||||||
|
final Color borderColor;
|
||||||
|
final Color progressColor;
|
||||||
|
final double radius;
|
||||||
|
final double stroke;
|
||||||
|
final Color bg;
|
||||||
|
|
||||||
|
PillBorderPainter({
|
||||||
|
required this.progress,
|
||||||
|
required this.borderColor,
|
||||||
|
required this.progressColor,
|
||||||
|
required this.radius,
|
||||||
|
required this.stroke,
|
||||||
|
required this.bg,
|
||||||
|
});
|
||||||
|
|
||||||
|
@override
|
||||||
|
void paint(Canvas canvas, Size size) {
|
||||||
|
final r = radius.clamp(0.0, size.height / 2);
|
||||||
|
final rrect = RRect.fromRectAndRadius(
|
||||||
|
Rect.fromLTWH(0, 0, size.width, size.height),
|
||||||
|
Radius.circular(r),
|
||||||
|
);
|
||||||
|
|
||||||
|
canvas.drawRRect(rrect, Paint()
|
||||||
|
..color = bg
|
||||||
|
..style = PaintingStyle.fill);
|
||||||
|
canvas.drawRRect(rrect, Paint()
|
||||||
|
..color = borderColor
|
||||||
|
..style = PaintingStyle.stroke
|
||||||
|
..strokeWidth = stroke);
|
||||||
|
|
||||||
|
if (progress <= 0.001) return;
|
||||||
|
|
||||||
|
final straightH = size.width - 2 * r;
|
||||||
|
final halfTop = straightH / 2;
|
||||||
|
final arcLen = math.pi * r;
|
||||||
|
final totalLen = halfTop + arcLen + straightH + arcLen + halfTop;
|
||||||
|
final target = totalLen * progress;
|
||||||
|
|
||||||
|
final path = Path();
|
||||||
|
double done = 0;
|
||||||
|
final cx = size.width / 2;
|
||||||
|
|
||||||
|
path.moveTo(cx, 0);
|
||||||
|
var seg = math.min(halfTop, target - done);
|
||||||
|
path.lineTo(cx + seg, 0);
|
||||||
|
done += seg;
|
||||||
|
if (done >= target) { _drawPath(canvas, path); return; }
|
||||||
|
|
||||||
|
seg = math.min(arcLen, target - done);
|
||||||
|
_traceArc(path, size.width - r, r, r, -math.pi / 2, seg / r);
|
||||||
|
done += seg;
|
||||||
|
if (done >= target) { _drawPath(canvas, path); return; }
|
||||||
|
|
||||||
|
seg = math.min(straightH, target - done);
|
||||||
|
path.lineTo(size.width - r - seg, size.height);
|
||||||
|
done += seg;
|
||||||
|
if (done >= target) { _drawPath(canvas, path); return; }
|
||||||
|
|
||||||
|
seg = math.min(arcLen, target - done);
|
||||||
|
_traceArc(path, r, r, r, math.pi / 2, seg / r);
|
||||||
|
done += seg;
|
||||||
|
if (done >= target) { _drawPath(canvas, path); return; }
|
||||||
|
|
||||||
|
seg = math.min(halfTop, target - done);
|
||||||
|
path.lineTo(r + seg, 0);
|
||||||
|
_drawPath(canvas, path);
|
||||||
|
}
|
||||||
|
|
||||||
|
void _drawPath(Canvas canvas, Path path) {
|
||||||
|
canvas.drawPath(path, Paint()
|
||||||
|
..color = progressColor
|
||||||
|
..style = PaintingStyle.stroke
|
||||||
|
..strokeWidth = stroke
|
||||||
|
..strokeCap = StrokeCap.round);
|
||||||
|
}
|
||||||
|
|
||||||
|
void _traceArc(Path p, double cx, double cy, double r, double start, double sweep) {
|
||||||
|
const n = 24;
|
||||||
|
final step = sweep / n;
|
||||||
|
for (int i = 0; i <= n; i++) {
|
||||||
|
final a = start + step * i;
|
||||||
|
p.lineTo(cx + r * math.cos(a), cy + r * math.sin(a));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
@override
|
||||||
|
bool shouldRepaint(PillBorderPainter old) => old.progress != progress || old.stroke != stroke;
|
||||||
|
}
|
||||||
|
|
||||||
|
class WavePainter extends CustomPainter {
|
||||||
|
final double t;
|
||||||
|
final Color color;
|
||||||
|
WavePainter({required this.t, required this.color});
|
||||||
|
|
||||||
|
@override
|
||||||
|
void paint(Canvas canvas, Size size) {
|
||||||
|
final paint = Paint()
|
||||||
|
..color = color
|
||||||
|
..style = PaintingStyle.fill;
|
||||||
|
final bw = size.width * 0.2;
|
||||||
|
final gap = size.width * 0.1;
|
||||||
|
final tw = 3 * bw + 2 * gap;
|
||||||
|
final sx = (size.width - tw) / 2;
|
||||||
|
for (int i = 0; i < 3; i++) {
|
||||||
|
final phase = t * 2 * math.pi + i * math.pi * 0.7;
|
||||||
|
final hr = 0.3 + 0.7 * ((math.sin(phase) + 1) / 2);
|
||||||
|
final bh = size.height * hr;
|
||||||
|
final x = sx + i * (bw + gap);
|
||||||
|
final y = (size.height - bh) / 2;
|
||||||
|
canvas.drawRRect(
|
||||||
|
RRect.fromRectAndRadius(Rect.fromLTWH(x, y, bw, bh), Radius.circular(bw / 2)),
|
||||||
|
paint,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
@override
|
||||||
|
bool shouldRepaint(WavePainter old) => old.t != t;
|
||||||
|
}
|
||||||
@ -24,8 +24,7 @@
|
|||||||
|
|
||||||
1. **song_title** (歌曲名称)
|
1. **song_title** (歌曲名称)
|
||||||
- 使用**中文**,简短有趣,3-8个字。
|
- 使用**中文**,简短有趣,3-8个字。
|
||||||
- 体现咔咔的可爱风格。
|
- 根据用户描述的场景自由发挥,不要套用固定模板。
|
||||||
- 示例:"温泉咔咔乐"、"草地蹦蹦跳"、"雨夜安眠曲"
|
|
||||||
|
|
||||||
2. **style** (风格描述)
|
2. **style** (风格描述)
|
||||||
- 使用**英文**描述音乐风格、乐器、节奏、情绪。
|
- 使用**英文**描述音乐风格、乐器、节奏、情绪。
|
||||||
|
|||||||
244
server.py
244
server.py
@ -2,10 +2,14 @@ import os
|
|||||||
import re
|
import re
|
||||||
import sys
|
import sys
|
||||||
import time
|
import time
|
||||||
|
import uuid
|
||||||
|
import struct
|
||||||
|
import asyncio
|
||||||
import uvicorn
|
import uvicorn
|
||||||
import requests
|
import requests
|
||||||
import json
|
import json
|
||||||
from fastapi import FastAPI, HTTPException
|
import websockets
|
||||||
|
from fastapi import FastAPI, HTTPException, Query
|
||||||
from fastapi.responses import StreamingResponse
|
from fastapi.responses import StreamingResponse
|
||||||
from fastapi.middleware.cors import CORSMiddleware
|
from fastapi.middleware.cors import CORSMiddleware
|
||||||
from pydantic import BaseModel
|
from pydantic import BaseModel
|
||||||
@ -20,11 +24,15 @@ if sys.platform == "win32":
|
|||||||
load_dotenv()
|
load_dotenv()
|
||||||
MINIMAX_API_KEY = os.getenv("MINIMAX_API_KEY")
|
MINIMAX_API_KEY = os.getenv("MINIMAX_API_KEY")
|
||||||
VOLCENGINE_API_KEY = os.getenv("VOLCENGINE_API_KEY")
|
VOLCENGINE_API_KEY = os.getenv("VOLCENGINE_API_KEY")
|
||||||
|
TTS_APP_ID = os.getenv("TTS_APP_ID")
|
||||||
|
TTS_ACCESS_TOKEN = os.getenv("TTS_ACCESS_TOKEN")
|
||||||
|
|
||||||
if not MINIMAX_API_KEY:
|
if not MINIMAX_API_KEY:
|
||||||
print("Warning: MINIMAX_API_KEY not found in .env")
|
print("Warning: MINIMAX_API_KEY not found in .env")
|
||||||
if not VOLCENGINE_API_KEY:
|
if not VOLCENGINE_API_KEY:
|
||||||
print("Warning: VOLCENGINE_API_KEY not found in .env")
|
print("Warning: VOLCENGINE_API_KEY not found in .env")
|
||||||
|
if not TTS_APP_ID or not TTS_ACCESS_TOKEN:
|
||||||
|
print("Warning: TTS_APP_ID or TTS_ACCESS_TOKEN not found in .env")
|
||||||
|
|
||||||
# Initialize FastAPI
|
# Initialize FastAPI
|
||||||
app = FastAPI()
|
app = FastAPI()
|
||||||
@ -606,14 +614,244 @@ def get_playlist():
|
|||||||
return {"playlist": playlist}
|
return {"playlist": playlist}
|
||||||
|
|
||||||
|
|
||||||
# ── Static file serving for generated music ──
|
# ═══════════════════════════════════════════════════════════════════
|
||||||
|
# ── TTS: 豆包语音合成 2.0 WebSocket V3 二进制协议 ──
|
||||||
|
# ═══════════════════════════════════════════════════════════════════
|
||||||
|
|
||||||
|
TTS_WS_URL = "wss://openspeech.bytedance.com/api/v1/tts/ws_binary"
|
||||||
|
TTS_CLUSTER = "volcano_tts"
|
||||||
|
TTS_SPEAKER = "ICL_zh_female_keainvsheng_tob"
|
||||||
|
|
||||||
|
_audio_dir = os.path.join(os.path.dirname(__file__) or ".", "Capybara audio")
|
||||||
|
os.makedirs(_audio_dir, exist_ok=True)
|
||||||
|
|
||||||
|
|
||||||
|
def _build_tts_v1_request(payload_json: dict) -> bytes:
|
||||||
|
"""Build a V1 full-client-request binary frame.
|
||||||
|
Header: 0x11 0x10 0x10 0x00 (v1, 4-byte header, full-client-request, JSON, no compression)
|
||||||
|
Then 4-byte big-endian payload length, then JSON payload bytes.
|
||||||
|
"""
|
||||||
|
payload_bytes = json.dumps(payload_json, ensure_ascii=False).encode("utf-8")
|
||||||
|
header = bytes([0x11, 0x10, 0x10, 0x00])
|
||||||
|
length = struct.pack(">I", len(payload_bytes))
|
||||||
|
return header + length + payload_bytes
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_tts_v1_response(data: bytes):
|
||||||
|
"""Parse a V1 TTS response binary frame.
|
||||||
|
Returns (audio_bytes_or_none, is_last, is_error, error_msg).
|
||||||
|
"""
|
||||||
|
if len(data) < 4:
|
||||||
|
return None, False, True, "Frame too short"
|
||||||
|
|
||||||
|
byte1 = data[1]
|
||||||
|
msg_type = (byte1 >> 4) & 0x0F
|
||||||
|
msg_flags = byte1 & 0x0F
|
||||||
|
|
||||||
|
# Error frame: msg_type = 0xF
|
||||||
|
if msg_type == 0x0F:
|
||||||
|
offset = 4
|
||||||
|
error_code = 0
|
||||||
|
if len(data) >= offset + 4:
|
||||||
|
error_code = struct.unpack(">I", data[offset:offset + 4])[0]
|
||||||
|
offset += 4
|
||||||
|
if len(data) >= offset + 4:
|
||||||
|
msg_len = struct.unpack(">I", data[offset:offset + 4])[0]
|
||||||
|
offset += 4
|
||||||
|
error_msg = data[offset:offset + msg_len].decode("utf-8", errors="replace")
|
||||||
|
else:
|
||||||
|
error_msg = f"error code {error_code}"
|
||||||
|
print(f"[TTS Error] code={error_code}, msg={error_msg}", flush=True)
|
||||||
|
return None, False, True, error_msg
|
||||||
|
|
||||||
|
# Audio-only response: msg_type = 0xB
|
||||||
|
if msg_type == 0x0B:
|
||||||
|
# flags: 0b0000=no seq, 0b0001=seq>0, 0b0010/0b0011=last (seq<0)
|
||||||
|
is_last = (msg_flags & 0x02) != 0 # bit 1 set = last message
|
||||||
|
offset = 4
|
||||||
|
|
||||||
|
# If flags != 0, there's a 4-byte sequence number
|
||||||
|
if msg_flags != 0:
|
||||||
|
offset += 4 # skip sequence number
|
||||||
|
|
||||||
|
if len(data) < offset + 4:
|
||||||
|
return None, is_last, False, ""
|
||||||
|
|
||||||
|
payload_size = struct.unpack(">I", data[offset:offset + 4])[0]
|
||||||
|
offset += 4
|
||||||
|
audio_data = data[offset:offset + payload_size]
|
||||||
|
return audio_data, is_last, False, ""
|
||||||
|
|
||||||
|
# Server response with JSON (msg_type = 0x9): usually contains metadata
|
||||||
|
if msg_type == 0x09:
|
||||||
|
offset = 4
|
||||||
|
if len(data) >= offset + 4:
|
||||||
|
payload_size = struct.unpack(">I", data[offset:offset + 4])[0]
|
||||||
|
offset += 4
|
||||||
|
json_str = data[offset:offset + payload_size].decode("utf-8", errors="replace")
|
||||||
|
print(f"[TTS] Server JSON: {json_str[:200]}", flush=True)
|
||||||
|
return None, False, False, ""
|
||||||
|
|
||||||
|
return None, False, False, ""
|
||||||
|
|
||||||
|
|
||||||
|
async def tts_synthesize(text: str) -> bytes:
|
||||||
|
"""Connect to Doubao TTS V1 WebSocket and synthesize text to MP3 bytes."""
|
||||||
|
headers = {
|
||||||
|
"Authorization": f"Bearer;{TTS_ACCESS_TOKEN}",
|
||||||
|
}
|
||||||
|
|
||||||
|
payload = {
|
||||||
|
"app": {
|
||||||
|
"appid": TTS_APP_ID,
|
||||||
|
"token": "placeholder",
|
||||||
|
"cluster": TTS_CLUSTER,
|
||||||
|
},
|
||||||
|
"user": {
|
||||||
|
"uid": "airhub_user",
|
||||||
|
},
|
||||||
|
"audio": {
|
||||||
|
"voice_type": TTS_SPEAKER,
|
||||||
|
"encoding": "mp3",
|
||||||
|
"speed_ratio": 1.0,
|
||||||
|
"rate": 24000,
|
||||||
|
},
|
||||||
|
"request": {
|
||||||
|
"reqid": str(uuid.uuid4()),
|
||||||
|
"text": text,
|
||||||
|
"operation": "submit", # streaming mode
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
audio_buffer = bytearray()
|
||||||
|
request_frame = _build_tts_v1_request(payload)
|
||||||
|
|
||||||
|
print(f"[TTS] Connecting to V1 WebSocket... text length={len(text)}", flush=True)
|
||||||
|
|
||||||
|
async with websockets.connect(
|
||||||
|
TTS_WS_URL,
|
||||||
|
extra_headers=headers,
|
||||||
|
max_size=10 * 1024 * 1024, # 10MB max frame
|
||||||
|
ping_interval=None,
|
||||||
|
) as ws:
|
||||||
|
# Send request
|
||||||
|
await ws.send(request_frame)
|
||||||
|
print("[TTS] Request sent, waiting for audio...", flush=True)
|
||||||
|
|
||||||
|
# Receive audio chunks
|
||||||
|
chunk_count = 0
|
||||||
|
async for message in ws:
|
||||||
|
if isinstance(message, bytes):
|
||||||
|
audio_data, is_last, is_error, error_msg = _parse_tts_v1_response(message)
|
||||||
|
|
||||||
|
if is_error:
|
||||||
|
raise RuntimeError(f"TTS error: {error_msg}")
|
||||||
|
|
||||||
|
if audio_data and len(audio_data) > 0:
|
||||||
|
audio_buffer.extend(audio_data)
|
||||||
|
chunk_count += 1
|
||||||
|
|
||||||
|
if is_last:
|
||||||
|
print(f"[TTS] Last frame received. chunks={chunk_count}, "
|
||||||
|
f"audio size={len(audio_buffer)} bytes", flush=True)
|
||||||
|
break
|
||||||
|
|
||||||
|
return bytes(audio_buffer)
|
||||||
|
|
||||||
|
|
||||||
|
class TTSRequest(BaseModel):
|
||||||
|
title: str
|
||||||
|
content: str
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/api/tts_check")
|
||||||
|
def tts_check(title: str = Query(...)):
|
||||||
|
"""Check if audio already exists for a story title."""
|
||||||
|
for f in os.listdir(_audio_dir):
|
||||||
|
if f.lower().endswith(".mp3"):
|
||||||
|
# Match by title prefix (before timestamp)
|
||||||
|
name = f[:-4] # strip .mp3
|
||||||
|
name_without_ts = re.sub(r'_\d{10,}$', '', name)
|
||||||
|
if name_without_ts == title or name == title:
|
||||||
|
return {
|
||||||
|
"exists": True,
|
||||||
|
"audio_url": f"Capybara audio/{f}",
|
||||||
|
}
|
||||||
|
return {"exists": False, "audio_url": None}
|
||||||
|
|
||||||
|
|
||||||
|
@app.post("/api/create_tts")
|
||||||
|
def create_tts(req: TTSRequest):
|
||||||
|
"""Generate TTS audio for a story. Returns SSE stream with progress."""
|
||||||
|
|
||||||
|
def event_stream():
|
||||||
|
import asyncio
|
||||||
|
|
||||||
|
yield sse_event({"stage": "connecting", "progress": 10,
|
||||||
|
"message": "正在连接语音合成服务..."})
|
||||||
|
|
||||||
|
# Check if audio already exists
|
||||||
|
for f in os.listdir(_audio_dir):
|
||||||
|
if f.lower().endswith(".mp3"):
|
||||||
|
name = f[:-4]
|
||||||
|
name_without_ts = re.sub(r'_\d{10,}$', '', name)
|
||||||
|
if name_without_ts == req.title:
|
||||||
|
yield sse_event({"stage": "done", "progress": 100,
|
||||||
|
"message": "语音已存在",
|
||||||
|
"audio_url": f"Capybara audio/{f}"})
|
||||||
|
return
|
||||||
|
|
||||||
|
yield sse_event({"stage": "generating", "progress": 30,
|
||||||
|
"message": "AI 正在朗读故事..."})
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Run async TTS in a new event loop
|
||||||
|
loop = asyncio.new_event_loop()
|
||||||
|
audio_bytes = loop.run_until_complete(tts_synthesize(req.content))
|
||||||
|
loop.close()
|
||||||
|
|
||||||
|
if not audio_bytes or len(audio_bytes) < 100:
|
||||||
|
yield sse_event({"stage": "error", "progress": 0,
|
||||||
|
"message": "语音合成返回了空音频"})
|
||||||
|
return
|
||||||
|
|
||||||
|
yield sse_event({"stage": "saving", "progress": 80,
|
||||||
|
"message": "正在保存音频..."})
|
||||||
|
|
||||||
|
# Save audio file
|
||||||
|
timestamp = int(time.time())
|
||||||
|
safe_title = re.sub(r'[<>:"/\\|?*]', '', req.title)[:50]
|
||||||
|
filename = f"{safe_title}_{timestamp}.mp3"
|
||||||
|
filepath = os.path.join(_audio_dir, filename)
|
||||||
|
|
||||||
|
with open(filepath, "wb") as f:
|
||||||
|
f.write(audio_bytes)
|
||||||
|
|
||||||
|
print(f"[TTS Saved] {filepath} ({len(audio_bytes)} bytes)", flush=True)
|
||||||
|
|
||||||
|
yield sse_event({"stage": "done", "progress": 100,
|
||||||
|
"message": "语音生成完成!",
|
||||||
|
"audio_url": f"Capybara audio/{filename}"})
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
print(f"[TTS Error] {e}", flush=True)
|
||||||
|
yield sse_event({"stage": "error", "progress": 0,
|
||||||
|
"message": f"语音合成失败: {str(e)}"})
|
||||||
|
|
||||||
|
return StreamingResponse(event_stream(), media_type="text/event-stream")
|
||||||
|
|
||||||
|
|
||||||
|
# ── Static file serving ──
|
||||||
from fastapi.staticfiles import StaticFiles
|
from fastapi.staticfiles import StaticFiles
|
||||||
|
|
||||||
# Create music directory if it doesn't exist
|
# Music directory
|
||||||
_music_dir = os.path.join(os.path.dirname(__file__) or ".", "Capybara music")
|
_music_dir = os.path.join(os.path.dirname(__file__) or ".", "Capybara music")
|
||||||
os.makedirs(_music_dir, exist_ok=True)
|
os.makedirs(_music_dir, exist_ok=True)
|
||||||
app.mount("/Capybara music", StaticFiles(directory=_music_dir), name="music_files")
|
app.mount("/Capybara music", StaticFiles(directory=_music_dir), name="music_files")
|
||||||
|
|
||||||
|
# Audio directory (TTS generated)
|
||||||
|
app.mount("/Capybara audio", StaticFiles(directory=_audio_dir), name="audio_files")
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
print("[Server] Music Server running on http://localhost:3000")
|
print("[Server] Music Server running on http://localhost:3000")
|
||||||
|
|||||||
@ -3,7 +3,7 @@
|
|||||||
> **用途**:每次对话结束前 / 做完一个阶段后更新此文件。
|
> **用途**:每次对话结束前 / 做完一个阶段后更新此文件。
|
||||||
> 新对话开始时,AI 先读此文件恢复上下文。
|
> 新对话开始时,AI 先读此文件恢复上下文。
|
||||||
>
|
>
|
||||||
> **最后更新**:2026-02-09 (第八次对话)
|
> **最后更新**:2026-02-10 (第九次对话)
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@ -155,9 +155,47 @@
|
|||||||
- **封面区分**:预设故事显示封面图,AI 生成的故事显示淡紫渐变"暂无封面"占位
|
- **封面区分**:预设故事显示封面图,AI 生成的故事显示淡紫渐变"暂无封面"占位
|
||||||
- **乱码过滤**:API 层自动跳过无中文标题的异常文件
|
- **乱码过滤**:API 层自动跳过无中文标题的异常文件
|
||||||
|
|
||||||
### 正在做的
|
### 第九次对话完成的工作(2026-02-10)
|
||||||
- TTS 语音合成待后续接入(用户去开通火山语音服务后再做)
|
|
||||||
|
#### TTS 语音合成全链路接入(上次对话完成,此处补记)
|
||||||
|
- **后端**:`server.py` 新增 `/api/tts` 接口,WebSocket 流式调用豆包 TTS V1 API
|
||||||
|
- **音色**:可爱女生(`ICL_zh_female_keainvsheng_tob`)
|
||||||
|
- **前端组件**:`PillProgressButton`(药丸形进度按钮)替代旧 RingProgressButton
|
||||||
|
- 5 种状态:idle / ready / generating / completed / playing / paused / error
|
||||||
|
- 进度环动画 + 音波动效 + 发光效果
|
||||||
|
- **TTSService 单例**:后台持续运行,切页面不中断生成
|
||||||
|
- **音频保存**:生成的 TTS 音频保存到 `Capybara audio/` 目录
|
||||||
|
- **暂停/续播修复**:显式 seek 到暂停位置再 play,解决 Web 端从头播放的 bug
|
||||||
|
- **按钮状态修复**:新增 `ready` 状态,未播放过的音频显示"播放"而非"继续"
|
||||||
|
- **自动播放控制**:仅在用户停留在故事页时自动播放,切出页面不自动播
|
||||||
|
|
||||||
|
#### 音乐总监 Prompt 优化
|
||||||
|
- **歌名去重复**:移除固定示例("温泉咔咔乐"等),改为"根据场景自由发挥,不要套用固定模板"
|
||||||
|
- **效果**:AI 每次为相似场景生成不同歌名,唱片架不再出现一堆同名歌曲
|
||||||
|
|
||||||
|
#### 唱片架播放状态可视化
|
||||||
|
- **卡片高亮**:当前播放的歌曲整张卡片变暖金色底 + 金色边框 + 阴影
|
||||||
|
- **标题标识**:播放中的歌曲标题前加小喇叭图标 + 金色加粗文字
|
||||||
|
- **音波动效**:播放中的唱片中心叠加跳动音波 CustomPaint 动画
|
||||||
|
|
||||||
|
#### 气泡持续显示当前歌名
|
||||||
|
- 播放期间气泡始终显示"正在播放: xxx",不再 3 秒后消失
|
||||||
|
- 直接点播放按钮(非从唱片架选歌)也会显示歌名
|
||||||
|
- 暂停时气泡自动隐藏,切歌时自动更新
|
||||||
|
- 使用 `_playStickyText` 机制,即使其他临时消息弹出后也会恢复播放信息
|
||||||
|
|
||||||
|
#### 调研 AI 音乐生成平台
|
||||||
|
- 对比了 MiniMax Music 2.5(现用)、Mureka(昆仑万维)、天谱乐、ACE-Step
|
||||||
|
- 发现 Mureka 有中国站 API(platform.mureka.cn),质量评测超越 Suno V4
|
||||||
|
- 用户的朋友用的 Muse AI App 底层就是 Mureka 模型
|
||||||
|
- MiniMax 文本模型(abab6.5s-chat)价格偏高,可考虑切豆包
|
||||||
|
- 歌词生成费用极低(每次约 0.005 元),主要成本在音乐生成(1 元/首)
|
||||||
|
|
||||||
|
### 正在做的 / 待办
|
||||||
- 故事封面方案待定(付费生成 or 免费生成)
|
- 故事封面方案待定(付费生成 or 免费生成)
|
||||||
|
- 考虑将音乐生成从 MiniMax 切换到 Mureka(用户在评估中)
|
||||||
|
- 考虑将歌词生成的 LLM 从 MiniMax abab6.5s-chat 切到豆包(更便宜)
|
||||||
|
- 长歌名 fallback 问题:LLM 返回空 song_title 时用了用户输入原文当歌名,后续可优化
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user