210 lines
9.3 KiB
Markdown
210 lines
9.3 KiB
Markdown
# 火山 Websocket 双向流式对话
|
||
|
||
## 例程简介
|
||
|
||
本例程实现了扣子智能语音对话 Websocket OpenAPI, 主要是通过按键的方式实现与智能体的对讲。
|
||
|
||
## 示例创建
|
||
|
||
### IDF 默认分支
|
||
|
||
本例程支持 IDF release/v5.4 及以后的分支。
|
||
|
||
### 预备知识
|
||
|
||
首先需要在[Coze文档中](https://bytedance.larkoffice.com/docx/Da6qd87pQodvNrxdFYrcnzMxnsh)申请 `Access token` 和 `BOT ID`账号
|
||
更多的 Websocket 文档可以参考 [双向流式对话事件](https://www.coze.cn/open/docs/developer_guides/streaming_chat_event)
|
||
|
||
### 配置
|
||
|
||
1. 将获取到的 `Access token` 和 `BOT ID` 信息填入 `Menuconfig->Example Configuration` 中。
|
||
2. 將 wifi 信息填入 `Menuconfig->>Example Configuration` 中。
|
||
|
||
### 编译和下载
|
||
|
||
编译本例程前需要先确保已配置 ESP-IDF 的环境,如果已配置可跳到下一项配置,如果未配置需要先在 ESP-IDF 根目录运行下面脚本设置编译环境,有关配置和使用 ESP-IDF 完整步骤,请参阅 [《ESP-IDF 编程指南》](https://docs.espressif.com/projects/esp-idf/zh_CN/latest/esp32s3/index.html)
|
||
|
||
```
|
||
./install.sh
|
||
. ./export.sh
|
||
```
|
||
|
||
|
||
- 选择编译芯片,以 esp32s3 为例:
|
||
|
||
```
|
||
idf.py set-target esp32s3
|
||
```
|
||
|
||
- 编译例子程序
|
||
|
||
```
|
||
idf.py build
|
||
```
|
||
|
||
- 烧录程序并运行 monitor 工具来查看串口输出 (替换 PORT 为端口名称):
|
||
|
||
```
|
||
idf.py -p PORT flash monitor
|
||
```
|
||
|
||
- 退出调试界面使用 ``Ctrl-]``
|
||
|
||
## 如何使用例程
|
||
|
||
### 功能和用法
|
||
|
||
- 例程开始运行后, 当出现以下log就说明了与服务端建立了连接, 就可以对话了:
|
||
```c
|
||
I (931) main: Initialize board peripherals
|
||
W (932) i2c_bus_v2: I2C master handle is NULL, will create new one
|
||
I (941) DRV8311: ES8311 in Slave mode
|
||
I (958) ES7210: ES7210 in Slave mode
|
||
I (967) ES7210: Enable ES7210_INPUT_MIC1
|
||
I (970) ES7210: Enable ES7210_INPUT_MIC2
|
||
I (972) ES7210: Enable ES7210_INPUT_MIC3
|
||
W (976) ES7210: Enable TDM mode. ES7210_SDP_INTERFACE2_REG12: 2
|
||
I (980) ES7210: config fmt 60
|
||
I (982) AUDIO_HAL: Codec mode is 3, Ctrl:1
|
||
I (990) pp: pp rom version: e7ae62f
|
||
I (990) net80211: net80211 rom version: e7ae62f
|
||
I (990) AUDIO_THREAD: The esp_periph task allocate stack on internal memory
|
||
I (991) wifi:wifi driver task: 3fcee994, prio:23, stack:6656, core=0
|
||
I (1001) wifi:wifi firmware version: 21fc8af6de
|
||
I (1003) wifi:wifi certification version: v7.0
|
||
I (1007) wifi:config NVS flash: enabled
|
||
I (1011) wifi:config nano formatting: disabled
|
||
I (1015) wifi:Init data frame dynamic rx buffer num: 32
|
||
I (1020) wifi:Init static rx mgmt buffer num: 5
|
||
I (1024) wifi:Init management short buffer num: 32
|
||
I (1029) wifi:Init dynamic tx buffer num: 32
|
||
I (1033) wifi:Init static tx FG buffer num: 2
|
||
I (1037) wifi:Init static rx buffer size: 1600
|
||
I (1041) wifi:Init static rx buffer num: 16
|
||
I (1045) wifi:Init dynamic rx buffer num: 32
|
||
I (1049) wifi_init: rx ba win: 16
|
||
I (1052) wifi_init: accept mbox: 32
|
||
I (1055) wifi_init: tcpip mbox: 32
|
||
I (1058) wifi_init: udp mbox: 6
|
||
I (1061) wifi_init: tcp mbox: 32
|
||
I (1064) wifi_init: tcp tx win: 65535
|
||
I (1067) wifi_init: tcp rx win: 32768
|
||
I (1071) wifi_init: tcp mss: 1440
|
||
I (1074) wifi_init: WiFi IRAM OP enabled
|
||
I (1078) wifi_init: WiFi RX IRAM OP enabled
|
||
W (1082) wifi:Password length matches WPA2 standards, authmode threshold changes from OPEN to WPA2
|
||
I (1090) wifi:Set ps type: 1, coexist: 0
|
||
|
||
I (1094) phy_init: phy_version 701,f4f1da3a,Mar 3 2025,15:50:10
|
||
I (1155) wifi:mode : sta (74:4d:bd:9d:b6:30)
|
||
I (1155) wifi:enable tsf
|
||
W (1155) PERIPH_WIFI: WiFi Event cb, Unhandle event_base:WIFI_EVENT, event_id:43
|
||
I (2446) wifi:new:<11,0>, old:<1,0>, ap:<255,255>, sta:<11,0>, prof:1, snd_ch_cfg:0x0
|
||
I (2446) wifi:state: init -> auth (0xb0)
|
||
W (2447) PERIPH_WIFI: WiFi Event cb, Unhandle event_base:WIFI_EVENT, event_id:43
|
||
I (2462) wifi:state: auth -> assoc (0x0)
|
||
I (2468) wifi:state: assoc -> run (0x10)
|
||
I (2485) wifi:connected with xtworks, aid = 88, channel 11, BW20, bssid = ec:56:23:e9:7e:f0
|
||
I (2486) wifi:security: WPA2-PSK, phy: bgn, rssi: -40
|
||
I (2488) wifi:pm start, type: 1
|
||
|
||
I (2490) wifi:dp: 1, bi: 102400, li: 3, scale listen interval from 307200 us to 307200 us
|
||
I (2498) wifi:set rx beacon pti, rx_bcn_pti: 0, bcn_timeout: 25000, mt_pti: 0, mt_time: 10000
|
||
W (2508) PERIPH_WIFI: WiFi Event cb, Unhandle event_base:WIFI_EVENT, event_id:4
|
||
I (2532) wifi:AP's beacon interval = 102400 us, DTIM period = 1
|
||
I (3571) esp_netif_handlers: sta ip: 192.168.3.7, mask: 255.255.255.0, gw: 192.168.3.1
|
||
I (3571) PERIPH_WIFI: Got ip:192.168.3.7
|
||
I (3571) coze_chat: wss_url: ws://ws.coze.cn/v1/chat?bot_id=74800678********
|
||
I (3579) AUDIO_THREAD: The coze_data_pull_task task allocate stack on external memory
|
||
I (3586) AUDIO_THREAD: The coze_data_push_task task allocate stack on external memory
|
||
I (3594) coze_chat: WEBSOCKET_EVENT_BEGIN
|
||
I (3598) websocket_client: Started
|
||
I (3600) coze_chat: Wait for websocket connected
|
||
I (4104) coze_chat: Wait for websocket connected
|
||
I (4192) coze_chat: WEBSOCKET_EVENT_CONNECTED
|
||
I (5215) wifi:<ba-add>idx:0 (ifx:0, ec:56:23:e9:7e:f0), tid:6, ssn:2665, winSize:64
|
||
I (5265) coze_chat: Request conversation_id : 7491159*********
|
||
I (5266) coze_chat: WS connected
|
||
I (5266) coze_chat: WS updata chat
|
||
I (5268) coze_chat: Update chat: {
|
||
"id": "15deeac7-bc4c-731a-c81e-2585128c3557",
|
||
"event_type": "chat.update",
|
||
"data": {
|
||
"chat_config": {
|
||
"auto_save_history": true,
|
||
"conversation_id": "7491159*********",
|
||
"user_id": "userid_123",
|
||
"meta_data": {
|
||
},
|
||
"custom_variables": {
|
||
},
|
||
"extra_params": {
|
||
},
|
||
"parameters": {
|
||
"custom_var_1": "测试"
|
||
}
|
||
},
|
||
"input_audio": {
|
||
"format": "pcm",
|
||
"codec": "pcm",
|
||
"sample_rate": 16000,
|
||
"channel": 1,
|
||
"bit_depth": 16
|
||
},
|
||
"turn_detection": {
|
||
"type": "server_vad",
|
||
"prefix_padding_ms": 600,
|
||
"silence_duration_ms": 500
|
||
},
|
||
"output_audio": {
|
||
"codec": "opus",
|
||
"opus_config": {
|
||
"bitrate": 16000,
|
||
"frame_size_ms": 60,
|
||
"limit_config": {
|
||
"period": 1,
|
||
"max_frame_num": 18
|
||
}
|
||
},
|
||
"speech_rate": 20,
|
||
"voice_id": "7426720361733144585"
|
||
},
|
||
"event_subscriptions": ["conversation.audio.delta", "conversation.chat.completed", "input_audio_buffer.speech_started", "input_audio_buffer.speech_stopped", "chat.created", "error"]
|
||
}
|
||
}
|
||
I (5365) MODEL_LOADER: The storage free size is 23104 KB
|
||
I (5367) MODEL_LOADER: The partition size is 5168 KB
|
||
I (5372) MODEL_LOADER: Successfully load srmodels
|
||
I (5376) ALGORITHM_STREAM: Load: wn9_hilexin
|
||
I (5381) AUDIO_PIPELINE: link el->rb, el:0x3c17e144, tag:algo_stream, rb:0x3c17e3d0
|
||
I (5389) AUDIO_PIPELINE: link el->rb, el:0x3c185108, tag:raw_stream, rb:0x3c185538
|
||
I (5395) AUDIO_PIPELINE: link el->rb, el:0x3c185250, tag:raw_opus, rb:0x3c187580
|
||
I (5402) AUDIO_THREAD: The algo_stream task allocate stack on external memory
|
||
I (5409) AUDIO_ELEMENT: [algo_stream-0x3c17e144] Element task created
|
||
I (5415) AUDIO_ELEMENT: [raw_read-0x3c17e274] Element task created
|
||
I (5421) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:8447912 Bytes, Inter:201343 Bytes, Dram:201343 Bytes, Dram largest free:102400Bytes
|
||
|
||
I (5435) AUDIO_ELEMENT: [algo_stream] AEL_MSG_CMD_RESUME,state:1
|
||
I (5440) AUDIO_PIPELINE: Pipeline started
|
||
I (5440) AFE_CONFIG: Set WakeNet Model: wn9_hilexin
|
||
I (5444) AUDIO_ELEMENT: [raw_stream-0x3c185108] Element task created
|
||
W (5450) AFE_CONFIG: For single microphone channel, SE is deactivated.
|
||
I (5456) AUDIO_THREAD: The raw_opus task allocate stack on external memory
|
||
I (5485) AFE: AFE Version: (1MIC_V250121)
|
||
I (5485) AFE: Input PCM Config: total 2 channels(1 microphone, 1 playback), sample rate:16000
|
||
I (5488) AFE: AFE Pipeline: [input] -> |AEC(VOIP_LOW_COST)| -> |NS(WebRTC)| -> [output]
|
||
I (5497) AUDIO_THREAD: The algo_fetch task allocate stack on external memory
|
||
I (5522) AUDIO_ELEMENT: [raw_opus-0x3c185250] Element task created
|
||
I (5522) AUDIO_THREAD: The filter task allocate stack on external memory
|
||
I (5523) AUDIO_ELEMENT: [filter-0x3c1853c8] Element task created
|
||
I (5529) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:8219944 Bytes, Inter:172823 Bytes, Dram:172823 Bytes, Dram largest free:98304Bytes
|
||
|
||
I (5542) AUDIO_ELEMENT: [raw_opus] AEL_MSG_CMD_RESUME,state:1
|
||
I (5557) AUDIO_ELEMENT: [filter] AEL_MSG_CMD_RESUME,state:1
|
||
I (5557) AUDIO_PIPELINE: Pipeline started
|
||
I (5558) AUDIO_THREAD: The audio_data_read_task task allocate stack on external memory
|
||
I (5566) main: Func:app_main, Line:161, MEM Total:8187216 Bytes, Inter:171507 Bytes, Dram:171507 Bytes, Dram largest free:98304Bytes
|
||
|
||
I (5582) main_task: Returned from app_main()
|
||
```
|