210 lines
9.3 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# 火山 Websocket 双向流式对话
## 例程简介
本例程实现了扣子智能语音对话 Websocket OpenAPI 主要是通过按键的方式实现与智能体的对讲。
## 示例创建
### IDF 默认分支
本例程支持 IDF release/v5.4 及以后的分支。
### 预备知识
首先需要在[Coze文档中](https://bytedance.larkoffice.com/docx/Da6qd87pQodvNrxdFYrcnzMxnsh)申请 `Access token``BOT ID`账号
更多的 Websocket 文档可以参考 [双向流式对话事件](https://www.coze.cn/open/docs/developer_guides/streaming_chat_event)
### 配置
1. 将获取到的 `Access token``BOT ID` 信息填入 `Menuconfig->Example Configuration` 中。
2. 將 wifi 信息填入 `Menuconfig->>Example Configuration` 中。
### 编译和下载
编译本例程前需要先确保已配置 ESP-IDF 的环境,如果已配置可跳到下一项配置,如果未配置需要先在 ESP-IDF 根目录运行下面脚本设置编译环境,有关配置和使用 ESP-IDF 完整步骤,请参阅 [《ESP-IDF 编程指南》](https://docs.espressif.com/projects/esp-idf/zh_CN/latest/esp32s3/index.html)
```
./install.sh
. ./export.sh
```
- 选择编译芯片,以 esp32s3 为例:
```
idf.py set-target esp32s3
```
- 编译例子程序
```
idf.py build
```
- 烧录程序并运行 monitor 工具来查看串口输出 (替换 PORT 为端口名称)
```
idf.py -p PORT flash monitor
```
- 退出调试界面使用 ``Ctrl-]``
## 如何使用例程
### 功能和用法
- 例程开始运行后, 当出现以下log就说明了与服务端建立了连接 就可以对话了:
```c
I (931) main: Initialize board peripherals
W (932) i2c_bus_v2: I2C master handle is NULL, will create new one
I (941) DRV8311: ES8311 in Slave mode
I (958) ES7210: ES7210 in Slave mode
I (967) ES7210: Enable ES7210_INPUT_MIC1
I (970) ES7210: Enable ES7210_INPUT_MIC2
I (972) ES7210: Enable ES7210_INPUT_MIC3
W (976) ES7210: Enable TDM mode. ES7210_SDP_INTERFACE2_REG12: 2
I (980) ES7210: config fmt 60
I (982) AUDIO_HAL: Codec mode is 3, Ctrl:1
I (990) pp: pp rom version: e7ae62f
I (990) net80211: net80211 rom version: e7ae62f
I (990) AUDIO_THREAD: The esp_periph task allocate stack on internal memory
I (991) wifi:wifi driver task: 3fcee994, prio:23, stack:6656, core=0
I (1001) wifi:wifi firmware version: 21fc8af6de
I (1003) wifi:wifi certification version: v7.0
I (1007) wifi:config NVS flash: enabled
I (1011) wifi:config nano formatting: disabled
I (1015) wifi:Init data frame dynamic rx buffer num: 32
I (1020) wifi:Init static rx mgmt buffer num: 5
I (1024) wifi:Init management short buffer num: 32
I (1029) wifi:Init dynamic tx buffer num: 32
I (1033) wifi:Init static tx FG buffer num: 2
I (1037) wifi:Init static rx buffer size: 1600
I (1041) wifi:Init static rx buffer num: 16
I (1045) wifi:Init dynamic rx buffer num: 32
I (1049) wifi_init: rx ba win: 16
I (1052) wifi_init: accept mbox: 32
I (1055) wifi_init: tcpip mbox: 32
I (1058) wifi_init: udp mbox: 6
I (1061) wifi_init: tcp mbox: 32
I (1064) wifi_init: tcp tx win: 65535
I (1067) wifi_init: tcp rx win: 32768
I (1071) wifi_init: tcp mss: 1440
I (1074) wifi_init: WiFi IRAM OP enabled
I (1078) wifi_init: WiFi RX IRAM OP enabled
W (1082) wifi:Password length matches WPA2 standards, authmode threshold changes from OPEN to WPA2
I (1090) wifi:Set ps type: 1, coexist: 0
I (1094) phy_init: phy_version 701,f4f1da3a,Mar 3 2025,15:50:10
I (1155) wifi:mode : sta (74:4d:bd:9d:b6:30)
I (1155) wifi:enable tsf
W (1155) PERIPH_WIFI: WiFi Event cb, Unhandle event_base:WIFI_EVENT, event_id:43
I (2446) wifi:new:<11,0>, old:<1,0>, ap:<255,255>, sta:<11,0>, prof:1, snd_ch_cfg:0x0
I (2446) wifi:state: init -> auth (0xb0)
W (2447) PERIPH_WIFI: WiFi Event cb, Unhandle event_base:WIFI_EVENT, event_id:43
I (2462) wifi:state: auth -> assoc (0x0)
I (2468) wifi:state: assoc -> run (0x10)
I (2485) wifi:connected with xtworks, aid = 88, channel 11, BW20, bssid = ec:56:23:e9:7e:f0
I (2486) wifi:security: WPA2-PSK, phy: bgn, rssi: -40
I (2488) wifi:pm start, type: 1
I (2490) wifi:dp: 1, bi: 102400, li: 3, scale listen interval from 307200 us to 307200 us
I (2498) wifi:set rx beacon pti, rx_bcn_pti: 0, bcn_timeout: 25000, mt_pti: 0, mt_time: 10000
W (2508) PERIPH_WIFI: WiFi Event cb, Unhandle event_base:WIFI_EVENT, event_id:4
I (2532) wifi:AP's beacon interval = 102400 us, DTIM period = 1
I (3571) esp_netif_handlers: sta ip: 192.168.3.7, mask: 255.255.255.0, gw: 192.168.3.1
I (3571) PERIPH_WIFI: Got ip:192.168.3.7
I (3571) coze_chat: wss_url: ws://ws.coze.cn/v1/chat?bot_id=74800678********
I (3579) AUDIO_THREAD: The coze_data_pull_task task allocate stack on external memory
I (3586) AUDIO_THREAD: The coze_data_push_task task allocate stack on external memory
I (3594) coze_chat: WEBSOCKET_EVENT_BEGIN
I (3598) websocket_client: Started
I (3600) coze_chat: Wait for websocket connected
I (4104) coze_chat: Wait for websocket connected
I (4192) coze_chat: WEBSOCKET_EVENT_CONNECTED
I (5215) wifi:<ba-add>idx:0 (ifx:0, ec:56:23:e9:7e:f0), tid:6, ssn:2665, winSize:64
I (5265) coze_chat: Request conversation_id : 7491159*********
I (5266) coze_chat: WS connected
I (5266) coze_chat: WS updata chat
I (5268) coze_chat: Update chat: {
"id": "15deeac7-bc4c-731a-c81e-2585128c3557",
"event_type": "chat.update",
"data": {
"chat_config": {
"auto_save_history": true,
"conversation_id": "7491159*********",
"user_id": "userid_123",
"meta_data": {
},
"custom_variables": {
},
"extra_params": {
},
"parameters": {
"custom_var_1": "测试"
}
},
"input_audio": {
"format": "pcm",
"codec": "pcm",
"sample_rate": 16000,
"channel": 1,
"bit_depth": 16
},
"turn_detection": {
"type": "server_vad",
"prefix_padding_ms": 600,
"silence_duration_ms": 500
},
"output_audio": {
"codec": "opus",
"opus_config": {
"bitrate": 16000,
"frame_size_ms": 60,
"limit_config": {
"period": 1,
"max_frame_num": 18
}
},
"speech_rate": 20,
"voice_id": "7426720361733144585"
},
"event_subscriptions": ["conversation.audio.delta", "conversation.chat.completed", "input_audio_buffer.speech_started", "input_audio_buffer.speech_stopped", "chat.created", "error"]
}
}
I (5365) MODEL_LOADER: The storage free size is 23104 KB
I (5367) MODEL_LOADER: The partition size is 5168 KB
I (5372) MODEL_LOADER: Successfully load srmodels
I (5376) ALGORITHM_STREAM: Load: wn9_hilexin
I (5381) AUDIO_PIPELINE: link el->rb, el:0x3c17e144, tag:algo_stream, rb:0x3c17e3d0
I (5389) AUDIO_PIPELINE: link el->rb, el:0x3c185108, tag:raw_stream, rb:0x3c185538
I (5395) AUDIO_PIPELINE: link el->rb, el:0x3c185250, tag:raw_opus, rb:0x3c187580
I (5402) AUDIO_THREAD: The algo_stream task allocate stack on external memory
I (5409) AUDIO_ELEMENT: [algo_stream-0x3c17e144] Element task created
I (5415) AUDIO_ELEMENT: [raw_read-0x3c17e274] Element task created
I (5421) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:8447912 Bytes, Inter:201343 Bytes, Dram:201343 Bytes, Dram largest free:102400Bytes
I (5435) AUDIO_ELEMENT: [algo_stream] AEL_MSG_CMD_RESUME,state:1
I (5440) AUDIO_PIPELINE: Pipeline started
I (5440) AFE_CONFIG: Set WakeNet Model: wn9_hilexin
I (5444) AUDIO_ELEMENT: [raw_stream-0x3c185108] Element task created
W (5450) AFE_CONFIG: For single microphone channel, SE is deactivated.
I (5456) AUDIO_THREAD: The raw_opus task allocate stack on external memory
I (5485) AFE: AFE Version: (1MIC_V250121)
I (5485) AFE: Input PCM Config: total 2 channels(1 microphone, 1 playback), sample rate:16000
I (5488) AFE: AFE Pipeline: [input] -> |AEC(VOIP_LOW_COST)| -> |NS(WebRTC)| -> [output]
I (5497) AUDIO_THREAD: The algo_fetch task allocate stack on external memory
I (5522) AUDIO_ELEMENT: [raw_opus-0x3c185250] Element task created
I (5522) AUDIO_THREAD: The filter task allocate stack on external memory
I (5523) AUDIO_ELEMENT: [filter-0x3c1853c8] Element task created
I (5529) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:8219944 Bytes, Inter:172823 Bytes, Dram:172823 Bytes, Dram largest free:98304Bytes
I (5542) AUDIO_ELEMENT: [raw_opus] AEL_MSG_CMD_RESUME,state:1
I (5557) AUDIO_ELEMENT: [filter] AEL_MSG_CMD_RESUME,state:1
I (5557) AUDIO_PIPELINE: Pipeline started
I (5558) AUDIO_THREAD: The audio_data_read_task task allocate stack on external memory
I (5566) main: Func:app_main, Line:161, MEM Total:8187216 Bytes, Inter:171507 Bytes, Dram:171507 Bytes, Dram largest free:98304Bytes
I (5582) main_task: Returned from app_main()
```