9.3 KiB
Raw Permalink Blame History

火山 Websocket 双向流式对话

例程简介

本例程实现了扣子智能语音对话 Websocket OpenAPI 主要是通过按键的方式实现与智能体的对讲。

示例创建

IDF 默认分支

本例程支持 IDF release/v5.4 及以后的分支。

预备知识

首先需要在Coze文档中申请 Access tokenBOT ID账号 更多的 Websocket 文档可以参考 双向流式对话事件

配置

  1. 将获取到的 Access tokenBOT ID 信息填入 Menuconfig->Example Configuration 中。
  2. 將 wifi 信息填入 Menuconfig->>Example Configuration 中。

编译和下载

编译本例程前需要先确保已配置 ESP-IDF 的环境,如果已配置可跳到下一项配置,如果未配置需要先在 ESP-IDF 根目录运行下面脚本设置编译环境,有关配置和使用 ESP-IDF 完整步骤,请参阅 《ESP-IDF 编程指南》

./install.sh
. ./export.sh
  • 选择编译芯片,以 esp32s3 为例:
idf.py set-target esp32s3
  • 编译例子程序
idf.py build
  • 烧录程序并运行 monitor 工具来查看串口输出 (替换 PORT 为端口名称)
idf.py -p PORT flash monitor
  • 退出调试界面使用 Ctrl-]

如何使用例程

功能和用法

  • 例程开始运行后, 当出现以下log就说明了与服务端建立了连接 就可以对话了:
I (931) main: Initialize board peripherals
W (932) i2c_bus_v2: I2C master handle is NULL, will create new one
I (941) DRV8311: ES8311 in Slave mode
I (958) ES7210: ES7210 in Slave mode
I (967) ES7210: Enable ES7210_INPUT_MIC1
I (970) ES7210: Enable ES7210_INPUT_MIC2
I (972) ES7210: Enable ES7210_INPUT_MIC3
W (976) ES7210: Enable TDM mode. ES7210_SDP_INTERFACE2_REG12: 2
I (980) ES7210: config fmt 60
I (982) AUDIO_HAL: Codec mode is 3, Ctrl:1
I (990) pp: pp rom version: e7ae62f
I (990) net80211: net80211 rom version: e7ae62f
I (990) AUDIO_THREAD: The esp_periph task allocate stack on internal memory
I (991) wifi:wifi driver task: 3fcee994, prio:23, stack:6656, core=0
I (1001) wifi:wifi firmware version: 21fc8af6de
I (1003) wifi:wifi certification version: v7.0
I (1007) wifi:config NVS flash: enabled
I (1011) wifi:config nano formatting: disabled
I (1015) wifi:Init data frame dynamic rx buffer num: 32
I (1020) wifi:Init static rx mgmt buffer num: 5
I (1024) wifi:Init management short buffer num: 32
I (1029) wifi:Init dynamic tx buffer num: 32
I (1033) wifi:Init static tx FG buffer num: 2
I (1037) wifi:Init static rx buffer size: 1600
I (1041) wifi:Init static rx buffer num: 16
I (1045) wifi:Init dynamic rx buffer num: 32
I (1049) wifi_init: rx ba win: 16
I (1052) wifi_init: accept mbox: 32
I (1055) wifi_init: tcpip mbox: 32
I (1058) wifi_init: udp mbox: 6
I (1061) wifi_init: tcp mbox: 32
I (1064) wifi_init: tcp tx win: 65535
I (1067) wifi_init: tcp rx win: 32768
I (1071) wifi_init: tcp mss: 1440
I (1074) wifi_init: WiFi IRAM OP enabled
I (1078) wifi_init: WiFi RX IRAM OP enabled
W (1082) wifi:Password length matches WPA2 standards, authmode threshold changes from OPEN to WPA2
I (1090) wifi:Set ps type: 1, coexist: 0

I (1094) phy_init: phy_version 701,f4f1da3a,Mar  3 2025,15:50:10
I (1155) wifi:mode : sta (74:4d:bd:9d:b6:30)
I (1155) wifi:enable tsf
W (1155) PERIPH_WIFI: WiFi Event cb, Unhandle event_base:WIFI_EVENT, event_id:43
I (2446) wifi:new:<11,0>, old:<1,0>, ap:<255,255>, sta:<11,0>, prof:1, snd_ch_cfg:0x0
I (2446) wifi:state: init -> auth (0xb0)
W (2447) PERIPH_WIFI: WiFi Event cb, Unhandle event_base:WIFI_EVENT, event_id:43
I (2462) wifi:state: auth -> assoc (0x0)
I (2468) wifi:state: assoc -> run (0x10)
I (2485) wifi:connected with xtworks, aid = 88, channel 11, BW20, bssid = ec:56:23:e9:7e:f0
I (2486) wifi:security: WPA2-PSK, phy: bgn, rssi: -40
I (2488) wifi:pm start, type: 1

I (2490) wifi:dp: 1, bi: 102400, li: 3, scale listen interval from 307200 us to 307200 us
I (2498) wifi:set rx beacon pti, rx_bcn_pti: 0, bcn_timeout: 25000, mt_pti: 0, mt_time: 10000
W (2508) PERIPH_WIFI: WiFi Event cb, Unhandle event_base:WIFI_EVENT, event_id:4
I (2532) wifi:AP's beacon interval = 102400 us, DTIM period = 1
I (3571) esp_netif_handlers: sta ip: 192.168.3.7, mask: 255.255.255.0, gw: 192.168.3.1
I (3571) PERIPH_WIFI: Got ip:192.168.3.7
I (3571) coze_chat: wss_url: ws://ws.coze.cn/v1/chat?bot_id=74800678********
I (3579) AUDIO_THREAD: The coze_data_pull_task task allocate stack on external memory
I (3586) AUDIO_THREAD: The coze_data_push_task task allocate stack on external memory
I (3594) coze_chat: WEBSOCKET_EVENT_BEGIN
I (3598) websocket_client: Started
I (3600) coze_chat: Wait for websocket connected
I (4104) coze_chat: Wait for websocket connected
I (4192) coze_chat: WEBSOCKET_EVENT_CONNECTED
I (5215) wifi:<ba-add>idx:0 (ifx:0, ec:56:23:e9:7e:f0), tid:6, ssn:2665, winSize:64
I (5265) coze_chat: Request conversation_id : 7491159*********
I (5266) coze_chat: WS connected
I (5266) coze_chat: WS updata chat
I (5268) coze_chat: Update chat: {
        "id":   "15deeac7-bc4c-731a-c81e-2585128c3557",
        "event_type":   "chat.update",
        "data": {
                "chat_config":  {
                        "auto_save_history":    true,
                        "conversation_id":      "7491159*********",
                        "user_id":      "userid_123",
                        "meta_data":    {
                        },
                        "custom_variables":     {
                        },
                        "extra_params": {
                        },
                        "parameters":   {
                                "custom_var_1": "测试"
                        }
                },
                "input_audio":  {
                        "format":       "pcm",
                        "codec":        "pcm",
                        "sample_rate":  16000,
                        "channel":      1,
                        "bit_depth":    16
                },
                "turn_detection":       {
                        "type": "server_vad",
                        "prefix_padding_ms":    600,
                        "silence_duration_ms":  500
                },
                "output_audio": {
                        "codec":        "opus",
                        "opus_config":  {
                                "bitrate":      16000,
                                "frame_size_ms":        60,
                                "limit_config": {
                                        "period":       1,
                                        "max_frame_num":        18
                                }
                        },
                        "speech_rate":  20,
                        "voice_id":     "7426720361733144585"
                },
                "event_subscriptions":  ["conversation.audio.delta", "conversation.chat.completed", "input_audio_buffer.speech_started", "input_audio_buffer.speech_stopped", "chat.created", "error"]
        }
}
I (5365) MODEL_LOADER: The storage free size is 23104 KB
I (5367) MODEL_LOADER: The partition size is 5168 KB
I (5372) MODEL_LOADER: Successfully load srmodels
I (5376) ALGORITHM_STREAM: Load: wn9_hilexin
I (5381) AUDIO_PIPELINE: link el->rb, el:0x3c17e144, tag:algo_stream, rb:0x3c17e3d0
I (5389) AUDIO_PIPELINE: link el->rb, el:0x3c185108, tag:raw_stream, rb:0x3c185538
I (5395) AUDIO_PIPELINE: link el->rb, el:0x3c185250, tag:raw_opus, rb:0x3c187580
I (5402) AUDIO_THREAD: The algo_stream task allocate stack on external memory
I (5409) AUDIO_ELEMENT: [algo_stream-0x3c17e144] Element task created
I (5415) AUDIO_ELEMENT: [raw_read-0x3c17e274] Element task created
I (5421) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:8447912 Bytes, Inter:201343 Bytes, Dram:201343 Bytes, Dram largest free:102400Bytes

I (5435) AUDIO_ELEMENT: [algo_stream] AEL_MSG_CMD_RESUME,state:1
I (5440) AUDIO_PIPELINE: Pipeline started
I (5440) AFE_CONFIG: Set WakeNet Model: wn9_hilexin
I (5444) AUDIO_ELEMENT: [raw_stream-0x3c185108] Element task created
W (5450) AFE_CONFIG: For single microphone channel, SE is deactivated.
I (5456) AUDIO_THREAD: The raw_opus task allocate stack on external memory
I (5485) AFE: AFE Version: (1MIC_V250121)
I (5485) AFE: Input PCM Config: total 2 channels(1 microphone, 1 playback), sample rate:16000
I (5488) AFE: AFE Pipeline: [input] -> |AEC(VOIP_LOW_COST)| -> |NS(WebRTC)| -> [output]
I (5497) AUDIO_THREAD: The algo_fetch task allocate stack on external memory
I (5522) AUDIO_ELEMENT: [raw_opus-0x3c185250] Element task created
I (5522) AUDIO_THREAD: The filter task allocate stack on external memory
I (5523) AUDIO_ELEMENT: [filter-0x3c1853c8] Element task created
I (5529) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:8219944 Bytes, Inter:172823 Bytes, Dram:172823 Bytes, Dram largest free:98304Bytes

I (5542) AUDIO_ELEMENT: [raw_opus] AEL_MSG_CMD_RESUME,state:1
I (5557) AUDIO_ELEMENT: [filter] AEL_MSG_CMD_RESUME,state:1
I (5557) AUDIO_PIPELINE: Pipeline started
I (5558) AUDIO_THREAD: The audio_data_read_task task allocate stack on external memory
I (5566) main: Func:app_main, Line:161, MEM Total:8187216 Bytes, Inter:171507 Bytes, Dram:171507 Bytes, Dram largest free:98304Bytes

I (5582) main_task: Returned from app_main()