Skip to content

【ESP32-S3-BOX-3B】【使用VOLC RTC 语音交互例程接入火山引擎进行语音交互没有声音】 (AUD-6524) #1494

@liuxingxing1104

Description

@liuxingxing1104

使用这个https://gitee.com/EspressifSystems/esp-adf/tree/master/examples/ai_agent/volc_rtc 示例,按文档配置编译,如果使用coze 方案及默认测试账号,是可以跟设备AI对话的。当使用火山智能体时,就没有声音,不能对话,加入房间时也没有声音输出。使用浏览器“无代码跑通实时对话式 AI Demo”是可以对话的,开机日志及启动智能体请求及相应信息如下,请帮忙看看是什么原因
请求数据:
{
"query": {},
"body": {
"AppId": "686******8d3d",
"RoomId": "ChatRoom03",
"TaskId": "ChatTask03",
"Config": {
"ASRConfig": {
"Provider": "volcano",
"ProviderParams": {
"Mode": "smallmodel",
"AppId": "8640329948",
"AccessToken": "oCXibbIGXBr78xU62NJEBKuaMq7gObl5",
"ApiResourceId": "volc.bigasr.sauc.duration",
"Cluster": "volcengine_streaming_common",
"StreamMode": 0
}
},
"TTSConfig": {
"Provider": "volcano",
"ProviderParams": {
"app": {
"appid": "8640329948",
"cluster": "volcano_tts",
"token": "oCXibbIGXBr78xU62NJEBKuaMq7gObl5"
},
"audio": {
"speed_ratio": 1,
"voice_type": "BV001_streaming",
"volume_ratio": 2,
"pitch_ratio": 1,
"pitch_rate": 0,
"speech_rate": 0
}
}
},
"LLMConfig": {
"Mode": "ArkV3",
"EndPointId": "ep-20250708110436-bgvqc"
}
},
"AgentConfig": {
"TargetUserId": [
"liuxing123"
],
"WelcomeMessage": "你好呀,我是乐鑫AI助手",
"UserId": "lexin"
}
}
}
响应结果:
{
"Result": "The task has been started. Please do not call the startup task interface repeatedly.",
"ResponseMetadata": {
"RequestId": "20250709150439368CAA87789AFE039341",
"Action": "StartVoiceChat",
"Version": "2024-12-01",
"Service": "rtc",
"Region": "cn-beijing"
}
}

设备开机日志:
I (407) esp_image: segment 4: paddr=001f3fe0 vaddr=4037b090 sizeI (982) esp_psram: SPI SRAM memory test OK
I (990) cpu_start: Pro cpu start user code
I (990) cpu_start: cpu freq: 240000000 Hz
I (991) app_init: Application information:
I (991) app_init: Project name: volc_rtc
I (994) app_init: App version: v2.7-113-g1e839afb-dirty
I (1000) app_init: Compile time: Jul 9 2025 11:37:45
I (1005) app_init: ELF file SHA256: 6bb64914b...
I (1009) app_init: ESP-IDF: v5.4.1-dirty
I (1014) efuse_init: Min chip rev: v0.0
I (1018) efuse_init: Max chip rev: v0.99
I (1022) efuse_init: Chip rev: v0.2
I (1026) heap_init: Initializing. RAM available for dynamic allocation:
I (1032) heap_init: At 3FCBD2F0 len 0002C420 (177 KiB): RAM
I (1037) heap_init: At 3FCE9710 len 00005724 (21 KiB): RAM
I (1043) heap_init: At 600FE11C len 00001ECC (7 KiB): RTCRAM
I (1048) esp_psram: Adding pool of 14400K of PSRAM memory to heap allocator
I (1055) spi_flash: detected chip: gd
I (1058) spi_flash: flash io: qio
I (1061) sleep_gpio: Configure to isolate all GPIO pins in sleep state
I (1067) sleep_gpio: Enable automatic switching of GPIO sleep configuration
I (1074) main_task: Started on CPU0
I (1077) esp_psram: Reserving pool of 32K of internal memory for DMA/internal allocations
I (1085) main_task: Calling app_main()
I (1091) main: Initialize board peripherals
I (1094) PERIPH_SPIFFS: Partition size: total: 52961, used: 12299
I (1098) AUDIO_THREAD: The esp_periph task allocate stack on internal memory
I (1108) pp: pp rom version: e7ae62f
I (1108) net80211: net80211 rom version: e7ae62f
I (1114) wifi:wifi driver task: 3fcd2108, prio:23, stack:6656, core=0
I (1120) wifi:wifi firmware version: 79fa3f41ba
I (1123) wifi:wifi certification version: v7.0
I (1127) wifi:config NVS flash: enabled
I (1131) wifi:config nano formatting: disabled
I (1135) wifi:Init data frame dynamic rx buffer num: 32
I (1140) wifi:Init static rx mgmt buffer num: 5
I (1144) wifi:Init management short buffer num: 32
I (1149) wifi:Init static tx buffer num: 16
I (1153) wifi:Init tx cache buffer num: 32
I (1156) wifi:Init static tx FG buffer num: 2
I (1160) wifi:Init static rx buffer size: 1600
I (1165) wifi:Init static rx buffer num: 16
I (1168) wifi:Init dynamic rx buffer num: 32
I (1173) wifi_init: rx ba win: 16
I (1175) wifi_init: accept mbox: 6
I (1179) wifi_init: tcpip mbox: 32
I (1182) wifi_init: udp mbox: 6
I (1185) wifi_init: tcp mbox: 6
I (1187) wifi_init: tcp tx win: 5760
I (1191) wifi_init: tcp rx win: 5760
I (1194) wifi_init: tcp mss: 1440
I (1197) wifi_init: WiFi/LWIP prefer SPIRAM
I (1201) wifi_init: WiFi IRAM OP enabled
I (1205) wifi_init: WiFi RX IRAM OP enabled
W (1209) wifi:Password length matches WPA2 standards, authmode threshold changes from OPEN to WPA2
I (1217) wifi:Set ps type: 1, coexist: 0

I (1221) phy_init: phy_version 700,8582a7fd,Feb 10 2025,20:13:11
I (1262) wifi:mode : sta (b4:3a:45:0b:77:cc)
I (1263) wifi:enable tsf
W (1263) PERIPH_WIFI: WiFi Event cb, Unhandle event_base:WIFI_EVENT, event_id:43
I (2141) wifi:new:<7,0>, old:<1,0>, ap:<255,255>, sta:<7,0>, prof:1, snd_ch_cfg:0x0
I (2141) wifi:state: init -> auth (0xb0)
W (2141) PERIPH_WIFI: WiFi Event cb, Unhandle event_base:WIFI_EVENT, event_id:43
I (2145) wifi:state: auth -> assoc (0x0)
I (2156) wifi:state: assoc -> run (0x10)
I (2368) wifi:connected with liuxing_2.4G, aid = 1, channel 7, BW20, bssid = 0c:4b:54:34:53:60
I (2369) wifi:security: WPA2-PSK, phy: bg, rssi: -42
I (2370) wifi:pm start, type: 1

I (2373) wifi:dp: 1, bi: 102400, li: 3, scale listen interval from 307200 us to 307200 us
I (2381) wifi:set rx beacon pti, rx_bcn_pti: 0, bcn_timeout: 25000, mt_pti: 0, mt_time: 10000
W (2390) PERIPH_WIFI: WiFi Event cb, Unhandle event_base:WIFI_EVENT, event_id:4
I (2470) wifi:AP's beacon interval = 102400 us, DTIM period = 1
I (3413) esp_netif_handlers: sta ip: 192.168.1.103, mask: 255.255.255.0, gw: 192.168.1.1
I (3413) PERIPH_WIFI: Got ip:192.168.1.103
W (3737) i2c_bus_v2: I2C master handle is NULL, will create new one
E (3738) i2c.master: I2C transaction unexpected nack detected
E (3738) i2c.master: s_i2c_synchronous_transaction(924): I2C transaction failed
E (3745) i2c.master: i2c_master_multi_buffer_transmit(1186): I2C transaction failed
I (3757) DRV8311: ES8311 in Slave mode
I (3767) gpio: GPIO[46]| InputEn: 0| OutputEn: 1| OpenDrain: 0| Pullup: 0| Pulldown: 0| Intr:0
I (3774) ES7210: ES7210 in Slave mode
I (3782) ES7210: Enable ES7210_INPUT_MIC1
I (3785) ES7210: Enable ES7210_INPUT_MIC2
I (3788) ES7210: Enable ES7210_INPUT_MIC3
W (3791) ES7210: Enable TDM mode. ES7210_SDP_INTERFACE2_REG12: 2
I (3796) ES7210: config fmt 60
I (3798) AUDIO_HAL: Codec mode is 3, Ctrl:1
I (3804) AUDIO_PROCESSOR: Create audio pipeline for audio player
I (3804) AUDIO_PROCESSOR: Create audio player audio stream
I (3806) AUDIO_PROCESSOR: Register all elements to playback pipeline
I (3812) AUDIO_PROCESSOR: Link playback element together raw-->audio_decoder-->i2s_stream-->[codec_chip]
E (3821) gpio: gpio_install_isr_service(502): GPIO isr service already installed
I (3828) DISPATCHER: exe first list: 0x0
I (3832) DISPATCHER: dispatcher_event_task is running...
I (3837) volc_rtc: app_id: 6868d3d, room_id: liuxing, uid: liuxing123, token: 0016****qbU=

2025-07-09 15:04:44.130 [E] VolcEngineRTCLite.c:112 ****************** HELLO BOOKA (686c81ae6790700172578d3d)(1.56.002.09500)(c48915dc01ca27f5a98be735dfef3138d4beaac7) ********************Duplicate of #
2025-07-09 15:04:44.147 [E] Cache.c:270 operation returned status code: 0x00000009
I (3890) AUDIO_PROCESSOR: recorder_pipeline_open
I (3890) AUDIO_PROCESSOR: Create audio pipeline for recording
I (3894) AUDIO_PROCESSOR: Create player audio stream
I (3902) AUDIO_PROCESSOR: Register all player elements to audio pipeline
I (3905) AUDIO_PROCESSOR: Link all player elements to audio pipeline
I (3911) AUDIO_PROCESSOR: player_pipeline_open
I (3915) AUDIO_PROCESSOR: Create audio pipeline for playback
I (3921) AUDIO_PROCESSOR: Create playback audio stream
I (3925) DUAL_MICROPHONES: Create opus decoder
I (3930) AUDIO_PROCESSOR: Register all elements to playback pipeline
I (3936) AUDIO_PROCESSOR: ENBALE_AUDIO_STREAM_DUAL_MIC
I (3941) AUDIO_PROCESSOR: Link playback element together raw-->audio_decoder-->rsp-->i2s_stream-->[codec_chip]
I (3969) AUDIO_PROCESSOR: player pipe start running
I (3969) volc_rtc: start join room

2025-07-09 15:04:44.385 [E] RoomImplX.c:167 operation returned status code: 0x52000057
2025-07-09 15:04:44.705 [E] Cache.c:309 operation returned status code: 0x00000009
2025-07-09 15:04:44.708 [E] RoomImplX.c:167 operation returned status code: 0x52000057
2025-07-09 15:04:44.711 [E] LiteHttp.c:497 ID 590364354 E_LOGIC : NO need keepAlive
2025-07-09 15:04:44.719 [E] RoomImplX.c:167 operation returned status code: 0x52000057
2025-07-09 15:04:44.811 [E] RoomImplX.c:167 operation returned status code: 0x52000057
I (4744) volc_rtc: join channel success liuxing elapsed 162 ms now 162 ms

I (4745) volc_rtc: join room success

I (4745) MODEL_LOADER: The storage free size is 16384 KB
I (4749) MODEL_LOADER: The partition size is 4152 KB
I (4753) MODEL_LOADER: Successfully load srmodels
I (4758) AFE_CONFIG: Set WakeNet Model: wn9s_nihaoxiaozhi
I (4763) AFE_CONFIG: Set Second WakeNet Model: wn9s_nihaoxiaozhi
I (4769) RAW_OPUS_ENC: Raw Opus encoder init
I (4773) RECORDER_SR: The first wakenet model: wn9s_nihaoxiaozhi

I (4774) volc_rtc: remote user mute audio liuxing:liuxing123 0
I (4779) RECORDER_SR: The second wakenet model: wn9_hilexin

I (4785) volc_rtc: remote user mute video liuxing:liuxing123 1

/********** General AFE (Audio Front End) Parameter **********/
pcm_config.total_ch_num: 4
pcm_config.mic_num: 2: [ ch1, ch3 ]
pcm_config.ref_num: 1: [ ch0 ]
pcm_config.sample_rate: 16000
afe_type: SR
afe_mode: LOW COST
afe_perferred_core: 0
afe_perferred_priority: 5
afe_ringbuf_size: 50
memory_alloc_mode: 3
afe_linear_gain: 1.0
debug_init: false
fixed_first_channel: false

/********** AEC (Acoustic Echo Cancellation) **********/
aec_init: true
aec mode: SR_LOW_COST
aec_filter_length: 4

/********** SE (Speech Enhancement, Microphone Array Processing) **********/
se_init: false, model: BSS

/********** NS (Noise Suppression) **********/
ns_init: false
ns model name: WEBRTC

/********** VAD (Voice Activity Detection) **********/
vad_init: false
vad_mode: 3
vad_model_name: NULL
vad_min_speech_ms: 128
vad_min_noise_ms: 1000
vad_delay_ms: 128
vad_mute_playback: false
vad_enable_channel_trigger: false

/********** WakeNet (Wake Word Engine) **********/
wakenet_init: false
wakenet_model_name: wn9s_nihaoxiaozhi
wakenet_model_name_2: wn9_hilexin
wakenet_mode: 0

/********** AGC (Automatic Gain Control) **********/
agc_init: false
agc_mode: WEBRTC
agc_compression_gain_db: 9
agc_target_level_dbfs: 9

/**************************************************/
I (4955) AFE: AFE Version: (2MIC_V250113)
I (4955) AFE: Input PCM Config: total 4 channels(2 microphone, 1 playback), sample rate:16000
I (4956) AFE: AFE Pipeline: [input] -> |AEC(SR_LOW_COST)| -> [output]
I (5163) AUDIO_RECORDER: RECORDER_CMD_TRIGGER_START
I (5368) main_task: Returned from app_main()
2025-07-09 15:04:46.834 [E] rx_net_lite_cc_bandwidth_estimation.c:190 lite-cc bandwidth up bandwidth = 4630000
2025-07-09 15:04:48.854 [E] rx_net_lite_cc_bandwidth_estimation.c:190 lite-cc bandwidth up bandwidth = 5000000

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions