❓ Questions / Help / Support #692
-
|
hi, do we have a speech-to-non-speech transitions (latency analysis) comparison between v6 and ten-vad? any experimental results? thanks. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Our window size is 30ms. Most likely it takes several windows to detect end of speech. But even 50-100ms is not a big deal, since this is as accurate as a VAD can get anyway. We benchmarked ten vad, and it behaves not very well on real life audios, more akin to web rtc, hence the need to highlight this non-issue. |
Beta Was this translation helpful? Give feedback.
Our window size is 30ms. Most likely it takes several windows to detect end of speech. But even 50-100ms is not a big deal, since this is as accurate as a VAD can get anyway.
We benchmarked ten vad, and it behaves not very well on real life audios, more akin to web rtc, hence the need to highlight this non-issue.