For A Voice Embedding Better With or Without trailing Silence? #1936

EthanEpp · 2025-10-20T22:30:45Z

EthanEpp
Oct 20, 2025

Working with the voice embedding, and if I have a captured brief sample of a user's speech (a few seconds) is it better to have the audio used to generate the embedding be trimmed to the point of starting right at the start of speech, or is a half to full second of silence at prior to speech ideal? Same question for at the end of speech.

I'd be inclined towards as little silence as possible without risking clipping the target speech, but wondering if there is any input here.

Thanks!

hbredin · 2025-10-21T06:39:57Z

hbredin
Oct 21, 2025
Maintainer

I confirm you'll get better results by removing non-speech.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

For A Voice Embedding Better With or Without trailing Silence? #1936

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

For A Voice Embedding Better With or Without trailing Silence? #1936

Uh oh!

EthanEpp Oct 20, 2025

Replies: 1 comment

Uh oh!

hbredin Oct 21, 2025 Maintainer

EthanEpp
Oct 20, 2025

hbredin
Oct 21, 2025
Maintainer