For A Voice Embedding Better With or Without trailing Silence? #1936
EthanEpp
started this conversation in
Development
Replies: 1 comment
-
|
I confirm you'll get better results by removing non-speech. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Working with the voice embedding, and if I have a captured brief sample of a user's speech (a few seconds) is it better to have the audio used to generate the embedding be trimmed to the point of starting right at the start of speech, or is a half to full second of silence at prior to speech ideal? Same question for at the end of speech.
I'd be inclined towards as little silence as possible without risking clipping the target speech, but wondering if there is any input here.
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions