Replies: 5 comments 4 replies
-
|
I would love to get a tutorial made on this very subject too! Transfer learning seems to be (on average) regarded as a faster, more accurate way of achieving great results, but it’s tough to find detailed instructions on how to do it! In addition to the Italian TTS model mentioned above, another model to include in this potential tutorial would be one based on the LJSpeech dataset, where all we want to do is change the voice using our own .wav and transcription files. How would this be done? Anyone tried it yet with success yet? |
Beta Was this translation helpful? Give feedback.
-
|
you want to transcribe lj speech but with your own voice? That will take time even if you have the cach and use descript to fake your voice in order to easily export the rest of the data |
Beta Was this translation helpful? Give feedback.
-
|
For voice style transfer (get 2 audio and get the same sentence of First audio with voice of second) i try that code https://github.com/nicolalandro/autovc , It works well on english but very bed on any other languages 😅 you can try to replicate that work with italian pretrained, otherways you must train your model from zero with your custom dataset. |
Beta Was this translation helpful? Give feedback.
-
|
Thanks! I will definitely check out that repo and see how it does. I was thinking that, conceptually, if the model was trained well with a big dataset and sounded natural, etc etc, that you could “freeze” the model except for the vocoder and en(de)coder to make use of the new voice only, even if the new voice’s dataset might not be as large as what was trained on. All of the inflections and other subtle spoken language nuances would have theoretically been learned during training, but we just want to change the vocoder to the new voice? |
Beta Was this translation helpful? Give feedback.
-
|
This is all assuming the same language is used, yes. Sorry for the confusion. So hypothetically let’s say we have trained a model on 100 hours of English dialogue, and we want to use our own voice, which we have say 3 hours of dialogue. I could use autovc to transfer learn the base model to my new voice? Is that the correct way to think of it? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, I'm fairly new to the AI world in general and I have a pretty basic question.
Would it be possible to finetune the italian TTS model released by @nicolalandro here #1148 which is based on a single speaker dataset to transfer my own voice on it? And how could I achieve that?
Thanks in advance! :)
Beta Was this translation helpful? Give feedback.
All reactions