Release Improved Controllable Multilingual · DigitalPhonetics/IMS-Toucan

This release extends the toolkits functionality and provides new checkpoints.

new sampling rate for the vocoder: Using 24kHz instead of 48kHz lowers the theoretical upper bound for quality, but produces fewer artifacts in practice.
flow based postnet from portaspeech is included in the new TTS model which brings cleaner results at basically no expense
new controllability options through artificial speaker generation in a lower dimensional space with a better embedding function
quality of life changes, such as an integrated finetuning example and an arbiter for the train loops to be used and vocoder finetuning (although that should really not be necessary)
divese bugfixes and speed increases

This release breaks backwards compatibility, please download the new models or stick to a prior release if you rely on your old models.

Future releaes will include one more change to the vocoder used (BigVGAN generator) and lots of changes to scale up the multi-lingual capabilities of a single model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Improved Controllable Multilingual

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!