Replies: 1 comment
-
|
Festival and HTS are really hard to beat in latency since they use very simple algorithms compared to the neural models. However, there are many ways to optimize the models that we plan to discover soon. Some of which are:
I am not sure when I can start working on these but if someone is willing to initiate the work, I'd be delighted to help. Being said that most of our non-Tacotron models are already real-time capable on low-end devices even without optimization. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
This is just a question on the state-of-the-art.
I've been experimenting with TTS systems provided by
coqui-aiand other open-source solutions for the past few months.While the voice quality provided by these solutions are amazing, there's still an issue of latency that old-school solutions such as HTS and Festival do not suffer from.
Is it realistic to expect that modern neural TTS systems can beat these old solutions in terms of latency (even on small devices)?
Beta Was this translation helpful? Give feedback.
All reactions