How to outperform HTS in terms of latency? #672

ihshareef · 2021-07-22T11:37:51Z

ihshareef
Jul 22, 2021

This is just a question on the state-of-the-art.

I've been experimenting with TTS systems provided by coqui-ai and other open-source solutions for the past few months.
While the voice quality provided by these solutions are amazing, there's still an issue of latency that old-school solutions such as HTS and Festival do not suffer from.

Is it realistic to expect that modern neural TTS systems can beat these old solutions in terms of latency (even on small devices)?

erogol · 2021-07-23T14:32:49Z

erogol
Jul 23, 2021
Maintainer

Festival and HTS are really hard to beat in latency since they use very simple algorithms compared to the neural models. However, there are many ways to optimize the models that we plan to discover soon. Some of which are:

quantization
pytorch script export
ONNX

I am not sure when I can start working on these but if someone is willing to initiate the work, I'd be delighted to help.

Being said that most of our non-Tacotron models are already real-time capable on low-end devices even without optimization.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to outperform HTS in terms of latency? #672

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How to outperform HTS in terms of latency? #672

Uh oh!

ihshareef Jul 22, 2021

Replies: 1 comment

Uh oh!

erogol Jul 23, 2021 Maintainer

ihshareef
Jul 22, 2021

erogol
Jul 23, 2021
Maintainer