-
Notifications
You must be signed in to change notification settings - Fork 5.8k
Experimental Released Models
Eren Gölge edited this page May 25, 2021
·
4 revisions
| TTS Models | Dataset | Commit | Audio Sample | Details |
|---|---|---|---|---|
| Tacotron2 | LJSpeech | branch | --- | Details |
| Tacotron2 DDC | LJSpeech | 72a6ac5 | voice samples | Trained with DDC and includes PyTorch, Tensorflow and TFLite models. Check Colab notebooks or notebooks folder. |
| Glow-TTS | LJSpeech | 08394e4 | --- | Details. Sample notebook |
| Multi-Speaker-Tacotron2 | VCTK | 4873601 | Colab notebook | Multi-Speaker TTS model with Tacotron2/ |
| Multi-Speaker-Tacotron2 DDC | VCTK | 2136433 | Colab notebook | Multi-Speaker TTS model with Tacotron2 and Double Decoder Consistency. |
| Tacotron2 with Dynamic Conv Attention | LJSpeech | 4132240 | Colab notebook | Tacotron2 with Dynamic Convolutional Attention. |
| Glow-TTS | LJSpeech | 4132240 | Colab notebook | Glow-TTS as in the paper. |
| Speaker Encoder Models | Dataset | Commit |
|---|---|---|
| Speaker-Encoder-iter25k | LibriSpeech | ... |
| Speaker-Encoder by @mueller91 | LibriTTS + VCTK + VoxCeleb + CommonVoice | ... |
| Vocoder Models | Dataset | Commit | Details |
|---|---|---|---|
| ParallelWaveGAN | LJSpeech | 72a6ac5 | Trained using TTS.vocoder. It produces better results than MelGAN model but it is slightly slower. Check notebooks for testing. |
| Multi-Band MelGAN | LJSpeech | 72a6ac5 | Trained using TTS.vocoder. It is the fastest vocoder model. Check notebooks for testing. |
| WaveRNN models | go to repo for the models. (Soon to be deprecated) | ||
| Full-Band MelGAN | LibriTTS | c514628 | Trained using TTS.vocoder. Generic vocoder that can sample any voice. Sampling rate 24Khz. To use with a different sampling rate follow this issue. |
| Universal WaveGrad | LibriTTS | 2136433 | Trained using TTS.vocoder. Generic vocoder that can sample any voice. Original Sampling rate 24Khz. To use with a different sampling rate follow this issue. |
| Universal HifiGAN | LibriTTS | - |
How to use:
- Create a fresh virtual environment with Python 3.6
$ apt-get install espeak libsndfile1$ pip install python_package_url_from_table_below$ python -m TTS.server.server- Open http://localhost:5002
| Model | Dataset | Python package | nginx/uWSGI config files |
|---|---|---|---|
| Tacotron 2 + Forward Attention + PWGAN | LJSpeech | TTS-0.0.1+92aea2a-py3-none-any.whl | tts-nginx-uwsgi.zip |
The server is a Flask application. For deployment with multiple workers see the nginx/uWSGI config files also linked in the table above. Pass --use_cuda 1 to use GPUs when available.
| TTS Models | Dataset | Commit | Audio Sample | Details |
|---|---|---|---|---|
| Tacotron2 DDC | MAI-Labs | 48a40c4 | --- | Model Details and Colab Notebook. |
| TTS Models | Dataset | Commit | Audio Sample | Details |
|---|---|---|---|---|
| Tacotron2 DDC | MAI-Labs | f09defa | --- | Model Details and Colab Notebook. |
model details by @Edresson