You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
IMS Toucan is a toolkit for teaching, training and using state-of-the-art Speech Synthesis models, developed at the
11
+
IMS Toucan is a toolkit for training, using, and teaching state-of-the-art Text-to-Speech Synthesis models, developed at the
4
12
**Institute for Natural Language Processing (IMS), University of Stuttgart, Germany**. Everything is pure Python and
5
-
PyTorch based to keep it as simple and beginner-friendly, yet powerful as possible.
13
+
PyTorch based to keep it as simple and beginner-friendly, yet powerful as possible.
6
14
7
-
---
15
+
<br>
16
+
17
+

18
+
19
+
---
20
+
<br>
8
21
9
22
## Links 🦚
10
23
@@ -32,7 +45,8 @@ PyTorch based to keep it as simple and beginner-friendly, yet powerful as possib
32
45
33
46
[We have also published a massively multilingual TTS dataset on Huggingface🤗](https://huggingface.co/datasets/Flux9665/BibleMMS)
34
47
35
-
---
48
+
---
49
+
<br>
36
50
37
51
## Installation 🦉
38
52
@@ -128,7 +142,8 @@ However, the espeak-ng installation file you need to set this variable to is a .
128
142
Mac. In order to locate the espeak-ng library file, you can run `port contents espeak-ng`. The specific file you are
129
143
looking for is named `libespeak-ng.dylib`.
130
144
131
-
---
145
+
---
146
+
<br>
132
147
133
148
## Inference 🦢
134
149
@@ -161,7 +176,8 @@ pass them to the interface when you use it in your own code.
161
176
To change the language of the model and see which languages are available in our pretrained model,
162
177
[have a look at the list linked here](https://github.com/DigitalPhonetics/IMS-Toucan/blob/feb573ca630823974e6ced22591ab41cdfb93674/Utility/language_list.md)
163
178
164
-
---
179
+
---
180
+
<br>
165
181
166
182
## Creating a new Recipe (Training Pipeline) 🐣
167
183
@@ -189,7 +205,8 @@ Once this is complete, we are almost done, now we just need to make it available
189
205
*run* function from the pipeline you just created and give it a meaningful name. Now in the
190
206
*pipeline_dict*, add your imported function as value and use as key a shorthand that makes sense.
191
207
192
-
---
208
+
---
209
+
<br>
193
210
194
211
## Training a Model 🦜
195
212
@@ -242,7 +259,8 @@ fuser -v /dev/nvidia*
242
259
243
260
Whenever a checkpoint is saved, a compressed version that can be used for inference is also created, which is named _best.py_
244
261
245
-
---
262
+
---
263
+
<br>
246
264
247
265
## FAQ 🐓
248
266
@@ -268,9 +286,10 @@ Here are a few points that were brought up by users:
268
286
but nothing that hints at them in the text. That's why ASR corpora, which leave out punctuation, are usually difficult
269
287
to use for TTS.
270
288
271
-
---
289
+
---
290
+
<br>
272
291
273
-
## Disclaimer 🦆
292
+
## Acknowledgements 🦆
274
293
275
294
The basic PyTorch modules of FastSpeech 2 and GST are taken from
276
295
[ESPnet](https://github.com/espnet/espnet), the PyTorch modules of
0 commit comments