Skip to content

Commit 2791a5f

Browse files
authored
Update README.md
1 parent 5639949 commit 2791a5f

File tree

1 file changed

+4
-1
lines changed

1 file changed

+4
-1
lines changed

README.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,10 +17,13 @@ the TransformerTTS and MelGAN branch. They are separated to keep the code clean,
1717

1818
- [As shown in this paper](http://festvox.org/blizzard/bc2021/BC21_DelightfulTTS.pdf) vocoders can be used to perform
1919
super-resolution and spectrogram inversion simultaneously. We added this to our HiFi-GAN vocoder. It now takes 16kHz
20-
spectrograms as input, but produces 48kHz waveforms.
20+
spectrograms as input, but produces 48kHz waveforms. This is however not present in this branch to stay true to the
21+
setup we used for the LAML paper. The vocoder module can be used interchangeably though.
2122
- We officially introduced IMS Toucan in
2223
[our contribution to the Blizzard Challenge 2021](http://festvox.org/blizzard/bc2021/BC21_IMS.pdf). Check out the
2324
bottom of the readme for a bibtex entry.
25+
- We now use articulatory representations of phonemes as the input for all models. This allows us to easily use multilingual data to benefit less resource-rich languages. For IPA representations this works flawlessly, for other input representations you'll have to either stick to the embedding lookup table approach from the older branches of this toolkit or build your own text frontend that encodes your representations into meaningful vectors and feed those into the models. Especially tonal languages suffer from this, since there isn't a great unified phonetic representation system for those. We plan on supporting tonal languages in the future, but for now we'll stick to pitch accent an dword accent languages.
26+
- We provide a checkpoint trained with a variant of model agnostic meta learning from which you should be able to fine-tune a model with very little data in almost any language (except for tonal languages, as mentioned in the last point). These two contributions are described in our paper that we will present at the ACL 2022! Check the releases for the models. We will link a preview version of the paper here soon.
2427

2528
## Demonstration
2629

0 commit comments

Comments
 (0)