Can someone help me with training new Language from scratch ? #453
Unanswered
costcuttingcz
asked this question in
Q&A
Replies: 3 comments 3 replies
-
|
Were you able to get this working? The Gradio UI helps very much, accompanied by the various video tutorials in this repo discussion, and on youtube. |
Beta Was this translation helpful? Give feedback.
2 replies
-
|
Do you have any progress with Czech language? I want to get started on that too - I have about 2 hours of dataset for learning (studio quality sound). |
Beta Was this translation helpful? Give feedback.
1 reply
-
|
Hi, maybe this will help? https://huggingface.co/fav-kky/SpeechT5-base-cs-tts I would be interested in this. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Helo,
I am experimenting longer time with this and I am frustrated. I want train Czech language from a scratch.
I have everything but I am not able to make it work.
I have /workspace/dataset/cs/wav with all wav files
/workspace/dataset/cs/data.csv contains wav name and in second field text which is in wav. Separator is |
I created vocabulary /workspace/dataset/cs/vocab.txt which contains words.
I want to save results to /workspace/ckpts
On the biginning it has problem that there is no model loaded.
And when i try fix files everythign fall apart.
I modifyed trainer.py to my_trainer.py
Can pelase someone help me make it work ?
`
from importlib.resources import files
from f5_tts.model import CFM, DiT, Trainer, UNetT
from f5_tts.model.dataset import load_dataset
from f5_tts.model.utils import get_tokenizer
-------------------------- Dataset Settings ---------------------------
target_sample_rate = 24000
n_mel_channels = 100
hop_length = 256
win_length = 1024
n_fft = 1024
mel_spec_type = "vocos" # 'vocos' or 'bigvgan'
tokenizer = "custom" # Set to 'custom' to use your vocab.txt ---- 'pinyin', 'char', or 'custom'
tokenizer_path = "/workspace/dataset/cs/vocab.txt" # Path to your vocab.txt
dataset_name = "/workspace/dataset/cs/data.csv" # Path to your data.csv
-------------------------- Training Settings --------------------------
exp_name = "F5TTS_Czech" # Updated experiment name
learning_rate = 7.5e-5
batch_size_per_gpu = 1000 # 8 GPUs, 8 * 38400 = 307200
batch_size_type = "frame" # "frame" or "sample"
max_samples = 64 # max sequences per batch if use frame-wise batch_size. we set 32 for small models, 64 for base models
grad_accumulation_steps = 1 # note: updates = steps / grad_accumulation_steps
max_grad_norm = 1.0
epochs = 11 # use linear decay, thus epochs control the slope
num_warmup_updates = 20000 # warmup steps
save_per_updates = 50000 # save checkpoint per steps
last_per_steps = 5000 # save last checkpoint per steps
model params
if exp_name == "F5TTS_Czech":
wandb_resume_id = None
model_cls = DiT
model_cfg = dict(dim=1024, depth=22, heads=16, ff_mult=2, text_dim=512, conv_layers=4)
-----------------------------------------------------------------------
def main():
vocab_char_map, vocab_size = get_tokenizer(tokenizer_path, tokenizer)
if name == "main":
main()
`
Beta Was this translation helpful? Give feedback.
All reactions