Skip to content

High duration loss #40

@w11wo

Description

@w11wo

Hi @p0p4k, thanks for making this repo!

I am currently trying to train a 44.1kHz English model, but my model is struggling with a rather high duration loss when compared against your TensorBoard logs. It currently looks as follows:

image

It seems like the other loss terms are correct.

Also, when the generated mel-spectrogram is passed to a vocoder, the audio is very much wrong in pronunciation -- maybe only half right.

My P-Flow config can be found here, and the corresponding HiFi-GAN vocoder config can be found here.

Could you please let me know where I might be wrong? Thanks in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions