Skip to content

Clarifications for creating sequential batches instead of random sampling.  #112

@Raviteja-banda

Description

@Raviteja-banda

Please correct me if I'm wrong:
In the speaker_id.py file, in the function create_baches_rnd(), you select a batch of random random samples and then a random chunk of length 200ms(since the length of the sample is 3200 and sampling rate is 16000) from each sample. This way, you might select the same file in 2 different batches and you might end up not selecting some files. Eventually the model might end up not training some labels. Am i correct in saying so?

What is the effect of selecting the files in sequence instead of random selection. This way, I select each audio file only once, and I make sure all the audio samples are selected for training. This also decreases the number of batches for training and hence less training time.
Could some one please clarify these things?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions