-
Notifications
You must be signed in to change notification settings - Fork 530
Audio statistics #3833
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Audio statistics #3833
Conversation
|
When I see length, I think in seconds. I like the frames approach too, and I'd like it spelled out explicitly (num_frames or whatever). I'd like to see:
Would love to hear other feedback as well while I read into it a bit more. |
|
Added seconds and sampling rates |
isaac-chung
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for adding more. Revisited some papers and maybe we should use the standard measure of audio dataset size.
isaac-chung
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| unique_audios: int | ||
|
|
||
| average_sampling_rate: float | ||
| sampling_rates: dict[int, int] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this just be a unique set of sampling rates? OK either way.
| sampling_rates: dict[int, int] | |
| sampling_rates: list[int] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to keep dict to show full distribution of different sample rates. If this became a problem, we can easily change to list of ints
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Co-authored-by: Isaac Chung <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor things - generally think this looks good (of course Isaac's comments still apply, but nothing more to add)
Co-authored-by: Kenneth Enevoldsen <[email protected]>
|
@isaac-chung Can you review this PR? |
# Conflicts: # pyproject.toml # uv.lock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Judging from the size, does this incorporate changes from #3875 too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I used these changes here too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lovely!

Ref #3498
I’ve started integrating audio statistics. For now, I’ve come up with this format. Do you have any suggestions?