Release v6.0 · snakers4/silero-vad

New v6 VAD Released

Improved quality;
v5 features and improvements kept;
16% less errors on noisy real-life data;
11% less errors on multi-domain validation;
Various community contributions;
Added quality comparison with TenVAD;
Changed the training algorithm, ideally it should result in higher robustness;
Metrics on a new (manually noised) community-provided dataset to be added soon;
Known persisting issues: music with human voice-like instruments, very high pitched voices (artificial, cartoons, small children);

Improve documentation. by @EarningsCall in #553
Adamnsandle by @adamnsandle in #573
fx #576 by @adamnsandle in #579
Add cpp source based on libtorch by @NathanJHLee in #578
fx negative ths bug by @adamnsandle in #581
Add haskell example by @qwbarch in #591
Add CITATION.cff file for proper citation by @kiwamizamurai in #601
Fix/cpp vad context by @OJRYK in #605
Specify time resolution when returning speech coordinates in seconds by @b3by in #627
Use second coordinates for audio concatenation in collect_chunks and drop_chunks by @b3by in #626
Surface drop_chunks in init by @davidrs in #656
Adamnsandle by @adamnsandle in #669
fx by @adamnsandle in #670
Adamnsandle by @adamnsandle in #671
fx by @adamnsandle in #672
Adamnsandle by @adamnsandle in #673
Adamnsandle by @adamnsandle in #674
Adamnsandle by @adamnsandle in #675
Adamnsandle by @adamnsandle in #676
Adding additional params to get_speech_timestamps by @shashank14k in #664
get rid of hop_size_ratio by @adamnsandle in #677

Full Changelog: v5.1.2...v6.0