Releases: SYSTRAN/faster-whisper
faster-whisper 1.2.1
What's Changed
- only merge when
clip_timestampsare not provided by @MahmoudAshraf97 in #1345 - Fix: Prevent <|nocaptions|> tokens in BatchedInferencePipeline by @mmichelli in #1338
- Upgrade to Silero-VAD V6 by @MahmoudAshraf97 and @sssshhhhhh in #1373
- Offload retry logic to hf hub by @MahmoudAshraf97 in #1382
- Refinements for clip timestamps in batched inference by @MahmoudAshraf97 in #1376
New Contributors
- @mmichelli made their first contribution in #1338
- @sssshhhhhh made their first contribution in #1365
Full Changelog: v1.2.0...v1.2.1
faster-whisper 1.2.0
What's Changed
- feat: allow passing specific revision to download by @felixmosh in #1292
- Support
distil-large-v3.5by @MahmoudAshraf97 in #1311 - Feature: Allow loading of private HF models by @r15hil in #1309
- bugfix: Get correct chunk index when restoring timestamps by @MahmoudAshraf97 in #1336
- Remove Silence in Batched transcription by @MahmoudAshraf97 in #1297
New Contributors
- @djpg made their first contribution in #1267
- @felixmosh made their first contribution in #1292
- @r15hil made their first contribution in #1309
Full Changelog: v1.1.1...v1.2.0
faster-whisper 1.1.1
What's Changed
- Brings back original VAD parameters naming by @Purfview in #1181
- Make batched
suppress_tokensbehaviour same as in sequential by @Purfview in #1194 - Fixes OOM Errors - too high RAM usage by VAD by @Purfview in #1198
- Add duration of audio and VAD removed duration to
BatchedInferencePipelineby @greenw0lf in #1186 - Fix
neg_thresholdby @Purfview in #1191
New Contributors
- @greenw0lf made their first contribution in #1186
Full Changelog: v1.1.0...v1.1.1
faster-whisper 1.1.0
New Features
- New batched inference that is 4x faster and accurate, Refer to README on usage instructions.
- Support for the new
large-v3-turbomodel. - VAD filter is now 3x faster on CPU.
- Feature Extraction is now 3x faster.
- Added
log_progresstoWhisperModel.transcribeto print transcription progress. - Added
multilingualoption to transcription to allow transcribing multilingual audio. Note that Large models already have codeswitching capabilities, so this is mostly beneficial tomediummodel or smaller. WhisperModel.detect_languagenow has the option to use VAD filter and improved language detection usinglanguage_detection_segmentsandlanguage_detection_threshold.
Bug Fixes
- Use correct features padding for encoder input when
chunk_length<30s - Use correct
seekvalue in output
Other Changes
- replace
NamedTuplewithdataclassinWord,Segment,TranscriptionOptions,TranscriptionInfo, andVadOptions, this allows conversion tojsonwithout nesting. Note that_asdict()method is still available inWordandSegmentclasses for backward compatibility but will be removed in the next release, you can usedataclasses.asdict()instead. - Added new tests for development
- Updated benchmarks in the Readme
- use
jiwerinstead ofevaluatein benchmarks - Filter out non_speech_tokens in suppressed tokens by @jordimas in #898
New Contributors
- @Jiltseb made their first contribution in #856
- @heimoshuiyu made their first contribution in #1092
Full Changelog: v1.0.3...v1.1.0
faster-whisper 1.0.3
Upgrade Silero-Vad model to latest V5 version (#884)
Silero-vad V5 release: https://github.com/snakers4/silero-vad/releases/tag/v5.0
- window_size_samples parameter is fixed at 512.
- Change to use the state variable instead of the existing h and c variables.
- Slightly changed internal logic, now some context (part of previous chunk) is passed along with the current chunk.
- Change the dimensions of the state variable from 64 to 128.
- Replace ONNX file with V5 version
Other changes
faster-whisper 1.0.2
-
Add support for distil-large-v3 (#755)
The latest Distil-Whisper model, distil-large-v3, is intrinsically designed to work with the OpenAI sequential algorithm. -
Benchmarks (#773)
Introduces functionality to measure benchmarking for memory, Word Error Rate (WER), and speed in Faster-whisper. -
Support initializing more whisper model args (#807)
-
Small bug fix:
-
New feature from original openai Whisper project:
faster-whisper 1.0.1
faster-whisper 1.0.0
-
Support distil-whisper model (#557)
Robust knowledge distillation of the Whisper model via large-scale pseudo-labelling.
For more detail: https://github.com/huggingface/distil-whisper -
Upgrade ctranslate2 version to 4.0 to support CUDA 12 (#694)
-
Upgrade PyAV version to 11.* to support Python3.12.x (#679)
-
Small bug fixes
-
New improvements from original OpenAI Whisper project
faster-whisper 0.10.1
Fix the broken tag v0.10.0
faster-whisper 0.10.0
- Support "large-v3" model with
- The ability to load
feature_size/num_melsand other frompreprocessor_config.json - A new language token for Cantonese (
yue)
- The ability to load
- Update
CTranslate2requirement to include the latest version 3.22.0 - Update
tokenizersrequirement to include the latest version 0.15 - Change the hub to fetch models from Systran organization