Fix faster-whisper #53

o-alexandre-felipe · 2025-03-09T19:37:09Z

The main issue

I can't see faster-whisper on the published leader board. Initially I thought it was because it was never added. Checking the code I saw a ctranslate2 folder there, but when trying to run on my environment it was broken, I assume it is broken on the evaluation environment as well and that is why those are missing

faster-whisper model naming

The existing scripts are using short-hand naming e.g. tiny.en to avoid confusion between that and other models e.g. openai/whisper-tiny.en I used the full names as defined here

pytorch and cuda versiosn.

I see some references on the README file indicating that it should run with pytorch 2.4.1 and CUDA 12.6 but this I can't find this combination. I can install pytorch 2.6.0 with CUDA 12.6 or pytorch 2.4.1 with CUDA 12.4. So that is something that should be clarified.

## The dependency files

I renamed the library specific dependency files from requirements/requirements_${lib}.txt to ${lib}/requirements.txt. It makes more sense to have the dependency within the folder where it is used.

The common dependencies was moved to docker/requirements.txt because it is used to build the base image.

Another thing that is worth having a look to keep this project future proof and make it easier to reproduce results or troubleshoot mismatches is to specify the library versions. Currently most of the dependencies are defined without fixing a version.

In the hope of being helpful
Alexandre Felipe

* Refactor data and normalizer * Update transformers * Update requirements * Update requirements * revert datasets for HF

* Update eval script for Fast Conformer NeMo models to support write and post-scoring * Add evaluate helper * Alias manifest utils in data utils * Update eval script for HF models to support write and post-scoring * Add comments Signed-off-by: smajumdar <[email protected]> * Fix detection of dataset id Signed-off-by: smajumdar <[email protected]> * Add checks for empty string in model filtering for eval script Signed-off-by: smajumdar <[email protected]> --------- Signed-off-by: smajumdar <[email protected]>

* Add XL and XXL RNNT and CTC models Signed-off-by: Nithin Rao Koluguri <nithinraok> * update max samples Signed-off-by: Nithin Rao Koluguri <nithinraok> * use single batch size Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok>

* speechbrain initial get_model fn * wav2vec / run_eval.py working * conformer.sh * add .sh * remove pycache * fix batch size * docstring * docstring * updt * speechbrain requirements * speechbrain requirements * fix wer? * manifest * gitignore / remove savedir arg * remove speechbrain/ path * gitignore * update wav2vec * cv * update scripts * fix issue composite wer

…ers_models inference: Loop over transformers models

[Transformers] Enable torch compile

Switch to hf-audio/esb-datasets-test-only-sorted dataset

Propagate RTFx updates to other libs

…data2vec [transformers] from common voice from data2vec

* Remove common voice from evaluation, as discussed. Pin nemo to a particular version to make sure results are reproducible. In particular, include: NVIDIA-NeMo/NeMo#10054 Make sure that optional dependency cuda-python is included to ensure that we use cuda graph accelerated decoder inference in RNN-T and TDT mdoels.

* update readme * fix * fix fix * fix nemo note * same hps

* Add UsefulSensors Moonshine benchmark Due to trainable in-model preprocessor and therefore lack of a spectrogram preprocessor, we have opted against wrapping the tokenizer as a processor. Further, we must make substantial changes compared with existing transformer models, so we decided to create a separate benchmark. * Add moonshine-specific requirements.txt. Adds the `einops` package which our HF hub repo requries.

* add whisper trt-llm * add vad module * remove vad * remove vad files * remove convert_checkpoint * code clean --------- Co-authored-by: Yuekai Zhang <[email protected]>

* fix whisper * add all models --------- Co-authored-by: Yuekai Zhang <[email protected]>

* best SB model * fix comments * minor changes * fix everything --------- Co-authored-by: Titouan Parcollet/Embedded AI /SRUK/Engineer/Samsung Electronics <[email protected]>

Adds Phi-4-Multimodal

titu1994 and others added 30 commits July 25, 2023 13:14

Add NeMo model support + Refactor codebase (huggingface#2)

df33082

* Refactor data and normalizer * Update transformers * Update requirements * Update requirements * revert datasets for HF

updating the run_whisper script to work with recent changes

40cc09a

add jiwer & librosa to requirements

8640704

fix: load_dataset -> load_data

611b27d

up

a5f8268

up

f184963

up

5c60dbb

up

868d771

Merge pull request huggingface#6 from huggingface/loop_over_transform…

6b493e5

…ers_models inference: Loop over transformers models

up

27c4e42

add hubert models

0f9c861

add data2vec models

6063541

add wavlm models

320ed6d

remove non en models

581ec3e

update hubert models

5658abf

remove wavlm models

4fdc80f

streaming -> False

de89533

indentation fix -> evals

6707cc3

final eval configurations

cb3fa07

add MMS models:

05c3c85

initial commit - rtf calculation script

398a393

up

2594106

Initial commit - RTF script

7f11b87

update rtf evals w/ all models

fa39250

update rtf evals

d96dafc

update rtf evals w/ batching

9bdd19f

disclaimer

ecd653c

sanchit-gandhi and others added 29 commits August 7, 2024 09:31

Merge pull request huggingface#32 from sanchit-gandhi/static-kv-2

d0a5ec4

[Transformers] Enable torch compile

Merge pull request huggingface#30 from jordimas/ct2-newdataset

72d38d3

Switch to hf-audio/esb-datasets-test-only-sorted dataset

update faster whisper for rtfx

a9a8808

remove calc rtf

b5334ae

finish faster whisper

e4aeb4f

update speechbrain script

336069a

finish speechbrain

904e29f

fix indentation

db75efa

Merge pull request huggingface#34 from sanchit-gandhi/update-other-libs

ca8e7a2

Propagate RTFx updates to other libs

[transformers] from common voice from data2vec

b33bdf9

Merge pull request huggingface#35 from sanchit-gandhi/remove-cv-from-…

4ea93a2

…data2vec [transformers] from common voice from data2vec

update trmfs scripts (huggingface#37)

9079e98

[readme] update for rtfx (huggingface#36)

fe50cf0

* update readme * fix * fix fix * fix nemo note * same hps

add CrisperWhisper model (huggingface#39)

1053c19

[Ready] Add whisper TensorRT-LLM (huggingface#42)

003f525

* add whisper trt-llm * add vad module * remove vad * remove vad files * remove convert_checkpoint * code clean --------- Co-authored-by: Yuekai Zhang <[email protected]>

fix whisper .en model (huggingface#46)

b4fff33

* fix whisper * add all models --------- Co-authored-by: Yuekai Zhang <[email protected]>

Adds Phi-4-Multimodal

a471a82

Adding largescale ASR model for speechbrain (huggingface#49)

e7fba76

* best SB model * fix comments * minor changes * fix everything --------- Co-authored-by: Titouan Parcollet/Embedded AI /SRUK/Engineer/Samsung Electronics <[email protected]>

Add requirements for Phi evals

f7b9566

Merge pull request huggingface#51 from freewym/work

0123743

Adds Phi-4-Multimodal

Rearrange requirement files

16bf928

Create docker runner

ebb27b1

Fix reference to deprecated _asdict()

422defb

Failover logic

94d174c

Use model full name

e540248

Ignore cache folder

933a9be

Update readme.txt

c97c87b

Deep-unlearning force-pushed the main branch from 292901a to d2167fb Compare June 24, 2025 12:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix faster-whisper #53

Fix faster-whisper #53

Uh oh!

o-alexandre-felipe commented Mar 9, 2025

Uh oh!

Uh oh!

Fix faster-whisper #53

Are you sure you want to change the base?

Fix faster-whisper #53

Uh oh!

Conversation

o-alexandre-felipe commented Mar 9, 2025

The main issue

faster-whisper model naming

pytorch and cuda versiosn.

Uh oh!

Uh oh!