-
Notifications
You must be signed in to change notification settings - Fork 63
Fix faster-whisper #53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
o-alexandre-felipe
wants to merge
111
commits into
huggingface:main
Choose a base branch
from
o-alexandre-felipe:main
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Refactor data and normalizer * Update transformers * Update requirements * Update requirements * revert datasets for HF
* Update eval script for Fast Conformer NeMo models to support write and post-scoring * Add evaluate helper * Alias manifest utils in data utils * Update eval script for HF models to support write and post-scoring * Add comments Signed-off-by: smajumdar <[email protected]> * Fix detection of dataset id Signed-off-by: smajumdar <[email protected]> * Add checks for empty string in model filtering for eval script Signed-off-by: smajumdar <[email protected]> --------- Signed-off-by: smajumdar <[email protected]>
* Add XL and XXL RNNT and CTC models Signed-off-by: Nithin Rao Koluguri <nithinraok> * update max samples Signed-off-by: Nithin Rao Koluguri <nithinraok> * use single batch size Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok>
* speechbrain initial get_model fn * wav2vec / run_eval.py working * conformer.sh * add .sh * remove pycache * fix batch size * docstring * docstring * updt * speechbrain requirements * speechbrain requirements * fix wer? * manifest * gitignore / remove savedir arg * remove speechbrain/ path * gitignore * update wav2vec * cv * update scripts * fix issue composite wer
…ers_models inference: Loop over transformers models
[Transformers] Enable torch compile
Switch to hf-audio/esb-datasets-test-only-sorted dataset
Propagate RTFx updates to other libs
…data2vec [transformers] from common voice from data2vec
* Remove common voice from evaluation, as discussed. Pin nemo to a particular version to make sure results are reproducible. In particular, include: NVIDIA-NeMo/NeMo#10054 Make sure that optional dependency cuda-python is included to ensure that we use cuda graph accelerated decoder inference in RNN-T and TDT mdoels.
* update readme * fix * fix fix * fix nemo note * same hps
* Add UsefulSensors Moonshine benchmark Due to trainable in-model preprocessor and therefore lack of a spectrogram preprocessor, we have opted against wrapping the tokenizer as a processor. Further, we must make substantial changes compared with existing transformer models, so we decided to create a separate benchmark. * Add moonshine-specific requirements.txt. Adds the `einops` package which our HF hub repo requries.
* add whisper trt-llm * add vad module * remove vad * remove vad files * remove convert_checkpoint * code clean --------- Co-authored-by: Yuekai Zhang <[email protected]>
* fix whisper * add all models --------- Co-authored-by: Yuekai Zhang <[email protected]>
* best SB model * fix comments * minor changes * fix everything --------- Co-authored-by: Titouan Parcollet/Embedded AI /SRUK/Engineer/Samsung Electronics <[email protected]>
Adds Phi-4-Multimodal
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The main issue
I can't see faster-whisper on the published leader board. Initially I thought it was because it was never added. Checking the code I saw a
ctranslate2
folder there, but when trying to run on my environment it was broken, I assume it is broken on the evaluation environment as well and that is why those are missingfaster-whisper model naming
The existing scripts are using short-hand naming e.g.
tiny.en
to avoid confusion between that and other models e.g.openai/whisper-tiny.en
I used the full names as defined herepytorch and cuda versiosn.
I see some references on the README file indicating that it should run with pytorch
2.4.1
and CUDA12.6
but this I can't find this combination. I can install pytorch2.6.0
with CUDA12.6
or pytorch2.4.1
with CUDA12.4
. So that is something that should be clarified.## The dependency files
I renamed the library specific dependency files from
requirements/requirements_${lib}.txt
to${lib}/requirements.txt
. It makes more sense to have the dependency within the folder where it is used.The common dependencies was moved to
docker/requirements.txt
because it is used to build the base image.Another thing that is worth having a look to keep this project future proof and make it easier to reproduce results or troubleshoot mismatches is to specify the library versions. Currently most of the dependencies are defined without fixing a version.
In the hope of being helpful
Alexandre Felipe