Training

ASR training recipes created for ivrit.ai This is not yet properly documented - Soon to come.

Evaluation

Running the entire evaluation bench

The run_bench.py will run the suite of eval datasets using a specified engine.

Example command line is:

python run_bench.py \
    --engine engines/faster_whisper_engine.py \
    --model /path-to-your-ct2-whisper-model \
    --output-dir /path-to-eval-result-csvs \
    --overwrite

This will use the "faster-whisper" engine, so a CT2 model is required. Other engines include:

HF Transformers
Remote runpod container running faster-whisper
OpenAI Whisper API
Amazon Transcribe API
Google Speech API

The above are not yet documented but feel free to look at the code on how to run them.

Datasets in the suite

The following datasets are part of the evaluation suite:

Label	Dataset	Split	Text Column	Dataset Configuration Name	Gated
ivrit_ai_eval_d1	ivrit-ai/eval-d1	test	text	-	❌
saspeech	upai-inc/saspeech	test	text	-	❌
fleurs	google/fleurs	test	transcription	he_il	❌
common_voice_17	mozilla-foundation/common_voice_17_0	test	sentence	he	✅
hebrew_speech_kan	imvladikon/hebrew_speech_kan	validation	sentence	-	❌

Leaderboard

We publish the results on the HF Leaderboard. Contact us to add your model to the leaderboard.

Common Problems

`Unable to load any of {libcudnn_ops.so.9.1.0, libcudnn_ops.so.9.1, libcudnn_ops.so.9, libcudnn_ops.so}`

If the evaluation engine faster-whisper is used the following error may show up.

The CTranslate 2 engine depends on CUDNN to run on Nvidia GPUs. This library is actually installed already using pip but is not on the dynamic library path most likely. The following line will put it on path for the current session. If you use a virtual env - make sure it's active before running it so it can infer the correct pip folder.

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`python3 -c 'import os; import nvidia.cublas.lib; import nvidia.cudnn.lib; print(os.path.dirname(nvidia.cublas.lib.__file__) + ":" + os.path.dirname(nvidia.cudnn.lib.__file__))'`

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
assets		assets
engines		engines
preprocess		preprocess
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
compare_texts.py		compare_texts.py
create_dataset.py		create_dataset.py
evaluate_model.py		evaluate_model.py
merge-checkpoints-whisper.py		merge-checkpoints-whisper.py
merge-lora-whisper.py		merge-lora-whisper.py
pyproject.toml		pyproject.toml
requirements.dev.txt		requirements.dev.txt
requirements.txt		requirements.txt
run_bench.py		run_bench.py
run_whisper.py		run_whisper.py
train-whisper.py		train-whisper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Training

Evaluation

Running the entire evaluation bench

Datasets in the suite

Leaderboard

Common Problems

`Unable to load any of {libcudnn_ops.so.9.1.0, libcudnn_ops.so.9.1, libcudnn_ops.so.9, libcudnn_ops.so}`

About

Uh oh!

Releases

Packages

Languages

License

thewh1teagle/asr-training

Folders and files

Latest commit

History

Repository files navigation

Training

Evaluation

Running the entire evaluation bench

Datasets in the suite

Leaderboard

Common Problems

Unable to load any of {libcudnn_ops.so.9.1.0, libcudnn_ops.so.9.1, libcudnn_ops.so.9, libcudnn_ops.so}

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`Unable to load any of {libcudnn_ops.so.9.1.0, libcudnn_ops.so.9.1, libcudnn_ops.so.9, libcudnn_ops.so}`

Packages