Skip to content

GeForce RTX 50XX - cuBLAS failed with status CUBLAS_STATUS_NOT_SUPPORTED #1865

@Purfview

Description

@Purfview

faster-whisper. When compute_type="auto" [it auto-selects int8_float16] then there is this error:

With cuBLAS v12.1.3.1:

File "faster_whisper\transcribe.py", line 2179, in restore_speech_timestamps
File "faster_whisper\transcribe.py", line 1471, in generate_segments
File "faster_whisper\transcribe.py", line 1719, in encode
RuntimeError: cuBLAS failed with status CUBLAS_STATUS_NOT_SUPPORTED

With cuBLAS v12.8.4.1:

 File "faster_whisper\transcribe.py", line 2179, in restore_speech_timestamps
 File "faster_whisper\transcribe.py", line 1568, in generate_segments
 File "faster_whisper\transcribe.py", line 1918, in add_word_timestamps
 File "faster_whisper\transcribe.py", line 2037, in find_alignment
RuntimeError: cuBLAS failed with status CUBLAS_STATUS_NOT_SUPPORTED

All these types are reported as supported by CTranslate2:
int8 = FAIL
int8_float16 = FAIL
int8_float32 = FAIL
int8_bfloat16 = FAIL
float16 = WORKS
float32 = WORKS
bfloat16 = WORKS

Anyone knows what is going on with those GeForce RTX 50XX GPUs?

Tags: 5050. 5060. 5060Ti, 5070, 5070Ti, 5080, 5090

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions