Skip to content

Commit ffedc5c

Browse files
authored
fix: speaker embedding bug (#1178)
* fix: improve handling of speaker embeddings in transcribe_task * chore: bump version to 3.4.1
1 parent b93e9b6 commit ffedc5c

File tree

3 files changed

+10
-3
lines changed

3 files changed

+10
-3
lines changed

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
urls = { repository = "https://github.com/m-bain/whisperx" }
33
authors = [{ name = "Max Bain" }]
44
name = "whisperx"
5-
version = "3.4.0"
5+
version = "3.4.1"
66
description = "Time-Accurate Automatic Speech Recognition using Whisper."
77
readme = "README.md"
88
requires-python = ">=3.9, <3.13"

uv.lock

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

whisperx/transcribe.py

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -213,12 +213,19 @@ def transcribe_task(args: dict, parser: argparse.ArgumentParser):
213213
results = []
214214
diarize_model = DiarizationPipeline(model_name=diarize_model_name, use_auth_token=hf_token, device=device)
215215
for result, input_audio_path in tmp_results:
216-
diarize_segments, speaker_embeddings = diarize_model(
216+
diarize_result = diarize_model(
217217
input_audio_path,
218218
min_speakers=min_speakers,
219219
max_speakers=max_speakers,
220220
return_embeddings=return_speaker_embeddings
221221
)
222+
223+
if return_speaker_embeddings:
224+
diarize_segments, speaker_embeddings = diarize_result
225+
else:
226+
diarize_segments = diarize_result
227+
speaker_embeddings = None
228+
222229
result = assign_word_speakers(diarize_segments, result, speaker_embeddings)
223230
results.append((result, input_audio_path))
224231
# >> Write

0 commit comments

Comments
 (0)