Add:An example, aispeech_asr, and a dataset, speech_dataset_large, have been added and supporting multi-machine multi-GPU decoding #225

teamtee · 2025-04-17T03:55:41Z

What does this PR do?

An example, aispeech_asr, and a large dataset, speech_dataset_large, have been added, and supporting multi-machine multi-GPU decoding.

Feature/Issue validation/testing

Please describe the tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.

Add an example:aispeech_asr

This example is designed for large-scale industrial data training, suitable for datasets on the order of 100,000 hours. Its main features include:

Support for multi-task training: Designed to support tasks such as ASR and ST through a unified data format.
Dynamic prompt selection: Supports random selection from multiple prompts.
Iterative dataset: Uses an iterative dataset format to reduce startup time for large datasets.
Deepspeed training: Supports DeepSpeed training to significantly reduce memory usage.
Multi-machine multi-GPU inference: Supports distributed inference across multiple machines and GPUs to reduce evaluation time.
Dynamic frame batching: Dynamically combines frames based on audio size rather than using a fixed batch size, significantly reducing training and evaluation time (reduces training time by 3/4 for 100,000 hours of data).
Add an dataset for Dynamic frame batching
Support supporting multi-machine multi-GPU decoding.

Before submitting

Thanks for contributing 🎉!

…ge, have been added, supporting multi-machine multi-GPU decoding.

The function for handling data imbalance has been renamed to "deepspeed_join," and a bug where this function was not called has been fixed.

Change assert(0) to raise ValueError

Restore file

teamtee and others added 12 commits April 17, 2025 11:48

Add:An example, aispeech_asr, and a large dataset, speech_dataset_lar…

43d9e54

…ge, have been added, supporting multi-machine multi-GPU decoding.

Fix: deepspeed_utils.py

647fcbd

The function for handling data imbalance has been renamed to "deepspeed_join," and a bug where this function was not called has been fixed.

Fix: not import datetime error

f3447a5

Update README_zh.md

34598fe

Update README.md

e3df7e2

Update speech_dataset_large.py

02011e6

Change assert(0) to raise ValueError

Add:ddp support dynamic dataset

5106d94

Update README_zh.md

e9d606b

Update README.md

3a05581

Update finetune_deepspeed.sh

6501c65

Update train_utils.py

f07183e

asr_librispeech support deepspeed and update aispeech_asr

5ab1e81

teamtee force-pushed the for-merge-aispeech-asr branch 2 times, most recently from b8016ef to 5ab1e81 Compare April 23, 2025 09:47

teamtee added 2 commits April 23, 2025 23:47

Update finetune_whisper_large_linear_vicuna_7b_deepspeed.sh

6f92ca0

Restore file

Update finetune_deepspeed.sh

990709a

ddlBoJack merged commit 1f410bb into X-LANCE:main Apr 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add:An example, aispeech_asr, and a dataset, speech_dataset_large, have been added and supporting multi-machine multi-GPU decoding #225

Add:An example, aispeech_asr, and a dataset, speech_dataset_large, have been added and supporting multi-machine multi-GPU decoding #225

Uh oh!

teamtee commented Apr 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add:An example, aispeech_asr, and a dataset, speech_dataset_large, have been added and supporting multi-machine multi-GPU decoding #225

Add:An example, aispeech_asr, and a dataset, speech_dataset_large, have been added and supporting multi-machine multi-GPU decoding #225

Uh oh!

Conversation

teamtee commented Apr 17, 2025

What does this PR do?

Feature/Issue validation/testing

Before submitting

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants