You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/aispeech_asr/README.md
+2-24Lines changed: 2 additions & 24 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -66,7 +66,6 @@ dev_scp_file_path= # Path to validation data
66
66
train_max_frame_length=1500 # Maximum frame length for training
67
67
eval_max_frame_length=1000 # Maximum frame length for evaluation
68
68
multitask_prompt_path= # Path to multitask.jsonl
69
-
prompt_style="\{\}"# Prompt style, e.g., "<|im_start|>user\n{}<|im_end|>\n<|im_start|>assistant\n" or "USER: {}\n ASSISTANT:"
70
69
projector=linear # Type of projector
71
70
encoder_name=whisper # Name of the encoder
72
71
llm_name=Qwen2.5-7B-Instruct # Name of the LLM
@@ -86,7 +85,7 @@ For LoRA training, set (with `ckpt_path` pointing to the model saved in the prev
86
85
```bash
87
86
use_peft=true
88
87
if [[ $use_peft=="true" ]];then
89
-
ckpt_path= # For DDP training, provide the path to the saved pt file; for DeepSpeed training, convert mp_rank_00_model_states.pt to model.pt using the `scripts/transcribe_deepspeed_to_pt.py` script
88
+
ckpt_path=
90
89
fi
91
90
```
92
91
### Deepspeed
@@ -113,28 +112,7 @@ When using `bf16`/`fp16` for training, deepspeed saves about 20GB of GPU memory
113
112
}
114
113
}
115
114
```
116
-
117
-
Note that when using `zero-0`/`1`/`2`, the DeepSpeed model is saved in a format that requires a script to convert `mp_rank_00_model_states.pt` to `model.pt`, such as `python scripts/transcribe_deepspeed_to_pt.py mp_rank_00_model_states.pt output_dir`.
Copy file name to clipboardExpand all lines: examples/asr_librispeech/README.md
+2-22Lines changed: 2 additions & 22 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -79,27 +79,7 @@ If you're interested in training with DeepSpeed, refer to the script `finetune_w
79
79
}
80
80
```
81
81
82
-
Note that when using `zero-0`/`1`/`2`, the DeepSpeed model is saved in a format that requires a script to convert `mp_rank_00_model_states.pt` to `model.pt`, such as `python transcribe_deepspeed_to_pt.py mp_rank_00_model_states.pt output_dir`.
Note that when using `zero-0`/`1`/`2`/`3`, the DeepSpeed model is saved as `pytorch_model.bin`, and you should change "++ckpt_path=$ckpt_path/model.pt" to " ++ckpt_path=$ckpt_path/pytorch_model.bin" in the script to use the model during inference.
103
83
If you use bf16/fp16 training in DeepSpeed and encounter NaN in train/eval loss, check the autocast in `src/slam_llm/utils/deepspeed_utils.py`:
104
84
105
85
```python
@@ -116,4 +96,4 @@ You can refer to the paper for more results.
0 commit comments