You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/slam_aac/README.md
+3-5Lines changed: 3 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -54,8 +54,8 @@ You can also fine-tune the model without loading any pre-trained weights, though
54
54
55
55
56
56
### Note
57
-
In the current version of SLAM-LLM, the `peft_ckpt` parameter is no longer required. However, if you are using the checkpoint provided by us, which was trained with an earlier version, please keep the `peft_ckpt` parameter in your configuration to ensure compatibility.
58
-
57
+
-In the current version of SLAM-LLM, the `peft_ckpt` parameter is no longer required. However, if you are using the checkpoint provided by us, which was trained with an earlier version, please keep the `peft_ckpt` parameter in your configuration to ensure compatibility.
58
+
- Due to differences in dependency versions, there may be slight variations in the performance of the SLAM-AAC model.
59
59
60
60
## Inference
61
61
To perform inference with the trained models, you can use the following commands to decode using the common beam search method:
For improved inference results, you can use the CLAP-Refine strategy, which utilizes multiple beam search decoding. Note that this method may take longer to run, but it can provide better quality outputs. You can execute the following commands:
70
+
For improved inference results, you can use the CLAP-Refine strategy, which utilizes multiple beam search decoding. To use this method, you need to download and use our pre-trained [CLAP](https://drive.google.com/drive/folders/1X4NYE08N-kbOy6s_Itb0wBR_3X8oZF56?usp=sharing) model. Note that CLAP-Refine may take longer to run, but it can provide better quality outputs. You can execute the following commands:
71
71
```bash
72
72
# Inference on AudioCaps (CLAP-Refine)
73
73
bash scripts/inference_audiocaps_CLAP_Refine.sh
@@ -86,5 +86,3 @@ You can refer to the paper for more results.
86
86
```
87
87
88
88
``` -->
89
-
90
-
<!-- [CLAP](https://drive.google.com/drive/folders/1X4NYE08N-kbOy6s_Itb0wBR_3X8oZF56?usp=sharing) model for post-processing (CLAP-refine) -->
0 commit comments