Skip to content

Commit 8ab989d

Browse files
committed
Revert "update README"
This reverts commit 41d6185.
1 parent 41d6185 commit 8ab989d

File tree

3 files changed

+4
-372
lines changed

3 files changed

+4
-372
lines changed

README.md

Lines changed: 4 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,7 @@ developers to train custom multimodal large language model (MLLM), focusing on <
2828
6. [Citation](#citation)
2929

3030
# News
31-
- [Update Nov. 5, 2024] Recipes for [speech emotion captioning (SEC)](examples/sec_emotioncaps/README.md) with [emotion2vec](https://github.com/ddlBoJack/emotion2vec) as the encoder has been supported.
32-
- [Update Oct. 12, 2024] Recipes for [SLAM-AAC](examples/slam_aac/README.md) with [EAT](https://github.com/cwx-worst-one/EAT) as the encoder have been supported.
31+
- [Update Oct. 12, 2024] Recipes for [SLAM-AAC](examples/slam_aac/README.md) have been supported.
3332
- [Update Sep. 28, 2024] Recipes for [CoT-ST](examples/st_covost2/README.md) have been supported.
3433
- [Update Sep. 25, 2024] Recipes for [DRCap](examples/drcap_zeroshot_aac/README.md) have been supported.
3534
- [Update Jun. 12, 2024] Recipes for [MaLa-ASR](examples/mala_asr_slidespeech/README.md) have been supported.
@@ -91,7 +90,6 @@ We provide reference implementations of various LLM-based speech, audio, and mus
9190

9291
- Text-to-Speech (TTS)
9392
- [VALL-E-X](examples/vallex/README.md)
94-
- [Speech Emotion Captioning (SEC)](examples/sec_emotioncaps/README.md)
9593

9694
- **Audio Task**
9795
- [Automated Audio Captioning (AAC)](examples/aac_audiocaps/README.md)
@@ -120,10 +118,7 @@ command-line (shell file) > Hydra configuration (yaml file) > dataclass configur
120118
- We borrow code from [Fairseq](https://github.com/facebookresearch/fairseq) for deepspeed configuration.
121119
- We thank the contributors for providing diverse recipes.
122120

123-
# Citation
124-
125-
## Speech Task
126-
121+
## Citation
127122
SLAM-ASR:
128123
```
129124
@article{ma2024embarrassingly,
@@ -133,27 +128,7 @@ SLAM-ASR:
133128
year={2024}
134129
}
135130
```
136-
Mala-ASR:
137-
```
138-
@article{yang2024mala,
139-
title={MaLa-ASR: Multimedia-Assisted LLM-Based ASR},
140-
author={Yang, Guanrou and Ma, Ziyang and Yu, Fan and Gao, Zhifu and Zhang, Shiliang and Chen, Xie},
141-
journal={Proc. INTERSPEECH},
142-
year={2024}
143-
}
144-
```
145-
CoT-ST:
146-
```
147-
@article{du2024cot,
148-
title={CoT-ST: Enhancing LLM-based Speech Translation with Multimodal Chain-of-Thought},
149-
author={Du, Yexing and Ma, Ziyang and Yang, Yifan and Deng, Keqi and Chen, Xie and Yang, Bo and Xiang, Yang and Liu, Ming and Qin, Bing},
150-
journal={arXiv preprint arXiv:2409.19510},
151-
year={2024}
152-
}
153-
```
154131

155-
156-
## Audio Task
157132
SLAM-AAC:
158133
```
159134
@article{chen2024slam,
@@ -163,21 +138,5 @@ SLAM-AAC:
163138
year={2024}
164139
}
165140
```
166-
DRCap:
167-
```
168-
@article{li2024drcap,
169-
title={DRCap: Decoding CLAP Latents with Retrieval-augmented Generation for Zero-shot Audio Captioning},
170-
author={Li, Xiquan and Chen, Wenxi and Ma, Ziyang and Xu, Xuenan and Liang, Yuzhe and Zheng, Zhisheng and Kong, Qiuqiang and Chen, Xie},
171-
journal={arXiv preprint arXiv:2410.09472},
172-
year={2024}
173-
}
174-
```
175-
BAT:
176-
```
177-
@article{zheng2024bat,
178-
title={BAT: Learning to Reason about Spatial Sounds with Large Language Models},
179-
author={Zheng, Zhisheng and Peng, Puyuan and Ma, Ziyang and Chen, Xie and Choi, Eunsol and Harwath, David},
180-
journal={Proc. ICML},
181-
year={2024}
182-
}
183-
```
141+
142+

examples/s2s/scripts/utils/data_cleaning_cn.py

Lines changed: 0 additions & 257 deletions
This file was deleted.

0 commit comments

Comments
 (0)