Skip to content

Commit a294995

Browse files
committed
update main readme
1 parent 4830fe1 commit a294995

File tree

1 file changed

+7
-7
lines changed

1 file changed

+7
-7
lines changed

README.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ developers to train custom multimodal large language model (MLLM), focusing on <
2828
6. [Citation](#citation)
2929

3030
# News
31-
- [Update Jan. 22, 2025] 🔥🔥🔥 Full reproduction for [SLAM-Omni](examples/s2s/README.md) has been supported.
31+
- [Update Jan. 22, 2025] 🔥🔥🔥 Full reproduction (including all data preparation, model training, and inference) for [SLAM-Omni](examples/s2s/README.md) has been supported.
3232
![](docs/slam-omni-model.png)
3333
- SLAM-Omni is a **timbre-controllable** voice interaction system that requires only **single-stage training** and minimal resources to achieve high-quality, end-to-end speech dialogue, supporting multi-turn conversations in both Chinese and English. ([paper](https://arxiv.org/abs/2412.15649), [demo](https://slam-omni.github.io))
3434
- We have fully reproduced the **training and inference** processes of SLAM-Omni and open-sourced all related training datasets. The provided code framework theoretically supports all codec-based spoken dialogue models. Additionally, we offer the reproduction code for [Mini-Omni](https://github.com/gpt-omni/mini-omni).
@@ -196,20 +196,20 @@ SLAM-Omni:
196196
## Audio Task
197197
SLAM-AAC:
198198
```
199-
@article{chen2024slam,
199+
@article{chen2025slam,
200200
title={SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs},
201201
author={Chen, Wenxi and Ma, Ziyang and Li, Xiquan and Xu, Xuenan and Liang, Yuzhe and Zheng, Zhisheng and Yu, Kai and Chen, Xie},
202-
journal={arXiv preprint arXiv:2410.09503},
203-
year={2024}
202+
journal={Proc. ICASSP},
203+
year={2025}
204204
}
205205
```
206206
DRCap:
207207
```
208-
@article{li2024drcap,
208+
@article{li2025drcap,
209209
title={DRCap: Decoding CLAP Latents with Retrieval-augmented Generation for Zero-shot Audio Captioning},
210210
author={Li, Xiquan and Chen, Wenxi and Ma, Ziyang and Xu, Xuenan and Liang, Yuzhe and Zheng, Zhisheng and Kong, Qiuqiang and Chen, Xie},
211-
journal={arXiv preprint arXiv:2410.09472},
212-
year={2024}
211+
journal={Proc. ICASSP},
212+
year={2025}
213213
}
214214
```
215215
BAT:

0 commit comments

Comments
 (0)