Data Form of the MaLa-ASR

### System Info

torch 2.1


### Information

- [X] The official example scripts
- [ ] My own modified scripts

### 🐛 Describe the bug

```shell
bash decode_MaLa-ASR_withkeywords_L95.sh
```
Hi, I'm currently working on reproducing the results of MaLa-ASR and have downloaded the slidespeech dataset from https://www.openslr.org/144/. While running the provided decoding script, I noticed that it requires the file located at /nfs/yangguanrou.ygr/slidespeech/${split}_oracle_v1/. Could you please clarify what the format of this file is? Do I need to preprocess the downloaded data in any specific way, such as splitting the audio based on timestamps?


### Error logs

no file named test_oracle_v1

### Expected behavior

Could you please provide the steps for data processing and explain the format of the data? Thanks, looking forward to your reply.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Data Form of the MaLa-ASR #130

System Info

Information

🐛 Describe the bug

Error logs

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Data Form of the MaLa-ASR #130

Description

System Info

Information

🐛 Describe the bug

Error logs

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions