No punctuation in result - Train more lines or preprocess data.dev.txt?

I followed

```
python data.py <data_dir>
python main.py <model_name> 256 0.02
cat data.dev.txt | python punctuator.py <model_path> <model_output_path>
```
I used the `europarl-v7.de-en.de` dataset and took 

```
1800 lines for ep.dev.txt
1800 lines for ep.test.txt
7200 lines for ep.train.txt
```

with `data.dev.txt` being a long string on one line from kaldi, a speech-to-text engine. It's all lowercase, sometimes wrong words and no punctuation.

`<model_output_path>` is equal to `data.dev.txt`

Is the solution to train more lines or do I have to preprocess `data.dev.txt`? If the latter, how?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

No punctuation in result - Train more lines or preprocess data.dev.txt? #76

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

No punctuation in result - Train more lines or preprocess data.dev.txt? #76

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions