Skip to content

Why SentencePieceTokenizer can't save vocab file #282

@Codle

Description

@Codle

I want to use vocab file in PairedDataloader, but the the save_vocab function of SentencePieceTokenizer only save the model file.

The model file can't be load by Dataloader because of decoding error.

In sentencepiece_tokenizer.py, I saw you delete the vocab file.

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requestedtopic: dataIssue about data loader modules

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions