Skip to content

Invalid translations when using the pre-trained NLLB model #778

@mmartin9684-sil

Description

@mmartin9684-sil

Using the pre-trained NLLB model to translate a USFM file is producing invalid translations - i.e., a apparently random output of either Latin text or Devanagari text instead of the expected 100% Devanagari output. The problem is happening with both the experiment script and with the translate script.

An experiment demonstrating this problem can be found here: Nepal\WesternTamang\NLLB.1.3B.npi-WTNBT.en-WTEBT. This configuration is intended to translate a Nepali USFM file into English. Instead, the generated USFM file is mostly Devangari text, with some Latin text - but no English output. The infer subfolder has a draft created using the translate script and a draft created using the experiment script. Both drafts exhibit similar problems.

Metadata

Metadata

Labels

blockerbugSomething isn't workingpipeline 6: inferIssue related to using a trained model to translate.

Type

Projects

Status

✅ Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions