Skip to content

Create draft files using the target project's encoding #859

@mmartin9684-sil

Description

@mmartin9684-sil

When drafts are created using silnlp experiment/translate scripts, they use either utf-8, utf-8-sig, or the source project's Encoding setting. However, if the draft is being created so that it can be added to a target project with a different Encoding, the draft needs to be saved with the encoding used by the target project. For instance, if the target project uses an encoding of "cp1250" or "cp1252", a draft saved with "utf-8"/"utf-8-sig" encoding may not import correctly into the Paratext project.

For instance, the MAO project is configured in Paratext to use "1250" (Central European) encoding. When a draft is created with "utf-8" encoding using silnlp, the Latin Extended and Latin Extended A characters are not correctly displayed in Paratext. If the text is corrected so that it uses one of these characters, the text is not saved correctly after editing.

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingpipeline 6: inferIssue related to using a trained model to translate.

Type

Projects

Status

🔖 Ready

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions