Skip to content

Allow the onboarding script to perform language identification #782

@benjaminking

Description

@benjaminking

At a lower priority than all the other sub-tasks, the onboarding script should perform language identification at both the project and book level for each project. This is to help catch common configuration errors that exist in Paratext projects:

  • language codes not set correctly, perhaps due to copying the settings from a different project
  • different languages used for different books

We don't expect that the language identifier will know the minority languages, but it should output unknown for a language it doesn't recognize, rather than reporting the closest language that it does know.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestpipeline 2: extractIssue related to extracting parallel corpora

Projects

Status

🆕 New

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions