-
-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Task
0 / 70 of 7 issues completed
Copy link
Description
(From Michael)
Serval and silnlp rely on different implementations of the key terms logic. Differences have been identified in a number of areas, including:
- Pairing KT glosses (from Paratext datasets) and project-specific KT renderings
- Adding key term pairs to the training data (silnlp adds each KT twice; Serval adds each KT once).
- Extracting key term renderings from the project data.
- Supported gloss languages (Serval supports English, Spanish, French, Indonesian and Portuguese, while silnlp does not support Portuguese).
A single shared implementation would simplify maintenance and enhancement of this feature as well as user support.
The most natural place for a common implementation is Machine. This will require updates to machine.py, Machine, silnlp, and Serval. Broadly, we need to:
- Expand machine.py's key term handling (extraction and alignment) to be consistent with/as flexible as silnlp's current behavior as needed.
- Replace silnlp's key term handling with machine.py's
- Port machine.py updates to Machine (C#)
- Make updates to Serval and machine.py's
build_jobclasses as needed to use those updates
We should establish a small collection of key terms situations/projects that we can use to test consistency throughout this process.