This repository contains the TEI data that represents the Bodleian Library's catalogue of manuscripts written from the Middle Ages, Medieval Manuscripts in Oxford Libraries.
It also contains several scripts and tools for processing this data into a Solr instance for use with our Blacklight search service.
For some additional information see the Wiki.
For the TEI schema and guidelines, see the msDesc repository.
For information on the collections themselves, see the LibGuide.
The Python scripts in this project use uv for dependency management.
XML model validation uses the same Java validator as the GitHub Action:
sh processing/validate_xml.shsh processing/validate_xml.sh collections/e_Mus/MS_e_Mus_229.xmlIn VS Code, the Validate Current XML File task validates the active editor file and surfaces matching diagnostics in the Problems panel.
To check entity key consistency locally, run:
uv run python processing/check_entity_keys.py -d collectionsThe shared authority tooling is intended to come from the public msDesc/tei-msdesc-authorities repository via uv.
To install development dependencies, including pytest, run:
uv sync --devTo run the Python regression tests, use:
uv run pytest