xlsx-to-mods-xml conversion tool for Tibetan newspaper metadata

This repository contains two functions for converting our project-internal metadata spreadsheets Diverge_Tibetan Newspaper Metadata for XML v2.xlsx to MODS-formatted XML.

Note that these scripts are for project use only, and will not work outside of the Diverge project, because the conversion process from Excel to XML requires hard-coding of column names. Generally, columns and column names cannot be modified. However, new newspaper records (rows) can be added and information updated. To add, alter, or delete columns get in touch with the developers.

The spreadsheet Diverge_Tibetan Newspaper Metadata for XML v2.xlsx contains the metadata for the newspapers in the Diverge corpus. It includes links to the original holdings and aggregates information from various sources, including the newspapers, the catalogue entries of the libraries, where original copies are held and from previous research:

Schubert, J. 1935. "Tibetische Literatur in modernem Gewande (Mit einem Exkurs über tibetische Zeitungen)." Artibus Asiae 5(1), 95–98.
Schubert, J. 1958. Publikationen des modernen chinesisch-tibetischen Schrifttums./ Veröffentlichung / Deutsche Akademie der Wissenschaften, Institut für Orientforschung 39. Berlin: Akademie-Verlag.
Kolmaš, J. 1962. "Tibetan Literature in China." Archív Orientální 30, 638–44.
Kolmaš, J. 1978. Tibetan Books and Newspapers (Chinese collection): With Bibliographical Notes. Asiatische Forschungen 62. Wiesbaden: Harrassowitz.
Erhard, F.X. 2015. "Tibetan Mass Media. A Preliminary Survey of Tibetan Language Newspapers." In O. Czaja and G. Hazod (eds.) The Illuminating Mirror: Tibetan Studies in Honour of Per K. Sørensen on the Occasion of his 65th Birthday, 155–171. Contributions to Tibetan Studies 12. Wiesbaden: Reichert.
Erhard, F.X. and H. Hou 2018. "The Melong in Context. A Survey of the Earliest Tibetan Language Newspapers 1904–1960." In F. Wang-Toutain and M. Preziosi (eds.) Cahiers du Mirror, 1–40. Paris: Collège de France.

How to use

The workflow is executed in two steps. First, the python programs must be stored hierarchically at or above the level where the raw data and the target directory is stored, in the same hierarchy. Then, in the command line interface, run the following command:

python -m flat-xml.py

You will then be prompted for the relevant file path inputs. This script generates “flat” XML for each record in the spreadsheet, where each column corresponds to a unique field in the XML.

Once the “flat” XML files have been generated, run the second script, which converts “flat” XML to MODS format, compliant with the Berlin State Library’s metadata storage standards, with the command prompt:

python -m mods-from-flat-xml.py

As before, you will receive prompts to input your paths to files or folders for conversion. The MODS converter can handle single records or a directory containing only XML files.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
xml_ver2		xml_ver2
Diverge_Newspaper Metadata for XML v2.xlsx		Diverge_Newspaper Metadata for XML v2.xlsx
LICENSE		LICENSE
README.md		README.md
cross-repo-processing.yml		cross-repo-processing.yml
flat-xml.py		flat-xml.py
mods-from-flat-xml.py		mods-from-flat-xml.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

xlsx-to-mods-xml conversion tool for Tibetan newspaper metadata

How to use

About

Uh oh!

Releases

Packages

Languages

License

StabiBerlin/xlsx-to-xml-mods-processing

Folders and files

Latest commit

History

Repository files navigation

xlsx-to-mods-xml conversion tool for Tibetan newspaper metadata

How to use

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages