The repository of our LREC-COLING 2024 paper, following this work.
The authorship of the Homeric poems has been a matter of debate for centuries. Computational approaches such as language modeling exist that can aid experts in making crucial headway. We observe, however, that such work has, thus far, only been carried out at the level of lengthier excerpts, but not individual verses, the level at which most suspected interpolations lie. We address this weakness by presenting a corpus of Homeric verses, each complemented with a score quantifying linguistic unexpectedness. We assess the nature of these scores by exploring their correlation with named entities and the frequency of character n-grams and words, revealing robust correlations with the latter. This apparent bias can be partly overcome by simply dividing scores for unexpectedness by the maximum term frequency per verse.
The resources exist in this repository as two spreadsheets, but they can be accessed easier with this service.
Please cite us as:
@inproceedings{holm2024,
title={{HoLM}: {A}nalysing {L}inguistic {U}nexpectedness in {H}omeric {P}oetry},
author={Pavlopoulos, John and Sandell, Ryan and Konstantinidou, Maria and Bozzone, Chiara},
booktitle={LREC-COLING},
year={2024},
}
- John Pavlopoulos
- Ryan Sandell
- Maria Konstantinidou
- Chiara Bozzone