Like Two Pis in a Pod: Author Similarity in the Ancient Greek Corpus
Storey, Grant Justin
One commonly recognized feature of the Ancient Greek corpus is that some later texts imitate and allude to model texts from earlier time periods, but analysis of this phenomenon is mostly done for specific author pairs based on close reading and highly visible instances of imitation. In this work, we use computational techniques to examine the similarity of a wide range of Ancient Greek authors, with a particular focus on similarity between authors writing many centuries apart. We represent texts and authors based on their usage of high-frequency words to capture author signatures rather than document topics. We propose the Jensen-Shannon Similarity metric for measuring similarity between authors and show that it outperforms other common metrics for vector comparison. We then use this similarity metric to analyze author similarity across distances in time, finding high similarity between specific authors and across the corpus that is not common to all languages. We analyze these similar author pairs more closely and find the similarity is the result of similar usage of many different words rather than just a few.
Classical literature; Computer science; Stylometry; Ancient languages; digital humanities; Ancient Greek
Rusten, Jeffrey S.
M.S., Computer Science
Master of Science
dissertation or thesis