eCommons

 

Like Two Pis in a Pod: Author Similarity in the Ancient Greek Corpus

Other Titles

Abstract

One commonly recognized feature of the Ancient Greek corpus is that some later texts imitate and allude to model texts from earlier time periods, but analysis of this phenomenon is mostly done for specific author pairs based on close reading and highly visible instances of imitation. In this work, we use computational techniques to examine the similarity of a wide range of Ancient Greek authors, with a particular focus on similarity between authors writing many centuries apart. We represent texts and authors based on their usage of high-frequency words to capture author signatures rather than document topics. We propose the Jensen-Shannon Similarity metric for measuring similarity between authors and show that it outperforms other common metrics for vector comparison. We then use this similarity metric to analyze author similarity across distances in time, finding high similarity between specific authors and across the corpus that is not common to all languages. We analyze these similar author pairs more closely and find the similarity is the result of similar usage of many different words rather than just a few.

Journal / Series

Volume & Issue

Description

Sponsorship

Date Issued

2019-05-30

Publisher

Keywords

Classical literature; Computer science; Stylometry; Ancient languages; digital humanities; Ancient Greek

Location

Effective Date

Expiration Date

Sector

Employer

Union

Union Local

NAICS

Number of Workers

Committee Chair

Mimno, David

Committee Co-Chair

Committee Member

Rusten, Jeffrey S.

Degree Discipline

Computer Science

Degree Name

M.S., Computer Science

Degree Level

Master of Science

Related Version

Related DOI

Related To

Related Part

Based on Related Item

Has Other Format(s)

Part of Related Item

Related To

Related Publication(s)

Link(s) to Related Publication(s)

References

Link(s) to Reference(s)

Previously Published As

Government Document

ISBN

ISMN

ISSN

Other Identifiers

Rights

Rights URI

Types

dissertation or thesis

Accessibility Feature

Accessibility Hazard

Accessibility Summary

Link(s) to Catalog Record