What Would Cicero Write?

oleh: Todd G. Cook, TGC

Format: Article
Diterbitkan: Università degli Studi di Torino 2021-12-01

Deskripsi

Recent developments in Transformer language models now allow users to predict the probability of different sentences and to predict missing words more accurately than before. This new information and perspective can be used to form judgments on novel textual emendations and to further quantify existing historical editorial judgments. We examine the importance of analyzing an author’s corpus, and the impact of the Good-Turing theory of frequency estimation when predicting missing words. We will also outline some of the limits of what Transformer language models can do, and how to practically evaluate them.