Show Less
Restricted access

Vocabulary Richness of Novels and Their Adaptations


Maria Csernoch

A first-order statistical model was implemented to follow the vocabulary changes in literary works. The model only takes into account the frequency of the words of a text; it is thus independent of syntactic and semantic constrains. Using this model a characteristic pattern was found, describing these changes in vocabulary and being unique to each text. The pattern not only identifies the vocabulary-rich segments but enables the user to compare the vocabulary of different texts. The work focuses on comparing foreign language translations, condensation, and lemmatization of texts. It provides evidence that lemmatization alone does not alter the characteristic pattern. On the other hand, changes in this pattern in translated texts reveal how faithful translators are considering the vocabulary they use.
Contents: First-order statistical model – Newly introduced words – Vocabulary-rich segments of novels – Foreign language translation – Condensation – Lemmatization – Hapax legomena – The Jungle Books – The Da Vinci Code – The Adventures of Tom Sawyer – The Adventures of Robinson Crusoe Alice’s Adventures in Wonderland – Fateless.