A first-order statistical model was implemented to follow the vocabulary changes in literary works. The model only takes into account the frequency of the words of a text; it is thus independent of syntactic and semantic constrains. Using this model a characteristic pattern was found, describing these changes in vocabulary and being unique to each text. The pattern not only identifies the vocabulary-rich segments but enables the user to compare the vocabulary of different texts. The work focuses on comparing foreign language translations, condensation, and lemmatization of texts. It provides evidence that lemmatization alone does not alter the characteristic pattern. On the other hand, changes in this pattern in translated texts reveal how faithful translators are considering the vocabulary they use.
Frankfurt am Main, Berlin, Bern, Bruxelles, New York, Oxford, Wien, 2011. 197 pp., num. fig. and tables
Contents: First-order statistical model – Newly introduced words – Vocabulary-rich segments of novels – Foreign language translation
– Condensation – Lemmatization – Hapax legomena – The Jungle Books – The Da Vinci Code – The Adventures of Tom Sawyer –
The Adventures of Robinson Crusoe – Alice’s Adventures in Wonderland – Fateless.