Proceedings of the 14 th Norddeutsches Linguistisches Kolloquium 2013 in Halle an der Saale
The Morphilo Toolset. Handling the Diversity of English Historical Texts
This paper introduces a new toolset for quantitative diachronic analysis of English derivational morphology. It addresses the question why quantitative diachronic approaches deserve more attention in today’s corpus linguistics. Moreover, the paper at hand offers a solution to lowering the workload that naturally comes along with annotating historical texts. The Morphilo toolset consists of three components: Morextractor, Morphilizer, and Morquery. Morextractor commands a reductionistic logic matching a set of affix strings to the given word input by using a simple rule set of the English Morphology. Since this algorithm is highly overgeneralizing, Morphilizer assists in correcting the overgeneralizations and storing the correct entries in a database. Last, Morquery is a tool to conveniently query the database for all common features encountered in English derivational morphology. In sum, the Morphilotools assist in filling and querying the database. The toolset can be integrated in a web-based repository granting access to other researchers in the field and, at the same time, setting incentives to the user to contribute to the data stock by uploading their own annotated texts.