Proceedings of the 14 th Norddeutsches Linguistisches Kolloquium 2013 in Halle an der Saale
Edited By Anne Ammermann, Alexander Brock, Jana Pflaeging and Peter Schildhauer
The Morphilo Toolset. Handling the Diversity of English Historical Texts
This paper introduces a new toolset for quantitative diachronic analysis of English derivational morphology. It addresses the question why quantitative diachronic approaches deserve more attention in today’s corpus linguistics. Moreover, the paper at hand offers a solution to lowering the workload that naturally comes along with annotating historical texts. The Morphilo toolset consists of three components: Morextractor, Morphilizer, and Morquery. Morextractor commands a reductionistic logic matching a set of affix strings to the given word input by using a simple rule set of the English Morphology. Since this algorithm is highly overgeneralizing, Morphilizer assists in correcting the overgeneralizations and storing the correct entries in a database. Last, Morquery is a tool to conveniently query the database for all common features encountered in English derivational morphology. In sum, the Morphilotools assist in filling and querying the database. The toolset can be integrated in a web-based repository granting access to other researchers in the field and, at the same time, setting incentives to the user to contribute to the data stock by uploading their own annotated texts.
Diachronic resources for quantitative corpus analysis suffer from a major weakness of today’s data structures, i.e. they lack an adequate representation of the dynamics of time. Most of the work in quantitative historical linguistics has been carried out in the phylogenetic paradigm in typological research (cf. e.g. Cusouw 2013 or Steiner, et al. 2011, Wichmann & Saunders 2007) or on a...
You are not authenticated to view the full text of this chapter or article.
This site requires a subscription or purchase to access the full text of books or journals.
Do you have any questions? Contact us.Or login to access all content.