Proceedings of the Fifth International Conference of the Society of Historical English Language and Linguistics
Edited By Michiko Ogura and Hans Sauer
This volume is a collection of papers read at the International Medieval Congress at Leeds in 2017, in two sessions organized by the Institute of English Studies at the University of London and four sessions organized by the Society of Historical English Language and Linguistics. Contributions consist of poetry, prose, interlinear glosses, syntax, semantics, lexicology, and medievalism. The contributors employ a wealth of different approaches. The general theme of the IMC 2017 was ‘otherness’, and some papers fit this theme very well. Even when two researchers deal with a similar topic and arrive at different conclusions, the editors do not try to harmonize them but present them as they are for further discussion.
6 The design and implementation of a pilot parallel corpus of Old English
Abstract: This article presents the pilot corpus on the basis of which the Parallel Corpus of Old English Prose will be compiled. Some conclusions drawn from the pilot corpus may guide the sources, method, and design of the final version. The most important is that the core database has to be organised by textual form so as to enhance the retrievability of information.
This article discusses the principles that guide the design of a pilot parallel corpus of Old English and presents the preliminary version of the corpus, which is implemented on database software. The relevance of the undertaking lies in the lack of a large collection of texts with parallel translation for the study of Old English. On the theoretical side, the concept of parallel corpus is based on Aijmer and Altenberg (1996, in McEnery and Xiao 2007), while the idea that a pilot corpus should be compiled before the final corpus draws on Biber (1993). These questions are addressed in Section 2, which reviews previous research and sets the standards of the parallel corpus on the basis of the state of the art in parallel corpus design and compilation. On the applied side, the focus of the article is on the selection of the sources that allow for a maximal degree of information retrieval and automatisation. Two types of knowledge bases are distinguished, lexicographical knowledge bases and textual knowledge bases (Section 3), depending on whether they are lemmatised or not. The pilot corpus is...
You are not authenticated to view the full text of this chapter or article.
This site requires a subscription or purchase to access the full text of books or journals.
Do you have any questions? Contact us.Or login to access all content.