DAVID MORENO-OLALLA, ANTONIO MIRANDA-GARCÍA
An Annotated Corpus of Middle English Scientific Prose: Aims and Features1 1. Introduction Tagged corpora can be said to fall roughly into two main classes: labelled and annotated ones. Some are labelled in the sense that the items of the corpus are added tags (normally as headers to particular sections of the text) to indicate just some, very general features (information like title, author, date, genre, and/or such macrotextual information as line and page). This permits easy identification and reference, which comes in particularly handy when you have to deal with mammoth corpora containing several million words. When employing this system, the customary procedure is that a string of items will share the same tag. For example, the words comprising the sentence To be, or not to be, that is the question could be tagged together under a single label, perhaps something like: , which indicates author, title of the work, act, scene and line, respectively. The following line of the prince’s famous soliloquy, Whether ’tis nobler in the mind to suffer, would then offer something similar to . Further information can be added to this simple tag as needed regarding date of composition, dramatis persona, and the like. The actual syntax of the tag may vary exceedingly, depending on the mark-up language used; for example, 1 The present research has been funded by the Spanish Ministry of Science and Technology (grant number HUM2004–01075FILO) and by the Autonomous Government of Andalusia (grant number HUM–2609). These grants are hereby gratefully acknowledged....
You are not authenticated to view the full text of this chapter or article.
This site requires a subscription or purchase to access the full text of books or journals.
Do you have any questions? Contact us.Or login to access all content.