Edited By Piotr Pezik
In Quest of a Multi-Purpose Multi-Corpus Service-Based Corpus Research Tool: Karlheinz Moerth and Matej Durco
In Quest of a Multi-Purpose Multi-Corpus Service-Based Corpus Research Tool Karlheinz Moerth and Matej Durco Abstract Over the past decades, sizeable amounts of authentic digital language data have been collected for many languages in many parts of the world. While the available data is continuously increasing in size, tools to access them and glean data from them have remained comparatively scarce. One such prototypical corpus access application has been developed at the Institute of Corpus Linguistics and Text Technology of the Austrian Academy of Sciences. The so-called corpusBrowser is a program that not only allows researchers to linguistically analyze the corpus at hand, but also has navigational tools that offer random access to the texts under investigation. This paper discusses concepts underlying the design of the central components of this freely available corpus front end and highlights recently developed additional features. The second part of the paper includes the outline of an enhanced requirement specification defining comparable tools and presents plans to transform this standalone application into a multi-purpose corpus front end that operates in a platform-independent manner in any internet browser. After discussing the corpusBrowser application, we will present our plans for a generic web-based application, a front end for heterogeneous text resources available across different locations and domains on the Internet, and discuss the first implementation steps, which the ICLTT already started to work on in early 2011. This content search system is being built largely on previous ICLTT activities (Korpus C4, CLARIN CMDI and CLARIN European Demo...
You are not authenticated to view the full text of this chapter or article.
This site requires a subscription or purchase to access the full text of books or journals.
Do you have any questions? Contact us.Or login to access all content.