Show Less

Corpus Data across Languages and Disciplines

Series:

Edited By Piotr Pezik

Over the recent years corpus tools and methodologies have gained widespread recognition in various areas of theoretical and applied linguistics. Data lodged in corpora is explored and exploited across languages and disciplines as distinct as historical linguistics, language didactics, discourse analysis, machine translation and search engine development to name but a few. This volume contains a selection of papers presented at the 8 th edition of the Practical Applications in Language and Computers conference and it is aimed at helping a wide community of researchers, language professionals and practitioners keep up to date with new corpus theories and methodologies as well as language-related applications of computational tools and resources.

Prices

Show Summary Details
Restricted access

In Quest of a Multi-Purpose Multi-Corpus Service-Based Corpus Research Tool: Karlheinz Moerth and Matej Durco

Extract

In Quest of a Multi-Purpose Multi-Corpus Service-Based Corpus Research Tool Karlheinz Moerth and Matej Durco Abstract Over the past decades, sizeable amounts of authentic digital language data have been collected for many languages in many parts of the world. While the available data is continuously increasing in size, tools to access them and glean data from them have remained comparatively scarce. One such prototypical corpus access application has been developed at the Institute of Corpus Linguistics and Text Technology of the Austrian Academy of Sciences. The so-called corpusBrowser is a program that not only allows researchers to linguistically analyze the corpus at hand, but also has navigational tools that offer random access to the texts under investigation. This paper discusses concepts underlying the design of the central components of this freely available corpus front end and highlights recently developed additional features. The second part of the paper includes the outline of an enhanced requirement specification defining comparable tools and presents plans to transform this standalone application into a multi-purpose corpus front end that operates in a platform-independent manner in any internet browser. After discussing the corpusBrowser application, we will present our plans for a generic web-based application, a front end for heterogeneous text resources available across different locations and domains on the Internet, and discuss the first implementation steps, which the ICLTT already started to work on in early 2011. This content search system is being built largely on previous ICLTT activities (Korpus C4, CLARIN CMDI and CLARIN European Demo...

You are not authenticated to view the full text of this chapter or article.

This site requires a subscription or purchase to access the full text of books or journals.

Do you have any questions? Contact us.

Or login to access all content.