Show Less
Restricted access

Specialisation and Variation in Language Corpora

Series:

Edited By Ana Diaz-Negrillo and Francisco Javier Diaz-Pérez

Corpus linguistics was initiated with the compilation and exploitation of native English reference corpora. Over the past years, corpus linguistics has experienced such a great expansion and specialisation that a variety of languages, registers, text types and speakers are now represented in language corpora. This volume intends to give evidence of the extraordinary expansion that corpus linguistics and language corpora have undergone. It focuses on emerging types of corpora and corpus techniques, and also presents corpus-based studies in areas which have benefited from the recent developments in corpus linguistics methods and techniques, including foreign language teaching, language acquisition, translation and terminology dialectology, lexicography and language variation. The volume comprises 11 papers on technical aspects of corpus data processing, on corpus-based linguistic research, and on emerging corpora. It is structured in three main sections, one for each of the three latter aspects.
Show Summary Details
Restricted access

Conjunctive relations across languages, registers and modes: semi-automatic extraction and annotation: Ekaterina Lapshinova-Koltunski, Kerstin Kunz

Extract

EKATERINA LAPSHINOVA-KOLTUNSKI / KERSTIN KUNZ

Conjunctive relations across languages, registers and modes: semi-automatic extraction and annotation

Abstract

In the present paper, we focus on the analysis of conjunctions as intra- and intersentential links in texts which play an important role in text organisation. Our research goal is to explore a broad range of cohesive conjunctive relations across languages, registers and with varying mode of discourse (spoken vs. written). We aim to analyse how the resources for establishing cohesive conjunctive relations provided by English and German language systems are instantiated in naturally occurring texts of English and German. More specifically, we intend to explore contrasts between the two languages in form, frequency, function and relation. For this purpose, we develop semi-automatic extraction and annotation procedures. In a first step, we extract instances of selected conjunction types on the basis of available lexico-grammatical knowledge (lexical lists, context restrictions, etc.). In a second step, we annotate the extracted results with the types that they instantiate (e.g. semantic: additive, adversative, causal, etc., or syntactic: connects, subjuncts and adverbs). This methodology facilitates our corpus-based contrastive analyses as it provides easy access to information on comparable types of cohesive conjunctions for both languages under investigation. Now, we can query our corpus for different aspects or properties of conjunctive relations encoded as abstract categories – on both a general and a more fine-grained level, or in a specific context. The frequency lists extracted are used to analyse the distribution of conjunctive relations...

You are not authenticated to view the full text of this chapter or article.

This site requires a subscription or purchase to access the full text of books or journals.

Do you have any questions? Contact us.

Or login to access all content.