Show Less
Restricted access

Multidisciplinary Approaches to Multilingualism

Proceedings from the CALS conference 2014

Edited By Kristina Cergol Kovačević and Sanda Lucija Udier

This volume offers a selection of twenty papers presented at the 28 th International Annual Conference of the Croatian Applied Linguistics Society held in 2014. The authors’ reflections on Multidisciplinary Approaches to Multilingualism fall into four different areas of investigation: 1) bilingual and multilingual studies focusing on research in foreign, second and lingua franca issues, 2) language policy and planning, 3) translation studies, lexis and lexical relations and 4) experimental research into language processing. The volume addresses an international audience and places a number of Croatian-based considerations onto the international applied linguistics scene.
Show Summary Details
Restricted access

Corpus-based bilingual terminology extraction

Extract



Abstract

The paper describes a methodology for bilingual terminology extraction and termbase building based on terminological, lexical and pragmatic criteria along with the translator’s knowledge and experience. The research work is conducted on the sentence-aligned million word Croatian-English parallel corpus of legislative texts, the first bigger one designed for this language pair so far. In order to assess hybrid, statistical and linguistic approaches, as well as the tools for automatic term extraction, automatically obtained lists of term candidates are compared to the manually created reference list. Term extraction includes multi-word units and single-word units corresponding to multi-word ones. The tools used in this research are: SDL Trados WinAlign (sentence alignment), SDL MultiTermExtract and WordSmith (for statistically-based term extraction) and NooJ (linguistically-based environment). The evaluation is reported by the statistical measures of precision, recall and F-measure. The language resources covering a specific domain speed up translation process, reduce cost and time and enable communication across different languages and cultures. Also, their application greatly facilitates machine translation and computer-assisted translation, information retrieval, building of multilingual termbases, glossaries and other resources which present a growing demand for a development of any less resourced language, such as Croatian.

1 Introduction

Globalized environment and growing national language awareness emphasize the need for creating up-to-date terminology resources which are extremely important for easy, fast and reliable communication but also for the development of any less resourced language such as Croatian. Technological and economic development lead to the ongoing...

You are not authenticated to view the full text of this chapter or article.

This site requires a subscription or purchase to access the full text of books or journals.

Do you have any questions? Contact us.

Or login to access all content.