Proceedings from the CALS conference 2014
Edited By Kristina Cergol Kovačević and Sanda Lucija Udier
Corpus-based bilingual terminology extraction
The paper describes a methodology for bilingual terminology extraction and termbase building based on terminological, lexical and pragmatic criteria along with the translator’s knowledge and experience. The research work is conducted on the sentence-aligned million word Croatian-English parallel corpus of legislative texts, the first bigger one designed for this language pair so far. In order to assess hybrid, statistical and linguistic approaches, as well as the tools for automatic term extraction, automatically obtained lists of term candidates are compared to the manually created reference list. Term extraction includes multi-word units and single-word units corresponding to multi-word ones. The tools used in this research are: SDL Trados WinAlign (sentence alignment), SDL MultiTermExtract and WordSmith (for statistically-based term extraction) and NooJ (linguistically-based environment). The evaluation is reported by the statistical measures of precision, recall and F-measure. The language resources covering a specific domain speed up translation process, reduce cost and time and enable communication across different languages and cultures. Also, their application greatly facilitates machine translation and computer-assisted translation, information retrieval, building of multilingual termbases, glossaries and other resources which present a growing demand for a development of any less resourced language, such as Croatian.
Globalized environment and growing national language awareness emphasize the need for creating up-to-date terminology resources which are extremely important for easy, fast and reliable communication but also for the development of any less resourced language such as Croatian. Technological and economic development lead to the ongoing...
You are not authenticated to view the full text of this chapter or article.
This site requires a subscription or purchase to access the full text of books or journals.
Do you have any questions? Contact us.Or login to access all content.