Show Less

Grammar in Cross-Linguistic Perspective

The Syntax, Semantics, and Pragmatics of Japanese and Chinese

Series:

Edited By Teruhiro Ishiguro and Kang Kwong Luke

In this collection of papers on syntax, semantics and pragmatics, linguists specialising in the study of Japanese and Chinese offer fresh ideas and insights on the theme of grammatical categories and structure from a comparative perspective. Against the background of theoretical developments in recent years and individual studies of Japanese, Chinese and English grammar, the papers in this volume are devoted to new in-depth treatments of distinctive aspects of Chinese and Japanese grammar informed by influential theoretical frameworks of the day, including cognitive grammar, construction grammar, information structure, grammaticalization theory, and linguistic typology. Topics of investigation include compounding, verb complementation, tense and aspect, as well as a range of word order phenomena, such as passive constructions, focus-fronting, and right dislocation.

Prices

Show Summary Details
Restricted access

WONG PING WAI: Semantic Annotation of Chinese Texts with Message Structures Based on HowNet 271

Extract

271 WONG PING WAI Semantic Annotation of Chinese Texts with Message Structures Based on HowNet 1. Introduction Corpus annotation is not just a practical task of incorporating lin- guistic information to plain texts, it also sheds new light on the na- ture of language and the most effective means of analyzing it. This chapter reports on the task of using a knowledge base called HowNet to annotate Chinese texts with semantic information. The annotation method is Message Structure, which provides an effective way to analyze Chinese word senses and semantic dependency between words. 2. Corpus and Corpus Annotation A corpus is a collection of texts, usually in an electronic form, which may be processed by computers for various purposes, such as lin- guistic research and information technology. A corpus is useful only if we can extract information from it. However, limited information can be retrieved directly from a raw corpus since linguistic informa- tion is always implicit in plain texts. That is why we need to make such implicit information explicit by building in interpretative, lin- guistic information to the corpus. This process is called corpus anno- tation. 272 3. Annotated Chinese Corpora Efforts of annotating Chinese corpora began in the 1990s. For ex- ample, the tokenized corpus, e.g., the PH Corpus (Guo 1993), the parts-of-speech tagged corpora, e.g., the Sinica Corpus (CKIP 1995) and the PKU corpus (Yu et al. 2003), the syntactically annotated cor- pora, e.g., the Sinica Treebank (Huang et al. 2000) and the Penn Chinese Treebank...

You are not authenticated to view the full text of this chapter or article.

This site requires a subscription or purchase to access the full text of books or journals.

Do you have any questions? Contact us.

Or login to access all content.