Document Type

Conference Paper

Department/ Unit

Language Centre

Abstract

In this study, we propose to use two corpus-driven linguistic approaches for a sense prediction study. We will concentrate on the character similarity clustering approach and concept similarity clustering approach to predict the senses of non-assigned words by using corpora and tools, such as Chinese Gigaword Corpus, and HowNet. In this study, we would then like to evaluate their predictions via the sense divisions of Chinese Wordnet (CWN) and Xiandai Hanyu Cidian (Xian Han). Using these corpora, we will determine their clusters of our four target words ---- chi1 “eat”, wan2 “play”, huan4 “change” and shao1 “burn” in order to predict their all possible senses and evaluate them. This requirement will demonstrate the visibility of the corpus-based approaches.

Conference Date

5-2010

Included in

Linguistics Commons

Share

COinS