The Community for Technology Leaders
2013 8th ChinaGrid Annual Conference (2011)
Dalian, Liaoning China
Aug. 22, 2011 to Aug. 23, 2011
ISBN: 978-0-7695-4472-4
pp: 60-67
ABSTRACT
Domain terms play a crucial role in many research areas, which has led to a rise in demand for automatic domain terms extraction. In this paper, we present a two-level evaluation approach based on term hood and unit hood to extract Chinese domain compound terms automatically, which takes the character-level and word-level information into account. To achieve this, we incorporate semantic features by using the word segmentation to recognize single word terms, then leverage the improved C-value and heuristic methods such as word formation pattern and word formation power to evaluate candidates at both levels. By validating our approach with several existing dictionaries, a significant improvement of compound terms detection is achieved. Experiments in legal corpus show our method is superior over other compared methods.
INDEX TERMS
Domain Term Extraction, Compound Term, Chinese Word Segmentation, CCT C-value
CITATION
Jingjing Kang, Tao Liu, He Hu, Xiaoyong Du, "Discovering Chinese Compound Term Using Termhood and Unithood Measures", 2013 8th ChinaGrid Annual Conference, vol. 00, no. , pp. 60-67, 2011, doi:10.1109/ChinaGrid.2011.41
94 ms
(Ver 3.3 (11022016))