2013 8th ChinaGrid Annual Conference (2011)
Dalian, Liaoning China
Aug. 22, 2011 to Aug. 23, 2011
Domain terms play a crucial role in many research areas, which has led to a rise in demand for automatic domain terms extraction. In this paper, we present a two-level evaluation approach based on term hood and unit hood to extract Chinese domain compound terms automatically, which takes the character-level and word-level information into account. To achieve this, we incorporate semantic features by using the word segmentation to recognize single word terms, then leverage the improved C-value and heuristic methods such as word formation pattern and word formation power to evaluate candidates at both levels. By validating our approach with several existing dictionaries, a significant improvement of compound terms detection is achieved. Experiments in legal corpus show our method is superior over other compared methods.
Domain Term Extraction, Compound Term, Chinese Word Segmentation, CCT C-value
Jingjing Kang, Tao Liu, He Hu, Xiaoyong Du, "Discovering Chinese Compound Term Using Termhood and Unithood Measures", 2013 8th ChinaGrid Annual Conference, vol. 00, no. , pp. 60-67, 2011, doi:10.1109/ChinaGrid.2011.41