This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2009 WRI World Congress on Computer Science and Information Engineering
Automatic Term Recognition Based on Data-Mining Techniques
Los Angeles, California USA
March 31-April 02
ISBN: 978-0-7695-3507-4
We present a new method for automatic term extraction which is based on training datasets created to build inductive models for term identification. Existing approaches employ simple statistical and linguistic rules designed merely ad-hoc and are unable to utilize complex relations of linguistic units. In contrast to those approaches, our method does not require such manually ascribed rules of extraction. The data for our research is taken from the Czech National Corpus which is lemmatised and morphologically tagged. Statistical information (frequency, distribution etc.) is generated automatically and thus the only expert contribution needed is to label terms in the training dataset.The data mining software creates models that perform the extraction without any further human input. Additionally, feature ranking can serve as valuable aid for understanding of the extraction process and its future development and in terminology research.
Index Terms:
automatic term extraction, data-mining, feature-ranking, corpus linguistics
Citation:
Dominika Šrajerová, Oleg Kovárík, Václav Cvrcek, "Automatic Term Recognition Based on Data-Mining Techniques," csie, vol. 4, pp.453-457, 2009 WRI World Congress on Computer Science and Information Engineering, 2009
Usage of this product signifies your acceptance of the Terms of Use.