loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Third IEEE International Conference on Data Mining (ICDM'03)
Enhancing Techniques for Efficient Topic Hierarchy Integration
Melbourne, Florida
November 19-November 22
ISBN: 0-7695-1978-4
Jyh-Jong Tsay, National Chung Cheng University
Hsuan-Yu Chen, National Chung Cheng University
Chi-Feng Chang, National Chung Cheng University
Ching-Han Lin, National Chung Cheng University
In this paper, we study the problem of integrating documents from different sources into a comprehensive topic hierarchy. Our objective is to develop efficient techniques that improve the accuracy of traditional categorization methods by incorporating categorization information provided by data sources into categorization process. Notice that in the World-Wide Web, categorization information is often available from information sources. We present several enhancing techniques that use categorization information to enhance traditional methods such as naive Bayes and support vector machines. Experiment on collections from Openfind and Yam, and Google and Yahoo!, well-known popular web sites in Taiwan and USA, respectively, shows that our techniques significantly improve the classification accuracy from, for example, 55% to 66% for Naive Bayes, and from 57% to 67% for SVM for the data set collected from Yam and Openfind.
Citation:
Jyh-Jong Tsay, Hsuan-Yu Chen, Chi-Feng Chang, Ching-Han Lin, "Enhancing Techniques for Efficient Topic Hierarchy Integration," icdm, pp.657, Third IEEE International Conference on Data Mining (ICDM'03), 2003
Usage of this product signifies your acceptance of the Terms of Use.