Third IEEE International Conference on Data Mining (ICDM'03) Enhancing Techniques for Efficient Topic Hierarchy Integration Melbourne, Florida November 19-November 22 ISBN: 0-7695-1978-4
In this paper, we study the problem of integrating documents from different sources into a comprehensive topic hierarchy. Our objective is to develop efficient techniques that improve the accuracy of traditional categorization methods by incorporating categorization information provided by data sources into categorization process. Notice that in the World-Wide Web, categorization information is often available from information sources. We present several enhancing techniques that use categorization information to enhance traditional methods such as naive Bayes and support vector machines. Experiment on collections from Openfind and Yam, and Google and Yahoo!, well-known popular web sites in Taiwan and USA, respectively, shows that our techniques significantly improve the classification accuracy from, for example, 55% to 66% for Naive Bayes, and from 57% to 67% for SVM for the data set collected from Yam and Openfind.
Citation:
Jyh-Jong Tsay, Hsuan-Yu Chen, Chi-Feng Chang, Ching-Han Lin, "Enhancing Techniques for Efficient Topic Hierarchy Integration," icdm, pp.657, Third IEEE International Conference on Data Mining (ICDM'03), 2003 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||