loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
On Combining Classifier Mass Functions for Text Categorization
October 2005 (vol. 17 no. 10)
pp. 1307-1319
Experience shows that different text classification methods can give different results. We look here at a way of combining the results of two or more different classification methods using an evidential approach. The specific methods we have been experimenting with in our group include the Support Vector Machine, kNN (nearest neighbors), kNN model-based approach (kNNM), and Rocchio methods, but the analysis and methods apply to any methods. We review these learning methods briefly, and then we describe our method for combining the classifiers. In a previous study, we suggested that the combination could be done using evidential operations [1] and that using only two focal points in the mass functions (see below) gives good results. However, there are conditions under which we should choose to use more focal points. We assess some aspects of this choice from an evidential reasoning perspective and suggest a refinement of the approach.

[1] 1307 Y. Bi , D. Bell , H. Wang , G. Guo , and K. Greer , “Combining Multiple Classifiers Using Dempster's Rule of Combination for Text Categorization,” Proc. First Conf. Modeling Decisions for Artificial Intelligence, pp. 127-138, 2004.[2] F. Sebastiani , “Machine Learning in Automated Text Categorization,” ACM Computing Surveys, vol. 34, no. 1, 2002.[3] L.S. Larkey and W.B. Croft , “Combining Classifiers in Text Categorization,” Proc. SIGIR-96, 19th ACM Int'l Conf. Research and Development in Information Retrieval, pp. 289-297, 1996.[4] Y.H. Li and A.K. Jain , “Classification of Text Documents,” The Computer J., vol 41, no. 8, 537-546, 1998. [5] Y. Yang , T. Ault , and T. Pierce , “Combining Multiple Learning Strategies for Effective Cross Validation,” Proc. 17th Int'l Conf. Machine Learning (ICML '00), pp. 1167-1182, 2000.[6] Y. Bi , D. Bell , H. Wang , G. Guo , and K. Greer , “Combining Classification Decisions for Text Categorization: An Experimental Study,” Proc. 15th Int'l Conf. Database and Expert Systems Applications (DEXA '04), pp. 222-231, 2004.[7] Y. Bi , D. Bell , and J.W. Guan , “Combining Evidence from Classifiers in Text Categorization,” Proc. Eighth Conf. Knowledge-Based Intelligent Information & Eng. Systems, pp. 521-528, 2004.[8] Y. Bi , “Combining Multiple Classifiers for Text Categorization Using Dempster's Rule of Combination,” PhD thesis, Univ. of Ulster, 2004.[9] D.J. Ittner , D.D. Lewis , and D.D. Ahn , “Text Categorization of Low Quality Images,” Proc. Symp. Document Analysis and Information Retrieval, pp. 301-315, 1995.[10] Y. Yang , “A Study on Thresholding Strategies for Text Categorization,” Proc. ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '01), pp. 137-145, 2001.[11] T. Joachims , “A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization,” Proc. 14th Int'l Conf. Machine Learning (ICML '97), 1997.[12] C.C. Chang and C.J. Lin , “LIBSVM: A Library for Support Vector Machines,” http://www.csie.ntu.edu.tw/cjlinlibsvm, 2001.[13] G. Guo , H. Wang , D. Bell , Y. Bi , and K. ieran Greer , “kNN Model-Based Approach in Classification,” Proc. Cooperative Information Systems (CoopIS) Int'l Conf., pp. 986-996, 2003.[14] J.W. Guan and D.A. Bell , “ Evidence Theory and Its Applications,” Studies in Computer Science and Artificial Intelligence 7-8, vols. 1-2, Elsevier, North-Holland, 1991-1992.[15] T. Mitchell , Machine Learning. McGraw-Hill, 1997.[16] G. Shafer , A Mathematical Theory of Evidence. Princeton, N.J.: Princeton Univ. Press, 1976.[17] J.-H. Kang and D. Doermann , “Evaluation of the Information-Theoretic Construction of Multiple Classifier Systems,” Proc. Seventh Int'l Conf. Document Analysis and Pattern Recognition, pp. 789-793, 2003. [18] L. Kimcheve , J. Bezdek , and R. Duin , “Decision Templates for Multiple Classifier Fusion: An Experimental Comparison,” Pattern Recognition, vol. 34, pp. 299-314, 2001. [19] G. Salton , J. Allan , C. Buckley , and A. Singhal , “Automatic Analysis, Theme Generation, and Summarization of Machine-Readable Texts,” Science, vol. 264, pp. 1421-1426, 1994.[20] C.J. van Rijsbergen , Information Retrieval, second ed. Butterworths, 1979.[21] S. Zhang , C. Zhang , and Q. Yang , “Information Enhancement for Data Mining,” IEEE Intelligent Systems, pp. 12-13, Mar./Apr. 2004.

Index Terms:
Index Terms- Data mining systems and tools, modeling of structured, textual and multimedia data, uncertainty reasoning.
Citation:
David A. Bell, J.W. Guan, Yaxin Bi, "On Combining Classifier Mass Functions for Text Categorization," IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 10, pp. 1307-1319, Oct. 2005, doi:10.1109/TKDE.2005.167
Usage of this product signifies your acceptance of the Terms of Use.