The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.11 - November (2006 vol.18)
pp: 1457-1466
ABSTRACT
While naive Bayes is quite effective in various data mining tasks, it shows a disappointing result in the automatic text classification problem. Based on the observation of naive Bayes for the natural language text, we found a serious problem in the parameter estimation process, which causes poor results in text classification domain. In this paper, we propose two empirical heuristics: per-document text normalization and feature weighting method. While these are somewhat ad hoc methods, our proposed naive Bayes text classifier performs very well in the standard benchmark collections, competing with state-of-the-art text classifiers based on a highly complex learning method such as SVM.
INDEX TERMS
Text classification, naive Bayes classifier, Poisson model, feature weighting.
CITATION
Sang-Bum Kim, Kyoung-Soo Han, Hae-Chang Rim, Sung Hyon Myaeng, "Some Effective Techniques for Naive Bayes Text Classification", IEEE Transactions on Knowledge & Data Engineering, vol.18, no. 11, pp. 1457-1466, November 2006, doi:10.1109/TKDE.2006.180
22 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool