Issue No.04 - April (2008 vol.30)
pp: 735-740
We derive a tight dependency-related bound on the difference between the Naïve Bayes (NB) error and Bayes error for two binary features and two equiprobable classes. A measure of discrepancy of feature dependencies is proposed for multiple features. Its correlation with NB is shown using 23 real data sets.
Pattern Recognition, Classifier design and evaluation, Feature evaluation and selection, Naive Bayes, Dependency
Ludmila Kuncheva, "Error-Dependency Relationships for the Naïve Bayes Classifier with Binary Features", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.30, no. 4, pp. 735-740, April 2008, doi:10.1109/TPAMI.2007.70845
[1] C.L. Blake and C.J. Merz, “UCI Repository of Machine Learning Databases,” , 1998.
[2] P. Domingos and M. Pazzani, “On the Optimality of the Simple Bayesian Classifier under Zero-One Loss,” Machine Learning, vol. 29, pp. 103-130, 1997.
[3] N. Friedman, D. Geiger, and M. Goldszmid, “Bayesian Network Classifiers,” Machine Learning, vol. 29, no. 2, pp. 131-163, 1997.
[4] D.J. Hand and K. Yu, “Idiot's Bayes—Not so Stupid After All?” Int'l Statistical Rev., vol. 69, pp. 385-398, 2001.
[5] R. Kohavi, “Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid,” Proc. Second Int'l Conf. Knowledge Discovery and Data Mining, 1996.
[6] R. Kohavi, B. Becker, and D. Sommerfield, “Improving Simple Bayes,” technical report, Data Mining and Visualization Group, Silicon Graphics Inc., 1997.
[7] I. Kononenko, “Inductive and Bayesian Learning in Medical Diagnosis,” Applied Artificial Intelligence, vol. 7, pp. 317-337, 1993.
[8] L.I. Kuncheva, “On the Optimality of Naïve Bayes with Dependent Binary Features,” Pattern Recognition Letters, vol. 27, pp. 830-837, 2006.
[9] P. Langley, W. Iba, and K. Thompson, “An Analysis of Bayesian Classifiers,” Proc. 10th Nat'l Conf. Artificial Intelligence, pp. 399-406, 1992.
[10] P. Langley and S. Sage, “Induction of Selective Bayesian Classifiers,” Proc. 10th Conf. Uncertainty in Artificial Intelligence, pp. 399-406, 1994.
[11] B.D. Ripley, Pattern Recognition and Neural Networks. Univ. Press, 1996.
[12] I. Rish, “An Empirical Study of the Naive Bayes Classifier,” Proc. Int'l Joint Conf. Artificial Intelligence, Workshop “Empirical Methods in A,” 2001.
[13] I. Rish, J. Hellerstein, and J. Thathachar, “An Analysis of Data Characteristics that Affect Naive Bayes Performance,” Technical Report RC21993, IBM TJ Watson Research Center, 2001.
[14] G.I. Webb, J. Boughton, and Z. Wang, “Not So Naive Bayes: Aggregating One-Dependence Estimators,” Machine Learning, vol. 58, no. 1, pp. 5-24, 2005.
[15] G.U. Yule and M.G. Kendall, An introduction of the Theory of Statistics. Griffin Co. Ltd., 1940.
[16] H. Zhang, “The Optimality of Naive Bayes,” Proc. 17th Int'l FLAIRS Conf., 2004.
[17] H. Zhang and C.X. Ling, “A Fundamental Issue of Naive Bayes,” Proc. Canadian Conf. Artificial Intelligence, pp. 591-595, 2003.