This Article 
 Bibliographic References 
 Add to: 
Lower Bounds for Bayes Error Estimation
July 1999 (vol. 21 no. 7)
pp. 643-645

Abstract—We give a short proof of the following result. Let $(X,Y)$ be any distribution on ${\cal N} \times \{0,1\}$, and let $(X_1,Y_1),\ldots,(X_n,Y_n)$ be an i.i.d. sample drawn from this distribution. In discrimination, the Bayes error $L^* = \inf_g {\bf P}\{g(X) \not= Y \}$ is of crucial importance. Here we show that without further conditions on the distribution of $(X,Y)$, no rate-of-convergence results can be obtained. Let $\phi_n (X_1,Y_1,\ldots,X_n,Y_n)$ be an estimate of the Bayes error, and let $\{ \phi_n(.) \}$ be a sequence of such estimates. For any sequence $\{a_n\}$ of positive numbers converging to zero, a distribution of $(X,Y)$ may be found such that ${\bf E} \left\{ | L^* - \phi_n (X_1,Y_1,\ldots,X_n,Y_n) | \right\} \ge a_n$ infinitely often.

[1] L. Birgé, “On Estimating a Density Using Hellinger Distance and Some Other Strange Facts,” Probability Theory and Related Fields, vol. 71, pp. 271–291, 1986.
[2] Z. Chen and K.S. Fu, “Nonparametric Bayes Risk Estimation for Pattern Classification,” Proc. IEEE Conf. Systems, Man, and Cybernetics, Boston, 1973.
[3] T.M. Cover, “Rates of Convergence for Nearest Neighbor Procedures,” Proc. Hawaii Int'l Conf. Systems Sciences, pp. 413–415, Ho nolulu, 1968.
[4] L. Devroye, “Any Discrimination Rule Can Have an Arbitrarily Bad Probability of Error for Finite Sample Size,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 4, pp. 154–157, 1982.
[5] L. Devroye, “On Arbitrarily Slow Rates of Global Convergence in Density Estimation,” Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete, vol. 62, pp. 475–483, 1983.
[6] L. Devroye, “Another Proof of a Slow Convergence Result of Birgé,” Statistics and Probability Letters, vol. 23, pp. 63–67, 1995.
[7] L. Devroye, L. Györfi, and G. Lugosi, A Probabilistic Theory of Pattern Recognition. New York/Berlin: Springer-Verlag, 1996.
[8] K. Fukunaga and D.M. Hummels, "Bias of Nearest Neighbor Estimates," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 9, pp. 103-112, 1987.
[9] K. Fukunaga and D.L. Kessel, “Estimation of Classification Error,” IEEE Trans. Computers, vol. 20, pp. 1,521–1,527, 1971.
[10] J.M. Garnett and S.S. Yau, “Nonparametric Estimation of the Bayes Error of Feature Extractors Using Ordered Nearest Neighbor Sets,” IEEE Trans. Computers, vol. 26, pp. 46–54, 1977.
[11] G. McLachlan, Discriminant Analysis and Statistical Pattern Recognition. New York: John Wiley, 1992.

Index Terms:
Discrimination, statistical pattern recognition, nonparametric estimation, Bayes error, lower bounds, rates of convergence.
András Antos, Luc Devroye, László Györfi, "Lower Bounds for Bayes Error Estimation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 21, no. 7, pp. 643-645, July 1999, doi:10.1109/34.777375
Usage of this product signifies your acceptance of the Terms of Use.