This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Bootstrap Technique for Nearest Neighbor Classifier Design
January 1997 (vol. 19 no. 1)
pp. 73-79

Abstract—A bootstrap technique for nearest neighbor classifier design is proposed. Our primary interest in designing a classifier is in small training sample size situations. Conventional bootstrapping techniques sample the training samples with replacement. On the other hand, our technique generates bootstrap samples by locally combining original training samples. The nearest neighbor classifier is designed on the bootstrap samples and is tested on the test samples independent of training samples. The performance of the proposed classifier is demonstrated on three artificial data sets and one real data set. Experimental results show that the nearest neighbor classifier designed on the bootstrap samples outperforms the conventional k-NN classifiers as well as the edited 1-NN classifiers, particularly in high dimensions.

[1] T.M. Cover and P. Hart, "Nearest Neighbor Pattern Classification," Proc. IEEE Trans. Information Theory, pp. 21-27, 1967.
[2] K. Fukunaga and D.M. Hummels, "Bias of Nearest Neighbor Error Estimates," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 9, pp. 103-112, Jan. 1987.
[3] B. Efron, "Bootstrap Methods: Another Look at the Jackknife," Annual Statistics, vol. 7, pp. 1-26, 1979.
[4] A.K. Jain, R.C. Dubes, and C.-C. Chen, "Bootstrap Techniques for Error Estimation," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 9, pp. 628-633, 1987.
[5] M.C. Chernick, V.K. Murthy, and C.D. Nealy, "Application of Bootstrap and Other Resampling Techniques: Evaluation of Classifier Performance," Pattern Recognition Letters, vol. 3, pp. 167-178, 1985.
[6] S.M. Weiss, "Small Sample Error Rate Estimation fork-NN Classifiers," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 13, pp. 285-289, 1991.
[7] K. Fukunaga and D.M. Hummels, "Bayes Error Estimation Using Parzen andk-NN Procedures," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 9, pp. 634-643, 1987.
[8] K. Fukunaga, Introduction to Statistical Pattern Recognition, second ed. Academic Press, 1990.
[9] J. Van Ness, "On the Dominance of Non-Parametric Bayes Rule Discriminant Algorithms in High Dimensions," Pattern Recognition, vol. 12, pp. 355-368, 1980.
[10] A.K. Jain and B. Chandrasekaran, "Dimensionality and Sample Size Considerations in Pattern Recognition Practice," Handbook of Statistics, P.R. Krishnaiah and L.N. Kanal, eds., vol. 2. NorthHolland, 1982, pp. 835-855.
[11] P. Lachenbruch and M. Mickey, "Estimation of Error Rates in Discriminant Analysis," Technometrics, vol. 10, pp. 1-11, 1968.
[12] E. Fix and J.L. Hodges, Jr., "Discriminatory Analysis: Nonparametric Discrimination: Consistency Properties," Report No. 4, USAF School of Aviation Medicine, Randolph Field, Texas, Feb. 1951.
[13] E. Fix and J.L. Hodges, Jr., "Discriminatory Analysis: Nonparametric Discrimination: Small Sample Performance," Report No. 11, USAF School of Aviation Medicine, Randolph Field, Texas, Aug. 1952.
[14] D. Gabor, "Theory of Communication," J. Inst. Elect. Engr., vol. 93, pp. 429-459, 1946.
[15] Y. Hamamoto, S. Uchimura, K. Masamizu, and S. Tomita, "Recognition of Handprinted Chinese Characters Using Gabor Features," Proc. Third Int'l Conf. Document Analysis and Recognition, pp. 819-823,Montreal, Aug. 1995.
[16] Y. Hamamoto, S. Uchimura, M. Watanabe, T. Yasuda, and S. Tomita, "Recognition of Handwritten Numerals Using Gabor Features," Proc. 13th Int'l Conf. Pattern Recognition, vol. 3, pp. 250-253,Vienna, Aug. 1996.
[17] P.E. Hart, “The Condensed Nearest Neighbor Rule,” IEEE Trans. Information Theory, vol. 14, no. 3, pp. 515-516, 1968.
[18] G.W. Gates, “The Reduced Nearest Neighbor Rule,” IEEE Trans. Information Theory, vol. 18, no. 3, pp. 431-433 1972.
[19] Q. Xie, C.A. Laszlo, and R.K. Ward, “Vector Quantization Technique for Nonparametric Classifier Design,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 12, pp. 1326-1330, Dec. 1993.
[20] G.F. Hughes, "On the Mean Accuracy of Statistical Pattern Recognizers," IEEE Trans. Information Theory, vol. 14, pp. 55-63, 1968.
[21] D.J. Hand, "Recent Advances in Error Rate Estimation," Pattern Recognition Letters, vol. 4, pp. 335-346, 1986.
[22] B. Chandrasekaran and A.K. Jain, "On Balancing Decision Functions," J. Cybernetics and Information Science, vol. 2, no. 1, pp. 12-15, 1979.

Index Terms:
Bootstrap, nearest neighbor classifier, error rate, peaking phenomenon, small training sample size, high dimensions, outlier.
Citation:
Yoshihiko Hamamoto, Shunji Uchimura, Shingo Tomita, "A Bootstrap Technique for Nearest Neighbor Classifier Design," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 1, pp. 73-79, Jan. 1997, doi:10.1109/34.566814
Usage of this product signifies your acceptance of the Terms of Use.