This Article 
 Bibliographic References 
 Add to: 
Discriminant Adaptive Nearest Neighbor Classification
June 1996 (vol. 18 no. 6)
pp. 607-616

Abstract—Nearest neighbor classification expects the class conditional probabilities to be locally constant, and suffers from bias in high dimensions. We propose a locally adaptive form of nearest neighbor classification to try to ameliorate this curse of dimensionality. We use a local linear discriminant analysis to estimate an effective metric for computing neighborhoods. We determine the local decision boundaries from centroid information, and then shrink neighborhoods in directions orthogonal to these local decision boundaries, and elongate them parallel to the boundaries. Thereafter, any neighborhood-based classifier can be employed, using the modified neighborhoods. The posterior probabilities tend to be more homogeneous in the modified neighborhoods. We also propose a method for global dimension reduction, that combines local dimension information. In a number of examples, the methods demonstrate the potential for substantial improvements over nearest neighbor classification.

[1] T.M. Cover, "Rates of Convergence for Nearest Neighbor Procedures," Proc.Hawaii Int'l Conf. Systems Sciences, pp. 413-415, Western Periodicals, Honolulu, 1968.
[2] R.O. Duda and P.E. Hart, Pattern Classification and Scene Analysis.New York: Wiley, 1973.
[3] G.J. McLachlan, Discriminant Analysis and Statistical Pattern Recognition.New York: Wiley, 1992.
[4] T.M. Cover and P. Hart, "Nearest Neighbor Pattern Classification," Proc. IEEE Trans. Information Theory, pp. 21-27, 1967.
[5] T. Hastie, A. Buja, and R. Tibshirani, "Penalized Discriminant Analysis," Annals of Statistics, 1994.
[6] J. Friedman, "Flexible Metric Nearest Neighbour Classification," technical report, Stanford Univ., Nov. 1994.
[7] B.D. Ripley, "Neural Networks and Related Methods for Classification," J. Royal Statistical Soc. (Series B) (with discussion), 1994.
[8] L. Breiman, J.H. Friedman, R. Olshen, and C.J. Stone, Classification and Regression Trees. Wadsworth, 1984.
[9] D. Michie D. Spigelhalter and C. Taylor eds., Machine Learning, Neural and Statistical Classification. Ellis Horwood series in Artificial Intelligence. Ellis Horwood, 1994.
[10] R. Short and K. Fukanaga, "A New Nearest Neighbor Distance Measure," Proc. Fifth IEEE Int'l Conf. Pattern Recognition, pp. 81-86, 1980.
[11] R. Short and K. Fukanaga, "The Optimal Distance Measure for Nearest Neighbor Classification," IEEE Trans. Information Theory, vol. 27, pp. 622-627, 1981.
[12] J.P. Myles and D.J. Hand, "The Multi-Class Metric Problem in Nearest Neighbour Discrimination Rules," Pattern Recognition, vol. 23, pp. 1,291-1,297, 1990.
[13] D.G. Lowe, "Similarity Metric Learning for a Variable Kernel Classifier," techical report, Dept. of Computer Science, Univ. of British Columbia, 1993.
[14] P.Y. Simard, Y. LeCun, and J. Denker, "Efficient Pattern Recognition Using a New Transformation Distance," Advances in Neural Information Processing Systems, pp. 50-58.San Mateo, Calif.: Morgan Kaufman, 1993.
[15] T. Hastie, P. Simard, and E. Sackinger, "Learning Prototype Models for Tangent Distance," techical report, AT&T Bell Labs, 1993.
[16] W Cleveland, "Robust Locally-Weighted Regression and Smoothing Scatterplots," J. Am. Statistical Soc., vol. 74, pp. 829-836, 1979.
[17] N. Duan and K-C Li, "Slicing Regression: A Link-Free Regression Method," Annals of Statistics, pp. 505-530, 1991.

Index Terms:
Classification, nearest neighbors, linear discriminant analysis, curse of dimensionality.
Trevor Hastie, Robert Tibshirani, "Discriminant Adaptive Nearest Neighbor Classification," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, no. 6, pp. 607-616, June 1996, doi:10.1109/34.506411
Usage of this product signifies your acceptance of the Terms of Use.