This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
The Nearest Subclass Classifier: A Compromise between the Nearest Mean and Nearest Neighbor Classifier
September 2005 (vol. 27 no. 9)
pp. 1417-1429
We present the Nearest Subclass Classifier (NSC), which is a classification algorithm that unifies the flexibility of the nearest neighbor classifier with the robustness of the nearest mean classifier. The algorithm is based on the Maximum Variance Cluster algorithm and, as such, it belongs to the class of prototype-based classifiers. The variance constraint parameter of the cluster algorithm serves to regularize the classifier, that is, to prevent overfitting. With a low variance constraint value, the classifier turns into the nearest neighbor classifier and, with a high variance parameter, it becomes the nearest mean classifier with the respective properties. In other words, the number of prototypes ranges from the whole training set to only one per class. In the experiments, we compared the NSC with regard to its performance and data set compression ratio to several other prototype-based methods. On several data sets, the NSC performed similarly to the k-nearest neighbor classifier, which is a well-established classifier in many domains. Also concerning storage requirements and classification speed, the NSC has favorable properties, so it gives a good compromise between classification performance and efficiency.

[1] D.W. Aha, D. Kibler, and M.K. Albert, “Instance-Based Learning Algorithms,” Machine Learning, vol. 6, pp. 37-66, 1991.
[2] C. Ambroise and G.J. McLachlan, “Selection Bias in Gene Extraction on the Basis of Microarray Gene-Expression Data,” Proc. Nat'l Academy of Sciences (PNAS), vol. 99, no. 10, pp. 6562-6566, May 2002.
[3] G.H. Ball and D.J. Hall, “A Clustering Technique for Summarizing Multivariate Data,” Behavioral Science, vol. 12, pp. 153-155, Mar. 1967.
[4] J.C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms. New York: Plenum Press, 1981.
[5] J.C. Bezdek and L.I. Kuncheva, “Nearest Prototype Classifier Designs: An Experimental Study,” Int'l J. Intelligent Systems, vol. 16, pp. 1445-1473, 2001.
[6] J.C. Bezdek and N.R. Pal, “Some New Indexes of Cluster Validity,” IEEE Trans. Systems, Man, and CyberneticsPart B, vol. 28, no. 3, pp. 301-315, 1998.
[7] J.C. Bezdek, T.R. Reichherzer, G.S. Lim, and Y. Attikiouzel, “Multiple-Prototype Classifier Design,” IEEE Trans. Systems, Man, and CyberneticsPart C: Applications and Reviews, vol. 28, no. 1, pp. 67-79, 1998.
[8] C.L. Blake and C.J. Merz, UCI Repository of Machine Learning Databases, Univ. of California, Irvine, 1998.
[9] H. Brighton and C. Mellish, “Advances in Instance Selection for Instance-Based Learning Algorithms,” Data Mining and Knowledge Discovery, vol. 6, pp. 153-172, 2002.
[10] V. Cerverón and F.J. Ferri, “Another Move Toward the Minimum Consistent Subset: A Tabu Search Approach to the Condensed Nearest Neighbor Rule,” IEEE Trans. Systems, Man, and CyberneticsPart B: Cybernetics, vol. 31, no. 3, pp. 408-413, 2001.
[11] C.-L. Chang, “Finding Prototypes for Nearest Neighbor Classifiers,” IEEE Trans. Computers, vol. 23, no. 11, pp. 1179-1184, Nov. 1974.
[12] D. Chaudhuri, C.A. Murthy, and B.B. Chaudhuri, “Finding a Subset of Representative Points in a Data Set,” IEEE Trans. Systems, Man, and Cybernetics, vol. 24, no. 9, pp. 1416-1424, 1994.
[13] T.M. Cover and P.E. Hart, “Nearest Neighbor Pattern Classification,” IEEE Trans. Information Theory, vol. 13, no. 1, pp. 21-27, 1967.
[14] Nearest Neighbor NN Norms: NN Pattern Classification Techniques, B.V. Dasarathy, ed. Los Alamitos, Calif.: IEEE Computer Society Press, 1991.
[15] B.V. Dasarathy, “Minimal Consistent Set (MCS) Identification for Optimal Nearest Neighbor Decision Systems Design,” IEEE Trans. Systems, Man, and Cybernetics, vol. 24, no. 3, pp. 511-517, 1994.
[16] B.V. Dasarathy, J.S. Sánchez, and S. Townsend, “Nearest Neighbor Editing and Condensing ToolsSynergy Exploitation,” Pattern Analysis and Applications, vol. 3, no. 1, pp. 19-30, 2000.
[17] D.L. Davies and D.W. Bouldin, “A Cluster Separation Measure,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 1, no. 2, pp. 224-227, Apr. 1979.
[18] J.C. Dunn, “Well Separated Clusters and Optimal Fuzzy Partitions,” J. Cybernetics, vol. 4, pp. 95-104, 1974.
[19] B. Efron and R. Tibshirani, “Improvements on Cross-Validation: The .632+ Bootstrap Method,” J. Am. Statistical Assoc., vol. 92, no. 438, pp. 548-560, 1997.
[20] R.A. Fisher, “The Use of Multiple Measurements in Taxonomic Problems,” Annals of Eugenics, vol. 7, no. 2, pp. 179-188, 1936.
[21] E. Fix and J.L. Hodges, “Discriminatory Analysis: Nonparametric Discrimination: Consistency Properties,” USAF School of Aviation Medicine, Project 21-49-004 (Report Number 4), pp. 261-279, 1951.
[22] V. Ganti, J. Gehrke, and R. Ramakrishnan, “Mining Very Large Databases,” Computer, vol. 32, no. 8, pp. 38-45, Aug. 1999.
[23] G.W. Gates, “The Reduced Nearest Neighbor Rule,” IEEE Trans. Information Theory, vol. 18, no. 3, pp. 431-433, 1972.
[24] S. Geman, E. Bienenstock, and R. Doursat, “Neural Networks and the Bias/Variance Dilemma,” Neural Computation, vol. 4, pp. 1-58, 1992.
[25] F. Glover and M. Laguna, Tabu Search. Boston: Kluwer Academic, 1997.
[26] Y. Hamamoto, S. Uchimura, and S. Tomita, “A Bootstrap Technique for Nearest Neighbor Classifier Design,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 1, pp. 73-79, Jan. 1997.
[27] P.E. Hart, “The Condensed Nearest Neighbor Rule,” IEEE Trans. Information Theory, vol. 14, no. 3, pp. 515-516, 1968.
[28] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, 2001.
[29] R.J. Henery, Machine Learning, Neural and Statistical Classification, chapter 7, pp. 107-124. Ellis Horwood, 1994.
[30] A.E. Hoerl and R.W. Kennard, “Ridge Regression: Biased Estimation for Nonorthogonal Problems,” Technometrics, vol. 12, no. 1, pp. 55-67, 1970.
[31] L.J. Hubert and P. Arabie, “Comparing Partitions,” J. Classification, vol. 2, pp. 193-218, 1985.
[32] A.K. Jain and R.C. Dubes, Algorithms for Clustering Data. New Jersey: Prentice-Hall Inc., 1988.
[33] A.K. Jain and D.E. Zongker, “Feature Selection: Evaluation, Application, and Small Sample Performance,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 2, pp. 153-158, Feb. 1997.
[34] R. Kohavi and G.H. John, “Wrappers for Feature Subset Selection,” Artificial Intelligence, vol. 97, pp. 273-324, Dec. 1997.
[35] T. Kohonen, “Improved Versions of Learning Vector Quantization,” Proc. Int'l Joint Conf. Neural Networks, vol. 1, pp. 545-550, 1990.
[36] L.I. Kuncheva and J.C. Bezdek, “Presupervised and Postsupervised Prototype Classifier Design,” IEEE Trans. Neural Networks, vol. 10, no. 5, pp. 1142-1152, Sept. 1999.
[37] L.I. Kuncheva and J.C. Bezdek, “Nearest Prototype Classification: Clustering, Genetic Algorithms, or Random Search,” IEEE Trans. Systems, Man, and Cybernetics, vol. 28, no. 1, pp. 160-164, 1998.
[38] W. Lam, C.K Keung, and C.X. Ling, “Learning Good Prototypes for Classification Using Filtering and Abstraction of Instances,” Pattern Recognition, vol. 35, no. 7, pp. 1491-1506, July 2002.
[39] W. Lam, C.K Keung, and D. Liu, “Discovering Useful Concept Prototypes for Classification Based on Filtering and Abstraction,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 8, pp. 1075-1090, Aug. 2002.
[40] P. Mitra, C.A. Murthy, and S.K. Pal, “Density-Based Multiscale Data Condensation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 6, pp. 734-747, June 2002.
[41] R.A. Mollineda, F.J. Ferri, and E. Vidal, “An Efficient Prototype Merging Strategy for the Condensed 1-NN Rule Through Class-Conditional Hierarchical Clustering,” Pattern Recognition, vol. 35, pp. 2771-2782, 2002.
[42] G.L. Ritter, H.B. Woodruff, S.R. Lowry, and T.L. Isenhour, “An Algorithm for a Selective Nearest Neighbor Decision Rule,” IEEE Trans. Information Theory, vol. 21, no. 6, pp. 665-669, 1975.
[43] C. Schaffer, “Overfitting Avoidance as Bias,” Machine Learning, vol. 10, pp. 153-178, 1993.
[44] V.G. Sigillito, S.P. Wing, L.V. Hutton, and K.B. Baker, “Classification of Radar Returns from the Ionosphere Using Neural Networks,” John Hopkins APL Technical Digest, vol. 10, pp. 262-266, 1989.
[45] D.B. Skalak, “Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms,” Proc. 11th Int'l Conf. Machine Learning, pp. 293-301, 1994.
[46] C.W. Swonger, “Sample Set Condensation for a Condensed Nearest Neighbor Decision Rule for Pattern Recognition,” Frontiers of Pattern Recognition, pp. 511-519, 1972.
[47] I. Tomek, “An Experiment with the Edited Nearest-Neighbor Rule,” IEEE Trans. Systems, Man, and Cybernetics, vol. 6, no. 6, pp. 448-452, 1976.
[48] C.J. Veenman, M.J.T. Reinders, and E. Backer, “A Maximum Variance Cluster Algorithm,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 9, pp. 1273-1280, Sept. 2002.
[49] C.J. Veenman, M.J.T. Reinders, and E. Backer, “A Cellular Coevolutionary Algorithm for Image Segmentation,” IEEE Trans. Image Processing, vol. 12, no. 3, pp. 304-316, Mar. 2003.
[50] G. Wilfong, “Nearest Neighbor Problems,” Proc. Seventh Ann. Symp. Computational Geometry, pp. 224-233, 1991.
[51] D.L. Wilson, “Assymptotic Properties of Nearest Neighbor Rules Using Edited Data,” IEEE Trans. Systems, Man, and Cybernetics, vol. 2, no. 3, pp. 408-421, July 1972.
[52] D.R. Wilson and T.R. Martinez, “Instance Pruning Techniques,” Proc. 14th Int'l Conf. Machine Learning, pp. 404-411, 1997.
[53] D.R. Wilson and T.R. Martinez, “Reduction Techniques for Instance-Based Learning Algorithms,” Machine Learning, vol. 38, pp. 257-286, 2000.
[54] W.H. Wolberg and O.L. Mangarasian, “Multisurface Method of Pattern Separation for Medical Diagnoses Applied to Breast Cytology,” Proc. Nat'l Academy of Sciences, vol. 87, pp. 9193-9196, Dec. 1990.

Index Terms:
Index Terms- Classification, regularization, cross-validation, prototype selection.
Citation:
Cor J. Veenman, Marcel J.T. Reinders, "The Nearest Subclass Classifier: A Compromise between the Nearest Mean and Nearest Neighbor Classifier," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 9, pp. 1417-1429, Sept. 2005, doi:10.1109/TPAMI.2005.187
Usage of this product signifies your acceptance of the Terms of Use.