This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Discovering Useful Concept Prototypes for Classification Based on Filtering and Abstraction
August 2002 (vol. 24 no. 8)
pp. 1075-1090

The nearest-neighbor algorithm and its derivatives have been shown to perform well for pattern classification. Despite their high classification accuracy, they suffer from high storage requirement, computational cost, and sensitivity to noise. We develop a new framework, called ICPL (Integrated Concept Prototype Learner), which integrates instance-filtering and instance-abstraction techniques by maintaining a balance of different kinds of concept prototypes according to instance locality. The abstraction component, based on typicality, employed in our ICPL framework is specially designed for concept integration. We have conducted experiments on a total of 50 real-world benchmark data sets. We find that our ICPL framework maintains or achieves better classification accuracy and gains a significant improvement in data reduction compared with existing filtering and abstraction techniques as well as some existing techniques.

[1] D.W. Aha, D. Kibler, and M.K. Albert, “Instance-Based Learning Algorithms,” Machine Learning, vol. 6, pp. 37-66, 1991.
[2] D.W. Aha, D. Kibler, and M.K. Albert, “Noise-Tolerant Instance-Based Learning Algorithms,” Proc. Int'l Joint Conf. Artificial Intelligence, pp. 794-799, 1991.
[3] J.C. Bezdek, T.R. Reichherzer, G.S. Lim, and Y. Attikiouzel, “Multiple-Prototype Classifier Design,” IEEE Trans. Systems, Man, and Cybernetics, pp. 67-79, 1998.
[4] G. Bradshaw, “Learning about Speech Sounds: The NEXUS Project,” Proc. Fourth Int'l Workshop Machine Learning, pp. 1-11, 1987.
[5] R.M. Cameron-Jones, “Instance Selection by Encoding Length Heuristic with Random Mutation Hill Climbing,” Proc. Eighth Australian Joint Conf. Artificial Intelligence, pp. 293-301, 1995.
[6] C.L. Chang, “Finding Prototypes for Nearest Neighbor Classifier,” IEEE Trans. Computers, vol. 23, no. 11, pp. 1179-1184, Nov. 1974.
[7] W.J. Conover, Practical Nonparametric Statistics. New York: John Wiley, 1971.
[8] S. Cost and S. Salzberg, "A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features," Machine Learning, Vol. 10, No. 1, Jan. 1993, pp. 57-78.
[9] B.V. Dasarathy, Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques. Los Alamitos, Calif.: IEEE CS Press, 1991.
[10] B.V. Dasarathy, “Minimal Consistent Set (MCS) Identification for Optimal Nearest Neighbor Decision Systems Design,” IEEE Trans. Systems, Man, Cybernetics, vol. 24, no. 3, pp. 511-517, 1994.
[11] B.V. Dasarathy, J.S. Sanchez, and S. Townsend, “Nearest Neighbour Editing and Condensing Tools-Synergy Exploitation,“ Pattern Analysis and Applications, vol. 3, no. 1, pp. 19-30, 2000.
[12] P. Datta and D. Kibler, “Learning Prototypical Concept Description,” Proc. 12th Int'l Conf. Machine Learning, pp. 158-166, 1995.
[13] P. Datta and D. Kibler, “Learning Symbolic Prototypes,” Proc. 14th Int'l Conf. Machine Learning, pp. 75-82, 1997.
[14] P. Datta and D. Kibler, “Symbolic Nearest Mean Classifier,” Proc. 14th Nat'l Conf. Artificial Intelligence, pp. 82-87, 1997.
[15] P. Domingos, “Unifying Instance-Based and Rule-Based Induction,” Machine Learning, vol. 24, pp. 141-168, 1996.
[16] W. DuMouchel, C. Volinsky, T. Johnson, C. Cortes, and D. Pregibon, “Squashing Flat Files Flatter,” Proc. Fifth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 6-15, 1999.
[17] F.J. Ferri, J.V. Albert, and E. Vidal, “Considerations About Sample-Size Sensitivity of a Family of Edited Nearest-Neighbor Rules,” IEEE Trans. Systems, Man, and Cybernetics—Part B, vol. 29, no. 4, pp. 667-672, 1999.
[18] G.W. Gates, “The Reduced Nearest Neighbor Rule,” IEEE Trans. Information Theory, vol. 18, no. 3, pp. 431-433 1972.
[19] A.R. Golding and P.S. Rosenbloom, “Improving Accuracy by Combining Rule-Based and Case-Based Reasoning,” Artificial Intelligence, vol. 87, pp. 215-254, 1996.
[20] Y. Guermeur, A. Elisseeff, and H. Paugam-Moisy, “A New Multi-class SVM Based on a Uniform Convergence Result,” Proc. IEEE-INNS-ENNS Int'l Joint Conf. Neural Networks, pp. 183-188, 2000.
[21] Y. Hamamoto, S. Uchimura, and S. Tomita, “A Bootstrap Technique for Nearest Neighbor Classifier Design,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 1, pp. 73-79, Jan. 1997.
[22] P.E. Hart, “The Condensed Nearest Neighbor Rule,” IEEE Trans. Information Theory, vol. 14, no. 3, pp. 515-516, 1968.
[23] C.K. Keung and W. Lam, “Prototype Generation Based on Instance Filtering and Averaging,” Proc. Fourth Pacific-Asia Conf. Knowledge Discovery and Data Mining, pp. 142-152, 2000.
[24] D. Kibler and D.W. Aha, “Comparing Instance-Averaging with Instance-Filtering Learning Algorithm,” Proc. Third European Working Session on Learning, pp. 63-80, 1988.
[25] L.I. Kuncheva and J.C. Bezdek, “Nearest Prototype Classification: Clustering, Genetic Algorithms, or Random Search?” IEEE Transactions on Systems, Man, and Cybernetics, vol. 28, no. 1, pp. 160-164, 1998.
[26] W. Lam, C.K. Keung, and C.X. Ling, “Learning Good Prototypes for Classification Using Filtering and Abstraction of Instances,” Pattern Recognition, vol. 35, no. 7, pp. 1491-1506, July 2002.
[27] P.M. Murphy and D.W. Aha, “UCI Repository of Machine Learning Database,” Dept. of Information and Computer Science, Univ. of Calif., Irvine, 1994.
[28] G.L. Ritter, H.B. Woodruff, and S.R. Lowry, “An Algorithm for a Selective Nearest Neighbor Decision Rule,” IEEE Trans. Information Theory, vol. 21, no. 6, pp. 665-669, 1975.
[29] S. Salzberg, “A Nearest Hyperrectangle Learning Method,” Machine Learning, vol. 6, pp. 251-276, 1991.
[30] A. Sato, “A Learning Method for Definite Canonicalization Based on Minimum Classification Error,” Proc. Int'l Conf. Pattern Recognition (ICPR), pp. 199-202, 2000.
[31] B. Schölkopf, C. Burges, and V. Vapnik, 1995. “Extracting Support Data for a Given Task,” Proc. First Int'l Conf. Knowledge Discovery and Data Mining, pp. 252-257,
[32] C. Stanfill and D. Waltz, “Toward Memory-Based Reasoning,” Comm. ACM, vol. 29, pp. 1213-1228, 1986.
[33] I. Tomek, “An Experiment with the Edited Nearest-Neighbor Rule,” IEEE Trans. Systems, Man, and Cybernetics, vol. 6, no. 6, pp. 448-452, 1976.
[34] A. van den Bosch, “Instance-Family Abstraction in Memory-Based Language Learning,” Proc. 16th Int'l Conf. Machine Learning, pp. 39-48 1999.
[35] J. Weston and C. Watkins, “Multi-Class Support Vector Machines,” Technical Report CSD-TR-98-04, Dept. of Computer Science, Royal Holloway, Univ. of London, Egham, Surrey TW20 0EX, England, 1998.
[36] D. Wettschereck, “A Hybrid Nearest-Neighbor and Nearest-Hyperrectangle Algorithm,” Proc. Seventh European Conf. Machine Learning, pp. 323-335, 1994.
[37] D.L. Wilson, “Asymptotic Properties of Nearest Neighbor Rules Using Edited Data,” IEEE Trans. Systems, Man, and Cybernetics, vol. 2, no. 3, pp. 431-433 1972.
[38] D.R. Wilson and T.R. Martinez, “Instance Pruning Techniques,” Proc. 14th Int'l Conf. Machine Learning, pp. 403-411, 1997.
[39] D.R. Wilson and T.R. Martinez, “Reduction Techniques For Instance-Based Learning Algorithm,” Machine Learning, vol. 38, pp. 257-286, 2000.
[40] D.R. Wilson and T.R. Martinez, “In Integrated Instance-Based Learning Algorithm,” Computational Intelligence, vol. 16, no. 1, pp. 1-28, 2000.
[41] Q. Xie, C.A. Laszlo, and R.K. Ward, “Vector Quantization Technique for Nonparametric Classifier Design,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 12, pp. 1326-1330, Dec. 1993.
[42] J. Zhang, “Selecting Typical Instances In Instance-Based Learning,” Proc. Int'l Conf. Machine Learning, pp. 470-479 1992.

Index Terms:
Prototype learning, classification, instance abstraction, machine learning, data mining.
Citation:
Wai Lam, Chi-Kin Keung, Danyu Liu, "Discovering Useful Concept Prototypes for Classification Based on Filtering and Abstraction," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 8, pp. 1075-1090, Aug. 2002, doi:10.1109/TPAMI.2002.1023804
Usage of this product signifies your acceptance of the Terms of Use.