This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Combining Feature Reduction and Case Selection in Building CBR Classifiers
March 2006 (vol. 18 no. 3)
pp. 415-429
CBR systems that are built for the classification problems are called CBR classifiers. This paper presents a novel and fast approach to building efficient and competent CBR classifiers that combines both feature reduction (FR) and case selection (CS). It has three central contributions: 1) it develops a fast rough-set method based on relative attribute dependency among features to compute the approximate reduct, 2) it constructs and compares different case selection methods based on the similarity measure and the concepts of case coverage and case reachability, and 3) CBR classifiers built using a combination of the FR and CS processes can reduce the training burden as well as the need to acquire domain knowledge. The overall experimental results demonstrating on four real-life data sets show that the combined FR and CS method can preserve, and may also improve, the solution accuracy while at the same time substantially reducing the storage space. The case retrieval time is also greatly reduced because the use of CBR classifier contains a smaller amount of cases with fewer features. The developed FR and CS combination method is also compared with the kernel PCA and SVMs techniques. Their storage requirement, classification accuracy, and classification speed are presented and discussed.

[1] J. Kolodner, Case-Based Reasoning. Morgan Kaufmann, 1993.
[2] S.K. Pal and S.C.K. Shiu, Foundations of Soft Case-Based Reasoning. John Wiley, 2004.
[3] C.C. Hsu and C.S. Ho, “Acquiring Patient Data by an Intelligent Interface Agent with Medicine-Related Common Sense Reasoning,” Expert Systems with Applications: An Int'l J., vol. 17, no. 4, pp. 257-274, 1999.
[4] T.W. Liao, “An Investigation of a Hybrid CBR Method for Failure Mechanisms Identification,” Eng. Applications of Artificial Intelligence, vol. 17, no. 1, pp. 123-134, 2004.
[5] E. Kalapanidas and N. Avouris, “Short-Term Air Quality Prediction Using a Case-Based Classifier,” Environmental Modelling and Software, vol. 16, no. 3, pp. 263-272, 2001.
[6] K.E. Emam, S. Benlarbi, N. Goel, and S.N. Rai, “Comparing Case-Based Reasoning Classifiers for Predicting High Risk Software Components,” J. Systems and Software, vol. 55, no. 3, pp. 301-320, 2001.
[7] J.M. Garrell i Guiu, E. Golobardes i Ribé, E. Bernadó i Mansilla, and X. Llorà i Fàbrega, “Automatic Diagnosis with Genetic Algorithms and Case-Based Reasoning,” Artificial Intelligence in Eng., vol. 13, no. 4, pp. 367-372, 1999.
[8] M.Q. Xu, K. Hirota, and H. Yoshino, “A Fuzzy Theoretical Approach to Representation and Inference of Case in CISG,” Int'l J. Artificial Intelligence and Law, vol. 7, nos. 2-3, pp. 259-272, 1999.
[9] P.P. Bonissone and W. Cheetham, “Financial Application of Fuzzy Case-Based Reasoning to Residential Property Valuation,” Proc. Sixth IEEE Int'l Conf. Fuzzy Systems (FUZZ-IEEE-97), pp. 37-44, 1997.
[10] M.L. Masher and D.M. Zhang, “CADSYN: A Case-Based Design Process Model,” Artificial Intelligence for Eng. Design, Analysis and Manufacturing, vol. 7, no. 2, pp. 97-110, 1993.
[11] T.R. Hinrihs, Problem Solving in Open Worlds. Hillsdate, N.J.: Lawrence Erlbaum Assoc., 1992.
[12] P.A. Devijver and J. Kittler, Pattern Recognition: A Statistical Approach. Prentice Hall, 1982.
[13] I.T. Jolliffe, Principle Component Analysis. Springer, 1986.
[14] P. Geladi, H. Isaksson, L. Lindqvist, S. Wold, and K. Esbensen, “Principle Component Analysis of Multivariate Images,” Chemometrics and Intelligent Laboratory Systems, vol. 5, no. 3, pp. 209-220, 1989.
[15] M.A. Kramer, “Nonlinear Principal Component Analysis Using Autoassociative Neural Networks,” AIChE J., vol. 37, no. 2, pp. 233-243, Feb. 1991.
[16] P. Mitra, C.A. Murthy, and S.K. Pal, “Unsupervised Feature Selection Using Feature Similarity,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 3, pp. 301-312, Mar. 2002.
[17] R. Tibshirani, T. Hastie, B. Narasimhan, and G. Chu, “Diagnosis of Multiple Cancer Types By Shrunken Centroids Of Gene Expression,” Proc. Nat'l Academy of Sciences (PNAS), vol. 99, no. 10, pp. 6567-6572, 2002.
[18] Z. Pawlak, “Rough Sets,” Int'l J. Computer and Information Science, vol. 11, no. 5, pp. 341-356, 1982.
[19] Z. Pawlak, Rough Sets, Theoretical Aspects of Reasoning about Data. Kluwer Academic, 1991.
[20] F.E. H. Tay and L. Shen, “Fault Diagnosis Based on Rough Set Theory,” Eng. Applications of Artificial Intelligence, vol. 16, no. 1, pp. 39-43, 2003.
[21] S.K. Pal and P. Mitra, “Multispectral Image Segmentation Using Rough Set Initialized EM Algorithm,” IEEE Trans. Geoscience and Remote Sensing, vol. 40, no. 11, pp. 2495-2501, 2002.
[22] C.C. Chan, “A Rough Set Approach to Attribute Generalization in Data Mining,” Information Science, vol. 107, nos. 1-4, pp. 169-176, 1998.
[23] R. Jensen and Q. Shen, “Fuzzy-Rough Attribute Reduction with Application to Web Categorization,” Fuzzy Sets and Systems, vol. 141, no. 3, pp. 469-485, 2004.
[24] L. Shen and H.T. Loh, “Applying Rough Sets to Market Timing Decisions,” Decision Support Systems, vol. 37, no. 4, pp. 583-597, 2004.
[25] A. Skowron and C. Rauszer, “The Discernibility Matrices and Functions in Information Systems,” Intelligent Decision Support— Handbook of Applications and Advances of the Rough Sets Theory, R. Slowinski, ed., pp. 331-362, 1992.
[26] Q. Shen and A. Chouchoulas, “A Rough-Fuzzy Approach for Generating Classification Rules,” Pattern Recognition, vol. 35, no. 11, pp. 341-354, 2002.
[27] J. Han, X. Hu, and T.Y. Lin, “Feature Subset Selection Based on Relative Dependency Between Attributes,” Proc. Fourth Int'l Conf. Rough Sets and Current Trends in Computing (RSCTC '04), pp. 176-185, 2004.
[28] H. Brighton and C. Mellish, “Advances in Instance Selection for Instance-Based Learning Algorithms,” Data Mining and Knowledge Discovery, vol. 6, no. 2, pp. 153-172, 2002.
[29] P.E. Hart, “The Condensed Nearest Neighbor Rule,” Inst. of Electrical and Electronics Eng. Trans. Information Theory, vol. 14, pp. 515-516, 1968.
[30] D.R. Wilson and L. Dennis, “Asymptotic Properties of Nearest Neighbor Rules Using Edited Data,” IEEE Trans. Systems, Man, and Cybernetics, vol. 2, no. 3, pp. 408-421, 1972.
[31] G.W. Gates, “The Reduced Nearest Neighbor Rule,” IEEE Trans. Information Theory, vol. 18, no. 3, pp. 431-433, 1972.
[32] G.L. Ritter, H.B. Woodruff, S.R. Lowry, and T.L. Isenhour, “An Algorithm for the Selective Nearest Neighbor Decision Rule,” IEEE Trans. Information Theory, vol. 21, no. 6, pp. 665-669, 1975.
[33] I. Tomek, “An Experiment with the Edited Nearest-Neighbor Rule,” IEEE Trans. Systems, Man, and Cybernetics, vol. 6, no. 6, pp. 448-452, 1976.
[34] B. Smyth and E McKenna, “Footprint-Based Retrieval,” Proc. Fourth Int'l Conf. Case-Based Reasoning, pp. 343-357, 1999.
[35] B. Smyth and E. Mckenna, “Building Compact Competent Case Bases,” Proc. Third Int'l Conf. Case-Based Reasoning, pp. 329-342, 1999.
[36] K. Racine and Q. Yang, “Maintaining Unstructured Case Bases,” Proc. Second Int'l Conf. Case-Based Reasoning, pp. 553-564, 1997.
[37] G. Cao, S.C.K. Shiu, and X.Z. Wang, “A Fuzzy-Rough Approach for the Maintenance of Distributed Case-Based Reasoning Systems,” Soft Computing, vol. 7, no. 8, pp. 491-499, 2003.
[38] M.M. Astrahan, “Speech Analysis by Clustering, or the Hyperphoneme Method,” Stanford A.I. Project Memo, Stanford Univ., Calif., 1970.
[39] P. Mitra, C.A. Murthy, and S.K. Pal, “Density Based Multiscale Data Condensation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 6, pp. 734-747, 2002.
[40] C.L. Chang, “Finding Prototypes for Nearest Neighbor Classifiers,” IEEE Trans. Computers, vol. 23, no. 11, pp. 1179-1184, Nov. 1974.
[41] P. Domingos, “Rule Induction and Instance-Based Learning: A Unified Approach,” Proc. 14th Int'l Joint Conf. Artificial Intelligence, pp. 1226-1232, 1995.
[42] S. Salzberg, “A Nearest Hyperrectangle Learning Method,” Machine Learning, vol. 6, no. 3, pp. 251-276, 1991.
[43] S.K. Pal and P. Mitra, “Case Generation Using Rough Sets with Fuzzy Representation,” IEEE Trans. Knowledge and Data Eng., vol. 16, no. 3, pp. 292-300, 2004.
[44] V.N. Vapnik, Statistical Learning Theory. Wiley, 1998.
[45] V.N. Vapnik, The Nature of Statistical Learning Theory. Springer, 1999.
[46] D. Kim and C. Kim, “Forecasting Time Series with Genetic Fuzzy Predictor Ensemble,” IEEE Trans. Fuzzy Systems, vol. 5, no. 4, pp. 523-535, 1997.
[47] S.C.K. Shiu, D.S. Yeung, C.H. Sun, and X.Z. Wang, “Transforming Case Knowledge to Adaptation Knowledge: An Approach for Case-Base Maintenance,” Computational Intelligence, vol. 17, no. 2, pp. 295-313, 2001.
[48] UCI, Learning Data Repository, http://www.ics.uci.edu/~mlearnMLRepository.html , 2005.
[49] D.D. Lewis, Reuters-21578 Text Categorization Test Collection Distribution 1.0, http://www.research.att.com~lewis, 1999.
[50] S.R. Gunn, “Support Vector Machines for Classification and Regression,” technical report, Image Speech and Intelligent Systems Research Group, Univ. of Southampton, 1997.

Index Terms:
Case-based reasoning, CBR classifier, case selection, feature reduction, k-NN principle, rough sets.
Citation:
Yan Li, Simon C.K. Shiu, Sankar K. Pal, "Combining Feature Reduction and Case Selection in Building CBR Classifiers," IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 3, pp. 415-429, March 2006, doi:10.1109/TKDE.2006.40
Usage of this product signifies your acceptance of the Terms of Use.