The Community for Technology Leaders
RSS Icon
Issue No.03 - March (2012 vol.24)
pp: 452-464
Der-Chiang Li , National Cheng Kung University, Tainan
Chiao-Wen Liu , National Cheng Kung University, Tainan
Data quantity is the main issue in the small data set problem, because usually insufficient data will not lead to a robust classification performance. How to extract more effective information from a small data set is thus of considerable interest. This paper proposes a new attribute construction approach which converts the original data attributes into a higher dimensional feature space to extract more attribute information by a similarity-based algorithm using the classification-oriented fuzzy membership function. Seven data sets with different attribute sizes are employed to examine the performance of the proposed method. The results show that the proposed method has a superior classification performance when compared to principal component analysis (PCA), kernel principal component analysis (KPCA), and kernel independent component analysis (KICA) with a Gaussian kernel in the support vector machine (SVM) classifier.
Classification, small data set, feature construction, support vector machine.
Der-Chiang Li, Chiao-Wen Liu, "Extending Attribute Information for Small Data Set Classification", IEEE Transactions on Knowledge & Data Engineering, vol.24, no. 3, pp. 452-464, March 2012, doi:10.1109/TKDE.2010.254
[1] F.R. Bach and M.I. Jordan, "Kernel Independent Component Analysis," J. Machine Learning Research, vol. 3, pp. 1-48, 2003.
[2] S. Basu, C. Micchelli, and P. Olsen, "Maximum Entropy and Maximum Likelihood Criteria for Feature Selection from Multivariate Data," Proc. IEEE Int'l Symp. Circuits and Systems, pp. 267-270, 2000.
[3] C.L. Blake and C.J. Merz, "UCI Repository of Machine Learning Databases," Dept. of Information and Computer Science, California, Irvine, 1998.
[4] C. Campbell, "Kernel Methods: A Survey of Current Techniques," Neurocomputing, vol. 48, pp. 63-84, 2002.
[5] J. Demšar, "Statistical Comparisons of Classifiers over Multiple Data Sets," J. Machine Learning Research, vol. 7, pp. 1-30, 2006.
[6] P.A. Devijver and J. Kittler, Pattern Recognition: A Statistical Approach. Prentice Hall, 1982.
[7] J. Dy and C. Brodley, "Feature Subset Selection and Order Identification for Unsupervised Learning," Proc. 17th Int'l Conf. Machine Learning, pp. 247-254, 2000.
[8] G. Gomez and E.F. Morales, "Automatic Feature Construction and a Simple Rule Induction Algorithm for Skin Detection," Proc. ICML Workshop Machine Learning in Computer Vision, pp. 31-38, 2002.
[9] H.A. Güvenir and M. Çakir, "Voting Features Based Classifier with Feature Construction and Its Application to Predicting Financial Distress," Expert Systems with Applications, vol. 37, no. 2, pp. 1713-1718, 2010.
[10] M.A. Hall and G. Holmes, "Benchmarking Attribute Selection Techniques for Discrete Class Data Mining," IEEE Trans. Knowledge and Data Eng., vol. 15, no. 6, pp. 1437-1447, Nov./Dec. 2002.
[11] D.R. Hardoon, S. Szedmak, and J. Shawe-Taylar, "Canonical Correlation Analysis: An Overview with Application to Learning Methods," Neural Computation, vol. 16, no. 12, pp. 2639-2664, 2004.
[12] G. Hong and K.N. Asoke, "Breast Cancer Diagnosis Using Genetic Programming Generated Feature," Pattern Recognition, vol. 39, pp. 980-987, 2006.
[13] J.Y. Hu, "A Genetic Programming Approach to Constructive Induction," Proc. Third Ann. Genetic Programming Conf., pp. 146-157, 1998.
[14] C.F. Huang and C. Moraga, "A Diffusion-Neural-Network for Learning from Small Samples," Int'l J. Approximate Reasoning, vol. 35, pp. 137-161, 2004.
[15] R.I. Jennrich and M.D. Schluchter, "Unbalanced Repeated-Measures Models with Structured Covariance Matrices," Biometrics, vol. 42, pp. 805-820, 1986.
[16] I.T. Jolliffe, Principal Component Analysis. Springer, 1986.
[17] C. Kim and C.H. Choi, "A Discriminant Analysis Using Composite Features for Classification Problems," Pattern Recognition, vol. 40, no. 11, pp. 2958-2966, 2007.
[18] R. Kohave and G.H. John, "Wrappers for Feature Subset Selection," Artificial Intelligence, vol. 97, pp. 273-324, 1997.
[19] M. Kubat, B.C. Holte, and S. Matwin, "Machine Learning for the Detection of Oil Spills in Satellite Radar Images," Machine Learning, vol. 30, pp. 195-215, 1998.
[20] M. Kudo and J. Sklansky, "Comparison of Algorithms that Selects Features for Pattern Classifiers," Pattern Recognition, vol. 33, pp. 25-41, 2000.
[21] N. Kwak and C.-H. Choi, "Feature Extraction Based on ICA for Binary Classification Problems," IEEE Trans. Knowledge and Data Eng., vol. 15, no. 6, pp. 1374-1388, Nov./Dec. 2003.
[22] D.C. Li and C.W. Yeh, "A Non-Parametric Learning Algorithm for Small Manufacturing Data Sets," Expert Systems with Applications, vol. 34, pp. 391-398, 2008.
[23] N.M. Laird and J.H. Ware, "Random-Effects Models for Longitudinal Data," Biometrics, vol. 38, pp. 963-974, 1982.
[24] D.C. Li and C.W. Liu, "A Neural Network Weight Determination Model Designed Uniquely for Small Data Set Learning," Expert Systems with Applications, vol. 36, pp. 9853-9858, 2008.
[25] D.C. Li and C.W. Yeh, "A Non-Parametric Learning Algorithm for Small Manufacturing Data Sets," Expert Systems with Applications, vol. 34, pp. 391-398, 2008.
[26] D.C. Li, C.S. Wu, T.I Tsai, and Y.S. Lina, "Using Mega-Trend-Diffusion and Artificial Samples in Small Data Set Learning for Early Flexible Manufacturing System Scheduling Knowledge," Computers and Operations Research, vol. 34, pp. 966-982, 2007.
[27] D.C. Li, H.C. Hsu, T.I. Tsai, T.J. Lu, and S.C. Hu, "A New Method to Help Diagnose Cancers for Small Sample Size," Expert Systems with Applications, vol. 33, pp. 420-424, 2007.
[28] H. Liu and H. Motoda, Feature Extraction, Construction and Selection: A Data Mining Perspective. Kluwer Academic Publishers, 1998.
[29] C.J. Matheus and L. Rendell, "Constructive Induction in Decision Trees," Proc. 11th Int'l Joint Conf. Artificial Intelligence (IJCAI), pp. 645-650, 1989.
[30] P. Mitra, "Unsupervised Feature Selection Using Feature Similarity," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 3, pp. 301-312, Mar. 2002.
[31] H. Motoda and H. Liu, "Feature Selection, Extraction and Construction," Proc. Sixth Pacific-Asia Conf. Knowledge Discovery and Data Mining, pp. 67-72, 2002.
[32] M. Muharram and G.D. Smith, "Evolutionary Constructive Induction," IEEE Trans. Knowledge and Data Eng., vol. 17, no. 11, pp. 1518-1528, Nov. 2005.
[33] M.A. Muharram and G.D. Smith, "Evolutionary Feature Construction Using Information Gain and Gini Index," Proc. Genetic Programming: Seventh European Conf. EuroGP, pp. 379-388, 2004.
[34] Y. Muto and Y. Hamamoto, "Improvement of the Parzen Classifier in Small Training Sample Size Situations," Intelligent Data Analysis, vol. 5, no. 6, pp. 477-490, 2001.
[35] K. Neshatian, M. Zhang, and M. Johnston, Feature Construction and Dimension Reduction Using Genetic Programming, pp. 160-170. Springer-Verlag, 2007.
[36] P. Niyogi, F. Girosi, and P. Tomaso, "Incorporating Prior Information in Machine Learning by Creating Virtual Examples," Proc. IEEE, vol. 86, no. 11, pp. 2196-2209, Nov. 1998.
[37] F.E.B. Otero, M.M.S. Silve, A.A. Freitas, and J.C. Nievola, "Genetic Programming for Attribute Construction in Data Mining," Proc. Genetic Programming: Sixth European Conf. EuroGP, pp. 384-393, 2003.
[38] S.K. Pal, R.K. De, and J. Basak, "Unsupervised Feature Evaluation: A Neuro-Fuzzy Approach," IEEE Trans. Neural Network, vol. 11, no. 2, pp. 366-376, Mar. 2000.
[39] G. Pagallo, "Learning DNF by Decision Trees," Proc. 11th Int'l Joint Conf. Artificial Intelligence (IJCAI), pp. 639-644, 1989.
[40] W.J. Park and R.M. Kil, "Pattern Classification with Class Probability Output Network," IEEE Trans. Neural Network, vol. 20, no. 10, pp. 1659-1673, Oct. 2009.
[41] S. Piramuthu, "Feed-Forward Neural Networks and Feature Construction with Correlation Information: An Integrated Framework," European J. Operational Research, vol. 93, pp. 418-427, 1996.
[42] S. Piramuthu and R.T. Sikora, "Iterative Feature Construction for Improving Inductive Learning Algorithms," Expert Systems with Applications, vol. 36, pp. 3401-3406, 2009.
[43] B. Schőlkopf, A. Smola, and K.R. Műller, "Nonlinear Component Analysis as a Kernel Eigenvalue Problem," Neural Computation, vol. 10, no. 5, pp. 1299-1319, 1998.
[44] G. Schwarz, "Estimating the Dimensions of a Model," Analysis of Statistics, vol. 6, pp. 461-464, 1978.
[45] A. Shawkat, A. Kate, and M. Smith, "A Meta-Learning Approach to Automatic Kernel Selection for Support Vector Machines," Neurocomputing, vol. 70, pp. 173-186, 2006.
[46] J. Shawe-Taylor, M. Anthony, and N.L. Biggs, "Bounding Sample Size with the Vapnik-Chervonenkis Dimension," Discrete Applied Math., vol. 42, pp. 65-73, 1993.
[47] M.G. Smith and L. Bull, "Genetic Programming with a Genetic Algorithm for Feature Construction and Selection," Genetic Programming and Evolvable Machines, vol. 6, no. 3, pp. 265-281, 2005.
[48] R. Thawonmas and S. Abe, "A Novel Approach to Feature Selection Based on Analysis of Class Regions," IEEE Trans. Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 27, no. 2, pp. 196-207, Apr. 1997.
[49] V. Vapnik, The Nature of Statistical Learning Theory. Springer, 1995.
[50] V. Vapnik, "Universal Learning Technology: Support Vector Machines," NEC J. Advanced Technology, vol. 2, no. 2, pp. 137-144, 2005.
5 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool