This Article 
 Bibliographic References 
 Add to: 
A Niching Memetic Algorithm for Simultaneous Clustering and Feature Selection
July 2008 (vol. 20 no. 7)
pp. 868-879
Clustering is inherently a difficult task, and is made even more difficult when the selection of relevant features is also an issue. In this paper we propose an approach for simultaneous clustering and feature selection using a niching memetic algorithm. Our approach (which we call NMA_CFS) makes feature selection an integral part of the global clustering search procedure and attempts to overcome the problem of identifying less promising locally optimal solutions in both clustering and feature selection, without making any a priori assumption about the number of clusters. Within the NMA_CFS procedure, a variable composite representation is devised to encode both feature selection and cluster centers with different numbers of clusters. Further, local search operations are introduced to refine feature selection and cluster centers encoded in the chromosomes. Finally, a niching method is integrated to preserve the population diversity and prevent premature convergence. In an experimental evaluation we demonstrate the effectiveness of the proposed approach and compare it with other related approaches, using both synthetic and real data.

[1] H. Almuallim and T. Dietterich, “Learning with Many Irrelevant Features,” Proc. Ninth Nat'l Conf. Artificial Intelligence (AAAI '91), pp. 547-552, 1991.
[2] S. Areibi and Z. Yang, “Effective Memetic Algorithms for VLSI Design Automation = Genetic Algorithms + Local Search + Multi-Level Clustering,” Evolutionary Computation, vol. 12, no. 3, pp. 327-353, 2004.
[3] P. Baldi and G.W. Hatfield, DNA Microarrays and Gene Expression. Cambridge Univ. Press, 2002.
[4] S. Basu, C.A. Micchelli, and P. Olsen, “Maximum Entropy and Maximum Likelihood Criteria for Feature Selection from Multivariate Data,” Proc. IEEE Int'l Symp. Circuits and Systems (ISCAS '00), pp. 267-270, 2000.
[5] A. Blum and P. Langley, “Selection of Relevant Features and Examples in Machine Learning,” Artificial Intelligence, vol. 97, no. 1/2, pp. 245-271, 1997.
[6] C. Cardie, “Using Decision Trees to Improve Case-Based Learning,” Proc. 10th Int'l Conf. Machine Learning (ICML '93), pp. 25-32, 1993.
[7] S.K. Das, “Feature Selection with a Linear Dependence Measure,” IEEE Trans. Computers, pp. 1106-1109, 1971.
[8] M. Dash and H. Liu, “Unsupervised Feature Selection,” Proc. Fourth Pacific Asia Conf. Knowledge Discovery and Data Mining (PAKDD '00), pp. 110-121, 2000.
[9] K.A. DeJong, “An Analysis of the Behavior of a Class of Genetic Adaptative Systems,” PhD dissertation, Univ. of Michigan, Ann Arbor, 1975.
[10] A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” J. Royal Statistical Soc. B, vol. 39, no. 1, pp. 1-38, 1977.
[11] R.O. Duda, P.E. Hart, and D.G. Stork, Pattern Classification. Wiley, 2001.
[12] J. Dy and C. Brodley, “Feature Subset Selection and Order Identification for Unsupervised Learning,” Proc. 17th Int'l Conf. Machine Learning (ICML), 2000.
[13] J. Dy and C. Brodley, “Feature Selection for Unsupervised Learning,” J. Machine Learning Research, pp. 845-889, 2004.
[14] I. Foroutan and J. Sklasky, “Feature Selection for Automatic Classification of Non-Gaussian Data,” IEEE Trans. Systems, Man and Cybernetics, vol. 17, pp. 187-198, 1987.
[15] H. Frigui and R. Krishnapuram, “A Robust Competitive Clustering Algorithm with Applications in Computer Vision,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 5, pp.450-465, May 1999.
[16] K. Fukunaga, Statistical Pattern Recognition. Academic Press, 1990.
[17] M. Garey and D. Johnson, Computers and Intractability-A Guide to the Theory of NP-Completeness. W.H. Freeman, 1979.
[18] D.E. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, 1989.
[19] D.E. Goldberg and J. Richardson, “Genetic Algorithms with Sharing for Multimodal Function Optimization,” Proc. Second Int'l Conf. Genetic Algorithms (ICGA '87), pp. 41-49, 1987.
[20] M.A. Hall, “Correlation Based Feature Selection for Discrete and Numeric Class Machine Learning,” Proc. 17th Int'l Conf. Machine Learning (ICML), 2000.
[21] L.O. Hall, I.B. Ozyurt, and J.C. Bezdek, “Clustering with a Genetically Optimized Approach,” IEEE Trans. Evolutionary Computation, vol. 3, no. 2, pp. 103-112, 1999.
[22] R.P. Heydorn, “Redundancy in Feature Extraction,” IEEE Trans. Computers, pp. 1051-1054, 1971.
[23] J.H. Holland, Adaptation in Natural and Artificial Systems. Univ. of Michigan, Ann Arbor, 1975.
[24] A.K. Jain and R.C. Dubes, Algorithms for Clustering Data. Prentice Hall, 1988.
[25] A.K. Jain and P. Flynn, “Image Segmentation Using Clustering,” Advances in Image Understanding, pp. 65-83, 1996.
[26] K. Kira and L. Rendell, “A Practical Approach to Feature Selection,” Proc. Ninth Int'l Conf. Machine Learning (ICML '92), pp. 249-256, 1992.
[27] J. Kogan, C. Nicholas, and V. Volkovich, “Text Mining with Information-Theoretic Clustering,” IEEE Computational Science and Eng., pp. 52-59, 2003.
[28] R. Kohavi and G.H. John, “Wrappers for Feature Subset Selection,” Artificial Intelligence, vol. 97, no. 1/2, pp. 273-324, 1997.
[29] I. Kononenko, “Estimating Attributes: Analysis and Extension of Relief,” Proc. Seventh European Machine Learning Conf. (ECML '94), pp. 171-182, 1994.
[30] K. Krishna and M.N. Murty, “Genetic $K\hbox{-}{\rm Means}$ Algorithm,” IEEE Trans. Systems, Man and Cybernetics Part B, vol. 29, no. 3, 1999.
[31] J. MacQueen, “Some Methods for Classification and Analysis of Multivariate Observations,” Proc. Fifth Berkeley Symp. Math. Statistics and Probability, pp. 281-297, 1967.
[32] S.W. Mahfoud, “Niching Methods for Genetic Algorithms,” PhD dissertation, Univ. of Illinois, Urbana-Champaign, 1995.
[33] U. Maulik and S. Bandyopadhyay, “Genetic-Algorithm-Based Clustering Technique,” Pattern Recognition, vol. 33, pp. 1455-1465, 2000.
[34] P. Merz and B. Freisleben, “Memetic Algorithms and the Fitness Landscape of the Graph Bipartitioning Problem,” LNCS, pp. 765-774, 1998.
[35] P. Moscato, “On Evolution, Search, Optimization, Genetic Algorithms and Martial Arts: Toward Memetic Algorithms,” technical report, California Inst. Tech nology, 1989.
[36] P. Moscato, “Memetic Algorithms: A Short Introduction,” New Ideas in Optimization, D. Corne, M. Dorigo, and F. Glover, eds., McGraw-Hill, pp. 219-234, 1999.
[37] P.M. Murphy and D.W. Aha, “UCI Repository for Machine Learning Databases,” technical report, Dept. Information and Computer Science, Univ. of California, Irvine, http://www.ics. , 1994.
[38] N.R. Pal and J.C. Bezdek, “On Cluster Validity for the Fuzzy $C\hbox{-}{\rm Means}$ Model,” IEEE Trans. Fuzzy Systems, vol. 3, no. 3, pp. 370-379, 1995.
[39] D. Pelleg and A. Moore, “X-Means: Extending $K\hbox{-}{\rm Means}$ with Efficient Estimation of the Number of Clusters,” Proc. 17th Int'l Conf. Machine Learning (ICML '00), pp. 727-734, 2000.
[40] J.M. Pena, J.A. Lozano, and P. Larranaga, “An Empirical Comparison of Four Initialization Methods for the $K\hbox{-}{\rm Means}$ Algorithms,” Pattern Recognition Letters, vol. 20, pp. 1027-1040, 1999.
[41] A. Petrowski, “A Clearing Procedure as a Niching Method for Genetic Algorithms,” Proc. IEEE Int'l Conf. Evolutionary Computation (ICEC '96), pp. 798-803, 1996.
[42] M.L. Raymer, W.F. Punch, E.D. Goodman, L.A. Kuhn, and A.K. Jain, “Dimensionality Reduction Using Genetic Algorithms,” IEEE Trans. Evolutionary Computation, vol. 4, no. 2, pp. 164-171, 2000.
[43] G. Salton and M.J. McGill, Introduction to Modern Information Retrieval. McGraw-Hill, 1983.
[44] B. Sareni and L. Krähenbühl, “Fitness Sharing and Niching Methods Revisited,” IEEE Trans. Evolutionary Computation, vol. 2, pp. 97-106, 1998.
[45] G. Schwarz, “Estimating the Dimension of a Model,” The Annals of Statistics, vol. 6, no. 2, pp. 461-464, 1978.
[46] W. Sheng, A. Tucker, and X. Liu, “Clustering with Niching Genetic $K\hbox{-}{\rm Means}$ Algorithm,” Proc. Genetic and Evolutionary Computation Conf. (GECCO '04), pp. 162-173, 2004.
[47] J. Shi and J. Malik, “Normalized Cuts and Image Segmentation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 888-905, Aug. 2000.
[48] S. Tavazoie, D. Hughes, M.J. Campbell, R.J. Cho, and G.M. Church, “Systematic Determination of Genetic Network Architecture,” Nature Genetic, vol. 22, pp. 281-285, 1999.
[49] S. Theodoridis and K. Koutroumbas, Pattern Recognition. Academic, 1999.
[50] G.T. Toussaint and T.R. Vilmansen, “Comments on Feature Selection with a Linear Dependence Measure,” IEEE Trans. Computers, 1972.
[51] H.K. Tsai, J.M. Yang, Y.F. Tsai, and C.Y. Kao, “An Evolutionary Approach for Gene Expression Patterns,” IEEE Trans. Information Technology in Biomedicine, vol. 8, no. 2, pp. 69-78, 2004.
[52] D. Whitley, “Modeling Hybrid Genetic Algorithms,” Genetic Algorithms in Eng. and Computer Science, G. Winter, J. Periaux, M.Galan, and P. Cuesta, eds., John Wiley, pp. 191-201, 1995.
[53] S. Wu, A.W.C. Liew, H. Yan, and M. Yang, “Cluster Analysis of Gene Expression Database on Self-Splitting and Merging Competitive Learning,” IEEE Trans. Information Technology in Biomedicine, vol. 8, no. 1, 2004.
[54] J.H. Yang and V. Honavar, “Feature Subset Selection Using a Genetic Algorithm,” IEEE Intelligent Systems, vol. 13, no. 2, pp. 44-49, 1998.

Index Terms:
Clustering, feature selection, memetic algorithm, genetic algorithm, niching method, local search
Weiguo Sheng, Xiaohui Liu, Mike Fairhurst, "A Niching Memetic Algorithm for Simultaneous Clustering and Feature Selection," IEEE Transactions on Knowledge and Data Engineering, vol. 20, no. 7, pp. 868-879, July 2008, doi:10.1109/TKDE.2008.33
Usage of this product signifies your acceptance of the Terms of Use.