Subscribe

Issue No.07 - July (2008 vol.20)

pp: 868-879

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2008.33

ABSTRACT

Clustering is inherently a difficult task, and is made even more difficult when the selection of relevant features is also an issue. In this paper we propose an approach for simultaneous clustering and feature selection using a niching memetic algorithm. Our approach (which we call NMA_CFS) makes feature selection an integral part of the global clustering search procedure and attempts to overcome the problem of identifying less promising locally optimal solutions in both clustering and feature selection, without making any a priori assumption about the number of clusters. Within the NMA_CFS procedure, a variable composite representation is devised to encode both feature selection and cluster centers with different numbers of clusters. Further, local search operations are introduced to refine feature selection and cluster centers encoded in the chromosomes. Finally, a niching method is integrated to preserve the population diversity and prevent premature convergence. In an experimental evaluation we demonstrate the effectiveness of the proposed approach and compare it with other related approaches, using both synthetic and real data.

INDEX TERMS

Clustering, feature selection, memetic algorithm, genetic algorithm, niching method, local search

CITATION

Weiguo Sheng, Mike Fairhurst, "A Niching Memetic Algorithm for Simultaneous Clustering and Feature Selection",

*IEEE Transactions on Knowledge & Data Engineering*, vol.20, no. 7, pp. 868-879, July 2008, doi:10.1109/TKDE.2008.33REFERENCES

- [1] H. Almuallim and T. Dietterich, “Learning with Many Irrelevant Features,”
Proc. Ninth Nat'l Conf. Artificial Intelligence (AAAI '91), pp. 547-552, 1991.- [3] P. Baldi and G.W. Hatfield,
DNA Microarrays and Gene Expression. Cambridge Univ. Press, 2002.- [5] A. Blum and P. Langley, “Selection of Relevant Features and Examples in Machine Learning,”
Artificial Intelligence, vol. 97, no. 1/2, pp. 245-271, 1997.- [6] C. Cardie, “Using Decision Trees to Improve Case-Based Learning,”
Proc. 10th Int'l Conf. Machine Learning (ICML '93), pp. 25-32, 1993.- [8] M. Dash and H. Liu, “Unsupervised Feature Selection,”
Proc. Fourth Pacific Asia Conf. Knowledge Discovery and Data Mining (PAKDD '00), pp. 110-121, 2000.- [9] K.A. DeJong, “An Analysis of the Behavior of a Class of Genetic Adaptative Systems,” PhD dissertation, Univ. of Michigan, Ann Arbor, 1975.
- [10] A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,”
J. Royal Statistical Soc. B, vol. 39, no. 1, pp. 1-38, 1977.- [11] R.O. Duda, P.E. Hart, and D.G. Stork,
Pattern Classification. Wiley, 2001.- [12] J. Dy and C. Brodley, “Feature Subset Selection and Order Identification for Unsupervised Learning,”
Proc. 17th Int'l Conf. Machine Learning (ICML), 2000.- [13] J. Dy and C. Brodley, “Feature Selection for Unsupervised Learning,”
J. Machine Learning Research, pp. 845-889, 2004.- [14] I. Foroutan and J. Sklasky, “Feature Selection for Automatic Classification of Non-Gaussian Data,”
IEEE Trans. Systems, Man and Cybernetics, vol. 17, pp. 187-198, 1987.- [16] K. Fukunaga,
Statistical Pattern Recognition. Academic Press, 1990.- [17] M. Garey and D. Johnson,
Computers and Intractability-A Guide to the Theory of NP-Completeness. W.H. Freeman, 1979.- [18] D.E. Goldberg,
Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, 1989.- [19] D.E. Goldberg and J. Richardson, “Genetic Algorithms with Sharing for Multimodal Function Optimization,”
Proc. Second Int'l Conf. Genetic Algorithms (ICGA '87), pp. 41-49, 1987.- [20] M.A. Hall, “Correlation Based Feature Selection for Discrete and Numeric Class Machine Learning,”
Proc. 17th Int'l Conf. Machine Learning (ICML), 2000.- [23] J.H. Holland,
Adaptation in Natural and Artificial Systems. Univ. of Michigan, Ann Arbor, 1975.- [24] A.K. Jain and R.C. Dubes,
Algorithms for Clustering Data. Prentice Hall, 1988.- [25] A.K. Jain and P. Flynn, “Image Segmentation Using Clustering,”
Advances in Image Understanding, pp. 65-83, 1996.- [26] K. Kira and L. Rendell, “A Practical Approach to Feature Selection,”
Proc. Ninth Int'l Conf. Machine Learning (ICML '92), pp. 249-256, 1992.- [27] J. Kogan, C. Nicholas, and V. Volkovich, “Text Mining with Information-Theoretic Clustering,”
IEEE Computational Science and Eng., pp. 52-59, 2003.- [29] I. Kononenko, “Estimating Attributes: Analysis and Extension of Relief,”
Proc. Seventh European Machine Learning Conf. (ECML '94), pp. 171-182, 1994.- [30] K. Krishna and M.N. Murty, “Genetic $K\hbox{-}{\rm Means}$ Algorithm,”
IEEE Trans. Systems, Man and Cybernetics Part B, vol. 29, no. 3, 1999.- [31] J. MacQueen, “Some Methods for Classification and Analysis of Multivariate Observations,”
Proc. Fifth Berkeley Symp. Math. Statistics and Probability, pp. 281-297, 1967.- [32] S.W. Mahfoud, “Niching Methods for Genetic Algorithms,” PhD dissertation, Univ. of Illinois, Urbana-Champaign, 1995.
- [34] P. Merz and B. Freisleben, “Memetic Algorithms and the Fitness Landscape of the Graph Bipartitioning Problem,”
LNCS, pp. 765-774, 1998.- [35] P. Moscato, “On Evolution, Search, Optimization, Genetic Algorithms and Martial Arts: Toward Memetic Algorithms,” technical report, California Inst. Tech nology, 1989.
- [36] P. Moscato, “Memetic Algorithms: A Short Introduction,”
New Ideas in Optimization, D. Corne, M. Dorigo, and F. Glover, eds., McGraw-Hill, pp. 219-234, 1999.- [37] P.M. Murphy and D.W. Aha, “UCI Repository for Machine Learning Databases,” technical report, Dept. Information and Computer Science, Univ. of California, Irvine, http://www.ics. uci.edu/mlearnMLRepository.html , 1994.
- [39] D. Pelleg and A. Moore, “X-Means: Extending $K\hbox{-}{\rm Means}$ with Efficient Estimation of the Number of Clusters,”
Proc. 17th Int'l Conf. Machine Learning (ICML '00), pp. 727-734, 2000.- [43] G. Salton and M.J. McGill,
Introduction to Modern Information Retrieval. McGraw-Hill, 1983.- [45] G. Schwarz, “Estimating the Dimension of a Model,”
The Annals of Statistics, vol. 6, no. 2, pp. 461-464, 1978.- [46] W. Sheng, A. Tucker, and X. Liu, “Clustering with Niching Genetic $K\hbox{-}{\rm Means}$ Algorithm,”
Proc. Genetic and Evolutionary Computation Conf. (GECCO '04), pp. 162-173, 2004.- [48] S. Tavazoie, D. Hughes, M.J. Campbell, R.J. Cho, and G.M. Church, “Systematic Determination of Genetic Network Architecture,”
Nature Genetic, vol. 22, pp. 281-285, 1999.- [49] S. Theodoridis and K. Koutroumbas,
Pattern Recognition. Academic, 1999.- [50] G.T. Toussaint and T.R. Vilmansen, “Comments on Feature Selection with a Linear Dependence Measure,”
IEEE Trans. Computers, 1972.- [52] D. Whitley, “Modeling Hybrid Genetic Algorithms,”
Genetic Algorithms in Eng. and Computer Science, G. Winter, J. Periaux, M.Galan, and P. Cuesta, eds., John Wiley, pp. 191-201, 1995.- [53] S. Wu, A.W.C. Liew, H. Yan, and M. Yang, “Cluster Analysis of Gene Expression Database on Self-Splitting and Merging Competitive Learning,”
IEEE Trans. Information Technology in Biomedicine, vol. 8, no. 1, 2004. |