17th International Conference on Pattern Recognition (ICPR'04) - Volume 2
Feature Selection and Gene Clustering from Gene Expression Data
Cambridge UK
August 23-August 26
ISBN: 0-7695-2128-2
In this article we describe an algorithm for feature selection and gene clustering from high dimensional gene expression data. The method is based on measuring similarity between features/genes whereby redundancy therein is removed. This does not need any search and therefore is fast. A novel feature similarity measure, called maximum information compression index, is used. The feature selection algorithm also obtains gene clusters in a multiscale fashion. The superiority of the algorithm, in terms of speed and performance, is established on a real life molecular cancer classification dataset.
Index Terms:
Microarray, maximal information compression index, cancer classification, representation entropy, data mining
Citation:
Pabitra Mitra, Dwijesh Dutta Majumder, "Feature Selection and Gene Clustering from Gene Expression Data," icpr, vol. 2, pp.343-346, 17th International Conference on Pattern Recognition (ICPR'04) - Volume 2, 2004