The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.02 - April-June (2009 vol.6)
pp: 244-259
ABSTRACT
This paper presents Fuzzy-Adaptive-Subspace-Iteration-based Two-way Clustering (FASIC) of microarray data for finding differentially expressed genes (DEGs) from two-sample microarray experiments. The concept of fuzzy membership is introduced to transform the hard adaptive subspace iteration (ASI) algorithm into a fuzzy-ASI algorithm to perform two-way clustering. The proposed approach follows a progressive framework to assign a relevance value to genes associated with each cluster. Subsequently, each gene cluster is scored and ranked based on its potential to provide a correct classification of the sample classes. These ranks are converted into P values using the R-test, and the significance of each gene is determined. A fivefold validation is performed on the DEGs selected using the proposed approach. Empirical analyses on a number of simulated microarray data sets are conducted to quantify the results obtained using the proposed approach. To exemplify the efficacy of the proposed approach, further analyses on different real microarray data sets are also performed.
INDEX TERMS
Clustering, classification and association rules, data mining, data and knowledge visualization, feature extraction or construction.
CITATION
Jahangheer Shaik, Mohammed Yeasin, "Fuzzy-Adaptive-Subspace-Iteration-Based Two-Way Clustering of Microarray Data", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.6, no. 2, pp. 244-259, April-June 2009, doi:10.1109/TCBB.2008.15
REFERENCES
[1] I. Guyon, “An Introduction of Variable and Feature Selection,” J.Machine Learning Research, vol. 3, pp. 1157-1182, 2003.
[2] C. Tang and A. Zhang, “Interrelated Two-Way Clustering: An Unsupervised Approach for Gene Expression Data Analysis,” Proc. Second IEEE Int'l Symp. Bioinformatics and Bioeng., vol. 14, pp.41-48, 2001.
[3] U. Alon, N. Barkai, D.A. Notterman, K. Gish, S. Ybarra, D. Mack, and A.J. Levine, “Broad Patterns of Gene Expression Revealed by Clustering Analysis of Tumor and Normal Colon Tissues Probed by Oligonucleotide Arrays,” Proc. Nat'l Academy Sciences USA, vol. 96, pp. 6745-6750, 1999.
[4] A. Pascual-Montano, F. Tirado, P. Carmona-Saez, J.M. Carazo, and R.D. Pascual-Marqui, “Two-Way Clustering of Gene Expression Profiles by Sparse Matrix Factorization,” Proc. IEEE Computational Systems Bioinformatics Conf. Workshops (CSBW '05), pp. 103-104, 2005.
[5] D.L. Davies and D.W. Bouldin, “A Cluster Separation Measure,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 1, pp.224-227, 1979.
[6] C. Zhang, X. Lu, and X. Zhang, “Significance of Gene Ranking for Classification of Microarray Samples,” IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 3, pp. 312-320, 2006.
[7] X. Chen, S.Y. Leung, S.T. Yeuen, K.M. Chu, J. Ji, R. Li, A.S.Y. Chan, S. Law, O.G. Troyanskaya, J. Wong, S. So, D. Botstein, and P.O. Brown, “Variation in Gene Expression Patterns in HumanGastric Cancers,” Molecular Biology of the Cell, vol. 14, pp. 3208-3215, 2003.
[8] T.R. Golub, D.K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J.P. Mesirov, H. Coller, M.L. Loh, J.R. Downing, M.A. Caligiuri, C.D. Bloomfield, and E.S. Lander, “Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring,” Science, vol. 286, pp. 531-537, 1999.
[9] P. Tamayo, D. Slonim, J. Mesirov, Q. Zhu, S. Kitareewan, E. Dmitrovsky, E. Lander, and T. Golub, “Interpreting Patterns of Gene Expression with Self-Organizing Maps: Methods and Applications to Hematopoietic Differentiation,” Proc. Nat'l Academy Sciences USA, vol. 96, pp. 2907-2912, 1999.
[10] S. Tavazoie, J. Hughes, M. Campbell, R. Cho, and G. Church, “Cluster Analysis and Display of Genome Wide Expression Patterns,” Nature Genetics, vol. 22, pp. 281-285, 1999.
[11] D. Dembele and P. Kastner, “Fuzzy C-Means Method for Clustering Microarray Data,” Bioinformatics, vol. 19, pp. 973-980, 2003.
[12] D. Dembele and P. Kastner, “Fuzzy C-Means for Clustering Microarray Data,” Bioinformatics, vol. 19, pp. 973-980, 2003.
[13] P.J. Woolf and Y. Wang, “A Fuzzy Logic Approach to Analyzing Gene Expression Data,” Physiology of Genomics, vol. 2, pp. 9-15, 2000.
[14] N. Belacel, M. Cuperlovic, R. Ouellette, and M.R. Boulassel, “The Variable Neighborhood Search Metaheuristic for Fuzzy Clustering cDNA Microarray Gene Expression Data,” Artificial Intelligence and Applications, vol. 411, 2004.
[15] M. Ceccarelli and A. Maratea, “Semi-Supervised Fuzzy C-Means Clustering of Biological Data,” Proc. Sixth Int'l Workshop Fuzzy Logic (WILF '05), pp. 259-266, 2005.
[16] L.M. Fu and E. Medico, “FMC, A Fuzzy Map Clustering Algorithm for Microarray Data Analysis,” Proc. Bioinformatics Italian Soc. Meeting (BITS), 2004.
[17] S.Y. Kim, T.M. Choi, and J.S. Bae, “Fuzzy Types Clustering for Microarray Data,” Int'l J. Computational Intelligence, vol. 2, pp. 12-15, 2005.
[18] W. Yang, L. Rueda, and A. Ngom, “A Simulated Annealing Approach to Find the Optimal Parameters for Fuzzy Clustering Microarray Data,” Proc. 25th Int'l Conf. Chilean Computer Science Soc. (SCCC '05), pp. 45-55, 2005.
[19] T. Li, S. Ma, and M. Ogihara, “Document Clustering via Adaptive Subspace Iteration,” Proc. ACM SIGIR '04, pp. 218-225, 2004.
[20] J. Shaik and M. Yeasin, “A Progressive Framework for Two-Way Clustering Using Adaptive Subspace Iteration for Functionally Classifying Genes,” Proc. Int'l Joint Conf. Neural Networks (IJCNN '06), pp. 5287-5292, 2006.
[21] G.J. McLachlan, R.W. Bean, and D. Peel, “A Mixture Model-Based Approach to the Clustering of Microarray Expression Data,” Bioinformatics, vol. 18, pp. 413-422, 2002.
[22] G. Getz, E. Levine, and E. Domany, “Coupled Two-Way Clustering of Gene Microarray Data,” Proc. Nat'l Academy Sciences USA, vol. 97, pp. 12079-12084, 2000.
[23] K.S. Pollard and M.J.v.d. Laan, “Statistical Inference for Simultaneous Clustering of Gene Expression Data,” Math. Biosciences, vol. 176, pp. 9121-9126, 2002.
[24] Y. Benjamini and Y. Hochberg, “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing,” J.Royal Statistical Soc., vol. 57, pp. 289-300, 1995.
[25] S. Dudoit, J.P. Shaffer, and J.C. Boldrick, “Multiple Hypothesis Testing in Microarray Experiments,” Statistical Science, vol. 18, pp.71-103, 2003.
[26] I. Lonnstedt and T. Speed, “Replicated Microarray Data,” Statistica Sinica, vol. 12, pp. 31-46, 2002.
[27] S. Mukherjee, S.J. Roberts, and M.J. Laan, “Data-Adaptive Test Statistics for Microarray Data,” Bioinformatics, vol. 21, pp. 108-114, 2005.
[28] V.G. Tusher, R. Tibshirani, and G. Chu, “Significance Analysis of Microarrays Applied to the Ionizing Radiation Response,” Proc. Nat'l Academy Sciences USA, vol. 98, pp. 5116-5121, 2001.
[29] J. Shaik and M. Yeasin, “Two-Way Clustering Using Fuzzy-ASI for Knowledge Discovery in Microarrays,” Proc. IEEE Symp. Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), 2007.
[30] B. Zhang, D. Schmoyer, S. Kirov, and J. Snoddy, “GOTree Machine (GOTM): A Web Based Platform for Interpreting Sets of Interesting Genes Using Gene Ontology Hierarchies,” BMC Bioinformatics, vol. 5, pp. 1-8, 2004.
[31] Y. Su, T.M. Murali, V. Pavlovic, M. Schaffer, and S. Kasif, “Rankgene: Identification of Diagnostic Genes Based on Expression Data,” http://genomics10.bu.edu/yangsurankgene/, 2002.
[32] K. Fujarewicz and M. Wiench, “Selecting Differentially Expressed Genes for Colon Tumor Classification,” Int'l J. Applied Math. and Computer Science, vol. 13, pp. 327-335, 2003.
25 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool