The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.06 - November/December (2011 vol.8)
pp: 1592-1603
Chun-Hou Zheng , The Hong Kong Polytechnic University, Hong Kong and Qufu Normal University, Rizhao, Shandong
Lei Zhang , The Hong Kong Polytechnic University, Hong Kong
Vincent To-Yee Ng , The Hong Kong Polytechnic University, Hong Kong
Simon Chi-Keung Shiu , The Hong Kong Polytechnic University, Hong Kong
De-Shuang Huang , Tongji University, Shanghai
ABSTRACT
A reliable and precise identification of the type of tumors is crucial to the effective treatment of cancer. With the rapid development of microarray technologies, tumor clustering based on gene expression data is becoming a powerful approach to cancer class discovery. In this paper, we apply the penalized matrix decomposition (PMD) to gene expression data to extract metasamples for clustering. The extracted metasamples capture the inherent structures of samples belong to the same class. At the same time, the PMD factors of a sample over the metasamples can be used as its class indicator in return. Compared with the conventional methods such as hierarchical clustering (HC), self-organizing maps (SOM), affinity propagation (AP) and nonnegative matrix factorization (NMF), the proposed method can identify the samples with complex classes. Moreover, the factor of PMD can be used as an index to determine the cluster number. The proposed method provides a reasonable explanation of the inconsistent classifications made by the conventional methods. In addition, it is able to discover the modules in gene expression data of conterminous developmental stages. Experiments on two representative problems show that the proposed PMD-based method is very promising to discover biological phenotypes.
INDEX TERMS
Tumor clustering, penalized matrix decomposition, metasample, gene expression data, developmental biology.
CITATION
Chun-Hou Zheng, Lei Zhang, Vincent To-Yee Ng, Simon Chi-Keung Shiu, De-Shuang Huang, "Molecular Pattern Discovery Based on Penalized Matrix Decomposition", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.8, no. 6, pp. 1592-1603, November/December 2011, doi:10.1109/TCBB.2011.79
REFERENCES
[1] K. Akashi, X. He, J. Chen, H. Iwasaki, C. Niu, B. Steenhard, J. Zhang, J. Haug, and L. Li, “Transcriptional Accessibility for Genes of Multiple Tissues and Hematopoietic Lineages is Hierarchically Controlled During Early Hematopoiesis,” Blood, vol. 101, pp. 383-389, 2003.
[2] A.A. Alizadeh et al., “Distinct Types of Diffuse Large B-Cell Lymphoma Identified by Gene Expression Profiling,” Nature, vol. 403, pp. 503-511, 2000.
[3] U. Alon, N. Barkai, D.A. Notterman, K. Gish, S. Ybarra, D. Mack, and A.J. Levine, “Broad Patterns of Gene Expression Revealed by Clustering Analysis of Tumor and Normal Colon Tissues Probed by Oligonucleotide Arrays,” Proc. Nat'l Academy of Sciences USA, vol. 96, pp. 6745-6750, 1999.
[4] S.V. Anisimov et al., “‘NeuroStem Chip’: A Novel Highly Specialized Tool to Study Neural Differentiation Pathways in Human Stem Cells,” BMC Genomics, vol. 8, p. 46, 2007.
[5] J.P. Brunet, P. Tamayo, T.R. Golun, and J.P. Mesirov, “Metagenes and Molecular Pattern Discovery Using Matrix Factorization,” Proc Nat'l Academy of Sciences USA, vol. 101, no. 12, pp. 4164-4169, 2004.
[6] I.G. Costa, S. Roepcke, C. Hafemeister, and A. Schliep, “Inferring Differentiation Pathways from Gene Expression,” Bioinformatics, vol. 24, no. 13, pp. i156-i164, 2008.
[7] M.B. Eisen et al., “Cluster Analysis and Display of Genome-Wide Expression Patterns,” Proc. Nat'l Academy of Sciences USA, vol. 95, pp. 14863-14868, 1998.
[8] F. Ferrari, S. Bortoluzzi, D. Basso, S. Bicciato, R. Zini, C. Gemelli, G.A. Danieli, and S. Ferrari, “Genomic Expression during Human Myelopoiesis,” BMC Genomics, vol. 8, p. 264, 2007.
[9] Y. Gao and C. George, “Improving Molecular Cancer Class Discovery through Sparse Non-Negative Matrix Factorization,” Bioinformatics, vol. 21, pp. 3970-3975., 2005.
[10] T.R. Golub, D.K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J.P. Mesirov, H. Coller, M.L. Loh, J.R. Downing, M.A. Caligiuri, C.D. Bloomfield, and E.S. Lander, “Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring,” Science, vol. 286, pp. 531-537, 1999.
[11] K. Houck, N. Nikrui, L. Duska, Y. Chang, A.F. Fuller, D. Bell, and A. Goodman, “Borderline Tumors of the Ovary: Correlation of Frozen and Permanent Histopathologic Diagnosis,” Obstetrics and Gynecology, vol. 95, pp. 839-843, 2000.
[12] D.S. Huang and C.H. Zheng, “Independent Component Analysis-Based Penalized Discriminant Method for Tumor Classification Using Gene Expression Data,” Bioinformatics, vol. 22, pp. 1855-1862, 2006.
[13] G. Hyatt, R. Melamed, R. Park, R. Seguritan, C. Laplace, L. Poirot, S. Zucchelli, and R. Obst, “Gene Expression Microarrays: Glimpses of the Immunological Genome,” Nature Immunology, vol. 7, pp. 686-691, 2006.
[14] M. Okumi, Y. Matsuoka, M. Tsukikawa, N. Fujimoto, S. Sagawa, and K. Itoh, “A Compound Tumor in the Adrenal Medulla-Pheochromocytoma Combined with Ganglioneuroma: A Case Report,” Acta Urologica Japonica, vol. 46, pp. 887-890, 2000.
[15] C.M. Perou et al., “Molecular Portraits of Human Breast Tumours,” Nature, vol. 406, pp. 747-752, 2000.
[16] S.L. Pomeroy et al., “Prediction of Central Nervous System Embryonal Tumour Outcome Based on Gene Expression,” Nature, vol. 415, pp. 436-442, 2002.
[17] L. Poirot et al., “Natural Killer Cells Distinguish Innocuous and Destructive Forms of Pancreatic Islet Autoimmunity,” Proc. Nat'l Academy of Sciences USA, vol. 101, pp. 8102-8107, 2004.
[18] D.K. Slonim, P. Tamayo, J.P. Mesirov, T.R. Golub, and E.S. Lander, “Class Prediction and Discovery Using Gene Expression Data,” Proc. Fourth Ann. Int'l Conf. Computational Molecular Biology, pp. 263-272, 2000.
[19] J. Khan, J.S. Wei, M. Ringner, L.H. Saal, M. Ladanyi, F. Westermann, F. Berthold, M. Schwab, C.R. Antonescu, C. Peterson, and P.S. Meltzer, “Classification and Diagnostic Prediction of Cancers Using Gene Expression Profiling and Artificial Neural Networks,” Nature Medicine, vol. 7, no. 6, pp. 673-679, 2001.
[20] P. Tamayo et al., “Interpreting Patterns of Gene Expression with Self-Organizing Maps: Methods and Application to Hematopoietic Differentiation,” Proc. Nat'l Academy of Sciences USA, vol. 96, pp. 2907-2912, 1999.
[21] L.E. Tze et al., “Basal Immunoglobulin Signaling Actively Maintains Developmental Stage in Immature B Cells,” PLoS Biology, vol. 3, p. e82, 2005.
[22] H.Q. Wang, H.S. Wong, D.S. Huang, and J. Shu, “Extracting Gene Regulation Information for Cancer Classification,” Pattern Recognition, vol. 40, pp. 3379-33927, 2007.
[23] J. Wang, J. Delabie, H. Aasheim, E. Smeland, and O. Myklebost, “Clustering of the SOM Easily Reveals Distinct Gene Expression Patterns: Results of a Reanalysis of Lymphoma Study,” BMC Bioinformatics, vol. 3, p. 36, 2002.
[24] D.M. Witten, R. Tibshirani, and T. Hastie, “A Penalized Matrix Decomposition, with Applications to Sparse Principal Components and Canonical Correlation Analysis,” Biostatistics, vol. 10, no. 3, pp. 515-534, 2009.
[25] D.M. Witten and R. Tibshirani, “Extensions of Sparse Canonical Correlation Analysis with Applications to Genomic Data,” Statistical Applications in Genetics and Molecular Biology, vol. 8, no. 1, article no. 28, 2009.
[26] T. Yamagata, C. Benoist, and D. Mathis, “A Shared Gene-Expression Signature in Innate-Like Lymphocytes,” Immunological Rev., vol. 210, pp. 52-66, 2006.
[27] H. Wang, H. Zheng, and F. Azuaje, “Poisson-Based Self-Organizing Feature Maps and Hierarchical Clustering for Serial Analysis of Gene Expression Data,” IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 4, no. 2, pp. 163-175, Apr.-June 2007.
[28] C.H. Zheng, D.S. Huang, L. Zhang, and X.Z. Kong, “Tumor Clustering Using Non-Negative Matrix Factorization with Gene Selection,” IEEE Trans. Information Technology in Biomedicine, vol. 13, no. 4, pp. 599-607, July 2009.
[29] T.K. Paul and H. Iba, “Prediction of Cancer Class with Majority Voting Genetic Programming Classifier Using Gene Expression Data,” IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 6, no. 2, pp. 353-367, Apr.-June 2009.
[30] M. Leone, Sumedha, and M. Weight, “Clustering by Soft-Constraint Affinity Propagation: Applications to Gene-Expression Data,” Bioinformatics, vol. 23, pp. 2708-2715, 2007.
[31] J.F. Frey and D. Dueck, “Clustering by Passing Messages between Data Points,” Science, vol. 315, pp. 972-976, 2007.
[32] J.P. Bouchaud and M. Potters, “Financial Applications of Random Matrix Theory: A Short Review,” http://arxiv.org/abs0910.1205, 2011.
[33] N.E. Karoui, “Spectrum Estimation for Large Dimensional Covariance Matrices Using Random Matrix Theory,” http://www.stat.berkeley.edu/~nkaroui/papers AOS581Spectrum EstimationRMT.pdf , 2011.
[34] A.Y. Ng, M.I. Jorden, and Y. Weiss, “On Spectral Clustering: Analysis and an Algorithm,” Advances in Neural Information Processing Systems, vol. 14, pp. 849-856, 2002.
[35] N. Halabi, O. Rivoire, S. Leibler, and R. Ranganathan, “Protein Sectors: Evolutionary Units of Three-Dimensional Structure,” Cell, vol. 138, no. 4, pp. 774-786, 2009.
[36] X.M. Zhao, Y.M. Cheung, and D.S. Huang, “Analysis of Gene Expression Data Using RPEM Algorithm in Normal Mixture Model with Dynamic Adjustment of Learning Rate,” Int'l J. Pattern Recognition and Artificial Intelligence, vol. 24, no. 4, pp. 651-666, 2010.
[37] H. Li, Y. Sun, and M. Zhan, “The Discovery of Transcriptional Modules by a Two-Stage Matrix Decomposition Approach,” Bioinformatics, vol. 23, no. 4, pp. 473-479, 2007.
[38] X.M. Zhao, R.S. Wang, L.N. Chen, and A. Kazuyuki, “Uncovering Signal Transduction Networks from High-Throughput Data by Integer Linear Programming,” Nucleic Acids Research, vol. 36, no. 9, p. e48, 2008.
[39] J.T. Chang, C. Carvalho, S. Mori, A.H. Bild, M.L. Gatza, Q. Wang, J.E. Lucas, A. Potti, P.G. Febbo, M. West, and J.R. Nevins, “A Genomic Strategy to Elucidate Modules of Oncogenic Pathway Signaling Networks,” Moleculer Cell, vol. 34, no. 1, pp. 104-14, 2009.
49 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool