The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.02 - March/April (2012 vol.9)
pp: 560-570
Qinghua Huang , Sch. of Electron. & Inf. Eng., South China Univ. of Technol., Guangzhou, China
Dacheng Tao , Centre for Quantum Comput. & Intell. Syst., Univ. of Technol., Sydney, NSW, Australia
Xuelong Li , State Key Lab. of Transient Opt. & Photonics, Xi'an Inst. of Opt. & Precision Mech., Xi'an, China
A. Liew , Sch. of Inf. & Commun. Technol., Griffith Univ., Griffith, QLD, Australia
ABSTRACT
The analysis of gene expression data obtained from microarray experiments is important for discovering the biological process of genes. Biclustering algorithms have been proven to be able to group the genes with similar expression patterns under a number of experimental conditions. In this paper, we propose a new biclustering algorithm based on evolutionary learning. By converting the biclustering problem into a common clustering problem, the algorithm can be applied in a search space constructed by the conditions. To further reduce the size of the search space, we randomly separate the full conditions into a number of condition subsets (subspaces), each of which has a smaller number of conditions. The algorithm is applied to each subspace and is able to discover bicluster seeds within a limited computing time. Finally, an expanding and merging procedure is employed to combine the bicluster seeds into larger biclusters according to a homogeneity criterion. We test the performance of the proposed algorithm using synthetic and real microarray data sets. Compared with several previously developed biclustering algorithms, our algorithm demonstrates a significant improvement in discovering additive biclusters.
INDEX TERMS
Gene expression, Clustering algorithms, Bioinformatics, Search problems, Computational biology, Algorithm design and analysis, Optics,gene expression data analysis., Biclustering, genetic learning, subdimensional search strategy
CITATION
Qinghua Huang, Dacheng Tao, Xuelong Li, A. Liew, "Parallelized Evolutionary Learning for Detection of Biclusters in Gene Expression Data", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.9, no. 2, pp. 560-570, March/April 2012, doi:10.1109/TCBB.2011.53
REFERENCES
[1] S.Y. Shin, I.H. Lee, Y.M. Cho, K.A. Yang, and B.T. Zhang, “EvoOligo: Oligonucleotide Probe Design with Multiobjective Evolutionary Algorithms,” IEEE Trans. Systems, Man, Cybernetics, Part B: Cybernetics, vol. 39, no. 6, pp. 1606-1616, Dec. 2009.
[2] G.R. Wang, L.J. Yin, Y.H. Zhao, and K.M. Mao, “Efficiently Mining Time-Delayed Gene Expression Patterns,” IEEE Trans. Systems, Man, Cybernetics, Part B: Cybernetics, vol. 40, no. 2, pp. 400-411, Apr. 2010.
[3] K.Y. Yeung, D.R. Haynor, and W.L. Ruzzo, “Validating Clustering for Gene Expression Data,” Bioinformatics, vol. 17, pp. 309-318, 2001.
[4] X.L. Li, S. Lin, S.C. Yan, and D. Xu, “Discriminant Locally Linear Embedding with High-Order Tensor Data,” IEEE Trans. Systems, Man, Cybernetics, Part B: Cybernetics, vol. 38, no. 2, pp. 342-352, Apr. 2008.
[5] T.H. Zhang, K.Q. Huang, X.L. Li, J. Yang, and D.C. Tao, “Discriminative Orthogonal Neighborhood-Preserving Projections for Classification,” IEEE Trans. Systems, Man, Cybernetics, Part B: Cybernetics, vol. 40, no. 1, pp. 253-263, Feb. 2010.
[6] Y. Cheng and G.M. Church, “Biclustering of Expression Data,” Proc. Eighth Int'l Conf. Intelligent Systems for Molecular Biology (ISMB), pp. 93-103, 2000.
[7] J.A. Hartigan, “Direct Clustering of a Data Matrix,” J. Am. Statistical Assoc., vol. 67, pp. 123-129, 1972.
[8] L. Lazzeroni and A. Owen, “Plaid Models for Gene Expression Data,” technical report, Stanford Univ., 2000.
[9] E. Segal, B. Taskar, A. Gasch, N. Friedman, and D. Koller, “Rich Probabilistic Models for Gene Expression,” Bioinformatics, vol. 17, pp. S243-S252, 2001.
[10] T.M. Murali and S. Kasif, “Extracting Conserved Gene Expression Motifs from Gene Expression Data,” Proc. Pacific Symp. Biocomputing, pp. 77-88, 2003.
[11] J. Ihmels, G. Friedlander, S. Bergmann, O. Sarig, Y. Ziv, and N. Barkai, “Revealing Modular Organization in the Yeast Transcriptional Network,” Nature Genetics, vol. 31, pp. 370-377, 2002.
[12] J. Ihmels, S. Bergmann, and N. Barkai, “Defining Transcription Modules Using Large-Scale Gene Expression Data,” Bioinformatics, vol. 20, pp. 1993-2003, 2004.
[13] A. Ben-Dor, B. Chor, R. Karp, and Z. Yakhini, “Discovering Local Structure in Gene Expression Data: The Order-Perserving Submatrix Problem,” Proc. Sixth Int'l Conf. Computational Biology, pp. 49-57, 2002.
[14] A. Prelić, S. Bleuler, P. Zimmermann, A. Wille, P. Bühlmann, W. Gruissem, L. Hennig, L. Thiele, and E. Zitzler, “A Systematic Comparison and Evaluation of Biclustering Methods for Gene Expression Data,” Bioinformatics, vol. 22, pp. 1122-1129, 2006.
[15] K. Bryan, P. Cunningham, and N. Bolshakova, “Application of Simulated Annealing to the Biclustering of Gene Expression Data,” IEEE Trans. Information Technology in Biomedicine, vol. 10, no. 3, pp. 519-525, July 2006.
[16] S.C. Madeira and A.L. Oliveira, “Biclustering Algorithms for Biological Data Analysis: A Survey,” IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 1, no. 1, pp. 24-45, Jan.-Mar. 2004.
[17] G. Getz, E. Levine, and E. Domany, “Coupled Two-Way Clustering Analysis of Gene Microarray Data,” Proc. Nat'l Academy of Sciences USA, vol. 97, pp. 12079-12084, 2000.
[18] H. Wang, W. Wang, J. Yang, and P.S. Yu, “Clustering by Pattern Similarity in Large Data Sets,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 394-405, 2002.
[19] J. Yang, W. Wang, H. Wang, and P.S. Yu, “$\delta$ -Clusters: Capturing Subspace Correlation in a Large Dataset,” Proc. 18th Int'l Conf. Data Eng., pp. 517-528, 2002.
[20] X. Liu and L. Wang, “Computing the Maximum Similarity Bi-Clusters of Gene Expression Data,” Bioinformatics, vol. 23, pp. 50-56, 2007.
[21] Y. Kluger, R. Basri, J.T. Chang, and M. Gerstein, “Spectral Biclustering of Microarray Data: Coclustering Genes and Conditions,” Genome Research, vol. 13, pp. 703-716, 2003.
[22] L. Zhao and M.J. Zaki, “Tricluster: An Effective Algorithm for Mining Coherent Clusters in 3D Microarray Data,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 694-705, 2005.
[23] X. Xu, Y. Lu, A.K.H. Tung, and W. Wang, “Mining Shifting-and-Scaling Co-Regulation Patterns on Gene Expression Profiles,” Proc. 22th Int'l Conf. Data Eng., p. 89, 2006.
[24] J. Yang, W. Wang, and H. Wang, and P. Yu, “Enhanced Biclustering on Expression Data,” Proc. Third IEEE Conf. Bioinformatics and Bioeng., pp 321-327, 2003.
[25] A. Tanay, R. Sharan, and R. Shamir, “Discovering Statistically Significant Biclusters in Gene Expression Data,” Bioinformatics, vol. 18, pp. S136-S144, 2002.
[26] J.S. Tan, K.S. Chua, and L.X. Zhang, and S. Zhu, “Algorithmic and Complexity Issues of the Three Clustering Methods in Microarray Data Analysis,” Algorithmica, vol. 48, no. 2, pp. 203-219, 2007.
[27] P. Guturu and R. Dantu, “An Impatient Evolutionary Algorithm with Probabilistic Tabu Search for Unified Solution of Some NP-Hard Problems in Graph and Set Theory via Clique Finding,” IEEE Trans. Systems, Man, Cybernetics, Part B: Cybernetics, vol. 38, no. 3, pp. 645-666, June 2008.
[28] X. Yao, “Evolutionary Computation: A Gentle Introduction,” Evolutionary Optimization, R. Sarker, M. Mohammadian, and X. Yao, eds., chapter 2, pp. 27-53, Kluwer Academic Publishers, 2002.
[29] S. Bleuler, A. Prelić, and E. Zitzler, “An EA Framework for Biclustering of Gene Expression Data,” Proc. Congress Evolutionary Computation, pp. 166-173, 2004.
[30] S. Mitra and H. Banka, “Multi-Objective Evolutionary Biclustering of Gene Expression Data,” Pattern Recognition, vol. 39, pp. 2464-2477, 2006.
[31] F. Divina and J.S. Aguilar-Ruiz, “Biclustering of Expression Data with Evolutionary Computation,” IEEE Trans. Knowledge and Data Eng., vol. 18, no. 5, pp. 590-602, May 2006.
[32] A.H. Tewfik, A.B. Tchagang, and L. Vertatschitsch, “Parallel Identification of Gene Biclusters with Coherent Evolutions,” IEEE Trans. Signal Processing, vol. 54, no. 6, pp. 2408-2417, June 2006.
[33] R. Shamir, A. Maron-Katz, A. Tanay, C. Linhart, I. Steinfeld, R. Sharan, Y. Shiloh, and R. Elkon, “EXPANDER—An Integrative Program Suite for Microarray Data Analysis,” BMC Bioinformatics, vol. 6, article 232, 2005.
[34] A.P. Gasch, P.T. Spellman, C.M. Kao, O. Carmel-Harel, M.B. Eisen, G. Storz, D. Botstein, and P.O. Brown, “Genomic Expression Programs in the Response of Yeast Cells to Environmental Changes,” Molecular Biology of the Cell, vol. 11, pp. 4241-4257, 2000.
[35] G.F. Berriz, O.D. King, B. Bryant, C. Sander, and F.P. Roth, “Characterizing Gene Sets with FuncAssociate,” Bioinformatics, vol. 19, pp. 2502-2504, 2003.
[36] U. Alon, N. Barkai, D.A. Notterman, K. Gish, S. Ybarra, D. Mack, and A.J. Levine, “Broad Patterns of Gene Expression Revealed by Clustering Analysis of Tumor and Normal Colon Tissues Probed by Oligonucleotide Arrays,” Proc. Nat'l Academy of Sciences USA, vol. 96, pp. 6745-6750, 1999.
78 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool