CSDL Home IEEE/ACM Transactions on Computational Biology and Bioinformatics 2011 vol.8 Issue No.01 - January-February

Subscribe

Issue No.01 - January-February (2011 vol.8)

pp: 130-142

Mehmet Tan , Middle East Technical University, Ankara and University of Calgary, Calgary

Mohammed Alshalalfa , University of Calgary, Calgary

Reda Alhajj , University of Calgary, Calgary and Global University, Beirut

Faruk Polat , Middle East Technical University, Ankara

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2009.58

ABSTRACT

Constraint-based structure learning algorithms generally perform well on sparse graphs. Although sparsity is not uncommon, there are some domains where the underlying graph can have some dense regions; one of these domains is gene regulatory networks, which is the main motivation to undertake the study described in this paper. We propose a new constraint-based algorithm that can both increase the quality of output and decrease the computational requirements for learning the structure of gene regulatory networks. The algorithm is based on and extends the PC algorithm. Two different types of information are derived from the prior knowledge; one is the probability of existence of edges, and the other is the nodes that seem to be dependent on a large number of nodes compared to other nodes in the graph. Also a new method based on Gene Ontology for gene regulatory network validation is proposed. We demonstrate the applicability and effectiveness of the proposed algorithms on both synthetic and real data sets.

INDEX TERMS

Gene regulatory networks, transcription factors, genes, microarray data, gene ontology, prior knowledge-based learning.

CITATION

Mehmet Tan, Mohammed Alshalalfa, Reda Alhajj, Faruk Polat, "Influence of Prior Knowledge in Constraint-Based Learning of Gene Regulatory Networks",

*IEEE/ACM Transactions on Computational Biology and Bioinformatics*, vol.8, no. 1, pp. 130-142, January-February 2011, doi:10.1109/TCBB.2009.58REFERENCES

- [1] M. Bansal, V. Belcastro, A. Ambesi-Impiombato, and D. di Bernardo, "How to Infer Gene Networks from Expression Profiles,"
Molecular Systems Biology, vol. 3, article no. 78, 2007.- [2] A. Bernard and A. Hartemink, "Informative Structure Priors: Joint Learning of Dynamic Regulatory Networks from Multiple Types of Data,"
Proc. Pacific Symp. Biocomputing 05 (PSB '05), R. Altman, A.K. Dunker, L. Hunter, T. Jung, and T. Klein, eds., 2005.- [3] L.E. Brown, I. Tsamardinos, and C.F. Aliferis, "A Comparison of Novel State-of-the-Art Polynomial Bayesian Network Learning Algorithms,"
Proc. 20th Nat'l Conf. Artificial Intelligence (AAAI), pp. 739-745, 2005.- [4] R. Castelo and A. Roverato, "A Robust Procedure for Gaussian Graphical Model Search from Micorarray Data with p Larger than n,"
J. Machine Learning Research, vol. 7, pp. 2621-2650, 2006.- [5] D.M. Chickering, D. Heckerman, and C. Meek, "Large-Sample Learning of Bayesian Networks is NP-Hard,"
J. Machine Learning Research, vol. 5, pp. 1287-1330, 2004.- [6] The Gene Ontology Consortium, "Gene Ontology: Tool for the Unification of Biology,"
Nature Genetics, vol. 25, pp. 25-29, 2000.- [7] C.T. Harbison, D.B. Gordon, T.I. Lee, N.J. Rinaldi, K.D. MacIsaac, T.W. Danford, N.M. Hannett, J.B. Tagne, D.B. Reynolds, J. Yoo, E.G. Jennings, J. Zeitlinger, D.K. Pokholok, M. Kellis, P.A. Rolfe, K.T. Takusagawa, E.S. Lander, D.K. Gifford, E. Fraenkel, and R.A. Young, "Transcriptional Regulatory Code of a Eukaryotic Genome,"
Nature, vol. 431, pp. 99-104, 2004.- [8] A. de la Fuente, N. Bing, I. Hoeschele, and P. Mendes, "Discovery of Meaningful Associations in Genomic Data Using Partial Order Correlation Coefficients,"
Bioinformatics, vol. 20, pp. 3565-3574, 2004.- [9] M.K. Dougherty, J. Muller, D.A. Ritt, M. Zhou, X.Z. Zhou, T.D. Copeland, T.P. Conrads, T.D. Veenstra, K.P. Lu, and D.K. Morrison, "Regulation of Raf-1 by Direct Feedback Phosphorylation,"
Molecular Cell, vol. 17, pp. 215-224, 2005.- [10] J.J. Faith, B. Hayete, J.T. Thaden, I. Mogno, J. Wierzbowski, G. Cottarel, S. Kasif, J.J. Collins, and T.S. Gardner, "Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles,"
PLoS Biology, vol. 5, no. 1,e8, Jan. 2007.- [11] N. Friedman and D. Koller, "Being Bayesian About Network Structure: A Bayesian Approach to Structure Discovery in Bayesian Networks,"
Machine Learning, vol. 50, nos. 1-2, pp. 95-125, 2003.- [12] N. Friedman, M. Linial, I. Nachman, and D. Peer, "Using Bayesian Networks to Analyze Expression Data,"
J. Computational Biology, vol. 7, pp. 601-620, 2000.- [13] S. Imoto, T. Higuchi, T. Goto, and S. Miyano, "Error Tolerant Model for Incorporating Biological Knowledge with Expression Data in Estimating Gene Networks,"
Statistical Methodology, vol. 3, no. 1, pp. 1-16, 2006.- [14] B. Jones and M. West, "Covariance Decomposition in Undirected Gaussian Graphical Models,"
Biometrika, vol. 92, no. 4, pp. 779-786, 2005.- [15] H.D. Jong, "Modeling and Simulation of Genetic Regulatory Systems: A Literature Review,"
J. Computational Biology, vol. 9, pp. 67-103, 2002.- [16] M. Kalisch and P. Bühlmann, "Estimating High-Dimensional Directed Acyclic Graphs with the PC-Algorithm,"
J. Machine Learning Research, vol. 8, pp. 613-636, 2007.- [17] M. Kanehisa and S. Goto, "KEGG: Kyoto Encyclopedia of Genes and Genomes,"
Nucleic Acid Research, vol. 28, pp. 27-30, 2000.- [18] S. Lauritzen,
Graphical Models. Oxford Univ. Press, 1996.- [19] T.I. Lee, N.J. Rinaldi, F. Robert, D.T. Odom, Z. Bar-Joseph, G.K. Gerber, N.M. Hannett, C.T. Harbison, C.M. Thompson, I. Simon, J. Zeitlinger, E.G. Jennings, H.L. Murray, D. Benjamin Gordon, B. Ren, J.J. Wyrick, J.-B. Tagne, T.L. Volkert, E. Fraenkel, D.K. Gifford, and R.A. Young, "Transcriptional Regulatory Networks in Saccharomyces cerevisiae,"
Science, vol. 298, pp. 799-804, 2002.- [20] P.M. Magwene and J. Kim, "Estimating Genomic Coexpression Networks Using First-Order Conditional Independence,"
Genome Biology, vol. 5, R100, 2004.- [21] Y. Makita, M. Nakao, N. Ogasawara, and K. Nakai, "DBTBS: Database of Transcriptional Regulation in Bacillus subtilis and Its Contribution to Comparative Genomics,"
Nucleic Acid Research, vol. 32, pp. D75-D77, 2004.- [22] S. Mangan and U. Alon, "Structure and Function of the Feed-Forward Loop Network Motif,"
Proc. Nat'l Academy of Sciences USA, vol. 100, pp. 11980-11985, 2003.- [23] F. Markowevtz, "A Bibliography on Learning Causal Networks of Gene Interactions," http://www.molgen.mpg.de/markowet/docsnetwork-bib.pdf , 2006.
- [24] I. Nachman, A. Regev, and N. Friedman, "Inferring Quantitative Models of Regulatory Networks from Expression Data,"
Bioinformatics, vol. 20, no. 1, pp. i248-i256, 2004.- [25] R.E. Neapolitan,
Learning Bayesian Networks. Prentice Hall, 2003.- [26] A. Reverter and E.K. Chan, "Combining Partial Correlation and an Information Theory Approach to the Reverse Engineering of Gene Co-Expression Networks,"
Bioinformatics, vol. 24, no. 21, pp. 2491-2497, 2008.- [27] K. Sachs, O. Perez, D. Pe'er, D.A. Lauffenburger, and G.P. Nolan, "Causal Protein-Signalling Networks Derived from Multiparameter Single-Cell Data,"
Science, vol. 308, no. 5721, pp. 523-529, 2005.- [28] J. Schäfer and K. Strimmer, "A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics,"
Statistical Applications in Genetics and Molecular Biology, vol. 4, article no. 32, 2005.- [29] S. Shen-Orr, R. Milo, S. Mangan, and U. Alon, "Network Motifs in the Transcriptional Regulation Network of Escherichia coli,"
Nature Genetics, vol. 31, pp. 64-68, 2002.- [30] P.T. Spellman, G. Sherlock, M.Q. Zhang, V.R. Iyer, K. Anders, M.B. Eisen, P.O. Brown, D. Botstein, and B. Futcher, "Comprehensive Identification of Cell Cycle Regulated Genes of Yeast Saccharomyces cerevisiae by Microarray Hybridization,"
Molecular Biology Cell, vol. 9, pp. 3273-3297, 1998.- [31] P. Spirtes, C. Glymour, and R. Scheines,
Causation, Prediction, and Search. MIT Press, 2000.- [32] C. Stark, B.J. Breitkreutz, T. Reguly, L. Boucher, A. Breitkreutz, and M. Tyers, "BioGRID: A General Repository for Interaction Datasets,"
Nucleic Acids Research, vol. 34, Database issue, pp. D535-D539, 2006.- [33] Y. Tamada, H. Bannai, S. Imoto, T. Katayama, M. Kanehisa, and S. Miyano, "Utilizing Evolutionary Information and Gene Expression Data for Estimating Gene Regulations with Bayesian Network Models,"
J. Bioinformatics and Computational Biology, vol. 3, no. 6, pp. 1295-1313, 2005.- [34] M. Tan, M. Alshalalfa, R. Alhajj, and F. Polat, "Combining Multiple Types of Biological Data in Constraint-Based Learning of Gene Regulatory Networks,"
Proc. IEEE Fifth Symp. Computational Intelligence in Bioinformatics and Computational Biology (CIBCB '08), pp. 91-98, 2008.- [35] O. Troyanskaya, M. Cantor, G. Sherlock, P. Brown, T. Hastie, R. Tibshirani, D. Botstein, and R.B. Altman, "Missing Value Estimation Methods for DNA Microarrays,"
Bioinformatics, vol. 17, no. 6, pp. 520-525, 2001.- [36] I. Tsamardinos, L.E. Brown, and C.F. Aliferis, "The Max-Min Hill-Climbing Bayesian Network Structure Learning Algorithm,"
Machine Learning, vol. 65, no. 1, pp. 31-78, 2006.- [37] E. van Nimwegen, "Scaling Laws in the Functional Content of Genomes,"
Trends in Genetics, vol. 19, no. 9, pp. 479-484, 2003.- [38] A.V. Werhli and D. Husmeier, "Reconstructing Gene Regulatory Networks with Bayesian Networks by Combining Expression Data with Multiple Sources of Prior Knowledge,"
Statistical Applications in Genetics and Molecular Biology, vol. 6, no. 1,article no. 15, 2007.- [39] A. Willie and P. Bühlmann, "Low-Order Conditional Independence Graphs for Inferring Genetic Networks,"
Statistical Applications in Genetics and Molecular Biology, vol. 5, article no. 1, 2006.- [40] X. Wu and Y. Ye, "Exploring Gene Causal Interactions Using an Enhanced Constraint-Based Method,"
Pattern Recognition, vol. 39, no. 12, pp. 2439-2449, 2006.- [41] M. Zampieri, N. Soranzo, and C. Altafini, "Discerning Static and Causal Interactions in Genome-Wide Reverse Engineering Problems,"
Bioinformatics, vol. 24, pp. 1510-1515, 2008. |