An Information Theoretic Exploratory Method for Learning Patterns of Conditional Gene Coexpression from Microarray Data
Issue No. 01 - January-March (2008 vol. 5)
In this article, we introduce an exploratory framework for learning patterns of conditional co-expression in gene expression data. The main idea behind the proposed approach consists of estimating how the information content shared by a set of M nodes in a network (where each node is associated to an expression profile) varies upon conditioning on a set of L conditioning variables (in the simplest case represented by a separate set of expression profiles). The method is non-parametric and it is based on the concept of statistical co-information, which, unlike conventional correlation based techniques, is not restricted in scope to linear conditional dependency patterns. Moreover, such conditional co-expression relationships can potentially indicate regulatory interactions that do not manifest themselves when only pair-wise relationships are considered. A moment based approximation of the co-information measure is derived that efficiently gets around the problem of estimating high-dimensional multi-variate probability density functions from the data, a task usually not viable due to the intrinsic sample size limitations that characterize expression level measurements. By applying the proposed exploratory method, we analyzed a whole genome microarray assay of the eukaryote Saccharomices cerevisiae and were able to learn statistically significant patterns of conditional co-expression. A selection of such interactions that carry a meaningful biological interpretation are discussed.
Gene expression data, Statistical analysis, Information theory, Co-information, Entropy
R. Boscolo, V. P. Roychowdhury and J. C. Liao, "An Information Theoretic Exploratory Method for Learning Patterns of Conditional Gene Coexpression from Microarray Data," in IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 5, no. , pp. 15-24, 2007.