Issue No. 02 - March/April (2012 vol. 9)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2011.62
M. Hopfensitz , Res. Group of Bioinf. & Syst. Biol., Ulm Univ., Ulm, Germany
C. Mussel , Res. Group of Bioinf. & Syst. Biol., Ulm Univ., Ulm, Germany
C. Wawra , Res. Group of Bioinf. & Syst. Biol., Ulm Univ., Ulm, Germany
M. Maucher , Res. Group of Bioinf. & Syst. Biol., Ulm Univ., Ulm, Germany
M. Kuhl , Inst. of Biochem. & Mol. Biol., Ulm Univ., Ulm, Germany
H. Neumann , Inst. of Neural Inf. Process., Ulm Univ., Ulm, Germany
H. A. Kestler , Res. Group of Bioinf. & Syst. Biol., Ulm Univ., Ulm, Germany
Network inference algorithms can assist life scientists in unraveling gene-regulatory systems on a molecular level. In recent years, great attention has been drawn to the reconstruction of Boolean networks from time series. These need to be binarized, as such networks model genes as binary variables (either "expressed” or "not expressed”). Common binarization methods often cluster measurements or separate them according to statistical or information theoretic characteristics and may require many data points to determine a robust threshold. Yet, time series measurements frequently comprise only a small number of samples. To overcome this limitation, we propose a binarization that incorporates measurements at multiple resolutions. We introduce two such binarization approaches which determine thresholds based on limited numbers of samples and additionally provide a measure of threshold validity. Thus, network reconstruction and further analysis can be restricted to genes with meaningful thresholds. This reduces the complexity of network inference. The performance of our binarization algorithms was evaluated in network reconstruction experiments using artificial data as well as real-world yeast expression time series. The new approaches yield considerably improved correct network identification rates compared to other binarization techniques by effectively reducing the amount of candidate networks.
Time series analysis, Time measurement, Approximation error, Gene expression, Complexity theory, Bioinformatics, Computational biology
M. Hopfensitz et al., "Multiscale Binarization of Gene Expression Data for Reconstructing Boolean Networks," in IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 9, no. 2, pp. 487-498, 2012.