CSDL Home IEEE/ACM Transactions on Computational Biology and Bioinformatics 2010 vol.7 Issue No.01 - January-March
Issue No.01 - January-March (2010 vol.7)
While clustering genes remains one of the most popular exploratory tools for expression data, it often results in a highly variable and biologically uninformative clusters. This paper explores a data fusion approach to clustering microarray data. Our method, which combined expression data and Gene Ontology (GO)-derived information, is applied on a real data set to perform genome-wide clustering. A set of novel tools is proposed to validate the clustering results and pick a fair value of infusion coefficient. These tools measure stability, biological relevance, and distance from the expression-only clustering solution. Our results indicate that a data-fusion clustering leads to more stable, biologically relevant clusters that are still representative of the experimental data.
Clustering expression data, Gene Ontology, genomic data fusion, semantic similarity, cluster stability, knowledge-based validation.
Rafal Kustra, Adam Zagdański, "Data-Fusion in Clustering Microarray Data: Balancing Discovery and Interpretability", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.7, no. 1, pp. 50-63, January-March 2010, doi:10.1109/TCBB.2007.70267