Issue No. 01 - January-March (2010 vol. 7)
While clustering genes remains one of the most popular exploratory tools for expression data, it often results in a highly variable and biologically uninformative clusters. This paper explores a data fusion approach to clustering microarray data. Our method, which combined expression data and Gene Ontology (GO)-derived information, is applied on a real data set to perform genome-wide clustering. A set of novel tools is proposed to validate the clustering results and pick a fair value of infusion coefficient. These tools measure stability, biological relevance, and distance from the expression-only clustering solution. Our results indicate that a data-fusion clustering leads to more stable, biologically relevant clusters that are still representative of the experimental data.
Clustering expression data, Gene Ontology, genomic data fusion, semantic similarity, cluster stability, knowledge-based validation.
A. Zagdański and R. Kustra, "Data-Fusion in Clustering Microarray Data: Balancing Discovery and Interpretability," in IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 7, no. , pp. 50-63, 2007.