A Biologically Inspired Validity Measure for Comparison of Clustering Methods over Metabolic Data Sets
Issue No. 03 - May-June (2012 vol. 9)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.10
L. Kamenetzky , Partner Group, Max-Planck Inst. for Mol. Plant Physiol., Castelar, Argentina
D. H. Milone , Res. Center for Signals, Syst. & Comput. Intell., FICH-UNL, Santa Fe, Argentina
G. Stegmayer , Center for R&D of Inf. Syst. (CIDISI, UTN-FRSF, Santa Fe, Argentina
M. G. Lopez , Partner Group, Max-Planck Inst. for Mol. Plant Physiol., Castelar, Argentina
F. Carrari , Partner Group, Max-Planck Inst. for Mol. Plant Physiol., Castelar, Argentina
In the biological domain, clustering is based on the assumption that genes or metabolites involved in a common biological process are coexpressed/coaccumulated under the control of the same regulatory network. Thus, a detailed inspection of the grouped patterns to verify their memberships to well-known metabolic pathways could be very useful for the evaluation of clusters from a biological perspective. The aim of this work is to propose a novel approach for the comparison of clustering methods over metabolic data sets, including prior biological knowledge about the relation among elements that constitute the clusters. A way of measuring the biological significance of clustering solutions is proposed. This is addressed from the perspective of the usefulness of the clusters to identify those patterns that change in coordination and belong to common pathways of metabolic regulation. The measure summarizes in a compact way the objective analysis of clustering methods, which respects coherence and clusters distribution. It also evaluates the biological internal connections of such clusters considering common pathways. The proposed measure was tested in two biological databases using three clustering methods.
statistical analysis, biochemistry, biology computing, genetics, molecular biophysics, biological internal connections, biologically inspired validity, clustering methods, metabolic data sets, genes, metabolites, regulatory network, metabolic regulation, coherence, clusters distribution, Clustering methods, Clustering algorithms, Couplings, Bioinformatics, Coherence, Biological processes, metabolic pathways., Clustering, validation measure, biological assessment
L. Kamenetzky, D. H. Milone, G. Stegmayer, M. G. Lopez and F. Carrari, "A Biologically Inspired Validity Measure for Comparison of Clustering Methods over Metabolic Data Sets," in IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 9, no. , pp. 706-716, 2012.