Automated Hierarchical Density Shaving: A Robust Automated Clustering and Visualization Framework for Large Biological Data Sets
Issue No. 02 - April-June (2010 vol. 7)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2008.32
Gunjan Gupta , University of Texas at Austin
Alexander Liu , University of Texas at Austin
Joydeep Ghosh , University of Texas at Austin
A key application of clustering data obtained from sources such as microarrays, protein mass spectroscopy, and phylogenetic profiles is the detection of functionally related genes. Typically, only a small number of functionally related genes cluster into one or more groups, and the rest need to be ignored. For such situations, we present Automated Hierarchical Density Shaving (Auto-HDS), a framework that consists of a fast hierarchical density-based clustering algorithm and an unsupervised model selection strategy. Auto-HDS can automatically select clusters of different densities, present them in a compact hierarchy, and rank individual clusters using an innovative stability criteria. Our framework also provides a simple yet powerful 2D visualization of the hierarchy of clusters that is useful for further interactive exploration. We present results on Gasch and Lee microarray data sets to show the effectiveness of our methods. Additional results on other biological data are included in the supplemental material.
Mining methods and algorithms, data and knowledge visualization, clustering, bioinformatics.
J. Ghosh, A. Liu and G. Gupta, "Automated Hierarchical Density Shaving: A Robust Automated Clustering and Visualization Framework for Large Biological Data Sets," in IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 7, no. , pp. 223-237, 2008.