Scientific and Statistical Database Management, International Conference on (2006)
July 3, 2006 to July 5, 2006
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/SSDBM.2006.35
Elke Achtert , University of Munich, Germany
Christian Bohm , University of Munich, Germany
Peer Kroger , University of Munich, Germany
Arthur Zimek , University of Munich, Germany
The detection of correlations between different features in high dimensional data sets is a very important data mining task. These correlations can be arbitrarily complex: One or more features might be correlated with several other features, and both noise features as well as the actual dependencies may be different for different clusters. Therefore, each cluster contains points that are located on a common hyperplane of arbitrary dimensionality in the data space and thus generates a separate, arbitrarily oriented subspace of the original data space. The few recently proposed algorithms designed to uncover these correlation clusters have several disadvantages. In particular, these methods cannot detect correlation clusters of different dimensionality which are nested into each other. The complete hierarchical structure of correlation clusters of varying dimensionality can only be detected by a hierarchical clustering approach. Therefore, we propose the algorithm HiCO (Hierarchical Correlation Ordering), the first hierarchical approach to correlation clustering. The algorithm determines the cluster hierarchy, and visualizes it using correlation diagrams. Several comparative experiments using synthetic and real data sets show the performance and the effectivity of HiCO.
C. Bohm, P. Kroger, E. Achtert and A. Zimek, "Mining Hierarchies of Correlation Clusters," 18th International Conference on Scientific and Statistical Database Management(SSDBM), Vienna, 2006, pp. 119-128.