Issue No.12 - Dec. (2011 vol.17)
pp: 2581-2590
David Gotz , IBM T.J. Watson Research Center
Jimeng Sun , IBM T.J. Watson Research Center
Huamin Qu , Hong Kong University of Science and Technology
Clustering as a fundamental data analysis technique has been widely used in many analytic applications. However, it is often difficult for users to understand and evaluate multidimensional clustering results, especially the quality of clusters and their semantics. For large and complex data, high-level statistical information about the clusters is often needed for users to evaluate cluster quality while a detailed display of multidimensional attributes of the data is necessary to understand the meaning of clusters. In this paper, we introduce DICON, an icon-based cluster visualization that embeds statistical information into a multi-attribute display to facilitate cluster interpretation, evaluation, and comparison. We design a treemap-like icon to represent a multidimensional cluster, and the quality of the cluster can be conveniently evaluated with the embedded statistical information. We further develop a novel layout algorithm which can generate similar icons for similar clusters, making comparisons of clusters easier. User interaction and clutter reduction are integrated into the system to help users more effectively analyze and refine clustering results for large datasets. We demonstrate the power of DICON through a user study and a case study in the healthcare domain. Our evaluation shows the benefits of the technique, especially in support of complex multidimensional cluster analysis.
Visual Analysis, Clustering, Information Visualization.
David Gotz, Jimeng Sun, Huamin Qu, "DICON: Interactive Visual Analysis of Multidimensional Clusters", IEEE Transactions on Visualization & Computer Graphics, vol.17, no. 12, pp. 2581-2590, Dec. 2011, doi:10.1109/TVCG.2011.188
[1] M. Ankerst, S. Berchtold, and D. Keim, Similarity clustering of dimensions for an enhanced visualization of multidimensional data. In IEEE Symposium on Information Visualization, pages 52–60, 1998.
[2] M. Balzer and O. Deussen, Voronoi treemaps. In IEEE Symposium on Information Visualization, pages 49–56, 2005.
[3] M. Bruls, K. Huizing, and J. Van Wijk, Squarified treemaps. In Proceedings of the joint Eurographics and IEEE Symposium on Visualization, pages 33–42, 2000.
[4] D. Carr, R. Littlefield, W. Nicholson, and J. Littlefield, Scatterplot matrix techniques for large N. Journal of the American Statistical Association, 82 (398): 424–436, 1987.
[5] H. Chernoff, The use of faces to represent points in k-dimensional space graphically. Journal of the American Statistical Association, 68 (342): 361–368, 1973.
[6] S. Climer and W. Zhang, Rearrangement clustering: pitfalls, remedies, and applications. Journal of Machine Learning Research, 7: 919–943, 2006.
[7] Q. Du, V. Faber, and M. Gunzburger, Centroidal voronoi tessellations: applications and algorithms. SIAM review, 41 (4): 637–676, 1999.
[8] T. Dwyer, K. Marriott, and P. Stuckey, Fast node overlap removal. In Graph Drawing, pages 153–164, 2006.
[9] M. Eisen, P. Spellman, P. Brown, and D. Botstein, Cluster analysis and display of genome-wide expression patterns. In Proceedings of the National Academy of Sciences of the United States of America, pages 14863–14868, 1998.
[10] N. Elmqvist, P. Dragicevic, and J. Fekete, Rolling the dice: multidimensional visual exploration using scatterplot matrix navigation. IEEE Transactions on Visualization and Computer Graphics, 14 (6): 1141– 1148, 2008.
[11] M. Friendly, Corrgrams. The American Statistician, 56 (4): 316–324, 2002.
[12] Y. Fua, M. Ward, and E. Rundensteiner, Hierarchical parallel coordinates for exploration of large datasets. In Proceedings of IEEE Conference on Visualization, pages 43–508, 1999.
[13] E. Gansner, Y. Koren, and S. North, Graph drawing by stress majorization. In Graph Drawing, pages 239–250, 2005.
[14] D. Gotz, Dynamic voronoi treemaps: a visualization technique for time-varying hierarchical data. Technical Report RC25132, IBM, 2011.
[15] N. Henry, J.-D. Fekete, and M. J. McGuffin, Nodetrix: a hybrid visualization of social networks. IEEE Transactions on Visualization and Computer Graphics, 13 (6): 1302–1309, 2007.
[16] D. Holten, Hierarchical edge bundles: visualization of adjacency relations in hierarchical data. IEEE Transactions on Visualization and Computer Graphics, 12 (5): 741–748, 2006.
[17] D. Holten and J. Van Wijk, Evaluation of cluster identification performance for different PCP variants. Computer Graphics Forum, 29 (3): 793– 802, 2010.
[18] A. Inselberg and B. Dimsdale, Parallel coordinates: a tool for visualizing multi-dimensional geometry. In Proceedings of IEEE conference on Visualization, pages 361–378, 1990.
[19] S. Johansson and J. Johansson, Interactive dimensionality reduction through user-defined combinations of quality metrics. IEEE Transactions on Visualization and Computer Graphics, 15 (6): 993–1000, 2009.
[20] D. Keim, Designing pixel-oriented visualization techniques: theory and applications. IEEE Transactions on Visualization and Computer Graphics, 6 (1): 59–78, 2000.
[21] D. Keim, Information visualization and visual data mining. IEEE Transactions on Visualization and Computer Graphics, 8 (1): 1–8, 2002.
[22] D. Keim and H. Kriegel, VisDB: database exploration using multidimensional visualization. IEEE Computer Graphics and Applications, 14 (5): 40–49, 2002.
[23] D. Keim and H. Kriegel, Visualization techniques for mining large databases: a comparison. IEEE Transactions on Knowledge and Data Engineering, 8 (6): 923–938, 2002.
[24] E. Keogh, L. Wei, X. Xi, S. Lonardi, J. Shieh, and S. Sirowy, Intelligent icons: integrating lite-weight data mining and visualization into gui operating systems. In Proceedings of the International Conference on Data Mining, pages 912–916, 2006.
[25] L. Nováková and O. Štepánková, Multidimensional clusters in RadViz. In Proceedings of WSEAS nternational Conference on Simulation, Modelling and Optimization, pages 470–475, 2006.
[26] M. Novotny, Visually effective information visualization of large data. In Proceedings of the Central European Seminar on Computer Graphics, pages 41–48, 2004.
[27] T. Pham, R. Hess, C. Ju, E. Zhang, and R. Metoyer, Visualization of diversity in large multivariate data sets. IEEE Transactions on Visualization and Computer Graphics, 16 (6): 1053–1062, 2010.
[28] R. Pickett and G. Grinstein, Iconographic displays for visualizing multidimensional data. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, pages 514–519, 1988.
[29] F. Post, T. van Walsum, F. Post, and D. Silver, Iconic techniques for feature visualization. In Proceedings of IEEE Conference on Visualization, pages 288–295, 1995.
[30] J. Seo and B. Shneiderman, Interactively exploring hierarchical clustering results. IEEE Computer, 35: 80–86, 2002.
[31] B. Shneiderman, Tree visualization with treemaps: 2d space-filling approach. ACM Transactions on Graphics, 11 (1): 92–99, 1992.
[32] B. Shneiderman and M. Wattenberg, Ordered treemap layouts. In IEEE Symposium on Information Visualization, pages 73–80, 2001.
[33] A. Sud, D. Fisher, and H. Lee, Fast dynamic voronoi treemaps. In IEEE International Symposium on Voronoi Diagrams in Science and Engineering, pages 85–94, 2010.
[34] A. Tatu, G. Albuquerque, M. Eisemann, J. Schneidewind, H. Theisel, M. Magnork, and D. Keim, Combining automated analysis and visualization techniques for effective exploration of high-dimensional data. In IEEE Symposium on Visual Analytics Science and Technology, pages 59–66, 2009.
[35] E. Tufte and G. Howard, The visual display of quantitative information. 1983.
[36] C. Ware, Information visualization: perception for design. 2004.
[37] J. Wood and J. Dykes, Spatially ordered treemaps. IEEE Transactions on Visualization and Computer Graphics, 14 (6): 1348–1355, 2008.
[38] J. Yang, D. Hubball, M. Ward, E. Rundensteiner, and W. Ribarsky, Value and relation display: interactive visual exploration of large data sets with hundreds of dimensions. IEEE Transactions on Visualization and Computer Graphics, 13 (3): 494–507, 2007.
[39] X. Yuan, P. Guo, H. Xiao, H. Zhou, and H. Qu, Scattering points in parallel coordinates. IEEE Transactions on Visualization and Computer Graphics, 15 (6): 1001–1008, 2009.
[40] H. Zhou, X. Yuan, H. Qu, W. Cui, and B. Chen, Visual clustering in parallel coordinates. Computer Graphics Forum, 27 (3): 1047–1054, 2008.