The Community for Technology Leaders
RSS Icon
Issue No.06 - November/December (2010 vol.16)
pp: 1053-1062
Tuan Pham , Oregon State University
Rob Hess , Oregon State University
Crystal Ju , Oregon State University
Eugene Zhang , Oregon State University
Ronald Metoyer , Oregon State University
Understanding the diversity of a set of multivariate objects is an important problem in many domains, including ecology, college admissions, investing, machine learning, and others. However, to date, very little work has been done to help users achieve this kind of understanding. Visual representation is especially appealing for this task because it offers the potential to allow users to efficiently observe the objects of interest in a direct and holistic way. Thus, in this paper, we attempt to formalize the problem of visualizing the diversity of a large (more than 1000 objects), multivariate (more than 5 attributes) data set as one worth deeper investigation by the information visualization community. In doing so, we contribute a precise definition of diversity, a set of requirements for diversity visualizations based on this definition, and a formal user study design intended to evaluate the capacity of a visual representation for communicating diversity information. Our primary contribution, however, is a visual representation, called the Diversity Map, for visualizing diversity. An evaluation of the Diversity Map using our study design shows that users can judge elements of diversity consistently and as or more accurately than when using the only other representation specifically designed to visualize diversity.
information visualization, diversity, categorical data, multivariate data, evaluation
Tuan Pham, Rob Hess, Crystal Ju, Eugene Zhang, Ronald Metoyer, "Visualization of Diversity in Large Multivariate Data Sets", IEEE Transactions on Visualization & Computer Graphics, vol.16, no. 6, pp. 1053-1062, November/December 2010, doi:10.1109/TVCG.2010.216
[1] A method for quantifying and visualizing the diversity of qsar models. Journal of Molecular Graphics and Modelling, 22 4: 275 – 284, 2004.
[2] T. Anderson, An introduction to multivariate statistical analysis. Wiley New York, 1958.
[3] A. Artero, M. de Oliveira, and H. Levkowitz, Uncovering clusters in crowded parallel coordinates visualizations. In Proceedings of the IEEE Symposium on Information Visualization, pages 81–88. IEEE Computer Society, 2004.
[4] H. Chernoff, The use of faces to represent points in k-dimensional space graphically. Journal of the American Statistical Association, pages 361– 368, 1973.
[5] W. Cleveland and R. McGill, Graphical perception: Theory, experimentation, and application to the development of graphical methods. Journal of the American Statistical Association, 79 387: 531–554, 1984.
[6] W. Conover and R. Iman, Rank transformations as a bridge between parametric and nonparametric statistics. The American Statistician, 35 3: 124–129, 1981.
[7] Y. Fua, M. Ward, and E. Rundensteiner, Hierarchical parallel coordinates for exploration of large datasets. In IEEE Visualization, volume 99, pages 43–50, 1999.
[8] D. Harrison and K. Klein, What's the difference? Diversity constructs as separation, variety, or disparity in organizations. Academy of Management Review, 32 4: 1199, 2007.
[9] H. Hauser, F. Ledermann, and H. Doleisch, Angular brushing of extended parallel coordinates. In Proceedings of the IEEE Symposium on Information Visualization, pages 127–130, 2002.
[10] S. Hurlbert, The nonconcept of species diversity: a critique and alternative parameters. Ecology, 52 4: 577–586, 1971.
[11] A. Inselberg, Multidimensional detective. In IEEE Symposium on Information Visualization, pages 100–107, 1997.
[12] A. Inselberg and B. Dimsdale, Parallel coordinates: a tool for visualizing multi-dimensional geometry. In Proceedings of the 1st conference on Visualization'90, page 378. IEEE Computer Society Press, 1990.
[13] B. Johnson and B. Shneiderman, Tree-maps: A space-filling approach to the visualization of hierarchical information structures. In Proceedings of the 2nd conference on Visualization'91, pages 284–291. IEEE Computer Society Press Los Alamitos, CA, USA, 1991.
[14] E. Kandogan, Visualizing multi-dimensional clusters, trends, and outliers using star coordinates. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pages 107–116. ACM New York, NY, USA, 2001.
[15] D. Keim, Visual Database Exploration Techniques. In Proc. Tutorial Int. Conf. on Knowledge Discovery & Data Mining, Newport Beach, CA, 1997.
[16] D. Keim, Information visualization and visual data mining. IEEE transactions on Visualization and Computer Graphics, pages 1–8, 2002.
[17] R. Kosara, F. Bendix, and H. Hauser, Parallel sets: Interactive exploration and visual analysis of categorical data. IEEE Transactions on Visualization and Computer Graphics, 12 4: 558–568, 2006.
[18] C. Krebs, Ecological methodology. Harper & Row New York, 1989.
[19] D. Lau and J. Murnighan, Demographic diversity and faultlines: The compositional dynamics of organizational groups. Academy of Management Review, 23 2: 325–340, 1998.
[20] J. LeBlanc, M. Ward, and N. Wittels, Exploring n-dimensional databases. In Proceedings of the 1st conference on Visualization'90, page 237. IEEE Computer Society Press, 1990.
[21] D. MacAdam, Visual sensitivities to color differences in daylight. J. Opt. Soc. Am, 32: 247–273, 1942.
[22] J. Mackinlay, Automating the design of graphical presentations of relational information. ACM Transactions on Graphics (TOG), 5 (2): 141, 1986.
[23] A. Magurran, Measuring biological diversity. Wiley-Blackwell, 2003.
[24] J. Miller, Spatial and temporal distribution and abundance of moths in the Andrews Experimental Forest. http : // abstract.cfm?dbcode=SA015, 2005.
[25] J. Pearlman, P. Rheingans, and M. des Jardins, Visualizing diversity and depth over a set of objects. IEEE Computer Graphics and Applications, pages 35–45, 2007.
[26] W. Peng, M. Ward, and E. Rundensteiner, Clutter reduction in multidimensional data visualization using dimension reordering. In Proceedings of the IEEE Symposium on Information Visualization, pages 89–96. IEEE Computer Society, 2004.
[27] E. Pielou, Ecological diversity. Wiley New York, 1975.
[28] C. Plaisant, B. Shneiderman, K. Doan, and T. Bruns, Interface and data architecture for query preview in networked information systems. ACM Trans. Inf. Syst., 17 3: 320–341, 1999.
[29] J. Seo and B. Shneiderman, A rank-by-feature framework for interactive exploration of multidimensional data. Information Visualization, 4 2: 96–113, 2005.
[30] C. Shannon and W. Weaver, The mathematical theory of information. Urbana: University of Illinois Press, 97, 1949.
[31] B. Shneiderman, Tree visualization with tree-maps: 2-d space-filling approach. ACM Transactions on graphics (TOG), 11 (1): 92–99, 1992.
[32] B. Shneiderman, The eyes have it: A task by data type taxonomy for information visualizations. In IEEE Visual Languages, pages 336–343, 1996.
[33] J. Stasko, R. Catrambone, M. Guzdial, and K. McDonald, An evaluation of space-filling information visualizations for depicting hierarchical structures. International Journal of Human Computer Studies, 53 5: 663–694, 2000.
[34] W. Torgerson, Multidimensional scaling: I. Theory and method. Psychometrika, 17 4: 401–419, 1952.
[35] A. Treisman, Preattentive processing in vision. In Papers from the second workshop Vol. 13 on Human and Machine Vision II, page 334. Academic Press Professional, Inc., 1986.
[36] C. Ware, Information visualization: perception for design. Morgan Kaufmann, 2004.
[37] R. Whittaker, Dominance and Diversity in Land Plant Communities: Numerical relations of species express the importance of competition in community function and evolution. Science, 147 3655: 250, 1965.
[38] F. Young and R. Hamer, Multidimensional scaling: History, theory, and applications. L. Erlbaum Associates Hillsdale, NJ, 1987.
17 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool