This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Parallel Sets: Interactive Exploration and Visual Analysis of Categorical Data
July/August 2006 (vol. 12 no. 4)
pp. 558-568
Robert Kosara, IEEE Computer Society
Helwig Hauser, IEEE Computer Society

Abstract—Categorical data dimensions appear in many real-world data sets, but few visualization methods exist that properly deal with them. Parallel Sets are a new method for the visualization and interactive exploration of categorical data that shows data frequencies instead of the individual data points. The method is based on the axis layout of parallel coordinates, with boxes representing the categories and parallelograms between the axes showing the relations between categories. In addition to the visual representation, we designed a rich set of interactions. Parallel Sets allow the user to interactively remap the data to new categorizations and, thus, to consider more data dimensions during exploration and analysis than usually possible. At the same time, a metalevel, semantic representation of the data is built. Common procedures, like building the cross product of two or more dimensions, can be performed automatically, thus complementing the interactive visualization. We demonstrate Parallel Sets by analyzing a large CRM data set, as well as investigating housing data from two US states.

[1] F. Bendix, R. Kosara, and H. Hauser, “Parallel Sets: Visual Analysis of Categorical Data,” Proc. IEEE Conf. Information Visualization, pp. 133-140, 2005.
[2] Illuminating the Path: The Research and Development Agenda for Visual Analytics, J.J. Thomas and K.A. Cook, eds., IEEE Press, 2005.
[3] A. Inselberg and B. Dimsdale, “Parallel Coordinates: A Tool for Visualizing Multi-Dimensional Geometry,” Proc. IEEE Conf. Visualization, pp. 361-378, 1990.
[4] G.E. Rosario, E.A. Rundensteiner, D.C. Brown, M.O. Ward, and S. Huang, “Mapping Nominal Values to Numbers for Effective Visualization,” Proc. IEEE Conf. Information Visualization, pp. 80-95, 2003.
[5] S.T. Teoh and K.-L. Ma, “PaintingClass: Interactive Construction, Visualization and Exploration of Decision Trees,” Proc. Conf. Knowledge Discovery and Data Mining, pp. 667-672, 2003.
[6] D.F. Jerding and J.T. Stasko, “The Information Mural: A Technique for Displaying and Navigating Large Information Spaces,” IEEE Trans. Visualization and Computer Graphics, vol. 4, no. 3, pp. 257-271, July-Sept. 1998.
[7] J.A. Hartigan and B. Kleiner, “Mosaics for Contingency Tables,” Proc. Symp. Interface, pp. 268-273, 1981.
[8] M. Friendly, “Visualizing Categorical Data: Data, Stories and Pictures,” Proc. SAS User Group Conf., pp. 190-200, 1992.
[9] J. LeBlanc, M.O. Ward, and N. Wittels, “Exploring n-Dimensional Databases,” Proc. IEEE Conf. Visualization, pp. 230-237, 1990.
[10] M. Theus, H. Hofmann, B. Siegl, and A. Unwin, “MANET: Extensions to Interactive Statistical Graphics for Missing Values,” New Techniques and Technologies for Statistics II, pp. 247-259, Amsterdam: IOS Press, 1997.
[11] H. Hoffmann, “Exploring Categorical Data: Interactive Mosaic Plots,” Metrika, pp. 11-26, 2000.
[12] K. Wittenburg, T. Lanning, M. Heinrichs, and M. Stanton, “Parallel Bargrams for Consumer Based Information Exploration and Choice,” Proc. ACM User Interface Software and Technology Conf., pp. 51-60, 2001.
[13] M. Spenke and C. Beilken, “Visualization of Trees as Highly Compressed Tables with InfoZoom,” Proc. IEEE Conf. Information Visualization, pp. 122-123, 2003.
[14] D. Brodbeck and L. Girardin, “Visualization of Large-Scale Customer Satisfaction Surveys Using a Parallel Coordinate Tree,” Proc. IEEE Conf. Information Visualization, pp. 197-201, 2003.
[15] “Titanic Data Set (statlib),” http://lib.stat.cmu.edu/S/Harrell/data/descriptions titanic.html, 2006.
[16] S.K. Card, J.D. Mackinlay, and B. Shneiderman, “Using Vision to Think,” Readings in Information Visualization: Using Vision to Think, pp. 579-581, 1999.
[17] D.A. Keim, “Information Visualization and Visual Data Mining,” IEEE Trans. Visualization and Computer Graphics, vol. 7, no. 1, pp. 100-107, 2002.
[18] “Electronic Statistics Textbook,” 2006, http://www.statsoft.com/textbookstathome.html .
[19] “ColorBrewer,” http:/www.colorbrewer.org/, 2006.
[20] H. Hauser, F. Ledermann, and H. Doleisch, “Angular Brushing of Extended Parallel Coordinates,” Proc. IEEE Conf. Information Visualization, pp. 127-130, 2002.
[21] A. Agresti, An Introduction to Categorical Data Analysis. Wiley & Sons, 1996.
[22] B. Shneiderman, “The Eyes Have It: A Task by Data Type Taxonomy for Information Visualizations,” Proc. Conf. Visual Languages, pp. 336-343, 1996.
[23] K. Fukunaga, Introduction to Statistical Pattern Recognition. Academic Press Professional, Inc., 1990.
[24] J. Yang, M.O. Ward, E.A. Rundensteiner, and S. Huang, “Visual Hierarchical Dimension Reduction for Exploration of High Dimensional Datasets,” Proc. Joint IEEE TCVG-EUROGRAPHICS Symp. Visualization, pp. 19-28, 2003.
[25] L. Nowell, E. Hetzler, and T. Tanasse, “Change Blindness in Information Visualization: A Case Study,” Proc. IEEE Information Visualization, pp. 15-22, 2001.
[26] U. Cvek, M. Trutschl, and M. Wattenberg, “IEEE InfoVis 2006 Contest, US 2000 Census Data,” http://sun.cs.lsus.eduiv06/, 2006.
[27] J. Yang, A. Patro, S. Huang, N. Mehta, M.O. Ward, and E.A. Rundensteiner, “Value and Relation Display for Interactive Exploration of High Dimensional Datasets,” Proc. IEEE Conf. Information Visualization, pp. 73-80, 2004.

Index Terms:
Information visualization, interaction, nominal data, categorical data, multivariate data.
Citation:
Robert Kosara, Fabian Bendix, Helwig Hauser, "Parallel Sets: Interactive Exploration and Visual Analysis of Categorical Data," IEEE Transactions on Visualization and Computer Graphics, vol. 12, no. 4, pp. 558-568, July-Aug. 2006, doi:10.1109/TVCG.2006.76
Usage of this product signifies your acceptance of the Terms of Use.