Subscribe
Issue No.06 - November/December (2010 vol.16)
pp: 1027-1035
ABSTRACT
When analyzing multidimensional, quantitative data, the comparison of two or more groups of dimensions is a common task. Typical sources of such data are experiments in biology, physics or engineering, which are conducted in different configurations and use replicates to ensure statistically significant results. One common way to analyze this data is to filter it using statistical methods and then run clustering algorithms to group similar values. The clustering results can be visualized using heat maps, which show differences between groups as changes in color. However, in cases where groups of dimensions have an a priori meaning, it is not desirable to cluster all dimensions combined, since a clustering algorithm can fragment continuous blocks of records. Furthermore, identifying relevant elements in heat maps becomes more difficult as the number of dimensions increases. To aid in such situations, we have developed Matchmaker, a visualization technique that allows researchers to arbitrarily arrange and compare multiple groups of dimensions at the same time. We create separate groups of dimensions which can be clustered individually, and place them in an arrangement of heat maps reminiscent of parallel coordinates. To identify relations, we render bundled curves and ribbons between related records in different groups. We then allow interactive drill-downs using enlarged detail views of the data, which enable in-depth comparisons of clusters between groups. To reduce visual clutter, we minimize crossings between the views. This paper concludes with two case studies. The first demonstrates the value of our technique for the comparison of clustering algorithms. In the second, biologists use our system to investigate why certain strains of mice develop liver disease while others remain healthy, informally showing the efficacy of our system when analyzing multidimensional data containing distinct groups of dimensions.
INDEX TERMS
statistical analysis, data analysis, data visualisation, pattern clustering, multidimensional data, comparative analysis, multidimensional quantitative data, biology, physics, engineering, statistical method, clustering algorithm, heat maps, Matchmaker, visualization technique, interactive drill-downs, visual clutter, liver disease, Data visualization, Clustering algorithms, Image color analysis, Heating, Biological cells, Joining processes, bioinformatics visualization., multidimensional data, cluster comparison
CITATION
A Lex, M Streit, C Partl, Karl Kashofer, Dieter Schmalstieg, "Comparative Analysis of Multidimensional, Quantitative Data", IEEE Transactions on Visualization & Computer Graphics, vol.16, no. 6, pp. 1027-1035, November/December 2010, doi:10.1109/TVCG.2010.138
REFERENCES
 [1] J. Bertin and G. Jensch, Graphische Semiologie: Diagramme, Netze, Karten. de Gruyter, Berlin, first published 1967, german edition, 1974. [2] C. A. Brewer, Colorbrewer. http:/www.ColorBrewer.org last accessed March 27, 2010, 2010. [3] M. B. Eisen, P. T. Spellman, P. O. Brown, and D. Botstein, Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Academy of Science USA, 95 25: 14863–14868, Dec. 1998. [4] B. J. J. Frey and D. Dueck, Clustering by passing messages between data points. Science, 315 5814: 972–976, Jan. 2007. [5] M. Graham and J. Kennedy, Combining linking & focusing techniques for a multiple hierarchy visualisation. In Proceedings of the Fifth International Conference on Information Visualisation, page 425. IEEE Computer Society, 2001. [6] M. Graham, J. Kennedy, Exploring multiple trees through DAG representations. IEEE Transactions on Visualization and Computer Graphics, 13 6: 1294–1301, 2007. [7] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, I. H. Witten, The WEKA data mining software: an update. SIGKDD Explor. Newsl, 11 1: 10–18, 2009. [8] S. Hanada, P. Strnad, E. M. Brunt, M. B. Omary, The genetic background modulates susceptibility to mouse liver Mallory-Denk body formation and liver injury. Hepatology (Baltimore, Md.), 48 (3): 943–952, Sept. 2008. PMID: 18697208. [9] C. G. Healey, Choosing effective colours for data visualization. In Proceedings of the 7th conference on Visualization 96, pages 263-ff., San Francisco, Ca., United States, 1996. IEEE Computer Society Press. [10] D. Holten, Hierarchical edge bundles: Visualization of adjacency relations in hierarchical data. IEEE Transactions on Visualization and Computer Graphics, 12 5: 741748, 2006. [11] D. Holten and J. J. van Wijk, Visual comparison of hierarchically organized data. Computer Graphics Forum, 27: 759–766 (8), 2008. [12] J. Hong, J. D' Andries, M. Richman, and M. Westfall, Zoomology: comparing two large hierarchical trees. In Posters Compendium of Information Visualization 2003 ((Seattle, WA, USA)), pages 120–121, 2003. [13] A. Inselberg and B. Dimsdale, Parallel coordinates: a tool for visualizing multi-dimensional geometry. In Proc. of the First IEEE Conference on Visualization, pages 361–378, San Francisco, CA, USA, 1990. [14] I. Jolliffe, Principal Component Analysis. Springer, 2nd edition, Oct. 2002. [15] K. Kashofer, M. M. Tschernatsch, H. J. Mischinger, F. Iberer, and K. Zat-loukal, The disease relevance of human hepatocellular xenograft models: molecular characterization and review of the literature. Cancer Letters, 286 (1): 121–128, Dec. 2009. PMID: 19111389. [16] R. Kosara, F. Bendix, and Bendix, Hauser, Parallel sets: Interactive exploration and visual analysis of categorical data. IEEE Transactions on Visualization and Computer Graphics, 12 4: 558–568, 2006. [17] M. Krzywinski, J. Schein, I. Birol, J. Connors, R. Gascoyne, D. Hors-man, S. J. Jones, and M. A. Marra, Circos: An information aesthetic for comparative genomics. Genome Research, 19 9: 1639–1645, 2009. [18] C.M. Gogg-Kamerer, K. Zatloukal, C. Stumptner, E. M. Brunt, and H. Denk, Ballooned hepatocytes in steatohepatitis: the value of keratin immunohistochemistry for diagnosis. Journal of Hepatology, 48 5: 821–828, May 2008. PMID: 18329127. [19] Lackner, A. Lex, M. Streit, E. Kruijff, and D. Schmalstieg, Caleydo: Design and evaluation of a visual analysis framework for gene expression data in its biological context. In 2010 IEEE Pacific Visualization Symposium (Paci-ficVis)), pages 57–64, Taipei, Taiwan, 2010. [20] T. V Long and L. Linsen, MultiClusterTree: interactive visual exploration of hierarchical clusters in multidimensional multivariate data. Computer Graphics Forum, 28 (3): 823–830, 2009. [21] M. Meyer, T. Munzner, and H. Pfister, MizBee: a multiscale synteny browser. IEEE Transactions on Visualization and Computer Graphics, 15 6: 897–904, 2009. [22] T. Munzner, F. Guimbretire, S. Tasiran, L. Zhang, and Y. Zhou, TreeJux-taposer: scalable tree comparison using Focus+Context with guaranteed visibility. In ACM SIGGRAPH 2003 Papers, pages 453–462, San Diego, California, 2003. ACM. [23] J. C. Roberts, State of the art: Coordinated & multiple views in exploratory visualization. In International Conference on Coordinated and Multiple Views in Exploratory Visualization, volume 0, pages 61–71, Los Alamitos, CA, USA, 2007. IEEE Computer Society. [24] M. Sarkar, S. S. Snibbe, O. J. Tversky, and S. P. Reiss, Stretching the rubber sheet: a metaphor for viewing large layouts on small screens. In Proc. of the 6th annual ACM symposium on User interface software and technology, pages 81–91, Atlanta, Ga., United States, 1993. ACM. [25] J. Seo and B. Shneiderman, Interactively exploring hierarchical clustering results. Computer, 35 7: 80–86, 2002. [26] J. Sharko, G. G. Grinstein, K. A. Marx, J. Zhou, C. Cheng, S. Odelberg, and H. Simon, Heat map visualizations allow comparison of multiple clustering results and evaluation of dataset quality: Application to microarray data. In Proceedings of the 11th International Conference Information Visualization, pages 521–526. IEEE Computer Society, 2007. [27] B. Shneiderman, The eyes have it: A task by data type taxonomy for information visualizations. In VL 96: Proceedings on Visual Languages. IEEE Computer Society, 1996. [28] H. Siirtola and K. Rih, Discussion: Interacting with parallel coordinates. Interact. Comput., 18 6: 12781309, 2006. [29] C. Stolte, D. Tang, and P. Hanrahan, Polaris: a system for query, analysis, and visualization of multidimensional databases. Commun. ACM, 51 11: 7584, 2008. [30] M. Streit, A. Lex, M. Kalkusch, K. Zatloukal, and D. Schmalstieg, Caleydo: Connecting pathways and gene expression. Bioinformatics, 25 20: 2760–2761, July 2009. [31] R. D. C. Team, R: A Language and Environment for Statistical Computing. Vienna, Austria, 2009. ISBN 3-900051-07-0. [32] A. Telea and D. Auber, Code flows: Visualizing structural evolution of source code. Computer Graphics Forum, 27 3: 831–838, 2008. [33] Y. Tu and H. Shen, Visualizing changes of hierarchical data using treemaps. IEEE Transactions on Visualization and Computer Graphics, 13 6: 1286–1293, 2007.