The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.12 - Dec. (2012 vol.18)
pp: 2621-2630
Cagatay Turkay , Department of Informatics, University of Bergen
Arvid Lundervold , Department of Biomedicine, University of Bergen
Astri Johansen Lundervold , Department of Biological and Medical Psychology, University of Bergen
Helwig Hauser , Department of Informatics, University of Bergen
ABSTRACT
Datasets with a large number of dimensions per data item (hundreds or more) are challenging both for computational and visual analysis. Moreover, these dimensions have different characteristics and relations that result in sub-groups and/or hierarchies over the set of dimensions. Such structures lead to heterogeneity within the dimensions. Although the consideration of these structures is crucial for the analysis, most of the available analysis methods discard the heterogeneous relations among the dimensions. In this paper, we introduce the construction and utilization of representative factors for the interactive visual analysis of structures in high-dimensional datasets. First, we present a selection of methods to investigate the sub-groups in the dimension set and associate representative factors with those groups of dimensions. Second, we introduce how these factors are included in the interactive visual analysis cycle together with the original dimensions. We then provide the steps of an analytical procedure that iteratively analyzes the datasets through the use of representative factors. We discuss how our methods improve the reliability and interpretability of the analysis process by enabling more informed selections of computational tools. Finally, we demonstrate our techniques on the analysis of brain imaging study results that are performed over a large group of subjects.
INDEX TERMS
Correlation, Data visualization, Principal component analysis, Gaussian distribution, Reliability, Data mining, high-dimensional data analysis, Interactive visual analysis
CITATION
Cagatay Turkay, Arvid Lundervold, Astri Johansen Lundervold, Helwig Hauser, "Representative Factor Generation for the Interactive Visual Analysis of High-Dimensional Data", IEEE Transactions on Visualization & Computer Graphics, vol.18, no. 12, pp. 2621-2630, Dec. 2012, doi:10.1109/TVCG.2012.256
REFERENCES
[1] R. Agrawal, J. Gehrke, D. Gunopulos,, and P. Raghavan., Automatic sub-space clustering of high dimensional data for data mining applications. In Proceedings of the 1998 ACM SIGMOD international conference on Management of data, pages 94-105. ACM, 1998.
[2] M. Andersson, M. Ystad, L. Arvid,, and L. Astri., Correlations between measures of executive attention and cortical thickness of left posterior middle frontal gyrus-a dichotic listening study. Behavioral and Brain Functions, 5(41), 2009.
[3] G. Andrienko, N. Andrienko, S. Bremm., T. Schreck, T. Von Landes-berger, P. Bak, and D. Kei., Space-in-time and time-in-space self-organizing maps for exploring spatiotemporal patterns. In Computer Graphics Forum, 29, pages 913-922. Wiley Online Library, 2010.
[4] W. Berger, H. Piringer, P. Filzmoser,, and E. Gröller., Uncertainty-aware exploration of continuous parameter spaces using multivariate prediction. Computer Graphics Forum, 30(3): 911-920, 2011.
[5] A. Blum and P. Langley, Selection of relevant features and examples in machine learning Artificial intelligence, 97(1-2): 245-271, 1997.
[6] A. Endert, C. Han, D. Maiti., L. House, and C. North., Observation-level interaction with statistical models for visual analytics. In Visual Analytics Science and Technology (VAST), 2011 IEEE Conference on, pages 121-130. IEEE, 2011.
[7] S. Fernstad, J. Johansson, S. Adams., J. Shaw, and D. Taylor., Visual exploration of microbial populations. In Biological Data Visualization (BioVis), 2011 IEEE Symposium on, pages 127-134, oct. 2011.
[8] B. Fischl, D. Salat, E. Busa., M. Albert, M. Dieterich., C. Haselgrove, A. Van Der Kouwe, R. Killiany, D. Kennedy., S. Klaveness et al. Whole brain segmentation: automated labeling of neuroanatomical structures in the human brain. Neuron, 33(3): 341-355, 2002.
[9] A. Frank, and A. Asuncion, UCI machine learning repository, [http://archive.ics.uci.eduml] University of California, Irvine, School of Information and Computer Sciences, 2010.
[10] FreeSurfer. http:/surfer.nmr.mgh.harvard.edu, 2012.
[11] Y.-H. Fua,M. O. Ward,, and E. A. Rundensteiner., Hierarchical parallel coordinates for exploration of large datasets. In Proceedings of the conference on Visualization ‘99: celebrating ten years, VIS ‘99, pages 43-50. IEEE Computer Society Press, 1999.
[12] R. Fuchs and H. Hauser, Visualization of multi-variate scientific data Computer Graphics Forum, 28(6): 1670-1690, 2009.
[13] R. Fuchs, J. Waser, and M. E, Gröller. Visual human+machine learning IEEE TVCG, 15(6): 1327-1334, Oct. 2009.
[14] I. Guyon and A. Elisseeff, An introduction to variable and feature selection The Journal of Machine Learning Research, 3: 1157-1182, 2003.
[15] J. Hair and R. Anderson., Multivariate data analysis. Prentice Hall, 2010.
[16] E. Hodneland, M. Ystad, J. Haasz,A. Munthe-Kaas,, and A. Lundervold., Automated approaches for analysis of multimodal mri acquisitions in a study of cognitive aging. Comput. Methods Prog. Biomed., 106(3): 328-341, June 2012.
[17] S. Huang, M. Ward, and E. Rundensteiner., Exploration of dimensionality reduction for text visualization. In Coordinated and Multiple Views in Exploratory Visualization, 2005. (CMV 2005). Proceedings. Third International Conference on, pages 63-74. IEEE, 2005.
[18] G. Ivosev, L. Burton, and R. Bonner, Dimensionality reduction and visualization in principal component analysis Analytical chemistry, 80(13): 4933-4944, 2008.
[19] H. Jänicke,M. Böttinger,, and G. Scheuermann., Brushing of attribute clouds for the visualization of multivariate data. — IEEE Transactions on Visualization and Computer Graphics, pages 1459-1466, 2008.
[20] S. Johansson and J. Johansson, Interactive dimensionality reduction through user-defined combinations of quality metrics Visualization and Computer Graphics, IEEE Transactions on, 15(6): 993-1000, 2009.
[21] R. Johnson and D. Wichern., Applied multivariate statistical analysis, 6. Prentice Hall Upper Saddle River, NJ:, 2007.
[22] L. Kaufman and P. J Rousseeuw., Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, 2005.
[23] J. Kehrer, P. Filzmoser, and H. Hauser, Brushing moments in interactive visual analysis Computer Graphics Forum, 29(3): 813-822, 2010.
[24] J. Kehrer, P. Muigg, H. Doleisch,, and H. Hauser., Interactive visual analysis of heterogeneous scientific data across an interface. IEEE Transactions on Visualization and Computer Graphics, 17(7): 934-946, 2011.
[25] D. Keim, F. Mansmann, J. Schneidewind., J. Thomas, and H. Ziegler, Visual analytics: Scope and challenges Visual Data Mining, pages 76-90, 2008.
[26] A. Laird, J. Lancaster, and P. Fox, Brainmap Neuroinformatics, 3(1): 6577, 2005.
[27] Neuro Synth. neurosynth.org, 2012.
[28] S. Oeltze, H. Doleisch, H. Hauser., P. Muigg, and B. Preim, Interactive visual analysis of perfusion data Visualization and Computer Graphics, IEEE Transactions on, 13(6): 1392-1399, nov.-dec. 2007.
[29] A. Perer and B. Shneiderman, Integrating statistics and visualization for exploratory power: From long-term case studies to design guidelines Computer Graphics and Applications, IEEE, 29(3): 39-51, may-june 2009.
[30] H. Piringer, M. Buchetics, H. Hauser,, and M. E., Gröller. Hierarchical difference scatterplots: Interactive visual analysis of data cubes. ACM SIGKDD Explorations Newsletter, 11(2): 49-58, 2010.
[31] J. Royston, An extension of shapiro and wilk's w test for normality to large samples Applied Statistics, pages 115-124, 1982.
[32] H. Samet., Foundations of multidimensional and metric data structures. Morgan Kaufmann, 2006.
[33] J. Seo and B. Shneiderman., A rank-by-feature framework for unsuper-vised multidimensional data exploration using low dimensional projections. In Proc. IEEE Symposium on Information Visualization INFOVIS 2004, pages 65-72, 2004.
[34] C. Stolte, D. Tang, and P. Hanrahan, Polaris: a system for query, analysis, and visualization of multidimensional relational databases IEEE Transactions on Visualization and Computer Graphics, 8(1): 52-65, 2002.
[35] R. D. C., Team R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, 2009.
[36] C. Turkay, P. Filzmoser, and H. Hauser., Brushing dimensions – a dual visual analysis model for high-dimensional data. IEEE Transactions on Visualization and Computer Graphics. 17(12): 2591-2599, dec. 2011.
[37] M. O. Ward., Xmdvtool: integrating multiple methods for visualizing multivariate data. In Proceedings of the conference on Visualization ‘94, VIS ‘94, pages 326-333. IEEE Computer Society Press, 1994.
[38] C. Weaver., Cross-filtered views for multidimensional visual analysis. IEEE Transactions on Visualization and Computer Graphics, 16: 192-204, March 2010.
[39] L. Wilkinson, A. Anand, and R. Grossman., Graph-theoretic scagnostics. In Proceedings of the Proceedings of the 2005 IEEE Symposium on Information Visualization, INFOVIS ‘05, pages 157-164, Washington, DC, USA, 2005. IEEE Computer Society.
[40] L. Wilkinson, A. Anand, and R. Grossman, High-dimensional visual analytics: Interactive exploration guided by pairwise views of point distributions Visualization and Computer Graphics, IEEE Transactions on, 12(6): 1363-1372, 2006.
[41] M. Williams and T. Munzner., Steerable, progressive multidimensional scaling. In Proceedings of the IEEE Symposium on the Information Visualization, pages 57-64, Washington, DC, USA, 2004. IEEE Computer Society.
[42] P. C. Wong and R. D. Bergeron., 30 years of multidimensional multivariate visualization. In Scientific Visualization, Overviews, Methodologies, and Techniques, pages 3-33, Washington, DC, USA, 1997. IEEE Computer Society.
[43] J. Yang, D. Hubball, M. Ward., E. Rundensteiner, and W. Ribarsky, Value and relation display: Interactive visual exploration of large data sets with hundreds of dimensions Visualization and Computer Graphics, IEEE Transactions on, 13(3): 494-507, may-june 2007.
[44] J. Yang,M. O. Ward,E. A. Rundensteiner,, and S. Huang., Visual hierarchical dimension reduction for exploration of high dimensional datasets. In VISSYM ‘03: Proceedings of the symposium on Data visualisation 2003, pages 19-28. Eurographics Association, 2003.
[45] M. Ystad, T. Eichele, A. J. Lundervold,, and A. Lundervold., Subcortical functional connectivity and verbal episodic memory in healthy elderly a resting state fmri study. NeuroImage, 52(1): 379-388, 2010.
[46] M. Ystad, A. Lundervold, E. Wehling., T. Espeseth, H. Rootwelt,L. West-lye, M. Andersson, S. Adolfsdottir., J. Geitung, A. Fjell, et al. Hip-pocampal volumes are important predictors for memory function in elderly women. BMC medical imaging, 9(1): 17, 2009.
16 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool