Subscribe

Issue No.10 - Oct. (2013 vol.19)

pp: 1768-1781

Yu-Hsuan Chan , Dept. of Comput. Sci., Univ. of California at Davis, Davis, CA, USA

C. D. Correa , Dept. of Comput. Sci., Univ. of California at Davis, Davis, CA, USA

Kwan-Liu Ma , Dept. of Comput. Sci., Univ. of California at Davis, Davis, CA, USA

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TVCG.2013.20

ABSTRACT

Scatterplots remain a powerful tool to visualize multidimensional data. However, accurately understanding the shape of multidimensional points from 2D projections remains challenging due to overlap. Consequently, there are a lot of variations on the scatterplot as a visual metaphor for this limitation. An important aspect often overlooked in scatterplots is the issue of sensitivity or local trend, which may help in identifying the type of relationship between two variables. However, it is not well known how or what factors influence the perception of trends from 2D scatterplots. To shed light on this aspect, we conducted an experiment where we asked people to directly draw the perceived trends on a 2D scatterplot. We found that augmenting scatterplots with local sensitivity helps to fill the gaps in visual perception while retaining the simplicity and readability of a 2D scatterplot. We call this augmentation the generalized sensitivity scatterplot (GSS). In a GSS, sensitivity coefficients are visually depicted as flow lines, which give a sense of continuity and orientation of the data that provide cues about the way data points are scattered in a higher dimensional space. We introduce a series of glyphs and operations that facilitate the analysis of multidimensional data sets using GSS, and validate with a number of well-known data sets for both regression and classification tasks.

INDEX TERMS

Market research, Data visualization, Sensitivity analysis, Noise, Image color analysis, Interpolation,multidimensional data visualization, Sensitivity analysis, data transformations, model fitting

CITATION

Yu-Hsuan Chan, C. D. Correa, Kwan-Liu Ma, "The Generalized Sensitivity Scatterplot",

*IEEE Transactions on Visualization & Computer Graphics*, vol.19, no. 10, pp. 1768-1781, Oct. 2013, doi:10.1109/TVCG.2013.20REFERENCES

- [1] L. Arriola and J. Hyman, "Being Sensitive to Uncertainty,"
Computing in Science and Eng., vol. 9, no. 2, pp. 10-20, 2007.- [2] S. Bachthaler and D. Weiskopf, "Continuous Scatterplots,"
IEEE Trans. Visualization and Computer Graphics, vol. 14, no. 6, pp. 1428-1435, Nov./Dec. 2008.- [3] S. Barlowe, T. Zhang, Y. Liu, J. Yang, and D. Jacobs, "Multivariate Visual Explanation for High Dimensional Datasets,"
Proc. IEEE Symp. Visualization Analytics Science and Technology (VAST), pp. 147-154, 2008.- [4] W. Berger, H. Piringer, P. Filzmoser, and E. Gröller, "Uncertainty-Aware Exploration of Continuous Parameter Spaces Using Multivariate Prediction,"
Computer Graphics Forum, vol. 30, no. 3, pp. 911-920, 2011.- [5] P. Berkhin, "Survey of Clustering Data Mining Techniques,"
Grouping Multidimensional Data, pp. 25-71, Springer, 2006.- [6] G.E.P. Box and N.R. Draper,
Empirical Model-Building and Response Surfaces. Wiley, 1987.- [7] C. Brunsdon, S. Fotheringham, and M. Charlton, "Geographically Weighted Regression,"
J. Royal Statistical Soc., vol. 47, no. 3, pp. 431-443, 1998.- [8] D.G. Cacuci,
Sensitivity & Uncertainty Analysis: Theory. CRC Press, 2003.- [9] K. Chan, A. Saltelli, and S. Tarantola, "Sensitivity Analysis of Model Output,"
Proc. 29th Conf. Winter Simulation, pp. 261-268, 1997.- [10] Y.-H. Chan, C.D. Correa, and K.-L. Ma, "Flow-Based Scatterplots for Sensitivity Analysis,"
Proc. IEEE Symp. Visual Analytics Science and Technology (VAST), pp. 43-50, 2010.- [11] M. Chau, R. Cheng, B. Kao, and J. Ng, "Uncertain Data Mining: An Example in Clustering Location Data,"
Proc. 10th Pacific-Asia Conf. Advances in Knowledge Discovery and Data Mining (KDD), vol. 3918, pp. 199-204, 2006.- [12] C. Collins, G. Penn, and S. Carpendale, "Bubble Sets: Revealing Set Relations with Isocontours over Existing Visualizations,"
IEEE Trans. Visualization and Computer Graphics, vol. 15, no. 6, pp. 1009-1016, Nov./Dec. 2009.- [13] G. Cormode and A. McGregor, "Approximation Algorithms for Clustering Uncertain Data,"
Proc. ACM Symp. Principles of Database Systems, pp. 191-200, 2008.- [14] C.D. Correa, Y.-H. Chan, and K.-L. Ma, "A Framework for Uncertainty-Aware Visual Analytics,"
Proc. IEEE Symp. Visualization Analytics Science and Technology (VAST), p. 191, 2009.- [15] N.R. Draper and H. Smith,
Applied Regression Analysis, third ed. Wiley, 1998.- [16] N. Elmqvist, P. Dragicevic, and J.-D. Fekete, "Rolling the Dice: Multidimensional Visual Exploration Using Scatterplot Matrix Navigation,"
IEEE Trans. Visualization and Computer Graphics, vol. 14, no. 6, pp. 1141-1148, 2008.- [17] T.G. Eschenbach, "Spiderplots versus Tornado Diagrams for Sensitivity Analysis,"
Interfaces, vol. 22, no. 6, pp. 40-46, 1992.- [18] D. Feng, L. Kwock, Y. Lee, and M. Taylor, "Matching Visual Saliency to Confidence in Plots of Uncertain Data,"
IEEE Trans. Visualization and Computer Graphics, vol. 16, no. 6, pp. 980-989, Nov./Dec. 2010.- [19] H.C. Frey and S.R. Patil, "Identification & Review of Sensitivity Analysis Methods,"
Risk Analysis, vol. 22, no. 3, pp. 553-578, 2002.- [20] Y.-H. Fua, M.O. Ward, and E.A. Rundensteiner, "Structure-Based Brushes: A Mechanism for Navigating Hierarchically Organized Data and Information Spaces,"
IEEE Trans. Visualization and Computer Graphics, vol. 6, no. 2, pp. 150-159, Apr. 2000.- [21] A. Griewank and A. Walther,
Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation. SIAM, 2008.- [22] Z. Guo, O. Ward, A. Rundensteiner, and C. Ruiz, "Pointwise Local Pattern Exploration for Sensitivity Analysis,"
Proc. IEEE Conf. Visualization Analytics Science and Technology (VAST), p. 438, 2011.- [23] D. Harrison and D. Rubinfeld, "Boston Neighborhood Housing Price Dataset (BNHP)," http://lib.stat.cmu.edu/S/Harrell/data/descriptions boston.html, 2013.
- [24] T. Hastie and R. Tibshirani,
Generalized Additive Models. Chapman & Hall/CRC, 1990.- [25] J. Heinrich, S. Bachthaler, and D. Weiskopf, "Progressive Splatting of Continuous Scatterplots & Parallel Coordinates,"
Computer Graphics Forum, vol. 30, no. 3, pp. 653-662, 2011.- [26] J. Helton, J. Johnson, C. Sallaberry, and C. Storlie, "Survey of Sampling-Based Methods for Uncertainty & Sensitivity Analysis,"
Reliability Eng. & System Safety, vol. 91, no. 10/11, pp. 1175-1209, 2006.- [27] R.L. Iman and J.C. Helton, "An Investigation of Uncertainty & Sensitivity Analysis Techniques for Computer Models,"
Risk Analysis, vol. 8, no. 1, pp. 71-90, 1988.- [28] M. Jansen, "Analysis of Variance Designs for Model Output,"
Computer Physics Comm., vol. 117, no. 1/2, pp. 35-43, 1999.- [29] D.H. Jeong, C. Ziemkiewicz, B. Fisher, W. Ribarsky, and R. Chang, "iPCA: An Interactive System for PCA-Based Visual Analytics,"
Computer Graphics Forum, vol. 28, no. 3, pp. 767-774, 2009.- [30] D.A. Keim, M.C. Hao, U. Dayal, H. Janetzko, and P. Bak, "Generalized Scatter Plots,"
Information Visualization, vol. 9, pp. 301-311, 2009.- [31] D. Kurowicka and R.M. Cooke,
Uncertainty Analysis with High Dimensional Dependence Modelling. Wiley, 2006.- [32] A.R. Martin and M.O. Ward, "High Dimensional Brushing for Interactive Exploration of Multivariate Data,"
Proc. IEEE Sixth Conf. Visualization (VIS '95), pp. 271-278, 1995.- [33] R. McGill, J. Tukey, and W. Larsen, "Variations of Box Plots,"
Am. Statistician, vol. 32, no. 1, pp. 12-16, 1978.- [34] A. Moore, J. Schneider, and K. Deng, "Efficient Locally Weighted Polynomial Regression Predictions,"
Proc. Int'l Machine Learning Conf. (ICML), pp. 236-244, 1997.- [35] K. Potter, J. Kniss, R. Riesenfeld, and C.R. Johnson, "Visualizing Summary Statistics and Uncertainty,"
Computer Graphics Forum, vol. 29, no. 3, pp. 823-831, 2010.- [36] R. Quinlan, "Auto MPG Data Set," http://archive.ics.uci.edu/ml/datasetsAuto+MPG , 2013.
- [37] J. Shlens, "A Tutorial on Principal Component Analysis,"
Measurement, vol. 51, no. 10003, p. 52, 2005.- [38] B. Shneiderman and A. Aris, "Network Visualization by Semantic Substrates,"
IEEE Trans. Visualization and Computer Graphics, vol. 12, no. 5, pp. 733-740, Sept./Oct. 2006.- [39] V. Smidl and A. Quinn, "On Bayesian Principal Component Analysis,"
Computational Statistics & Data Analysis, vol. 51, no. 9, pp. 4101-4123, 2007.- [40] I.M. Sobolá, "Global Sensitivity Indices for Nonlinear Mathematical Models & Their Monte Carlo Estimates,"
Math. & Computers in Simulation, vol. 55, no. 1-3, pp. 271-280, 2001.- [41] Y. Tanaka, "Recent Advance in Sensitivity Analysis in Multivariate Statistical Methods,"
Computational Statistics, vol. 7, no. 1, pp. 1-25, 1994.- [42] S.K. Thompson,
Sampling, second ed. Wiley, 2002.- [43] M.P. Wand and M.C. Jones, "Comparison of Smoothing Parameterizations in Bivariate Kernel Density Estimation,"
J. Am. Statistical Assoc., vol. 88, no. 422, pp. 520-528, 2012.- [44] M.O. Ward, "A Taxonomy of Glyph Placement Strategies for Multidimensional Data Visualization,"
Information Visualization, vol. 1, no. 1, pp. 194-210, 2002.- [45] E.W. Weisstein, "Least Squares Fitting," http://mathworld. wolfram.comLeastSquaresFitting.html , 2013.
- [46] E.W. Weisstein, "Least Squares Fitting - Perpendicular Offsets," http://mathworld.wolfram.comLeastSquaresFitting PerpendicularOffsets.html , 2013.
- [47] Wine Data Set, http://archive.ics.uci.edu/ml/datasets Wine , 2013.
- [48] Y. Yamanishi and Y. Tanaka, "Sensitivity Analysis in Functional Principal Component Analysis,"
Computational Statistics, vol. 20, no. 2, pp. 311-326, 2005.- [49] D. Yang, E. Rundensteiner, and M. Ward, "Analysis Guided Visual Exploration of Multivariate Data,"
Proc IEEE Symp. Visualization Analytics Science and Technology (VAST), pp. 83-90, 2007. |