| | This Article | |
| |
| |
| | Share | |
| |
| |
| | Bibliographic References | |
| |
| |
| | Add to: | |
| |
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
| |
| | Search | |
| |
| |
| | |
High-Dimensional Visual Analytics: Interactive Exploration Guided by Pairwise Views of Point Distributions
November/December 2006 (vol. 12 no. 6)
pp. 1363-1372
Abstract—We introduce a method for organizing multivariate displays and for guiding interactive exploration through high-dimensional data. The method is based on nine characterizations of the 2D distributions of orthogonal pairwise projections on a set of points in multidimensional Euclidean space. These characterizations include such measures as density, skewness, shape, outliers, and texture. Statistical analysis of these measures leads to ways for 1) organizing 2D scatterplots of points for coherent viewing, 2) locating unusual (outlying) marginal 2D distributions of points for anomaly detection, and 3) sorting multivariate displays based on high-dimensional data, such as trees, parallel coordinates, and glyphs.
[1] 1363 C. Adami and A. Mazure, “The Use of Minimal Spanning Tree to Characterize the Second Cluster Galaxy Distribution,” Astronomy and Astrophysics Supplement Series, vol. 134, pp. 393-400, 1999.[2] F. Anscolmbe and J.W. Tukey, “The Examination and Analysis of Residuals,” Technometrics, pp. 141-160, 1963.[3] E.M. Arkin, Y.-J. Chiang, M. Held, J.S.B. Mitchell, V. Sacristan, S. Skiena, and T.-H. Yang, “On Minimum-Area Hulls,” Algorithmica, vol. 21, no. 1, pp. 119-136, 1998.[4] D. Asimov, “The Grand Tour: A Tool for Viewing Multidimensional Data,” SIAM J. Scientific and Statistical Computing, vol. 6, pp.128-143, 1985.[5] A.C. Atkinson, Plots, Transformations and Regression: An Introduction to Graphical Methods of Diagnostic Regression Analysis. Oxford Univ. Press, 1985.[6] A.C. Atkinson, “Fast Very Robust Methods for the Detection of Multiple Outliers,” J. Am. Statistical Assoc., vol. 89, pp. 1339-1994, 1994.[7] V. Barnett and T. Lewis, Outliers in Statistical Data. John Wiley and Sons, 1994.[8] M. Belkin and P. Niyogi, “Laplacian Eigenmaps for Dimensionality Reduction and Data Representation,” Neural Computation, vol. 15, pp. 1373-1396, 2003.[9] D.A. Belsley, E. Kuh, and R.E. Welsch, Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. John Wiley and Sons, 1980.[10] M. Brand, “Nonlinear Dimensionality Reduction by Kernel Eigenmaps,“ Proc. 18th Int'l Joint Conf. Artificial Intelligence, pp.547-552, 2003.[11] Computing and Graphics in Statistics, A. Buja and P. Tukey, eds. New York: Springer-Verlag, 1993.[12] D.B. Carr, “Looking at Large Data Sets Using Binned Data Plots,“ Computing and Graphics in Statistics, pp. 7-39. New York: Springer-Verlag, 1993.[13] D.B. Carr, R.J. Littlefield, W.L. Nicholson, and J.S. Littlefield, “Scatterplot Matrix Techniques for Large N,” J. Am. Statistical Assoc., vol. 82, pp. 424-436, 1987.[14] W.S. Cleveland, The Elements of Graphing Data. Summit, N.J.: Hobart Press, 1985.[15] R.D. Cook and S. Weisberg, An Introduction to Regression Graphics. New York: John Wiley and Sons, 1994.[16] H. Edelsbrunner, D.G. Kirkpatrick, and R. Seidel, “On the Shape of a Set of Points in the Plane,” IEEE Trans. Information Theory, vol. 29, pp. 551-559, 1983.[17] J.H. Friedman and J.W. Tukey, “A Projection Pursuit Algorithm for Exploratory Data Analysis,” IEEE Trans. Computers, vol. 23, 881-890, 1974.[18] M. Friendly and E. Kwan, “Effect Ordering for Data Displays,” Computational Statistics and Data Analysis, vol. 43, no. 4, pp. 509-539, 2003.[19] G.W. Furnas, “Metric Family Portraits,” J. Classification, vol. 6, pp.7-52, 1989.[20] G.W. Furnas and A. Buja, “Prosection Views: Dimensional Inference through Sections and Projections,” J. Computational and Graphical Statistics, vol. 3, no. 4, pp. 323-385, 1994.[21] J.C. Gower and G.J.S. Ross, “Minimal Spanning Trees and Single Linkage Cluster Analysis,” Applied Statistics, vol. 18, pp. 54-64, 1969.[22] J.A. Hartigan, “Printer Graphics for Clustering,” J. Statistical Computation and Simulation, vol. 4, pp. 187-213, 1975.[23] J.A. Hartigan and S. Mohanty, “The RUNT Test for Multimodality,” J. Classification, vol. 9, pp. 63-70, 1992.[24] J.A. Hartigan, Clustering Algorithms. New York: John Wiley and Sons, 1975.[25] T. Hastie and W. Stuetzle, “Principal Curves,” J. Am. Statistical Assoc., vol. 84, pp. 502-516, 1989.[26] J. Illingworth and J. Kittler, “A Survey of the Hough Transform,” Computer Vision, Graphics, and Image Processing, vol. 44, no. 1, pp.87-116, 1988.[27] J. Jaromczyk and G. Toussaint, “Relative Neighborhood Graphs and Their Relatives,” Proc. IEEE, vol. 80, no. 9, pp. 1502-1517, 1992.[28] J.B. KruskalJr., “On the Shortest Spanning Subtree of a Graph and the Travelling Salesman Problem,” Proc. Am. Math. Soc., vol. 7, pp.48-50, 1956.[29] R.J.A. Little and D.B. Rubin, Statistical Analysis with Missing Data. New York: John Wiley and Sons, 1987.[30] A. MacEachren, X. Dai, F. Hardisty, D. Guo, and G. Lengerich, “Exploring High-D Spaces with Multiform Matrices and Small Multiples” Proc. IEEE Information Visualization, pp. 31-38, 2003.[31] D. Marchette, Random Graphs for Statistical Pattern Recognition. New York: John Wiley and Sons, 2004.[32] W.J. Nash, T.L. Sellers, S.R. Talbot, A.J. Cawthorn, and W.B. Ford, “The Population Biology of Abalone (Haliotis Species) in Tasmania: I. Blacklip Abalone (H. Rubra) from the North Coast and Islands of Bass Strait,” technical report, Sea Fisheries Division, 1994.[33] J. O'Rourke, Computational Geometry in C, second ed. Cambridge, UK: Cambridge Univ. Press, 1998.[34] W. Peng, M.O. Ward, and E.A. Rundensteiner, “Clutter Reduction in Multi-Dimensional Data Visualization Using Dimension Reordering,” Proc. IEEE Information Visualization, pp. 89-96, 2004.[35] M.D. Penrose, “Extremes for the Minimal Spanning Tree on Normally Distributed Points,” Advances in Applied Probability, vol. 30, pp. 628-639, 1998.[36] F.P. Preparata and M.I. Shamos, Computational Geometry: An Introduction. New York: Springer-Verlag, 1985.[37] D.M. Rocke and D.L. Woodruff, “Identification of Outliers in Multivariate Data,” J. Am. Statistical Assoc., vol. 91, pp. 1047-1061, 1996.[38] S.T. Roweis and L.K. Saul, “Nonlinear Dimensionality Reduction by Locally Linear Embedding,” Science, vol. 290, pp. 2323-2326, 2000.[39] D.W. Scott, Multivariate Density Estimation: Theory, Practice, And Visualization. New York: John Wiley and Sons, 1992.[40] J. Seo and B. Shneiderman, “A Rank-by-Feature Framework for Unsupervised Multidimensional Data Exploration Using Low Dimensional Projections,” Proc. IEEE Information Visualization Conf., pp.65-72, 2004.[41] B. Silverman, Density Estimation for Statistics and Data Analysis. New York: Chapman and Hall, 1986.[42] S.S. Skiena, The Algorithm Design Manual. New York: Springer-Verlag, 1998.[43] J.M. Steele, “Growth Rates of Euclidean Minimal Spanning Trees With Power Weighted Edges,” The Annals of Probability, vol. 16, pp. 1767-1787, 1988.[44] W. Stuetzle, “Estimating the Cluster Tree of a Density by Analyzing the Minimal Spanning Tree of a Sample,” J. Classification, vol. 20, pp. 25-47, 2003.[45] J.W. Tukey, “Mathematics and the Picturing of Data” Proc. Int'l Congress of Mathematicians, pp. 523-531, 1974.[46] J.W. Tukey, Exploratory Data Analysis. Reading, Mass.: Addison-Wesley, 1977.[47] J.W. Tukey and P.A. Tukey, “Computer Graphics and Exploratory Data Analysis: An Introduction,” Proc. Sixth Ann. Conf. and Exposition: Computer Graphics, 1985.[48] P.F. Velleman and D.C. Hoaglin, Applications, Basics and Computing of Exploratory Data Analysis. Duxbury Press, 1981.[49] L. Wilkinson, The Grammar of Graphics, second ed. New York: Springer-Verlag, 2005.[50] L. Wilkinson, A. Anand, and R. Grossman, “Graph-Theoretic Scagnostics, Proc. IEEE Information Visualization Conf., pp. 157-164, 2005.
Index Terms:
Visualization, statistical graphics.
Citation:
Leland Wilkinson, Anushka Anand, Robert Grossman, "High-Dimensional Visual Analytics: Interactive Exploration Guided by Pairwise Views of Point Distributions," IEEE Transactions on Visualization and Computer Graphics, vol. 12, no. 6, pp. 1363-1372, Nov./Dec. 2006, doi:10.1109/TVCG.2006.94