This Article 
 Bibliographic References 
 Add to: 
Visualization of High-Dimensional Point Clouds Using Their Density Distribution's Topology
November 2011 (vol. 17 no. 11)
pp. 1547-1559
Patrick Oesterling, Universität Leipzig, Leipzig
Christian Heine, Universität Leipzig, Leipzig
Heike Jänicke, Heidelberg University, Heidelberg
Gerik Scheuermann, Universität Leipzig, Leipzig
Gerhard Heyer, Universität Leipzig, Leipzig
We present a novel method to visualize multidimensional point clouds. While conventional visualization techniques, like scatterplot matrices or parallel coordinates, have issues with either overplotting of entities or handling many dimensions, we abstract the data using topological methods before presenting it. We assume the input points to be samples of a random variable with a high-dimensional probability distribution which we approximate using kernel density estimates on a suitably reconstructed mesh. From the resulting scalar field we extract the join tree and present it as a topological landscape, a visualization metaphor that utilizes the human capability of understanding natural terrains. In this landscape, dense clusters of points show up as hills. The nesting of hills indicates the nesting of clusters. We augment the landscape with the data points to allow selection and inspection of single points and point sets. We also present optimizations to make our algorithm applicable to large data sets and to allow interactive adaption of our visualization to the kernel window width used in the density estimation.

[1] G. Weber, P.-T. Bremer, and V. Pascucci, “Topological Landscapes: A Terrain Metaphor for Scientific Data,” IEEE Trans. Visualization and Computer Graphics, vol. 13, no. 6, pp. 1416-1423, Nov./Dec. 2007.
[2] Graphical Methods for Data Analysis, J.M. Chambers, W.S. Cleveland, B. Kleiner, and P.A. Tukey, eds. Wadsworth Int'l Group, 1983.
[3] A. Inselberg and B. Dimsdale, “Parallel Coordinates: A Tool for Visualizing Multi-Dimensional Geometry,” VIS '90: Proc. First Conf. Visualization, pp. 361-378, 1990.
[4] R.A. Becker and W.S. Cleveland, “Brushing Scatterplots,” Technometrics, vol. 29, no. 2, pp. 127-142, 1987.
[5] A. Tatu, G. Albuquerque, M. Eisemann, J. Schneidewind, H. Theisel, M. Magnor, and D. Keim, “Combining Automated Analysis and Visualization Techniques for Effective Exploration of High-Dimensional Data,” Proc. IEEE Symp. Visual Analytics Science and Technology (VAST), pp. 59-66, 2009.
[6] I.T. Jolliffe, Principal Component Analysis. Springer, 2002.
[7] K. Fukunaga, Introduction to Statistical Pattern Recognition, second ed. Academic Press Professional, Inc., 1990.
[8] J.B. Kruskal and M. Wish, Multidimensional Scaling (Quantitative Applications in the Social Sciences). SAGE Publications, 1978.
[9] T. Kohonen, Self-Organizing Maps, third ed. Springer-Verlag, 2001.
[10] J. Choo, S. Bohn, and H. Park, “Two-Stage Framework for Visualization of Clustered High Dimensional Data,” Proc. IEEE Symp. Visual Analytics Science and Technology (VAST), pp. 67-74, 2009.
[11] E.J. Nam, Y. Han, K. Mueller, A. Zelenyuk, and D. Imre, “Clustersculptor: A Visual Analytics Tool for High-Dimensional Data,” Proc. IEEE Symp. Visual Analytics Science and Technology (VAST), pp. 75-82, 2007.
[12] R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan, “Automatic Subspace Clustering of High Dimensional Data,” Data Mining Knowledge Discovery, vol. 11, no. 1, pp. 5-33, 2005.
[13] C.C. Aggarwal, J.L. Wolf, P.S. Yu, C. Procopiuc, and J.S. Park, “Fast Algorithms for Projected Clustering,” Proc. ACM SIGMOD Conf. Management of Data, 1999.
[14] L. Kaufman and P. Rousseeuw, Finding Groups in Data An Introduction to Cluster Analysis. Wiley Interscience, 1990.
[15] A. Hinneburg and D.A. Keim, “An Efficient Approach to Clustering in Large Multimedia Databases with Noise,” Proc. Knowledge Discovery and Data Mining (KDD), pp. 58-65, 1998.
[16] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, “A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise,” Proc. Second Int'l Conf. Knowledge Discovery and Data Mining (KDD), pp. 226-231, 1996.
[17] S. Takahashi, I. Fujishiro, and M. Okada, “Applying Manifold Learning to Plotting Approximate Contour Trees,” IEEE Trans. Visualization and Computer Graphics (TVCG), vol. 15, no. 6, pp. 1185-1192, Nov./Dec. 2009.
[18] S. Fortune, Voronoi Diagrams and Delaunay Triangulations. World Scientific Press, pp. 193-233, 1992.
[19] M. De Berg, O. Cheong, and M. van Kreveld, Computational Geometry: Algorithms and Applications. Springer, 2008.
[20] G. Jaromczyk and J.W. Toussaint, “Relative Neighborhood Graphs and Their Relatives,” Proc. IEEE, vol. 80, no. 9, pp. 1502-1517, Sept. 1992.
[21] R.K. Gabriel and R.R. Sokal, “A New Statistical Approach to Geographic Variation Analysis,” Systematic Zoology, vol. 18, no. 3, pp. 259-270, 1969.
[22] M. Levoy, “Display of Surfaces from Volume Data,” IEEE Computer Graphics and Applications, vol. 8, no. 3, pp. 29-37, May 1988.
[23] W.E. Lorensen and H.E. Cline, “Marching Cubes: A High Resolution 3D Surface Construction Algorithm,” SIGGRAPH Computer Graphics, vol. 21, no. 4, pp. 163-169, 1987.
[24] G.H. Weber, G. Scheuermann, and B. Hamann, “Detecting Critical Regions in Scalar Fields,” Proc. IEEE TCVG Symp. Visualization, pp. 1-11, 2003.
[25] R.L. Boyell and H. Ruston, “Hybrid Techniques for Real-Time Radar Simulation,” AFIPS '63 (Fall): Proc. Nov. 12-14, 1963, Fall Joint Computer Conf., pp. 445-458, 1963.
[26] H. Carr, J. Snoeyink, and U. Axen, “Computing Contour Trees in All Dimensions,” Computational Geometry, vol. 24, no. 2, pp. 75-94, 2003.
[27] H. Edelsbrunner, D. Letscher, and A. Zomorodian, “Topological Persistence and Simplification,” Discrete and Computational Geometry, vol. 28, no. 4, pp. 511-533, 2002.
[28] V. Pascucci, K. Cole-McLaughlin, and G. Scorzelli, “The Toporrery: Computation and Presentation of Multi-Resolution Topology,” Proc. Math. Foundations of Scientific Visualization, Computer Graphics, and Massive Data Exploration, pp. 19-40, 2009.
[29] B.W. Silverman, Density Estimation for Statistics and Data Analysis, no. 26. Chapman and Hall, 1986.
[30] Y. Livnat, H.-W. Shen, and C.R. Johnson, “A Near Optimal Isosurface Extraction Algorithm Using the Span Space,” IEEE Trans. Visualization and Computer Graphics, vol. 2, no. 1, pp. 73-84, Mar. 1996.
[31] D. Cohen-Steiner, H. Edelsbrunner, and J. Harer, “Stability of Persistence Diagrams,” Discrete and Computational Geometry, vol. 37, pp. 103-120, 2007.
[32] A. Frank and A. Asuncion, “UCI Machine Learning Repository,” http://archive.ics.uci.eduml, 2010.
[33] M.A. Fanty and R. Cole, “Spoken Letter Recognition,” Proc. Conf. Advances in Neural Information Processing Systems (NIPS), 1990.
[34] M. Forina, C. Armanino, S. Lanteri, and E. Tiscornia, “Classification of Olive Oils from Their Fatty Acid Composition,” Proc. Food Research and Data Analysis, pp. 189-214, 1983.
[35] J. Handl and J. Knowles, “Cluster Generators for Large High-Dimensional Data Sets with Large Numbers of Clusters,”, 2005.

Index Terms:
Clustering, pattern analysis, point clouds, graphs, topology.
Patrick Oesterling, Christian Heine, Heike Jänicke, Gerik Scheuermann, Gerhard Heyer, "Visualization of High-Dimensional Point Clouds Using Their Density Distribution's Topology," IEEE Transactions on Visualization and Computer Graphics, vol. 17, no. 11, pp. 1547-1559, Nov. 2011, doi:10.1109/TVCG.2011.27
Usage of this product signifies your acceptance of the Terms of Use.