This Article 
 Bibliographic References 
 Add to: 
Visual Exploration of Large Relational Data Sets through 3D Projections and Footprint Splatting
November/December 2003 (vol. 15 no. 6)
pp. 1460-1471
Li Yang, IEEE

Abstract—This paper discusses 3D visualization and interactive exploration of large relational data sets through the integration of several well-chosen multidimensional data visualization techniques and for the purpose of visual data mining and exploratory data analysis. The basic idea is to combine the techniques of grand tour, direct volume rendering, and data aggregation in databases to deal with both the high dimensionality of data and a large number of relational records. Each technique has been enhanced or modified for this application. Specifically, positions of data clusters are used to decide the path of a grand tour. This cluster-guided tour makes intercluster-distance-preserving projections in which data clusters are displayed as separate as possible. A tetrahedral mapping method applied to cluster centroids helps in choosing interesting cluster-guided projections. Multidimensional footprint splatting is used to directly render large relational data sets. This approach abandons the rendering techniques that enhance 3D realism and focuses on how to efficiently produce real-time explanatory images that give comprehensive insights into global features such as data clusters and holes. Examples are given where the techniques are applied to large (more than a million records) relational data sets.

[1] D. Asimov, “The Grand Tour: A Tool For Viewing Multidimensional Data,” SIAM J. Science and Statistical Computing, vol. 6, pp. 128-143, 1985.
[2] B.G. Becker, Volume Rendering for Relational Data Proc. IEEE Symp. Information Visualization (InfoViz '97), pp. 87-90, Oct. 1997.
[3] G. Biswas, A.K. Jain, and R.C. Dubes, Evaluation of Projection Algorithms IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 3, no. 6, pp. 701-708, Nov. 1981.
[4] C. Blake, E. Keogh, and C.J. Merz, UCI Repository of Machine Learning Databases Univ. of California, Dept. Information and Computer Science, , 1998.
[5] W.S. Cleveland, Visualizing Data. Summit, N.J.: Hobart Press, 1993.
[6] D. Cook, A. Buja, and J. Cabrera, Grand Tour and Projection Pursuit J. Computational and Graphical Statistics, vol. 4, no. 3, pp. 155-172, 1995.
[7] D. Cook and A. Buja, Manual Controls for High-Dimensional Data Projections J. Computational and Graphical Statistics, vol. 6, no. 4, pp. 464-480, 1997.
[8] C. de Boor, K. Hollig, and S. Riemenschneider, Box Splines. New York: Springer-Verlag, 1993.
[9] I.S. Dhillon, D.S. Modha, and W.S. Spangler, Visualizing Class Structure of Multidimensional Data Proc. 30th Symp. Interface: Computing Science and Statistics, pp. 488-493, May 1998.
[10] S. Feiner and C. Beshers, Worlds Within Worlds: Metaphors for Exploring$n$-Dimensional Virtual Worlds Proc. Third ACM Symp. User Interface Software and Technology, pp. 76-83, Oct. 1990.
[11] M.A. Fisherkeller, J.H. Friedman, and J.W. Tukey, Prim-9: An Interactive Multi-Dimensional Data Display and Analysis System Dynamic Graphics for Statistics, W.S. Cleveland and M.E. McGill, eds., pp. 140-145, Wadsworth, Inc., 1988.
[12] J.D. Foley,A. van Dam,S.K. Feiner,, and J.F. Hughes,Computer Graphics: Principles and Practice,Menlo Park, Calif.: Addison-Wesley, 1990.
[13] J. Friedman and J. Tukey, A Projection Pursuit Algorithm for Exploratory Data Analysis IEEE Trans. Computers, vol. 23, pp. 881-890, 1974.
[14] Y.-H Fua, M.O. Ward, and E.A. Rundensteiner, Hierarchical Parallel Coordinates for Exploration of Large Datasets Proc. IEEE Conf. Visualization (Vis '99), pp. 43-50, Oct. 1999.
[15] J.C. Gower and G.J.S. Ross, Minimum Spanning Trees and Single Linkage Cluster Analysis Applied Statistics, vol. 18, no. 1, pp. 54-64, 1969.
[16] A. Hinneburg, M. Wawryniuk, and D.A. Keim, "HD-Eye: Visual Mining of High-Dimensional Data," IEEE Computer Graphics&Applications, vol. 19, no. 5, 1999, pp. 22-31.
[17] C. Hurley and A. Buja, Analyzing High-Dimensional Data with Motion Graphics SIAM J. Scientific and Statistical Computing, vol. 11, no. 6, pp. 1193-1211, 1990.
[18] A. Inselberg and B. Dimsdale, "Parallel Coordinates: A Tool for Visualizing Multi-Dimensional Geometry," Proc. Visualization '90, IEEE CS Press, 1990, pp. 361-370.
[19] A.K. Jain and R.C. Dubes, Algorithms for Clustering Data. Englewood Cliffs, N.J.: Prentice Hall, 1988.
[20] T.A. Keahey, Visualization of High-Dimensional Clusters Using Nonlinear Magnification Proc. SPIE Visual Data Exploration and Analysis VI, pp. 236-243, Jan. 1999.
[21] J. LeBlanc, M.O. Ward, and N. Wittels, “Exploring N-Dimensional Databases,” Proc. Visualization '90, pp. 230-239, 1990.
[22] R.C.T. Lee, J.R. Slagle, and H. Blum, A Triangulation Method for the Sequential Mapping of Points from n-Space to Two-Space IEEE Trans. Computers, pp. 288-292, Mar. 1977.
[23] R. Little and D. Rubin, Statistical Analysis With Missing Data. Wiley, 1987.
[24] National Highway Traffic Safety Administration, Fatality Analysis Reporting System (FARS), http:/, 2001
[25] R.M. Pickett and G.G. Grinstein, “Iconographic Displays for Visualizing Multidimensional Data,” Proc. IEEE Conf. Systems, Man, and Cybernetics, pp. 514-519, 1988.
[26] R.C. Prim, Shortest Connection Network and Some Generalizations Bell System Technical J., pp. 1389-1401, Nov. 1957.
[27] D.F. Swayne, D. Cook, and A. Buja, XGobi: Interactive Dynamic Data Visualization in the X Window System J. Computational and Graphical Statistics, vol. 7, no. 1, pp. 113-130, 1998.
[28] J. Symanzik, D. Cook, B.D. Kohlmeyer, U. Lechner, and C. Cruz-Neira, Dynamic Statistical Graphics in the C2 Virtual Reality Environment Computing Science and Statistics, vol. 29, no. 2, pp. 41-47, 1997.
[29] R. van Teylingen, W. Ribarsky, and C. van der Mast, Virtual Data Visualizer IEEE Trans. Visualization and Computer Graphics, vol. 3, no. 1, pp. 65-74, Jan./Mar. 1997.
[30] M. Ward, High Dimensional Brushing for Interactive Exploration of Multivariate Data Proc. IEEE Conf. Visualization (Vis '95), pp. 271-278, 1995.
[31] E. Wegman and Q. Luo, High Dimensional Clustering Using Parallel Coordinates and the Grand Tour Computer Science and Statistics, vol. 28, pp. 352-360, 1997.
[32] L. Westover,“Footprint evaluation for volume rendering,” Proc. SIGGRAPH’90 (Dallas, Texas, Aug. 6-10, 1990). In Computer Graphics, vol. 24, no. 4, pp. 367-376, 1990.
[33] P.C. Wong and R. D. Bergeron, Multiresolution Multidimensional Wavelet Brushing Proc. IEEE Conf. Visualization (Vis '00), pp. 141-148, Oct. 1996.
[34] L. Yang, 3D Grand Tour for Multidimensional Data and Clusters Proc. Third Symp. Intelligent Data Analysis (IDA '99), pp. 173-186, 1999.
[35] L. Yang, Interactive Exploration of Very Large Relational Datasets through 3D Dynamic Projections Proc. Sixth ACM Conf. Knowledge Discovery and Data Mining, pp. 236-243, Aug. 2000.
[36] F.W. Young and P. Rheingans, Visualizing Structure in High-Dimensional Multivariate Data IBM J. Research and Development, vol. 35, nos. 1/2, pp. 97-107, Jan./Mar. 1991.

Index Terms:
Data clustering, footprint splatting, grand tour, high-dimensional data, visual data exploration, volume rendering.
Li Yang, "Visual Exploration of Large Relational Data Sets through 3D Projections and Footprint Splatting," IEEE Transactions on Knowledge and Data Engineering, vol. 15, no. 6, pp. 1460-1471, Nov.-Dec. 2003, doi:10.1109/TKDE.2003.1245285
Usage of this product signifies your acceptance of the Terms of Use.