This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
From Visual Data Exploration to Visual Data Mining: A Survey
July-September 2003 (vol. 9 no. 3)
pp. 378-394

Abstract—We survey work on the different uses of graphical mapping and interaction techniques for visual data mining of large data sets represented as table data. Basic terminology related to data mining, data sets, and visualization is introduced. Previous work on information visualization is reviewed in light of different categorizations of techniques and systems. The role of interaction techniques is discussed, in addition to work addressing the question of selecting and evaluating visualization techniques. We review some representative work on the use of information visualization techniques in the context of mining data. This includes both visual data exploration and visually expressing the outcome of specific mining algorithms. We also review recent innovative approaches that attempt to integrate visualization into the DM/KDD process, using it to enhance user interaction and comprehension.

[1] C. Ahlberg and E. Wistrand, "IVEE: An Information Visualization and Exploration Environment," Proc. Information Visualization 95, Oct. 1995, IEEE Computer Soc. Press, Los Alamitos, Calif., pp. 66-73.
[2] T.W. Anderson, An Introduction to Multivariate Statistical Analysis. New York: Wiley, 1984.
[3] M. Ankerst, M. Ester, and H.-P. Kriegel, Towards an Effective Cooperation of the User and the Computer for Classification Proc. Int'l Conf. Knowledge Discovery and Data Mining (ACM SIGKDD '00), pp. 179-188, 2000.
[4] M. Ankerst, C. Elsen, M. Ester, and H.-P. Kriegel, Visual Classification: An Interactive Approach to Decision Tree Construction Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (ACM SIGKDD '99), pp. 392-397, 1999. http://www.sigchi.acm.org/pubs/citations/ proceedings/ai/312129p392-ankerst/.
[5] M. Ankerst, D.A. Keim, and H.-P. Kriegel, Circle Segments: A Technique for Visually Exploring Large Multidimensional Data Sets Proc. IEEE Visualization '96, Hot Topic Session, 1996.
[6] G.D. Battista, P. Eades, R. Tamassia, and I.G. Tollis, Annotated Bibliography on Graph Drawing Computational Geometry: Theory and Applications, vol. 4, no. 5, pp. 235-282, 1994.
[7] B. Becker, R. Kohavi, and D. Sommerfield, Visualizing the Simple Bayesian Classifier Proc. ACM SIGKDD '97 Workshop Issues on the Integration of Data Mining and Data Visualization, 1997.
[8] R.A. Becker, S.G. Eick, and A.R. Wilks, “Visualizing Network Data,” IEEE Trans. Visualization and Computer Graphics, vol. 1, no. 1, pp. 16-28, Mar. 1995.
[9] J. Beddow, “Shape Coding of Multidimensional Data on a Microcomputer Display,” Proc. Visualization '90, pp. 238-246, 1990.
[10] S. Berchtold, H.V. Jagadish, and K.A. Ross, Independence Diagrams: A Technique for Visual Data Mining Proc. Int'l Conf. Knowledge Discovery and Data Mining (ACM SIGKDD '98), pp. 139-143, 1998.
[11] J. Bertin, Semiology of Graphics: Diagrams, Networks, Maps, W.J. Berg, translator. Madison, Wis.: Univ. of Wisconsin Press, 1983.
[12] C.G. Beshers and S.K. Feiner, Visualizing n-Dimensional Virtual Worlds within n-Vision Computer Graphics, vol. 24, no. 2, pp. 37-38, 1990.
[13] C. Beshers and S. Feiner, “AutoVisual: Rule-Based Design of Interactive Multivariate Visualizations,” IEEE Computer Graphics and Applications, vol. 13, no. 4, pp. 41-49, 1993.
[14] C.G. Beshers and S.K. Feiner, Automated Design of Data Visualizations Scientific Visualization Advances and Challenges, L. Rosemblum et al., eds., pp. 88-102, Academic Press, 1994.
[15] C. Brunk, J. Kelly, and R. Kohavi, MineSet: An Integrated System for Data Mining Proc. Int'l Conf. Knowledge Discovery and Data Mining (ACM SIGKDD '97), pp. 135-138, 1997.
[16] I. Cadez, D. Heckerman, C. Meek, P. Smyth, and S. White, Visualization of Navigation Patterns on a Web Site Using Model-Based Clustering Proc. Int'l Conf. Knowledge Discovery and Data Mining (ACM SIGKDD '00), pp. 280-284, 1998.
[17] Readings in Information Visualization Using Vision to Think, S.K. Card, J.D. Mackinlay, and B. Shneiderman, eds. San Francisco: Morgan Kaufmann, 1999.
[18] S. Cbakrabarti, B.E. Dom, D. Gibson, and J. Kleinberg, Mining the Web's Link Structure Computer, vol. 32, no. 8, pp. 60-67, Aug. 1999.
[19] H. Chernoff, The Use of Faces to Represent Points in k-Dimensional Space Graphically J. Am. Statistical Assoc., vol. 68, pp. 361-368, 1973.
[20] E.H. Chi, A Taxonomy of Visualization Techniques Using the Data State Reference Model Proc. Symp. Information Visualization (InfoVis 2000), pp. 69-75, 2000.
[21] E.H. Chi and J.T. Riedl, An Operator Interaction Framework for Visualization Systems Proc. Symp. Information Visualization (InfoVis '98), pp. 63-70, 1998.
[22] M.C. Chuah and S.F. Roth, On the Semantics of Interactive Visualization Proc. IEEE Visualization '96, pp. 29-36, 1996.
[23] W.S. Cleveland, Visualizing Data. Summit, N.J.: Hobart Press, 1993.
[24] U. Cvek, A. Gee, G. Grinstein, P. Hoffman, K.A. Marx, D. Pinkney, M. Trutschl, and H. Zhang, Datamining of Yeast Functional Genomics Data Using Multidimensional Analytic and Visualization Techniques Drug Discovery Technology, 1999.
[25] J.G. Dy and C.E. Brodley, Visualization and Interactive Feature Selection for Unsupervised Data Proc. Int'l Conf. Knowledge Discovery and Data Mining (ACM SIGKDD '00), pp. 360-364, 2000.
[26] S. Eick and G.J. Wills, “Navigating Large Networks with Hierarchies,” Proc. Visualization '93, pp. 204-210, 1993.
[27] C. Faloutsos and K.-I.D. Lin, FastMap: A Fast Algorithm for Indexing, Data-Mining and Visualization of Traditional and Multimedia Datasets Proc. ACM SIGMOD Int'l Conf. Management of Data (ACM SIGMOD '95), pp. 163-174, 1995.
[28] U.M. Fayyad, Mining Databases: Towards Algorithms for Knowledge Discovery Data Eng. Bull., vol. 21, no. 1, pp. 39-48, 1998.
[29] U.M. Fayyad, G. Piatetsky-Shapiro, and P. Smith, From Data Mining to Knowledge Discovery: An Overview Advances in Knowledge Discovery and Data Mining, U.M. Fayyad et al., eds., chapter 1, pp. 1-34, AAAI Press and MIT Press, 1996.
[30] J.D. Foley and B. Ribarsky, Next-Generation Data Visualization Tools. Scientific Visualization Advances and Challenges, L. Rosemblum et al., eds., pp. 103-127, Academic Press, 1994.
[31] Y.-H Fua, M.O. Ward, and E.A. Rundensteiner, Hierarchical Parallel Coordinates for Exploration of Large Datasets Proc. IEEE Conf. Visualization (Vis '99), pp. 43-50, Oct. 1999.
[32] Y.H. Fua, E.A. Rundensteiner, and M.O. Ward, “Navigating Hierarchies with Structure-Based Brushes,” Proc. IEEE Symp. Information Visualization, Oct. 1999.
[33] M. Goebel and L. Gruenwald, A Survey of Data Mining and Knowledge Discovery Software Tools ACM SIGKDD Explorations, vol. 1, no. 1, pp. 20-33, June 1999.
[34] G. Grinstein, R. Pickett, and M.G. Williams, EXVIS: An Exploratory Visualization Environment Proc. Graphics Interface '89, 1989.
[35] M.H. Gross, T.C. Sprenger, and J. Finger, Visualizing Information on a Sphere Proc. IEEE Information Visualization '97, pp. 11-16, 1995.
[36] J. Han and N. Cercone, RuleViz: A Model for Visualizing Knowledge Discovery Process Proc. Int'l Conf. Knowledge Discovery and Data Mining (ACM SIGKDD '00), pp. 244-253, 2000.
[37] H.H. Harman, Modern Factor Analysis. Univ. of Chicago Press, 1967.
[38] J.M. Hellerstein, R. Avnur, A. Chou, C. Hidber, C. Olston, V. Raman, T. Roth, and P.J. Haas, Interactive Data Analysis: The Control Project Computer, vol. 32, no. 8, pp. 51-58, Aug. 1999.
[39] J. Hartigan and B. Kleiner, Mosaics for Contingency Plots Proc. 13th Symp. Interface, pp. 268-273, 1981.
[40] G.A. Helt, S. Lewis, A.E. Loraine, and G.M. Rubin, BioViews: Java-Based Tools for Genomic Data Visualization Genome Research, vol. 8, pp. 291-305, 1998.
[41] R.J. Hendley et al., “Narcissus: Visualising Information,” Proc. IEEE Symp. Information Visualization, pp. 90-96, 1995.
[42] W.L. Hibbard, C.R. Dyer, and B.E. Paul, A Lattice Model for Data Display Proc. IEEE Visualization '94, pp. 310-317, 1994.
[43] W. Hibbard, H. Levkowitz, J. Haswell, P. Rheingans, and F. Schroeder, Interaction in Perceptually-Based Visualization Perceptual Issues in Visualization, G.G. Grinstein and H. Levkowitz, eds., pp. 23-32, Springer-Verlag, 1995.
[44] A. Hinneburg, M. Wawryniuk, and D.A. Keim, "HD-Eye: Visual Mining of High-Dimensional Data," IEEE Computer Graphics&Applications, vol. 19, no. 5, 1999, pp. 22-31.
[45] P.E. Hoffman, Table Visualizations: A Formal Model and Its Applications doctoral dissertation, Computer Science Dept., Univ. of Massachusetts at Lowell, 1999.
[46] P. Hoffman, G. Grinstein, K. Marx, I. Grosse, and E. Stanley, DNA Visual and Analytic Data Mining Proc. IEEE Visualization '97, 1997. http://www.cs.uml.edu/~phoffmandna1/.
[47] H. Hofmann, A.P.J.M. Siebes, and A.F.X. Wilhelm, Visualizing Association Rules with Interactive Mosaic Plots Proc. Int'l Conf. Knowledge Discovery and Data Mining (ACM SIGKDD '00), pp. 227-235, 2000.
[48] A. Inselberg and T. Avidan, Classification and Visualization for High-Dimensional Data Proc. Int'l Conf. Knowledge Discovery and Data Mining (ACM SIGKDD '00), pp. 370-374, 2000.
[49] A. Inselberg, Multidimensional Detective Proc. IEEE Symp. Information Visualization (InfoVis '97), pp. 100-107, 1997.
[50] A. Inselberg, The Plane with Parallel Coordinates The Visual Computer, vol. 1, special issue on computational geometry, pp. 69-91, 1985.
[51] A. Inselberg and B. Dimsdale, "Parallel Coordinates: A Tool for Visualizing Multi-Dimensional Geometry," Proc. Visualization '90, IEEE CS Press, 1990, pp. 361-370.
[52] D.A. Keim, Visual Database Exploration Techniques Proc. Tutorial KDD '97 Int'l Conf. Knowledge Discovery and Data Mining, 1997. http://www.informatik.uni-halle.de/~keim/ PSKDD97.pdf.
[53] D.A. Keim, "Designing Pixel-Oriented Visualization Techniques: Theory and Applications," IEEE Trans. Visualization and Computer Graphics, vol. 6, no. 1, Jan.-Mar. 2000, pp. 59-78.
[54] D. Keim, R.D. Bergeron, and R.M. Pickett, Test Data Sets for Evaluating Data Visualization Techniques Perceptual Issues in Visualization, G.G. Grinstein and H. Levkowitz, eds., pp. 9-22, Springer-Verlag, 1995.
[55] D.A. Keim and H.-P. Kriegel, “Visualization Techniques for Mining Large Databases: A Comparison,” IEEE Trans. Knowledge and Data Eng., vol. 8, no. 6, pp. 923-938, Dec. 1996.
[56] D.A. Keim and H.-P. Kriegel, “VisDB: Database Exploration Using Multidimensional Visualization,” IEEE Computer Graphics&Applications, pp. 40-49, Sept. 1994.
[57] D.A. Keim, H.-P. Kriegel, and T. Seidl, “Supporting Data Mining of Large Databases by Visual Feedback Queries,” Proc. 10th Int'l Conf. Data Eng., pp. 302-313, 1994.
[58] R. Kohavi and D. Sommerfield, Targeting Business Users with Decision Table Classifiers Proc. Conf. Knowledge Discovery and Data Mining (ACM SIGKDD '98), 1998.
[59] C.B. Kreitzberg, Details on Demand: Hypertext Models for Coping with Information Overload Interfaces for Information Retrieval and Online Systems, M. Dillon, ed., pp. 169-176, New York: Greenwood Press, 1991.
[60] J. LeBlanc, M.O. Ward, and N. Wittels, “Exploring N-Dimensional Databases,” Proc. Visualization '90, pp. 230-239, 1990.
[61] J. Lee and M. Podlaseck, Visualization and Análisis of Clickstream Data of Online Stores for Understanding Web Merchandising Int'l J. Data Mining and Knowledge Discovery, special issue on e-commerce and data mining, Jan. 2001.
[62] H. Levkowitz, “Color Icons: Merging Color and Texture Perception for Integrated Visualization of Multiple Parameters,” Proc. Visualization '91, Oct. 1991.
[63] H. Levkowitz, R.M. Picket, S. Smith, and M. Torpey, An Environment and Studies for Exploring Auditory Representations of Multidimensional Data Perceptual Issues in Visualization, G.G. Grinstein and H. Levkowitz, eds., pp. 47-58, Springer-Verlag, 1995.
[64] J.D. Mackinlay, Automating the Design of Graphical Presentations of Relational Information ACM Trans. Graphics, vol. 5, no. 2, pp. 110-141, 1986.
[65] Proc. Workshop Web Usage Analysis and User Profiling (ACM WEBKDD '99), B. Masand and M. Spiliopoulou, eds., 1999. http://www.acm.org/sigkdd/proceedingswebkdd99 /.
[66] M.C.F. de Oliveira and H. Levkowitz, On-Line Resource on Visual Data Mining 2001. http://www.cs.uml.edu/~mcristinvdm-resource.htm.
[67] R.M. Pickett and G.G. Grinstein, “Iconographic Displays for Visualizing Multidimensional Data,” Proc. IEEE Conf. Systems, Man, and Cybernetics, pp. 514-519, 1988.
[68] W. Ribarsky, J. Katz, F. Jiang, and A. Holland, Discovery Visualization Using Fast Clustering IEEE Computer Graphics and Applications, vol. 19, no. 5, pp. 32-39, 1999.
[69] G. Robertson, S. Card, and J. Mackinlay, Cone Trees: Animated 3D Visualizations of Hierarchical Information Proc. ACM Int'l Conf. Human Factors in Computing (CHI 1991), pp. 189-194, 1991.
[70] P.K. Robertson, R.A. Earnshaw, D. Thalmann, M. Grave, J. Gallop, and E.M. De Jong, Research Issues in the Foundations of Visualization IEEE Computer Graphics and Applications, vol. 14, no. 2, pp. 73-76, 1994.
[71] A.J. Robinson and T.P. Flores, Novel Techniques for Visualizing Biological Information Proc. Fifth Int'l Conf. Intelligent Systems on Molecular Biology, pp. 241-249, 1997.
[72] S. Rose and P.C. Wong, DriftWeed A Visual Metaphor for Interactive Analysis of Multivariate Data Proc. IS&T/SPIE Conf. Visual Data Exploration and Analysis, 2000.
[73] S.F. Roth and J. Mattis, Data Characterization for Intelligent Graphics Presentations Proc. Human Factors in Computing Systems Conf. (CHI '90), pp. 193-200, 1990.
[74] S.F. Roth, P. Lucas, J.A. Senn, C.C. Gomberg, M.B. Burks, P.J. Stroffolino, J.A. Kolojejchick, and C. Dunmire, Visage: A User Interface Environment for Exploring Information Proc. IEEE Symp. Information Visualization (InfoVis '96), pp. 3-10, 1996.
[75] J.W. Sammon Jr., A Nonlinear Mapping for Data Structure Analysis IEEE Trans. Computers, vol. 18, no. 5, pp. 401-409, 1969.
[76] H. Senay and E. Ignatius, “A Knowledge-Based System for Visualization Design,” IEEE Computer Graphics and Applications, vol. 14, no. 6, pp. 36-47, Nov. 1994.
[77] B. Shneiderman, Tree Visualization with Treemaps: A 2D Space-Filling Approach ACM Trans. Graphics, vol. 11, no. 1, pp. 92-99, 1992.
[78] B. Shneiderman, "Dynamic Queries for Visual Information Seeking," IEEE Software, Nov./Dec. 1994, pp. 70-77.
[79] B. Shneiderman, "The Eyes Have It: A Task by Data Type Taxonomy for Information Visualizations," Proc. IEEE Symp. Visual Languages, IEEE CS Press, 1996, pp. 336-343.
[80] M. Spiliopoulou and C. Pohle, Data Mining to Measure and Improve the Success of Web Sites Int'l J. Data Mining and Knowledge Discovery, special issue on e-commerce and data mining, Jan. 2001.
[81] T.C. Sprenger, R. Brunella, and M.H. Gross, A Hierarchical Visual Clustering Method Using Implicit Surfaces CS Tech. Report #341, Computer Science Dept. ETH Zurich, Switzerland, 2000.
[82] J. Srivastava, R. Cooley, M. Deshpande, and P.-N. Tan, Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data SIGKDD Explorations, vol. 1, no. 2, pp. 12-23, 2000.
[83] J. Stasko, R. Catrambone, M. Guzdial, and K. McDonald, An Evaluation of Space Filling Information Visualizations for Depicting Hierarchical Structures Int'l J. Human-Computer Studies, vol. 53, no. 5, pp. 663-694, 2000. Also available as Gatech Tech. Report GIT-GVU-00-03.
[84] J. Symanzik, G.A. Ascoli, S.S. Washington, and J.L. Krichmar, Visual Data Mining of Brain Cells Computing Science and Statistics, vol. 31, pp. 445-449, 1999.
[85] E. Tufte, The Visual Display of Quantitative Information. Graphics Press, 1983.
[86] J. Vesanto, Using SOMs in Data Mining Licenciate's thesis, Helsink Univ, of Tech nology, 2000.
[87] J. Vesanto, SOM-Based Data Visualization Methods Intelligent Data Analysis, vol. 3, no. 2, pp. 111-126, 1999.
[88] M.O. Ward, "XmdvTool: Integrating Multiple Methods for Visualizing Multivariate Data," Proc. Visualization '94, IEEE CS Press, 1994, pp. 326-336.
[89] M. Ware, E. Frank, G. Holmes, M. Hall, and I.H. Witten, Interactive Machine Learning Letting Users Building Classifiers Working Paper 00/4, Dept. of Computer Science, Univ. of Waikato, 2000.
[90] P.C. Wong, Visual Data Mining IEEE Computer Graphics and Applications, vol. 19, no. 5, pp. 20-21, Sept./Oct. 1999.
[91] P.C. Wong and R.D. Bergeron, 30 Years of Multidimensional Multivariate Visualization Scientific Visualization Overviews, Methodologies, and Techniques, G.M. Nielson et al., eds., pp. 3-33, Los Alamitos, Calif.: IEEE CS Press, 1997.
[92] F. Young, Multidimensional Scaling: History, Theory, and Applications. Hillsdale, N.J.: Lawrence Erlbaum Assoc., 1987.

Index Terms:
Information visualization, visual data exploration, visual data mining, survey, framework, model.
Citation:
Maria Cristina Ferreira de Oliveira, Haim Levkowitz, "From Visual Data Exploration to Visual Data Mining: A Survey," IEEE Transactions on Visualization and Computer Graphics, vol. 9, no. 3, pp. 378-394, July-Sept. 2003, doi:10.1109/TVCG.2003.1207445
Usage of this product signifies your acceptance of the Terms of Use.