The Community for Technology Leaders
Subscribe
Issue No.03 - May/June (2008 vol.14)
pp: 564-575
ABSTRACT
The problem of projecting multidimensional data into lower dimensions has been pursued by many researchers due to its potential application to data analysis of various kinds. This paper presents a novel multidimensional projection technique based on least square approximations. The approximations compute the coordinates of a set of projected points based on the coordinates of a reduced number of control points with defined geometry. We name the technique Least Square Projections (LSP). From an initial projection of the control points, LSP defines the positioning of their neighboring points through a numerical solution that aims at preserving a similarity relationship between the points given by a metric in $mD$. In order to perform the projection, a small number of distance calculations is necessary and no repositioning of the points is required to obtain a final solution with satisfactory precision. The results show the capability of the technique to form groups of points by degree of similarity in $2D$. We illustrate that capability through its application to mapping collections of textual documents from varied sources, a strategic yet difficult application. LSP is faster and more accurate than other existing high quality methods, particularly where it was mostly tested, that is, for mapping text sets.
INDEX TERMS
Multivariate visualization, Data and knowledge visualization, Information visualization
CITATION
Fernando V. Paulovich, Luis G. Nonato, Rosane Minghim, Haim Levkowitz, "Least Square Projection: A Fast High-Precision Multidimensional Projection Technique and Its Application to Document Mapping", IEEE Transactions on Visualization & Computer Graphics, vol.14, no. 3, pp. 564-575, May/June 2008, doi:10.1109/TVCG.2007.70443
REFERENCES
 [1] M.C.F. Oliveira and H. Levkowitz, “From Visual Data Exploration to Visual Data Mining: A Survey,” IEEE Trans. Visualization and Computer Graphics, vol. 9, no. 3, pp. 378-394, July-Sept. 2003. [2] P. Berkhin, “Survey of Clustering Data Mining Techniques,” technical report, Accrue Software, 2002. [3] K. Borner, C. Chen, and K. Boyack, “Visualizing Knowledge Domains,” Ann. Rev. Information Science and Technology, vol. 37, pp.1-51, 2003. [4] F.V. Paulovich, L.G. Nonato, R. Minghim, and H. Levkovitz, “Visual Mapping of Text Collections through a Fast High-Precision Projection Technique,” Proc. 10th Conf. Information Visualization (IV '06), pp. 282-290, 2006. [5] F.V. Paulovich, M.C.F. de Oliveira, and R. Mighim, “The Projection Explorer: A Flexible Tool for Projection-Based Multidimensional Visualization,” Proc. 20th Brazilian Symp. Computer Graphics and Image Processing (SIBGRAPI '07), pp. 27-34, 2007. [6] E. Tejada, R. Minghim, and L.G. Nonato, “On Improved Projection Techniques to Support Visual Exploration of Multidimensional Data Sets,” Information Visualization, vol. 2, no. 4, pp. 218-231, 2003. [7] I.T. Jolliffe, Principal Component Analysis. Springer-Verlag, 1986. [8] K.V. Mardia, J.T. Kent, and J.M. Bibby, “Multivariate Analysis,” Probability and Mathematical Statistics. Academic Press, 1995. [9] T.F. Cox and M.A.A. Cox, Multidimensional Scaling, second ed. Chapman and Hall/CRC, 2000. [10] J.W. Sammon, “A Nonlinear Mapping for Data Structure Analysis,” IEEE Trans. Computers, vol. 13, pp. 401-409, May 1964. [11] P. Demartines and J. Herault, “Curvilinear Component Analysis: A Self-Organizing Neural Network for Nonlinear Mapping of Data Sets,” IEEE Trans. Neural Networks, vol. 8, no. 1, pp. 148-154, Jan. 1997. [12] L. Yang, “Sammon's Nonlinear Mapping Using Geodesic Distances,” Proc. 17th Int'l Conf. Pattern Recognition (ICPR '04), vol. 2, pp. 303-306, 2004. [13] J.B. Tenenbaum, V. de Silva, and J.C. Langford, “A Global Geometric Framework for Nonlinear Dimensionality Reduction,” Science, vol. 290, no. 5500, pp. 2319-2323, Dec. 2000. [14] E. Pekalska, D. de Ridder, R.P.W. Duin, and M.A. Kraaijveld, “A New Method of Generalizing Sammon Mapping with Application to Algorithm Speed-Up,” Proc. Fifth Ann. Conf. Advanced School for Computing and Imaging (ASCI '99), pp. 221-228, June 1999. [15] T.M.J. Fruchterman and E.M. Reingold, “Graph Drawing by Force-Directed Placement,” Software—Practice and Experience, vol. 21, no. 11, pp. 1129-1164, 1991. [16] P.A. Eades, “A Heuristic for Graph Drawing,” Congressus Numerantium, vol. 42, pp. 149-160, 1984. [17] F.V. Paulovich and R. Minghim, “Text Map Explorer: A Tool to Create and Explore Document Maps,” Proc. 10th Conf. Information Visualization (IV '06), pp. 245-251, 2006. [18] M. Chalmers, “A Linear Iteration Time Layout Algorithm for Visualising High-Dimensional Data,” Proc. Seventh Conf. Visualization (VIS '96), pp. 127-128, 1996. [19] A. Morrison, G. Ross, and M. Chalmers, “A Hybrid Layout Algorithm for Sub-Quadratic Multidimensional Scaling,” Proc. IEEE Symp. Information Visualization (InfoVis '02), p. 152, 2002. [20] A. Morrison, G. Ross, and M. Chalmers, “Fast Multidimensional Scaling through Sampling, Springs and Interpolation,” Information Visualization, vol. 2, no. 1, pp. 68-77, 2003. [21] A. Morrison and M. Chalmers, “A Pivot-Based Routine for Improved Parent-Finding in Hybrid MDS,” Information Visualization, vol. 3, no. 2, pp. 109-122, 2004. [22] F. Jourdan and G. Melancon, “Multiscale Hybrid MDS,” Proc. Eighth Int'l Conf. Information Visualization (IV '04), pp. 388-393, 2004. [23] K. Andrews, W. Kienreich, V. Sabol, J. Becker, G. Droschl, F. Kappe, M. Granitzer, P. Auer, and K. Tochtermann, “The Infosky Visual Explorer: Exploiting Hierarchical Structure and Document Similarities,” Information Visualization, vol. 1, nos. 3/4, pp. 166-181, 2002. [24] J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M. Pottier, A. Schur, and V. Crow, “Visualizing the Non-Visual: Spatial Analysis and Interaction with Information for Text Documents,” Readings in Information Visualization: Using Vision to Think, pp. 442-450, Morgan Kaufmann, 1995. [25] J.A. Wise, “The Ecological Approach to Text Visualization,” J. Am. Soc. Information Science, vol. 50, no. 13, pp. 1224-1233, Nov. 1999. [26] ${\rm IN}\hbox{-}{\rm SPIRE}^{TM}$ Visual Document Analysis. Pacific Northwest Nat'l Laboratory (PNL), http:/in-spire.pnl.gov/, 2007. [27] O. Sorkine and D. Cohen-Or, “Least-Squares Meshes,” Proc. Int'l Conf. Shape Modeling and Applications (SMI '04), pp. 191-199, 2004. [28] O. Sorkine, Y. Lipman, D. Cohen-Or, M. Alexa, C. Rössl, and H. Seidel, “Laplacian Surface Editing,” Proc. Eurographics/ACM SIGGRAPH Symp. Geometry Processing, pp. 179-188, 2004. [29] M.S. Floater, “Parametrization and Smooth Approximation of Surface Triangulations,” Computer Aided Geometric Design, vol. 4, no. 13, pp. 231-250, 1997. [30] W.T. Tutte, How to Draw a Graph, no. 13, pp. 743-768, 1963. [31] M. Martín-Merino and A. Muñoz, “A New Sammon Algorithm for Sparse Data Visualization,” Proc. 17th Int'l Conf. Pattern Recoginition (ICPR), 2004. [32] E.G. Chávez, R. Baeza-Yates, and J.L. Marroquín, “Searching in Metric Spaces,” ACM Computing Surveys, vol. 33, no. 3, pp. 273-321, 2001. [33] G. Salton, “Developments in Automatic Text Retrieval,” Science, vol. 253, pp. 974-980, 1991. [34] C. Faloutsos, K. Lin, “Fastmap: A Fast Algorithm for Indexing, Datamining and Visualization of Traditional and Multimedia Databases,” Proc. ACM SIGMOD '95, pp. 163-174, 1995. [35] S. Hettich and S.D. Bay, The UCI KDD Archive, http:/kdd.ics.uci.edu, 1999. [36] J.B. Kruskal, “Multidimensional Scaling by Optimizing Goodness of Fit to a Nonmetric Hypothesis,” Psychometrika, vol. 29, pp. 115-129, 1964. [37] J.D. Fekete, G. Grinstein, and C. Plaisant, IEEE InfoVis 2004 Contest: The History of InfoVis, www.cs.umd.edu/hciliv04contest, 2004. [38] R. Cilibrasi and P. Vitanyi, “Clustering by Compression,” IEEE Trans. Information Theory, vol. 51, no. 4, pp. 1523-1545, 2005. [39] G.P. Telles, R. Minghim, and F.V. Paulovich, “Normalized Compression Distances for Visual Analysis of Document Collections,” Computers and Graphics, special issue on visual analytics, vol. 31, no. 3, pp. 327-337, June 2007. [40] J.H. Gennari, P. Langley, and D. Fisher, “Models of Incremental Concept Formation,” Artificial Intelligence, vol. 40, pp. 11-61, 1989. [41] J.R. Shewchuck, An Introduction to the Conjugate Gradient Method without the Agonizing Pain, http://www.cs.cmu.edu/quake-paperspainless-conjugate-gradient.ps , Aug. 1994. [42] A.A. Lopes, R. Pinho, R. Minghim, and F.V. Paulovich, “Visual Text Mining Using Association Rules,” Computers and Graphics, special issue on visual analytics, vol. 31, no. 3, pp. 316-326, June 2007.
19 ms
(Ver 2.0)

Marketing Automation Platform