| | This Article | |
| |
| |
| | Share | |
| |
| |
| | Bibliographic References | |
| |
| |
| | Add to: | |
| |
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
| |
| | Search | |
| |
| |
| | |
On FastMap and the Convex Hull of Multivariate Data: Toward Fast and Robust Dimension Reduction
August 2005 (vol. 27 no. 8)
pp. 1340-1343
FastMap is a dimension reduction technique that operates on distances between objects. Although only distances are used, implicitly the technique assumes that the objects are points in a p\hbox{-}{\rm{dimensional}} Euclidean space. It selects a sequence of k\leq p orthogonal axes defined by distant pairs of points (called pivots) and computes the projection of the points onto the orthogonal axes. We show that FastMap uses only the outer envelope of a data set. Pivots are taken from the faces, usually vertices, of the convex hull of the data points in the original implicit Euclidean space. This provides a bridge to results in robust statistics, where the convex hull is used as a tool in multivariate outlier detection and in robust estimation methods. The connection sheds new light on the properties of FastMap, particularly its sensitivity to outliers, and provides an opportunity for a new class of dimension reduction algorithms, RobustMaps, that retain the speed of FastMap and exploit ideas in robust statistics.
[1] 1340 C. Faloutsos and K. Lin, “FastMap: A Fast Algorithm for Indexing, Data-Mining and Visualization of Traditional and Multimedia Datasets,” Proc. ACM SIGMOD Conf., pp. 163-174, May 1995.[2] W.S. Torgerson, “Multidimensional Scaling I: Theory and Method,” Psychometrika, vol. 17, pp. 401-419, 1952.[3] H. Hotelling, “Analysis of a Complex of Statistical Variables into Principal Components,” J. Educational Psychology, vol. 24, pp. 417-441, 498-520, 1933.[4] G.R. Hjaltason and H. Samet, “Properties of Embedding Methods for Similarity Searching in Metric Spaces,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, pp. 530-549, 2003.[5] J. Erickson, “New Lower Bounds for Convex Hull Problems in Odd Dimensions,” SIAM J. Computing, vol. 28, no. 4, pp. 1198-1214, 1999.[6] J.H. Gallier, Geometric Methods and Applications for Computer Science and Engineering. Springer, 2000.[7] G.M. Ziegler, Lectures on Polytopes. Springer-Verlag, 1995.[8] F.N. Abu-Khzam, N. Samatova, G. Ostrouchov, M.A. Langston, and A. Geist, “Distributed Dimension Reduction Algorithms for Widely Dispersed Data,” Parallel and Distributed Computing and Systems, pp. 174-178, ACTA Press, 2002.[9] G.H. Golub and C.F.V. Loan, Matrix Computations, third ed. Baltimore: John's Hopkins Univ. Press, 1996.[10] P.J. Huber, Robust Statistics. New York: John Wiley & Sons, 1981.[11] P.J. Huber, “Robust Statistics: A Review,” Annals Math. Statistics, vol. 43, no. 4, pp. 1041-1067, 1972.[12] D.J. Downing, V.V. Fedorov, W.F. Lawkins, M.D. Morris, and G. Ostrouchov, “Large Data Series: Modeling the Usual to Identify the Unusual,” Computational Statistics & Data Analysis, vol. 32, pp. 245-258, 2000.[13] “Atmospheric Radiation Measurement Program Plan,” Technical Report DOE/ER-0441, US Dept. of Energy, Office of Health and Environmental Research, Atmospheric and Climate Research Division, Nat'l Technical Information Service, 1990.[14] D.L. Donoho and P.J. Huber, “The Notion of Breakdown-Point,” Festschrift for Erich L. Lehmann, Bickel, Doksum, and Hodges, eds., pp. 157-184. Belmont, Calif.: Wadsworth, 1983.[15] I. Ruts and P.J. Rousseeuw, “Computing Depth Contours of Bivariate Point Clouds,” Computational Statistics & Data Analysis, vol. 23, pp. 153-168, 1996.[16] “R Development Core Team,” R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, 2004, http:/www.R-project.org.
Index Terms:
Index Terms- Dimension reduction, convex hull, FastMap, RobustMap, principal components, multidimensional scaling, robust statistics, Euclidean distance.
Citation:
George Ostrouchov, Nagiza F. Samatova, "On FastMap and the Convex Hull of Multivariate Data: Toward Fast and Robust Dimension Reduction," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 8, pp. 1340-1343, Aug. 2005, doi:10.1109/TPAMI.2005.164