
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
Gísli R. Hjaltason, Hanan Samet, "Properties of Embedding Methods for Similarity Searching in Metric Spaces," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 5, pp. 530549, May, 2003.  
BibTex  x  
@article{ 10.1109/TPAMI.2003.1195989, author = {Gísli R. Hjaltason and Hanan Samet}, title = {Properties of Embedding Methods for Similarity Searching in Metric Spaces}, journal ={IEEE Transactions on Pattern Analysis and Machine Intelligence}, volume = {25}, number = {5}, issn = {01628828}, year = {2003}, pages = {530549}, doi = {http://doi.ieeecomputersociety.org/10.1109/TPAMI.2003.1195989}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Pattern Analysis and Machine Intelligence TI  Properties of Embedding Methods for Similarity Searching in Metric Spaces IS  5 SN  01628828 SP530 EP549 EPD  530549 A1  Gísli R. Hjaltason, A1  Hanan Samet, PY  2003 KW  Embedding methods KW  metric spaces KW  similarity search KW  multimedia databases KW  contractiveness KW  distortion KW  quality KW  Lipschitz embeddings KW  singular value decomposition (SVD) KW  SparseMap KW  FastMap KW  MetricMap. VL  25 JA  IEEE Transactions on Pattern Analysis and Machine Intelligence ER   
Abstract—Complex data types—such as images, documents, DNA sequences, etc.—are becoming increasingly important in modern database applications. A typical query in many of these applications seeks to find objects that are similar to some target object, where (dis)similarity is defined by some distance function. Often, the cost of evaluating the distance between two objects is very high. Thus, the number of distance evaluations should be kept at a minimum, while (ideally) maintaining the quality of the result. One way to approach this goal is to
[1] N. Linial, E. London, and Y. Rabinovich, “The Geometry of Graphs and Some of Its Algorithmic Applications,” Combinatorica, vol. 15, pp. 215245, 1995.
[2] H. Samet, Applications of Spatial Data Structures. AddisonWesley, 1990.
[3] H. Samet, The Design and Analysis of Spatial Data Structures. AddisonWesley, 1990.
[4] M. Ankerst, G. Kastenmüller, H.P. Kriegel, and T. Seidl, “3D Shape Histograms for Similarity Search and Classification in Spatial Databases,” Proc. Advances in Spatial DatabasesSixth Int'l Symp., R.H. Guting, D. Papadias, and F.H. Lochovsky, eds., pp. 207226, July 1999.
[5] J. Hafner, H.S. Sawhney, W. Equitz, M. Flickner, and W. Niblack, “Efficient Color Histogram Indexing for Quadratic Form Distance Functions,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 17, no. 7, pp. 729736, July 1995.
[6] F. Korn, N. Sidiropoulos, C. Faloutsos, E. Siegel, and Z. Protopapas, “Fast NearestNeighbor Search in Medical Image Databases,” Proc. Conf. Very Large Data Bases (VLDB '96), Sept. 1996.
[7] R. Agrawal, C. Faloutsos, and A. Swami, “Efficient Similarity Search in Sequence Databases,” Proc. Fourth Int'l Conf. Foundations of Data Organization and Algorithms, pp. 6984, Oct. 1993.
[8] K.P. Chan and A. Fu, “Efficient Time Series Matching by Wavelets,” Proc. Int'l Conf. Data Eng., 1999.
[9] H. Hotelling, “Analysis of a Complex of Statistical Variables into Principal Components,” J. Educational Psychology, vol. 24, pp. 417441, and pp. 498520, 1933.
[10] K. Fukunaga, Introduction to Statistical Pattern Recognition, second edition. Academic Press, 1990.
[11] A.V. Oppenheim and R.W. Schafer, Digital Signal Processing. Englewood Cliffs, N.J.: PrenticeHall, 1975.
[12] C.S. Burrus, R.A. Gopinath, and H. Guo, Introduction to Wavelets and Wavelet Transforms: A Primer. Upper Saddle River, N.J.: Prentice Hall, 1998.
[13] M. Vetterli and J. Kovacevic, Wavelets and Subband Coding.Englewood Cliffs, N.J.: Prentice Hall, 1995.
[14] D. Achlioptas, “DatabaseFriendly Random Projections,” Proc. 20th ACM SIGACTSIGMODSIGART Symp. Principles of Database Systems, pp. 274281, May 2001.
[15] E. Bingham and H. Mannila, “Random Projection in Dimensionality Reduction: Applications to Image and Text Data,” Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 245250, Aug. 2001.
[16] N. Gershnfeld, The Nature of Mathematical Modeling, Cambridge Univ. Press, 1999.
[17] J. Bourgain, “On Lipschitz Embedding of Finite Metric Spaces in Hilbert Space,” Israel J. Math., vol. 52, nos. 12, pp. 4652, 1985.
[18] W. Johnson and J. Lindenstrauss, “Extensions of Lipschitz Mappings into a Hilbert Space,” Contemporary Math., vol. 26, pp. 189206, 1984.
[19] N.J. Young, An Introduction to Hilbert Space. Cambridge, UK: Cambridge Univ. Press, 1988.
[20] C. Faloutsos and K.I. Lin, “Fastmap: A Fast Algorithm for Indexing, DataMining and Visualization of Traditional and Multimedia Datasets,” Proc. SIGMOD, Int'l Conf. Management of Data, pp. 163174, 1995.
[21] G. Hristescu and M. FarachColton, “ClusterPreserving Embedding of Proteins,” technical report, Rutgers Univ., Piscataway, New Jersey, 1999.
[22] J.B. Kruskal and M. Wish, “Multidimensional Scaling,” technical report, Sage Univ. Series, Beverly Hills, Calif., 1978.
[23] J.T.L. Wang, X. Wang, K.I. Lin, D. Shasha, B.A. Shapiro, and K. Zhang, “Evaluating a Class of DistanceMapping Algorithms for Data Mining and Clustering,” Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 307311, Aug. 1999.
[24] G.R. Hjaltason and H. Samet, “Contractive Embedding Methods for Similarity Searching in Metric Spaces,” Computer Science TR4102, Univ. of Maryland, College Park, Maryland, Feb. 2000.
[25] S. Arya, D.M. Mount, N.S. Netanyahu, R. Silverman, and A.Y. Wu, “An Optimal Algorithm for Approximate Nearest Neighbor Searching in Fixed Dimensions,” J. ACM, vol. 45, no. 6, pp. 891923, Nov. 1998.
[26] M. Bern, “Approximate ClosestPoint Queries in High Dimensions,” Information Processing Letters, vol. 45, no. 2, pp. 9599, Feb. 1993.
[27] P. Indyk and R. Motwani, “Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality,” Proc. ACM Symp. Theory of Computing, pp. 604613, 1998.
[28] G.R. Hjaltason and H. Samet, “Incremental Similarity Search in Multimedia Databases,” Computer Science Dept. TR4199, Univ. of Maryland, College Park, Nov. 2000.
[29] T. Seidl and H.P. Kriegel, “Optimal MultiStep kNearest Neighbor Search,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 154165, 1998.
[30] G.R. Hjaltason and H. Samet, “Ranking in Spatial Databases,” Proc. Fourth Int'l Symp. Large Spatial Databases, pp. 8395, 1995.
[31] G.R. Hjaltason and H. Samet, “Distance Browsing in Spatial Databases,” ACM Trans. Database Systems, vol. 24, no. 2, pp. 265318, June 1999. Also Computer Science TR3919, Univ. of Maryland, College Park.
[32] N. Linial, E. London, and Y. Rabinovich, “The Geometry of Graphs and Some of Its Algorithmic Applications,” Proc. 35th IEEE Ann. Symp. Foundations of Computer Science, pp. 577591, Nov. 1994.
[33] M. Linial, N. Linial, N. Tishby, and G. Yona, “Global Self Organization of All Known Protein Sequences Reveals Inherent Biological Signatures,” J. Molecular Biology, vol. 268, no. 2, pp. 539556, May 1997.
[34] L.J. Cowen and C.E. Priebe, “Randomized NonLinear Projections Uncover HighDimensional Structure,” Advances in Applied Math., vol. 19, pp. 319331, 1997.
[35] A. Farago, T. Linder, and G. Lubosi, "Fast NearestNeighbor Search in Dissimilarity Spaces," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 9, pp. 957962, Sept. 1993.
[36] J. Vleugels and R.C. Veltkamp, “Efficient Image Retrieval through Vantage Objects,” Pattern Recognition, vol. 35, no. 1, pp. 6980, Jan. 2002.
[37] J.E. Barros, J. French, W. Martin, P.M. Kelly, and T.M. Cannon, “Using the Triangle Inequality to Reduce the Number of Comparisons Required for SimilarityBased Retrieval,” Proc. SPIE, Storage and Retrieval of Still Image and Video Databases IV, I.K. Sethi and R. Jain, eds., vol. 2670, pp. 392403, Jan. 1996.
[38] L. Mico, J. Oncina, and E. Vidal, “A New Version of the NearestNeighbour Approximating and Eliminating Search Algorithm (AESA) with Linear PreprocessingTime and Memory Requirements,” Pattern Recognition Letters, vol. 15, no. 1, pp. 917, Jan. 1994.
[39] M. Shapiro, “The Choice of Reference Points in BestMatch File Searching,” Comm. ACM, vol. 20, pp. 339343, May 1997.
[40] E. Vidal Ruiz, “An Algorithm for Finding Nearest Neighbours in (Approximately) Constant Average Time,” Pattern Recognition Letters, vol. 4, no. 3, pp. 145157, July 1986.
[41] T.L. Wang and D. Shasha, “Query Processing for Distance Metrics,” Proc. 16th Int'l Conf. Very Large Databases, D. McLeod, R. SacksDavis, and H.J. Schek, eds., pp. 602613, Aug. 1990.
[42] K.W. Pettis, T.A. Bailey, A.K. Jain, and R.C. Dubes, “An Intrinsic Dimensionality Estimator from NearNeighbor Information,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 1, no. 1, pp. 2537, 1979.
[43] X. Wang, J.T.L. Wang, K.I. Lin, D. Shasha, B.A. Shapiro, and K. Zhang, “An Index Structure for Data Mining and Clustering,” Knowledge and Information Systems, vol. 2, no. 2, pp. 161184, May 2000.
[44] Y. Yang, K. Zhang, X. Wang, J.T.L. Wang, and D. Shasha, “An Approximate Oracle for Distance in Metric Spaces,” Proc. Ninth Ann. Symp. Combinatorial Pattern Matching, M. FarachColton, ed., pp. 104117, July 1998.
[45] K. Zhang, personal communication (unpublished), July 2000.