This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
An Efficient Technique for Nearest-Neighbor Query Processing on the SPY-TEC
November/December 2003 (vol. 15 no. 6)
pp. 1472-1486

Abstract—The SPY-TEC (Spherical Pyramid-Technique) was proposed as a new indexing method for high-dimensional data spaces using a special partitioning strategy that divides a d-dimensional data space into 2d spherical pyramids. In the SPY-TEC, an efficient algorithm for processing hyperspherical range queries was introduced with a special partitioning strategy. However, the technique for processing k-nearest-neighbor queries, which are frequently used in similarity search, was not proposed. In this paper, we propose an efficient algorithm for processing nearest-neighbor queries on the SPY-TEC by extending the incremental nearest-neighbor algorithm. We also introduce a metric that can be used to guide an ordered best-first traversal when finding nearest neighbors on the SPY-TEC. Finally, we show that our technique significantly outperforms the related techniques in processing k-nearest-neighbor queries by comparing it to the R*-tree, the X-tree, and the sequential scan through extensive experiments.

[1] A. Guttman, “R-Trees: A Dynamic Index Structure for Spatial Searching,” Proc. ACM SIGMOD Conf. Management of Data, 1984.
[2] A. Henrich, The LSDh-Tree: An Access Structure for Feature Vectors Proc. 14th Int'l Conf. Data Eng., pp. 362-369, 1998.
[3] C. Faloutsos, R. Barber, M. Flicker, J. Hafner, W. Niblack, and W. Equitz, "Efficient and effective querying by image content," J. Intell. Information Systems," vol. 3, pp. 231-262, 1994.
[4] B.C. Ooi, K.L. Tan, T.S. Chua, and W. Hsu, Fast Image Retrieval Using Color-Spatial Information The VLDB J., vol. 7, no. 2, pp. 115-128, 1998.
[5] C. Bohm, A Cost Model for Query Processing in High-Dimensional Data Spaces ACM Trans. Database Systems, vol. 25, no. 2, 2000.
[6] C.E. Jacobs and A. Finkelstein, S.H. Salesin, “Fast Multiresolution Image Querying,” Proc. SIGGRAPH, 1995.
[7] D.A. White and R. Jain, Similarity Indexing: Algorithms and Performance Proc. SPIE Storage and Retrieval for Image and Video Databases IV, vol. 2670, pp. 62-75, 1996.
[8] D. White and R. Jain, “Similarity Indexing with the SS-Tree,” Proc. 12th Int'l Conf. Data Eng., 1996.
[9] D. Lomet and B. Salzberg, "The hB-Tree: A Multiattribute Indexing Method with Good Guaranteed Performance," ACM Trans. Database Systems. vol. 15, no. 4, pp. 625-658, Dec. 1990.
[10] D.H. Lee and H.J. Kim, SPY-TEC: An Efficient Indexing Method for Similarity Search in High-Dimensional Data Spaces Data&Knowledge Eng., vol. 34, no. 1, pp. 77-97, 2000.
[11] C. Faloutsos, Fast Searching by Content in Multimedia Databases Data Eng. Bull., vol. 18, no. 4, 1995.
[12] G.R. Hjaltason and H. Samet, “Distance Browsing in Spatial Databases,” ACM Trans. Database Systems, vol. 24, no. 2, pp. 265-318, June 1999. Also Computer Science TR-3919, Univ. of Maryland, College Park.
[13] J.L. Bentley, "Multidimensional Binary Search Trees Used for Associative Searching," Comm. ACM, vol. 18, no. 9, pp. 509-517, 1975.
[14] J. Nievergelt, H. Hinterberger, and K.C. Sevcik, "The Grid File: An Adaptable, Symmetric Multikey File Structure," ACM Trans. Database Systems, vol. 9, no. 1, pp. 38-71, Mar. 1984.
[15] J.R. Smith and S.F. Chang, “VisualSEEk: A Fully Automated Content-Based Image Query System,” ACM Multimedia '96, Nov. 1996.
[16] J.T. Robinson, “The K-D-B-Tree: A Search Structure for Large Multidimensional Dynamic Indexes,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 10-18, 1981.
[17] K. Lin, H.V. Jagadish, and C. Faloutsos, “The TV-Tree: An Index Structure for High-Dimensional Data,” VLDB J., vol. 3, pp. 517-542, 1995.
[18] L. Leithold, Trigonometry. Addison-Wesley, 1989.
[19] N. Katayama and S. Satoh, “The SR-Tree: An Index Structure for High-Dimensional Nearest Neighbor Queries,” Proc. SIGMOD, Int'l Conf. Management of Data, pp. 369-380, 1997.
[20] N. Roussopoulos, S. Kelley, and F. Vincent, “Nearest Neighbor Queries,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 71-79, 1995.
[21] K. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft, “When Is‘Nearest Neighbor’Meaningful?,” Proc. Int'l Conf. Database Theory (ICDT '99), pp. 217–235, Jan. 1999.
[22] R. Weber, H.-J. Schek, and S. Blott, “A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces,” Proc. Very Large Data Base Conf. (VLDB '98), pp. 194–205, Aug. 1998.
[23] S. Arya, D.M. Mount, N.S. Netanyahu, R. Silverman, and A.Y. Wu, “An Optimal Algorithm for Approximate Nearest Neighborhood Searching,” Proc. Symp. Discrete Algorithms, pp. 573-582, 1994.
[24] S. Arya, D.M. Mount, N.S. Netanyahu, R. Silverman, and A.Y. Wu, “An Optimal Algorithm for Approximate Nearest Neighbor Searching in Fixed Dimensions,” J. ACM, vol. 45, no. 6, pp. 891-923, Nov. 1998.
[25] S. Berchtold, C. Böhm, and H.-P. Kriegel, “The Pyramid-Technique: Towards Breaking the Curse of Dimensionality,” Proc. ACM SIGMOD Int'l Conf. Managment of Data, 1998.
[26] S. Berchtold, C. Böhm, B. Braunmüller, D. Keim, and H.-P. Kriegel, “Fast Parallel Similarity Search in Multimedia Databases,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 1-12, 1997.
[27] S. Berchtold, C. Böhm, and H.-P. Kriegel, “A Cost Model for Nearest Neighbor Search in High-Dimensional Data Spaces,” Proc. 16th ACM SIGACT-SIGMOD-SIGART Symp. Principles of Database Systems (PODS), pp. 78-86, 1997.
[28] S. Berchtold, D. Keim, and H.-P. Kriegel, “The X-Tree: An Index Structure for High-Dimensional Data,” Proc. 22nd Conf. Very Large Data Bases, pp. 28-39, 1996.
[29] S. Berchtold, B. Ertl, D.A. Keim, H.-P. Kriegel, and T. Seidl, “Fast Nearest Neighbor Search in High-Dimensional Spaces.,” Proc. Int'l Conf. Data Eng. (ICDE '98), pp. 209–218, Feb. 1998.
[30] P.M. Kelly, T.M. Cannon, and D.R. Hush, Query by Image Example: The CANDID Approach Proc. SPIE Storage and Retrieval for Image and Video Databases III, vol. 2420, pp. 238-248, 1995.

Index Terms:
Similarity search, high-dimensional index technique, nearest-neighbor query, incremental nearest-neighbor algorithm, SPY-TEC.
Citation:
Dong-Ho Lee, Hyoung-Joo Kim, "An Efficient Technique for Nearest-Neighbor Query Processing on the SPY-TEC," IEEE Transactions on Knowledge and Data Engineering, vol. 15, no. 6, pp. 1472-1486, Nov.-Dec. 2003, doi:10.1109/TKDE.2003.1245286
Usage of this product signifies your acceptance of the Terms of Use.