The Community for Technology Leaders
RSS Icon
Issue No.07 - July (2010 vol.22)
pp: 1014-1027
You Jung Kim , Oracle Corp, Redwood City and University of Michigan, Ann Arbor
Jignesh M. Patel , University of Wisconsin, Madison and University of Michigan, Ann Arbor
Multidimensional point indexing plays a critical role in a variety of data-centric applications, including image retrieval, sequence matching, and moving object database search. A common choice of indexing method for these applications is often the "ubiquitous” {\rm R}^{\ast}-tree. Choosing the right indexing method requires careful consideration of various factors such as query operations and index construction methods. In this work, we present an experimental study comparing the {\rm R}^{\ast}-tree and Quadtree using various criteria including the query operations and index construction methods. Although a variety of query operations can be performed using these index structures, previous work has largely focused only on the range search operation. We go beyond this previous work and compare the performance of these index structures using k-nearest neighbor (kNN) and distance join queries. In addition, we also consider the impact of index construction methods in evaluating these index structures. Our study sheds light on how the choice of the underlying index structure affects the performance of different query operations, and shows that the method used for constructing the index and the dynamic nature of the data set has a dramatic impact on the performance of these index structures.
Performance evaluation, Indexing methods, {\rm R}^{\ast}-tree, quadtree, kNN, distance join.
You Jung Kim, Jignesh M. Patel, "Performance Comparison of the {\rm R}^{\ast}-Tree and the Quadtree for kNN and Distance Join Queries", IEEE Transactions on Knowledge & Data Engineering, vol.22, no. 7, pp. 1014-1027, July 2010, doi:10.1109/TKDE.2009.141
[1] Y. Manolopoulos, A. Nanopoulos, A. Papadopoulos, and Y. Theodoridis, R-Trees: Theory and Applications. Springer, 2006.
[2] H. Samet, Applications of Spatial Data Structures. Addison-Wesley, 1990.
[3] S. Shekhar and S. Chawla, Spatial Databases: A Tour. Prentice Hall, 2003.
[4] H. Samet, Foundations of Multidimensional and Metric Data Structures. Morgan Koufmann, 2006.
[5] N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger, "The ${\rm R}^{\ast}$ -Tree: An Efficient and Robust Access Method for Points and Rectangles$^{+}$ ," Proc. ACM SIGMOD, pp. 322-331, 1990.
[6] H. Samet, "The Quadtree and Related Hierarchical Data Structures," Computing Surveys, vol. 16, no. 2, pp. 187-260, 1984.
[7] K.I. Lin, H. Jagadish, and C. Faloutsos, "The TV-Tree: An Index Structure for High-Dimensional Data," The VLDB J., vol. 3, pp. 517-542, 1994.
[8] S. Berchtold, D.A. Keim, and H.-P. Kriegel, "The X-Tree: An Index Structure for High-Dimensional Data," Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 28-39, 1996.
[9] V. Gaede and O. Günther, "Multidimensional Access Methods," ACM Computing Surveys, vol. 30, no. 2, pp. 170-231, 1997.
[10] S. Berchtold, C. Böhm, and H.-P. Kriegel, "The Pyramid-Technique: Towards Breaking the Curse of Dimensionality," Proc. ACM SIGMOD, pp. 142-153, 1998.
[11] A. Corral and J. Almedros-Jimenez, "A Performance Comparison of Distance-Based Query Algorithms Using R-Trees in Spatial Databases," Information Sciences, vol. 177, pp. 2207-2237, 2007.
[12] C. Böhm, B. Braunmüller, M. Breunig, and H.-P. Kriegel, "High Performance Clustering Based on the Similarity Join," Proc. Conf. Information and Knowledge Management (CIKM), pp. 298-305, 2000.
[13] C. Böhm and F. Krebs, "High Performance Data Mining Using the Nearest Neighbor Join," Proc. IEEE Int'l Conf. Data Mining (ICDM), pp. 43-50, 2002.
[14] H. Shin, B. Moon, and S. Lee, "Adaptive Multi-Stage Distance Join Processing," Proc. ACM SIGMOD, pp. 343-354, 2000.
[15] H. Shin, B. Moon, and S. Lee, "Adaptive and Incremental Processing for Distance Join Queries," IEEE Trans. Knowledge Data Eng., vol. 15, no. 6, pp. 1561-1578, Nov./Dec. 2003.
[16] E.G. Hoel and H. Samet, "Benchmarking Spatial Join Operations with Spatial Output," Proc. Int'l Conf. Very Large Data Bases, pp. 606-618, 1995.
[17] T. Brinkhoff et al., "Efficient Processing of Spatial Joins Using R-Trees," Proc. ACM SIGMOD, pp. 237-246, 1993.
[18] J.A. Orenstein, "Spatial Query Processing in an Object-Oriented Database System," Proc. ACM SIGMOD, pp. 326-336, 1986.
[19] A. Corral, J. Cañadas, and M. Vassilakopoulos, "Processing Distance-Based Queries in Multidimensional Data Spaces Using R-Trees," Proc. Panhellenic Conf. Informatics, pp. 1-18, 2001.
[20] A. Corral, A. D'Ermiliis, Y. Manolopoulos, and M. Vassilakopoulos, "VA-Files vs. ${\rm R}^{\ast}$ -Trees in Distance Join Queries," Lecture Notes in Computer Science, pp. 153-166, Springer, 2005.
[21] A. Corral et al., "Closest Pair Queries in Spatial Databases," Proc. ACM SIGMOD, pp. 189-200, 2000.
[22] R. Kothuri, S. Ravada, and D. Abugov, "Quadtree and R-Tree indices in Oracle Spatial: A Comparison Using GIS Data," Proc. ACM SIGMOD, pp. 546-556, 2002.
[23] I. Kamel and C. Faloutsos, "On Packing R-Trees," Proc. Conf. Information and Knowledge Management (CIKM), 1993.
[24] N. Roussopoulos and D. Leifker, "Direct Spatial Search on Pictorial Databases Using Packed R-Trees," Proc. ACM SIGMOD, pp. 17-31, 1985.
[25] S.T. Leutenegger, J.M. Edgington, and M.A. Lopez, "STR: A Simple and Efficient Algorithm for R-Tree Packing," Proc. Int'l Conf. Data Eng. (ICDE), pp. 497-506, 1997.
[26] G.R. Hjaltason and H. Samet, "Improved Bulk-Loading Algorithms for Quadtrees," Proc. ACM Int'l Symp. Advances in Geographic Information Systems (GIS), pp. 110-115, 1999.
[27] R. Agrawal, C. Faloutsos, and A. Swami, "Efficient Similarity Search in Sequence Databases," Proc. Int'l Conf. Foundations of Data Organization and Algorithms (FODO), pp. 69-84, 1993.
[28] Y.S. Moon, K.Y. Whang, and W.K. Loh, "Duality-Based Subsequence Matching in Timeseries Databases," Proc. Int'l Conf. Data Eng. (ICDE), pp. 263-272, 2001.
[29] Y.S. Moon, K.Y. Whang, and W.S. Han, "General Match: A Subsequence Matching Method in Time-Series Databases Based on Generalized Windows," Proc. ACM SIGMOD, pp. 382-393, 2002.
[30] C. Faloutsos, R. Barber, M. Flickner, J. Hafner, W. Niblack, D. Petkovic, and W. Equitz, "Efficient and Effective Querying by Image Content," J. Intelligent Information Systems, vol. 3, pp. 231-262, 1994.
[31] N. Weskamp, D. Kuhn, E. Hullermeier, and G. Klebe, "Efficient Similarity Search in Protein Structure Database by K-Clique Hashing," Bioinformatics, vol. 20, no. 10, pp. 1522-1526, 2004.
[32] M.J. Carey, D.J. DeWitt, M.J. Franklin, N.E. Hall, M.L. McAuliffe, J.F. Naughton, D.T. Schuh, M.H. Solomon, C.K. Tan, O.G. Tsatalos, S.J. White, and M.J. Zwilling, "Shoring Up Persistent Applications," Proc. ACM SIGMOD, pp. 383-444, 1994.
[33] A. Guttman, "R-Trees: A Dynamic Index Structure for Spatial Indexing," Proc. ACM SIGMOD, pp. 44-57, 1984.
[34] G.R. Hjaltason and H. Samet, "Distance Browsing in Spatial Databases," ACM Trans. Database Systems, vol. 24, no. 2, pp. 265-318, 1999.
[35] A. Corral et al., "Algorithms for Processing K-Closest-Pair Queries in Spatial Databases," Data and Knowledge Eng., vol. 49, no. 1, pp. 67-104, 2000.
[36] G.R. Hjaltason and H. Samet, "Incremental Distance Join Algorithms for Spatial Databases," Proc. ACM SIGMOD, pp. 265-318, 1999.
379 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool