Subscribe

## Performance Comparison of the {\rm R}^{\ast}-Tree and the Quadtree for kNN and Distance Join Queries

Issue No.07 - July (2010 vol.22)

pp: 1014-1027

You Jung Kim , Oracle Corp, Redwood City and University of Michigan, Ann Arbor

Jignesh M. Patel , University of Wisconsin, Madison and University of Michigan, Ann Arbor

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2009.141

ABSTRACT

Multidimensional point indexing plays a critical role in a variety of data-centric applications, including image retrieval, sequence matching, and moving object database search. A common choice of indexing method for these applications is often the "ubiquitous” {\rm R}^{\ast}-tree. Choosing the right indexing method requires careful consideration of various factors such as query operations and index construction methods. In this work, we present an experimental study comparing the {\rm R}^{\ast}-tree and Quadtree using various criteria including the query operations and index construction methods. Although a variety of query operations can be performed using these index structures, previous work has largely focused only on the range search operation. We go beyond this previous work and compare the performance of these index structures using k-nearest neighbor (kNN) and distance join queries. In addition, we also consider the impact of index construction methods in evaluating these index structures. Our study sheds light on how the choice of the underlying index structure affects the performance of different query operations, and shows that the method used for constructing the index and the dynamic nature of the data set has a dramatic impact on the performance of these index structures.

INDEX TERMS

Performance evaluation, Indexing methods, {\rm R}^{\ast}-tree, quadtree, kNN, distance join.

CITATION

You Jung Kim, Jignesh M. Patel, "Performance Comparison of the {\rm R}^{\ast}-Tree and the Quadtree for kNN and Distance Join Queries",

*IEEE Transactions on Knowledge & Data Engineering*, vol.22, no. 7, pp. 1014-1027, July 2010, doi:10.1109/TKDE.2009.141REFERENCES

- [1] Y. Manolopoulos, A. Nanopoulos, A. Papadopoulos, and Y. Theodoridis,
R-Trees: Theory and Applications. Springer, 2006.- [2] H. Samet,
Applications of Spatial Data Structures. Addison-Wesley, 1990.- [3] S. Shekhar and S. Chawla,
Spatial Databases: A Tour. Prentice Hall, 2003.- [4] H. Samet,
Foundations of Multidimensional and Metric Data Structures. Morgan Koufmann, 2006.- [7] K.I. Lin, H. Jagadish, and C. Faloutsos, "The TV-Tree: An Index Structure for High-Dimensional Data,"
The VLDB J., vol. 3, pp. 517-542, 1994.- [8] S. Berchtold, D.A. Keim, and H.-P. Kriegel, "The X-Tree: An Index Structure for High-Dimensional Data,"
Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 28-39, 1996.- [10] S. Berchtold, C. Böhm, and H.-P. Kriegel, "The Pyramid-Technique: Towards Breaking the Curse of Dimensionality,"
Proc. ACM SIGMOD, pp. 142-153, 1998.- [12] C. Böhm, B. Braunmüller, M. Breunig, and H.-P. Kriegel, "High Performance Clustering Based on the Similarity Join,"
Proc. Conf. Information and Knowledge Management (CIKM), pp. 298-305, 2000.- [13] C. Böhm and F. Krebs, "High Performance Data Mining Using the Nearest Neighbor Join,"
Proc. IEEE Int'l Conf. Data Mining (ICDM), pp. 43-50, 2002.- [14] H. Shin, B. Moon, and S. Lee, "Adaptive Multi-Stage Distance Join Processing,"
Proc. ACM SIGMOD, pp. 343-354, 2000.- [16] E.G. Hoel and H. Samet, "Benchmarking Spatial Join Operations with Spatial Output,"
Proc. Int'l Conf. Very Large Data Bases, pp. 606-618, 1995.- [19] A. Corral, J. Cañadas, and M. Vassilakopoulos, "Processing Distance-Based Queries in Multidimensional Data Spaces Using R-Trees,"
Proc. Panhellenic Conf. Informatics, pp. 1-18, 2001.- [20] A. Corral, A. D'Ermiliis, Y. Manolopoulos, and M. Vassilakopoulos, "VA-Files vs. ${\rm R}^{\ast}$ -Trees in Distance Join Queries,"
Lecture Notes in Computer Science, pp. 153-166, Springer, 2005.- [21] A. Corral et al., "Closest Pair Queries in Spatial Databases,"
Proc. ACM SIGMOD, pp. 189-200, 2000.- [22] R. Kothuri, S. Ravada, and D. Abugov, "Quadtree and R-Tree indices in Oracle Spatial: A Comparison Using GIS Data,"
Proc. ACM SIGMOD, pp. 546-556, 2002.- [23] I. Kamel and C. Faloutsos, "On Packing R-Trees,"
Proc. Conf. Information and Knowledge Management (CIKM), 1993.- [24] N. Roussopoulos and D. Leifker, "Direct Spatial Search on Pictorial Databases Using Packed R-Trees,"
Proc. ACM SIGMOD, pp. 17-31, 1985.- [25] S.T. Leutenegger, J.M. Edgington, and M.A. Lopez, "STR: A Simple and Efficient Algorithm for R-Tree Packing,"
Proc. Int'l Conf. Data Eng. (ICDE), pp. 497-506, 1997.- [26] G.R. Hjaltason and H. Samet, "Improved Bulk-Loading Algorithms for Quadtrees,"
Proc. ACM Int'l Symp. Advances in Geographic Information Systems (GIS), pp. 110-115, 1999.- [27] R. Agrawal, C. Faloutsos, and A. Swami, "Efficient Similarity Search in Sequence Databases,"
Proc. Int'l Conf. Foundations of Data Organization and Algorithms (FODO), pp. 69-84, 1993.- [28] Y.S. Moon, K.Y. Whang, and W.K. Loh, "Duality-Based Subsequence Matching in Timeseries Databases,"
Proc. Int'l Conf. Data Eng. (ICDE), pp. 263-272, 2001.- [29] Y.S. Moon, K.Y. Whang, and W.S. Han, "General Match: A Subsequence Matching Method in Time-Series Databases Based on Generalized Windows,"
Proc. ACM SIGMOD, pp. 382-393, 2002.- [32] M.J. Carey, D.J. DeWitt, M.J. Franklin, N.E. Hall, M.L. McAuliffe, J.F. Naughton, D.T. Schuh, M.H. Solomon, C.K. Tan, O.G. Tsatalos, S.J. White, and M.J. Zwilling, "Shoring Up Persistent Applications,"
Proc. ACM SIGMOD, pp. 383-444, 1994.- [33] A. Guttman, "R-Trees: A Dynamic Index Structure for Spatial Indexing,"
Proc. ACM SIGMOD, pp. 44-57, 1984.- [36] G.R. Hjaltason and H. Samet, "Incremental Distance Join Algorithms for Spatial Databases,"
Proc. ACM SIGMOD, pp. 265-318, 1999. |