
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
Yufei Tao, Dimitris Papadias, "Performance Analysis of R*Trees with Arbitrary Node Extents," IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 6, pp. 653668, June, 2004.  
BibTex  x  
@article{ 10.1109/TKDE.2004.13, author = {Yufei Tao and Dimitris Papadias}, title = {Performance Analysis of R*Trees with Arbitrary Node Extents}, journal ={IEEE Transactions on Knowledge and Data Engineering}, volume = {16}, number = {6}, issn = {10414347}, year = {2004}, pages = {653668}, doi = {http://doi.ieeecomputersociety.org/10.1109/TKDE.2004.13}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Knowledge and Data Engineering TI  Performance Analysis of R*Trees with Arbitrary Node Extents IS  6 SN  10414347 SP653 EP668 EPD  653668 A1  Yufei Tao, A1  Dimitris Papadias, PY  2004 KW  Database KW  spatial database KW  Rtree KW  cost model. VL  16 JA  IEEE Transactions on Knowledge and Data Engineering ER   
Abstract—Existing analysis for Rtrees is inadequate for several traditional and emerging applications including, for example, temporal, spatiotemporal, and multimedia databases because it is based on the assumption that the extents of a node are identical on all dimensions, which is not satisfied in these domains. In this paper, we propose analytical models that can accurately predict R*tree performance without this assumption. Our derivation is based on the novel concept of extent regression function, which computes the node extents as a function of the number of node splits. Detailed experimental evaluation reveals that the proposed models are accurate, even in cases where previous methods fail completely.
[1] S. Acharya, V. Poosala, and S. Ramaswamy, Selectivity Estimation in Spatial Databases Proc. ACM SIGMOD Int'l Conf. Management of Data, 1999.
[2] A. Aboulnaga and J. Naughton, Accurate Estimation of the Cost of Spatial Selections Proc. Int'l Conf. Data Eng., 2000.
[3] C. Bohm, A Cost Model for Query Processing in High Dimensional Data Spaces ACM Trans. Database Systems, vol. 25, no. 2, pp. 129178, 2000.
[4] B. Babcock, S. Chaudhuri, and G. Das, Dynamic Sample Selection for Approximate Query Processing Proc. 2003 ACM SIGMOD Int'l Conf. Management of Data, 2003.
[5] A. Belussi and C. Faloutsos, Estimating the Selectivity of Spatial Queries Using the Correlation's Fractal Dimension Proc. 21st Int'l Conf. Very Large Data Bases (VLDB), 1995.
[6] N. Bruno, L. Gravano, and S. Chaudhuri, STHoles: A Workload Aware Multidimensional Histogram Proc. ACM SIGMOD Int'l Conf. Management of Data, 2001.
[7] R. Bliujute, C. Jensen, S. Saltenis, and G. Slivinskas, RTree Based Indexing of NowRelative Bitemporal Data Proc. 24th Int'l Conf. Very Large Data Bases, 1998.
[8] S. Berchtold, D. Keim, and H.P. Kriegel, The XTree: An Index Structure for HighDimensional Data Proc. Int'l Conf. Very Large Databases (VLDB), 1996.
[9] B. Blohsfeld, D. Korus, and B. Seeger, A Comparison of Selectivity Estimators for Range Queries on Metric Attributes Proc. Int'l Conf. Very Large Databases (VLDB), 1999.
[10] N. Beckmann, H. Kriegel, R. Schneider, and B. Seeger, The R*Tree: An Efficient and Robust Access Method for Points and Rectangles Proc. ACM SIGMOD Int'l Conf. Management of Data, 1990.
[11] A. Deshpande, M. Garofalakis, and R. Rastogi, Independence Is Good: DependencyBased Histogram Synopses for HighDimensional Data Proc. 2001 ACM SIGMOD Int'l Conf. Management of Data, 2001.
[12] C. Faloutsos and I. Kamel, Beyond Uniformity and Independence, Analysis of RTrees Using the Concept of Fractal Dimension Proc. ACM SIGACTSIGMODSIGART Principles of Database Systems, 1994.
[13] C. Faloutsos, T. Sellis, and N. Roussopoulos, Analysis of Object Oriented Spatial Access Methods Proc. ACM SIGMOD Int'l Conf. on Management of Data, 1987.
[14] A. Guttman, RTrees: A Dynamic Index Structure for Spatial Searching Proc. ACM SIGMOD Int'l Conf. Management of Data, 1984.
[15] D. Gunopulos, G. Kollios, V. Tsotras, and C. Domeniconi, Approximate MultiDimensional Aggregate Range Queries over Real Attributes Proc. 2000 ACM SIGMOD Int'l Conf. Management of Data, 2000.
[16] C. Jermaine, Making Sampling Robust with APA Proc. VLDB Conf., 2003.
[17] J. Jin, N. An, and A. Sivasubramaniam, Analyzing Range Queries on Spatial Data Proc. Int'l Conf. Data Eng., 2000.
[18] M. Jurgens and H. Lenz, PISA: Performance Models for Index Structures with and without Aggregated Data Proc. 11th Int'l Conf. Scientific and Statistical Database Management, 1999.
[19] I. Kamel and C. Faloutsos, On Packing RTrees Proc. Second Int'l Conf. Information and Knowledge Management (CIKM), 1993.
[20] I. Kamel and C. Faloutsos, Hilbert RTree: An Improved RTree Using Fractals Proc. 20th Int'l Conf. Very Large Databases, 1994.
[21] C. Kolovson and M. Stonebraker, "Indexing Techniques for Historical Databases," Proc. IEEE Conf. Data Eng., pp. 127137, 1989.
[22] C. Kolovson and M. Stonebraker, Segment Indexes: Dynamic Indexing Techniques for MultiDimensional Interval Data Proc. ACM SIGMOD Int'l Conf. Management of Data, 1991.
[23] J. Lee, D. Kim, and C. Chung, Multidimensional Selectivity Estimation Using Compressed Histogram Information Proc. 1999 ACM SIGMOD Int'l Conf. Management of Data, 1999.
[24] S.T. Leutenegger and M.A. Lopez, The Effect of Buffering on the Performance of RTrees IEEE Trans. Knowledge and Data Eng., vol. 12, no. 1, pp. 3344, Jan./Feb. 2000.
[25] X. Lin, Q. Liu, Y. Yuan, and X. Zhou, Multiscale Histograms: Summarizing Topological Relations in Large Spatial Datasets Proc. 29th Int'l Conf. Very Large Data Bases, 2003.
[26] Y. Mattias, J. Vitter, and M. Wang, Dynamic Maintenance of WaveletBased Histograms Proc. 26th Int'l Conf. Very Large Data Bases, 2000.
[27] Y. Mattias, J. Vitter, and M. Wang, WaveletBased Histograms for Selectivity Estimation Proc. 1998 ACM SIGMOD Int'l Conf. Management of Data, 1998.
[28] O. Procopiuc, P. Agarwal, L. Arge, and J. Vitter, BkdTree: A DynamicScalable kdTree Proc. Eighth Int'l Symp. Spatial and Temporal Databases, 2003.
[29] G. Proietti and C. Faloutsos, Accurate Modeling of Region Data IEEE Trans. Knowledge and Data Eng., vol. 13, no. 6, pp. 874383, Nov./Dec. 2001.
[30] G. Proietti and C. Faloutsos, I/O Complexity for Range Queries on Region Data Stored Using an RTree Proc. Int'l Conf. Data Eng., 1999.
[31] Y. Poosala and Y. Ioannidis, Selectivity Estimation without the Attribute Value Independence Assumption Proc. 23rd Int'l Conf. on Very Large Data Bases, 1997.
[32] D. Pfoser, C. Jensen, and Y. Theodoridis, Novel Approaches to the Indexing of Moving Object Trajectories Proc. 26th Int'l Conf. Very Large Databases, 2000.
[33] B.U. Pagel and H.W. Six, Are Window Queries Representative for Arbitrary Range Queries? Proc. 15th ACM SIGACTSIGMODSIGART Symp. Principles of Database Systems 1996.
[34] B.U. Pagel, H.W. Six, H. Toben, and P. Widmayer, Towards an Analysis of Range Query Performance in Spatial Data Structures Proc. 12th ACM SIGACTSIGMODSIGART Symp. Principles of Database Systems, 1993.
[35] D. Papadias, Y. Tao, P. Kalnis, and J. Zhang, Indexing SpatioTemporal Data Warehouses Proc. Int'l Conf. Data Eng., 2002.
[36] M. Stonebraker, The Design of Postgres Storage System Proc. 13th Conf. Very Large Databases, 1987.
[37] C. Sun, D. Agrawal, and A. El Abbadi, Selectivity Estimation for Spatial Joins with Geometric Selections Proc. Conf. Extending Database Technology, 2002.
[38] S. Saltenis and C. Jensen, Indexing the Positions of Continuously Moving Objects The VLDB J., vol. 11, no. 1, pp. 116, 2002.
[39] T. Sellis, N. Roussopoulos, and C. Faloutsos, The R+Tree: A Dynamic Index for MultiDimensional Objects Proc. 13th Conf. Very Large Databases, 1987.
[40] B. Salzberg and V. Tsotras, Comparison of Access Methods for Temporal Data ACM Computing Surveys, vol. 31, no. 2, pp. 158221, 1999.
[41] Y. Sakurai, M. Yoshikawa, S. Uemura, and H. Kojima, The ATree: An Index Structure for HighDimensional Spaces Using Relative Approximation Proc. 26th Int'l Conf. Very Large Data Bases, 2000.
[42] N. Thaper, S. Guha, P. Indyk, and N. Koudas, Dynamic Multidimensional Histograms Proc. 2002 ACM SIGMOD Int'l Conf. Management of Data , 2002.
[43] http://www.census.gov/geo/www/tiger, 2004.
[44] Y. Tao and D. Papadias, Spatial Queries in Dynamic Environments ACM Trans. Database Systems, vol. 28, no. 2, pp. 101139, 2003.
[45] Y. Tao, D. Papadias, and J. Zhang, Cost Models for Overlapping and MultiVersion Structures ACM Trans. Database Systems, vol. 27, no. 3, pp. 299342, 2002.
[46] Y. Theodoridis and T. Sellis, A Model for the Prediction of RTree Performance Proc. 15th ACM SIGACTSIGMODSIGART Symp. Principles of Database Systems, 1996.
[47] Y. Theodoridis, E. Stefanakis, and T. K. Sellis, “Efficient Cost Models for Spatial Queries Using RTrees,” IEEE Trans. Knowledge and Data Eng., vol. 12, no. 1 pp. 1932, Jan./Feb. 2000.
[48] M. Vazirgiannis, Y. Theodoridis, and T. Sellis, SpatioTemporal Composition and Indexing for Large Multimedia Applications ACM/Springer Multimedia J., vol. 6, no. 4, 1998.
[49] Y. Wu, D. Agrawal, and A. Abbadi, Applying the Golden Rule of Sampling for Query Estimation Proc. 2001 ACM SIGMOD Int'l Conf. Management of Data, 2001.
[50] http:/www.rtreeportal.org/, 2004.
[51] M. Wang, J. Vitter, L. Lim, and S. Padmanabhan, WaveletBased Cost Estimation for Spatial Queries Proc. Seventh Int'l Symp. Spatial and Temporal Databases, 2001.
[52] S. Yao, Random 23 Trees Acta Informatica, vol. 2, no. 9, pp. 159179, 1978.