This Article 
 Bibliographic References 
 Add to: 
The Design and Implementation of Seeded Trees: An Efficient Method for Spatial Joins
January/February 1998 (vol. 10 no. 1)
pp. 136-152

Abstract—Existing methods for spatial joins require pre-existing spatial indices or other precomputation, but such approaches are inefficient and limited in generality. Operand data sets of spatial joins may not all have precomputed indices, particularly when they are dynamically generated by other selection or join operations. Also, existing spatial indices are mostly designed for spatial selections, and are not always efficient for joins. This paper explores the design and implementation of seeded trees [1], which are effective for spatial joins and efficient to construct at join time. Seeded trees are R-tree-like structures, but divided into seed levels and grown levels. This structure facilitates using information regarding the join to accelerate the join process, and allows efficient buffer management. In addition to the basic structure and behavior of seeded trees, we present techniques for efficient seeded tree construction, a new buffer management strategy to lower I/O costs, and theoretical analysis for choosing algorithmic parameters. We also present methods for reducing space requirements and improving the stability of seeded tree performance with no additional I/O costs. Our performance studies show that the seeded tree method outperforms other tree-based methods by far both in terms of the number disk pages accessed and weighted I/O costs. Further, its performance gain is stable across different input data, and its incurred CPU penalties are also lower.

[1] M. Lo and C.V. Ravishankar, “Spatial Joins Using Seeded Trees,” Proc. 1994 ACM SIGMOD Int'l Conf. Management of Data, pp. 209-220, 1994.
[2] H. Samet, The Design and Analysis of Spatial Data Structures. Addison-Wesley, 1990.
[3] D. Rotem, "Spatial Join Indices," Proc. Seventh Int'l Conf. Data Eng., pp. 500-509, 1991.
[4] P. Valduriez, “Join Indices,” ACM Trans. Database Systems, vol. 12, no. 2, 1987.
[5] J. Nievergelt, H. Hinterberger, and K.C. Sevcik, "The Grid File: An Adaptable, Symmetric Multikey File Structure," ACM Trans. Database Systems, vol. 9, no. 1, pp. 38-71, Mar. 1984.
[6] W. Lu and J. Han, "Distance-Associated Join Indices for Spatial Range Search," Proc. Int'l Conf. Data Eng., pp. 284-292, 1992.
[7] J. Orenstein, “Redundancy in Spatial Databases,” Proc. ACM SIGMOD Conf. Management of Data, 1989.
[8] J.A. Orenstein, "A Comparison of Spatial Query Processing Techniques for Native and Parameter Spaces," Proc. SIGMOD Int'l Conf. Management Data, pp. 343-352, ACM, 1990.
[9] J. Orenstein, “An Algorithm for Computing the Overlay of k-Dimensional Spaces,” Proc. Symp. Large Spatial Databases, pp. 381-400, Aug. 1991.
[10] R.H. Güting and W. Schilling, “A Practical Divide-and-Conquer Algorithm for the Rectangle Intersection Problem,” Information Sciences, vol. 42, pp. 95-112, 1987.
[11] O. Günther, “Efficient Computation of Spatial Joins,” Proc. Ninth Conf. Data Eng., pp. 50-60, 1993.
[12] A. Guttman, “R-Trees: A Dynamic Index Structure for Spatial Searching,” Proc. ACM SIGMOD Conf. Management of Data, 1984.
[13] N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger, “The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles,” Proc. ACM SIGMOD Conf. Management of Data, 1990.
[14] C. Faloutsos, T. Sellis, and N. Roussopoulos, “Analysis of Object Oriented Spatial Access Methods,” Proc. ACM SIGMOD Conf. Management of Data, 1987.
[15] T. Sellis, N. Roussopoulos, and C. Faloutsos, “The R+-Tree: A Dynamic Index for Multidimensional Objects,” Proc. 13th Int'l Conf. Very Large Data Bases (VLDB), 1987.
[16] T. Brinkhoff, H.-P. Kriegel, and B. Seeger, “Efficient Processing of Spatial Joins Using R-trees,” Proc. ACM SIGMOD Conf. Management of Data, 1993.
[17] J. Star and J. Estes, Geographic Information Systems.Englewood Cliffs, N.J.: Prentice Hall, 1990.
[18] U.S. Bureau of Census, "Tiger/Lines Precensus Files: 1990 Technical Documentation," technical report, U.S. Bureau of Census, Washington, D.C., 1989.
[19] F. Olken and D. Rotem, "Sampling from Spatial Databases," Proc. Int'l Conf. Data Eng., pp. 199-208, 1993.
[20] M.L. Lo and C.V. Ravishankar, “Generating Seeded Trees from Datasets,” Proc. Int'l Symp. Large Spatial Databases (Advances in Spatial Databases: SSD '95), pp. 328-347, Aug. 1995.
[21] W.J. Dixon, Introduction to Statistical Analysis, fourth ed. New York: McGraw-Hill, 1983.

Index Terms:
Spatial databases, query processing, join processing, database index, spatial index, buffer management.
Ming-Ling Lo, Chinya V. Ravishankar, "The Design and Implementation of Seeded Trees: An Efficient Method for Spatial Joins," IEEE Transactions on Knowledge and Data Engineering, vol. 10, no. 1, pp. 136-152, Jan.-Feb. 1998, doi:10.1109/69.667097
Usage of this product signifies your acceptance of the Terms of Use.