This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Efficient Join-Index-Based Spatial-Join Processing: A Clustering Approach
November/December 2002 (vol. 14 no. 6)
pp. 1400-1421

Abstract—A join-index is a data structure used for processing join queries in databases. Join-indices use precomputation techniques to speed up online query processing and are useful for data sets which are updated infrequently. The I/O cost of join computation using a join-index with limited buffer space depends primarily on the page-access sequence used to fetch the pages of the base relations. Given a join-index, we introduce a suite of methods based on clustering to compute the joins. We derive upper bounds on the length of the page-access sequences. Experimental results with Sequoia 2000 data sets show that the clustering method outperforms existing methods based on sorting and online-clustering heuristics.

[1] S.T. Barnard, A. Pothen, and H.D. Simon, “A Spectral Algorithm for Envelope Reduction of Sparse Matrices,” Numerical Linear Algebra with Applications, vol. 2, no. 4, pp. 317-334, 1995.
[2] L. Becker, K. Hinrichs, and U. Finke, “A New Algorithm for Computing Joins With Grid Files,” Proc. Int'l Conf. Data Eng., 1993.
[3] L. Belady, R. Nelson, and G. Shedler, “An Anomaly in the Space-Time Characteristics of Certain Programs Running in Paging Machines,” Comm. ACM, vol. 12, no. 6, pp. 349-353, June 1969.
[4] C. Berge, Graphs and Hypergraphs. New York: American Elsevier, 1976.
[5] T. Brinkhoff, H.-P. Kriegel, R. Schneider, and B. Seeger, “Multi-Step Processing of Spatial Joins,” Proc. ACM SIGMOD Conf. Management of Data, 1994.
[6] T. Brinkhoff, H.-P. Kriegel, and B. Seeger, “Efficient Processing of Spatial Joins Using R-trees,” Proc. ACM SIGMOD Conf. Management of Data, 1993.
[7] C.Y. Chan and B.C. Ooi, “Efficient Scheduling of Page Access in Index-Based Join Processing,” IEEE Trans. Knowledge and Data Eng., vol. 9, no. 6, pp. 1005-1011, Nov./Dec. 1997.
[8] H.T. Chou and D.J. DeWitt, “An Evaluation of Buffer Management Strategies for Relational Database Systems,” Proc. 11th Int'l Conf. Very Large Data Bases, pp. 127-141, Aug. 1985.
[9] T.H. Cormen,C.E. Leiserson, and R.L. Rivest,Introduction to Algorithms.Cambridge, Mass.: MIT Press/McGraw-Hill, 1990.
[10] M. Ester, J. Sander, S. Gundlach, and H. Kriegel, “Database Primitives for Spatial Data Mining,” Proc. Int'l Conf. Databases in Office, Eng. and Science, 1999.
[11] F. Fotouhi and S. Pramanik, "Optimal Secondary Storage Access Sequence for Performing Relational Join," IEEE Trans. Knowledge and Data Eng., vol. 1, no. 3, pp. 318-328, Sept. 1989.
[12] M.R. Garey and D.S. Johnson, Computers and Intractability,New York: W.H. Freeman and Co., p. 192, p. 198, 1979, Paperback edition 1991.
[13] A. George and A. Pothen, “An Analysis of Spectral Envelope-Reduction via Quadratic Assignment Problems,” SIAM J. Matrix Analysis and Its Applications, vol. 18, no. 3, pp. 706-732, 1997.
[14] P. Goyal, H.F. Li, E. Regener, and F. Sadri, “Scheduling of Page Fetches in Join Operations Using Bc-Trees,” Proc. Conf. Data Eng., 1988.
[15] G. Graefe, "Query Evaluation Techniques for Large Databases," ACM Computing Surveys, vol. 25, no. 2, pp. 73-170, June 1993.
[16] O. Günther, “Efficient Computation of Spatial Joins,” Proc. Ninth Conf. Data Eng., pp. 50-60, 1993.
[17] L. Hagen, A. Kahng, “Fast Spectral Methods for Ratio Cut Partitioning and Clustering,” Proc. IEEE Int'l Conf. Computer-Aided Design, 1991.
[18] J.E. Hopcroft and R.M. Karp, “An$\big. n^{5/2}\bigr.$Algorithm for Maximum Matching of Graphs,” SIAM J. Computing, vol. 2, no. 4, pp. 225-231, 1973.
[19] Informix, White Papers,http://www.informix.com/informix/solutions/ dw/redbrick/wpapersstar.html, 1999.
[20] W.H. Inmon, Building the Data Warehouse. John Wiley&Sons, 1992.
[21] G. Karypis, R. Aggarwal, V. Kumar, and S. Shekhar, “hMetis Home Page,” http://www-users.cs.umn.edu/~karypis/metis/ hmetismain.html, 2002.
[22] G. Karypis, R. Aggarwal, V. Kumar, and S. Shekhar, “Multilevel Hypergraph Partitioning: Application in VLSI Domain,” Proc. ACM/IEEE Design Automation Conf., 1997.
[23] G. Karypis, R. Aggarwal, V. Kumar, and S. Shekhar, “Multilevel Hypergraph Partitioning: Application in VLSI Domain,” IEEE Trans. Very Large Scale Integration (VLSI) Systems, vol. 7, no. 1, pp. 69-79, Mar. 1999.
[24] G. Karypis and V. Kumar, “Metis Home Page,” http://www-users.cs.umn.edu/~karypis/metis/ metismain.html, 2002.
[25] G. Karypis and V. Kumar, “Parallel Multilevel Graph Partitioning,” Proc. Supercomputing, Nov. 1996.
[26] G. Karypis and V. Kumar, "Multilevel K-Way Partitioning Scheme for Irregular Graphs," J. Parallel and Distributed Computing, vol. 48, 1998, pp. 96-129.
[27] B.W. Kernighan and S. Lin, “An Efficient Heuristic Procedure for Partitioning Graphs,” The Bell System Technical J., 1970.
[28] G. Kumfert and A. Pothen, “Two Improved Algorithms for Envelope and Wavefront Reduction,” BIT, vol. 37, no. 3, pp. 001-032, 1997.
[29] D.R. Liu and S. Shekhar, "A Similarity Graph-Based Approach to Declustering Problem and its Applications," Proc. 11th Int'l Conf. Data Eng., IEEE CS Press, 1995.
[30] M. Lo and C.V. Ravishankar, “Spatial Joins Using Seeded Trees,” Proc. 1994 ACM SIGMOD Int'l Conf. Management of Data, pp. 209-220, 1994.
[31] T. Merrett, Y. Kimbayasi, and H. Yasuura, “Scheduling of Page-Fetches in Join Operations,” Proc. Seventh Int'l Conf. Very Large Databases, 1981.
[32] P. Mishra and M.H. Eich, "Join Processing in Relational Databases," ACM Computing Surveys, vol. 24, no. 1, pp. 64-113, Mar. 1992.
[33] E.R. Omiecinski, "Heuristics for Join Processing Using Nonclustered Indexes," IEEE Trans. Software Eng., vol. 15, no. 1, pp. 18-25, Jan. 1989.
[34] S. Pramanik and D. Ittner, "Use of Graph-Theoretic Models for Optimal Relational Database Accesses to Perform Join," ACM Trans. Database Systems, vol. 10, no. 1, pp. 57-74, Mar. 1985.
[35] D. Rotem, "Spatial Join Indices," Proc. Seventh Int'l Conf. Data Eng., pp. 500-509, 1991.
[36] G.M. Sacco and M. Schkolnick, “A Mechanism for Managing the Buffer Pool in a Relational Database System Using the Hot Set Model,” Proc. Eighth Int'l Conf. Very Large Data Bases, pp. 257-262, Sept. 1982.
[37] S. Shekhar, S. Ravada, A. Fetterer, X. Liu, and C.T. Lu, “Spatial Databases: Accomplishments and Research Needs,” IEEE Trans. Knowledge and Data Eng., vol. 11, no. 1, pp. 45-55, Jan./Feb. 1999.
[38] S. Shekhar and D. Liu, “CCAM: A Connectivity-Clustered Access Method for Networks and Network Computations,” IEEE Trans. Knowledge and Data Eng., vol. 9, no. 1, pp. 102-119, 1997.
[39] M. Stonebraker, “Operating System Support for Database Management,” Comm. ACM, vol. 24, no. 7, pp. 412–418, July 1981.
[40] M. Stonebraker, J. Frew, and J. Dozier, “The Sequoia 2000 Project,” Proc. Third Int'l Symp. Large Spatial Databases, 1993.
[41] P. Valduriez, “Join Indices,” ACM Trans. Database Systems, vol. 12, no. 2, 1987.

Index Terms:
Optimal page access sequence, join index, join processing, spatial join.
Citation:
Shashi Shekhar, Chang-Tien Lu, Sanjay Chawla, Sivakumar Ravada, "Efficient Join-Index-Based Spatial-Join Processing: A Clustering Approach," IEEE Transactions on Knowledge and Data Engineering, vol. 14, no. 6, pp. 1400-1421, Nov.-Dec. 2002, doi:10.1109/TKDE.2002.1047776
Usage of this product signifies your acceptance of the Terms of Use.