This Article 
 Bibliographic References 
 Add to: 
Optimal Secondary Storage Access Sequence for Performing Relational Join
September 1989 (vol. 1 no. 3)
pp. 318-328

Two graph models are developed to determine the minimum required buffer size for achieving the theoretical lower bound on the number of disk accesses for performing relational joins. Here, the lower bound implies only one disk access per joining block or page. The first graph model is based on the block connectivity of the joining relations. Using this model, the problem of determining an ordered list of joining blocks that requires the smallest buffer is considered. It is shown that this problem as well as the problem of computing the least upper bound on the buffer size is NP-hard. The second graph model represents the page connectivity of the joining relations. It is shown that the problem of computing the least upper bound on the buffer size for the page connectivity model is also NP-hard. Heuristic procedures are presented for the page connectivity model and it is shown that the sequence obtained using the heuristics requires a near-optimal buffer size The authors also show the performance improvement of the proposed heuristics over the hybrid-has join algorithm for a wide range of join factors.

[1] M. W. Blasgen and K. P. Eswaran, "Storage and access in relational databases,"IBM Syst. J., vol. 16, no. 4, 1977.
[2] K. Bratbergsengen, "Hashing methods and relational algebra operations," inProc. ACM SIGMOD Conf., Aug. 1984.
[3] D. J. DeWitt, "DIRECT--A multiprocessor organization for supporting relational database management systems,"IEEE Trans. Comput., vol. C-28, June 1979.
[4] D. J. DeWittet al., "Implementation techniques for main memory databases," inProc. ACM Sigmod(Boston, MA), June 18-21, 1984, pp. 1-8.
[5] F. Fotouhi and S. Pramanik, "Optimizing the cost of relational queries using partial-relation schemes,"Inform. Syst., vol. 13, no. 1, 1988.
[6] M. R. Garey and D. S. Johnson,Computers and Intractability: A Guide to Theory of NP-Completeness. San Francisco, CA: Freeman, 1979.
[7] T. Haerder, "Implementing a generalized access path structure for a relational database system," inACM Trans. Database Syst., vol. 3, no. 3, pp. 285-298, Sept. 1978.
[8] D. Hsiao,Advanced Database Machine Architecture. Englewood Cliffs, NJ: Prentice-Hall, 1983.
[9] T. Merrett, Y. Kambayashi, and H. Yasuura, "Scheduling of pagefetches in join operations," inProc. 7th Int. Conf. on VLDB, Cannes, France, 1981.
[10] S. Pramanik, "Performance analysis of a database filter search hard-ware,"IEEE Trans. Comput., vol. 35, no. 12, pp. 1077-1082, Dec. 1986.
[11] S. Pramanik and F. Fotouhi, "Index database machine,"The Comput. J., vol. 29, Oct. 1986.
[12] S. Pramanik and D. Ittner, "Use of graph-theoretic models for optimal relational database accesses to perform join,"ACM Trans. Database Syst., vol. 10, no. 1, pp. 57-74, Mar. 1985.
[13] J. P. Richardson, H. Lu, and K. Mikkilineni, "Design and evaluation of parallel pipelined join algorithms," inProc. 1987 ACM SIGMOD Int. Conf. Management of Data, May 1987, pp. 399-409.
[14] G. M. Sacco, "Fragmentation: A technique for efficient query processing,"ACM Trans. Database Syst., vol. 11, no. 2, pp. 113-133, 1986.
[15] G. M. Sacco and M. Schkolnick, "Buffer management in relational database systems,"ACM TODS, vol. 11, Dec. 1986.
[16] L. D. Shapiro, "Join processing in database systems with large main memories,"ACM Trans. Database Syst., vol. 11, no. 3, pp. 239-264, Sept. 1986.
[17] Private communications, Software Products International, Inc., San Diego, CA.
[18] S. Y. W. Su, L. H. Nguyen, A. Emam, and G. L. Lipovski, "The architectural features and implementation techniques of the multicell CASSM,"IEEE Trans. Comput., vol. C-28, 1979.
[19] P. Valduriez, "Join indices,"ACM Trans. Database Syst., vol. 12, pp. 218-246, June 1987.

Index Terms:
optimal secondary storage access sequence; heuristic procedures; performing relational join; graph models; buffer size; lower bound; block connectivity; NP-hard; page connectivity; least upper bound; database theory; relational databases
F. Fotouhi, S. Pramanik, "Optimal Secondary Storage Access Sequence for Performing Relational Join," IEEE Transactions on Knowledge and Data Engineering, vol. 1, no. 3, pp. 318-328, Sept. 1989, doi:10.1109/69.87978
Usage of this product signifies your acceptance of the Terms of Use.