This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Symmetric Fragment and Replicate Algorithm for Distributed Joinsyout
December 1993 (vol. 4 no. 12)
pp. 1345-1354

It is shown that the fragment and replicate (FR) distributed join algorithm is a specialcase of the symmetric fragment and replicate (SFR) algorithm, which improves the FRalgorithm by reducing its communication. The SFR algorithm, like the FR algorithm, isapplicable to N-way joins and nonequijoins and does tuple balancing automatically. Theauthors derive formulae that show how to minimize the communication in the SFRalgorithm, discuss its performance on a parallel database prototype, and evaluate itspracticality under various conditions. It is claimed that SFR improves the worst-case costfor a distributed join, but it will not displace specialized distributed join algorithms whenthe later are applicable.

[1] M. Ajtai, J. L. Hafner, J. W. Stamos, and H. C. Young, "A divisor search problem with applications to database queries," RJ 8430, IBM Almaden Res. Cen., Oct. 1991.
[2] T. Baba, H. Saito, and S. B. Yao, "A network algorithm for relational database operations," inProc. 5th Int. Workshop Database Machines, Oct. 1987, pp. 257-270.
[3] Y. Birk, "Concurrent communication among multi-transceiver stations over shared media," Ph.D. dissertation, Stanford Univ., Mar. 1987. Available as Tech. Rep. CSL-TR-87-321.
[4] D. J. DeWitt, S. Ghandeharizadeh, D. Schneider, A. Bricker, H.-I. Hsiao, and R. Rasmussen, "The Gamma database machine project,"IEEE Trans. Knowledge Data Eng., vol. 2, pp. 44-62, Mar. 1990.
[5] D. J. DeWitt, J. F. Naughton, and D. A. Schneider, "An evaluation of non-equijoin algorithms," inProc. 17th Int. Conf. Very Large Data Bases, Sept. 1991, pp. 443-452.
[6] S. Englert, J. Gray, T. Kocher, and P. Shah, "A benchmark of NonStop SQL release 2 demonstrating near-linear speedup and scaleup on large databases," Tech. Rep. 89.4, Tandem Computer Inc., May 1989.
[7] R. Epstein, M. Stonebraker, and E. Wong, "Distributed query processing in a relational data base system," inProc. 1978 ACM SIGMOD Int. Conf. Management of Data. New York: ACM Press, May 1978, pp. 169-180.
[8] J. R. Goodman, "An investigation of multiprocessor structures and algorithms for data base management," Ph.D. dissertation, UC Berkeley, May 1981. Available as Tech. Rep. UCB/ERLM81/33.
[9] D. S. Johnson, "The NP-completeness column: An ongoing guide,"J. Algorithms, vol. 5, pp. 433-447, 1984.
[10] R. Krishnamurthy, H. Boral, and C. Zaniolo, "Optimization of nonrecursive queries," inProc. 12th Int. Conf. Very Large Data Bases, Aug. 1986, pp. 128-137.
[11] R. A. Lorie, J.-J. Daudenarde, J. W. Stamos, and H. C. Young, "Exploiting database parallelism in a message-passing multiprocessor,"IBM J. Res. Develop., vol. 35, pp. 681-695, Sept./Nov. 1991.
[12] H. Lu and M. J. Carey, "Some experimental results on distributed join algorithms in a local network," inProc. 11th Int. Conf. Very Large Data Bases, Stockholm, Aug. 1985, pp. 292-304.
[13] E. Omiecinski and E. T. Lin, "The adaptive-hash join algorithm for a hypercube multicomputer," GIT-ICS 89/48, School of Inform. and Comput. Sci., Georgia Inst. Technol., Dec. 1989. Also,IEEE Trans. Parallel Distributed Syst., to be published.
[14] J. P. Richardson, H. Lu, and K. Mikkilineni, "Design and evaluation of parallel pipelined join algorithms," inProc. 1987 ACM SIGMOD Int. Conf. Management of Data, May 1987, pp. 399-409.
[15] J. W. Stamos and H. C. Young, "A symmetric fragment and replicate algorithm for distributed joins," RJ 7188, IBM Almaden Res. Cen., Dec. 1989.
[16] A. Swami and A. Gupta, "Optimization of large join queries," inProc. 1988 ACM SIGMOD Int. Conf. Management of Data, June 1988, pp. 8-17.
[17] A. N. Swami, "Optimization of large join queries," Ph.D. dissertation, Dep. Comput. Sci., Stanford Univ., Stanford, CA, June 1989. Available as Rep. STAN-CS-89-1262.
[18] H. C. Young and J. W. Stamos, "An experimental evaluation of SFR with integer constraints," RJ 8428, IBM Almaden Res. Cen., Oct. 1991.

Index Terms:
Index Termsfragment and replicate algorithm; distributed joins; symmetric fragment and replicate;SFR; tuple balancing; parallel database; worst case cost; distributed join; intratransaction parallelism; load balancing; multicast communication; performance evaluation; relational data model; symmetry; computational complexity; database theory; distributed algorithms
Citation:
J.W. Stamos, H.C. Young, "A Symmetric Fragment and Replicate Algorithm for Distributed Joinsyout," IEEE Transactions on Parallel and Distributed Systems, vol. 4, no. 12, pp. 1345-1354, Dec. 1993, doi:10.1109/71.250116
Usage of this product signifies your acceptance of the Terms of Use.