This Article 
 Bibliographic References 
 Add to: 
A Parallel Sort Merge Join Algorithm for Managing Data Skew
January 1993 (vol. 4 no. 1)
pp. 70-86

A parallel sort-merge-join algorithm which uses a divide-and-conquer approach to address the data skew problem is proposed. The proposed algorithm adds an extra, low-cost scheduling phase to the usual sort, transfer, and join phases. During the schedulingphase, a parallelizable optimization algorithm, using the output of the sort phase,attempts to balance the load across the multiple processors in the subsequent joinphase. The algorithm naturally identifies the largest skew elements, and assigns each ofthem to an optimal number of processors. Assuming a Zipf-like distribution of data skew,the algorithm is demonstrated to achieve very good load balancing for the join phase, andis shown to be very robust relative, among other things, to the degree of data skew andthe total number of processors.

[1] A. V. Aho, J. E. Hopcroft, and J. D. Ullman,The Design and Analysis of Computer Algorithms. Menlo Park, CA: Addison-Wesley, 1974.
[2] S. G. Akl and N. Santoro, "Optimal parallel merging and sorting without memory conflicts,"IEEE Trans. Comput., vol. C-36, pp. 1367-1369, 1987.
[3] L. Bic and R. L. Hartman, "Hither hundreds of processors in a database machine," inProc. 1985 Int. Workshop Database Machines, Springer-Verlag, 1985.
[4] M. Blasgen and K. Eswaran, "Storage and access in relational databases,"IBM Syst. J., vol. 4, p. 363, 1977.
[5] M. Blum, R. W. Floyd, V. R. Pratt, R. L. Rivest, and R. E. Tarjan, "Time bounds for selection,"J. Comput. Syst. Sci., vol. 7, pp. 448-461, 1972.
[6] H. Boral, W. Alexander, L. Clay, G. Copeland, S. Danforth, M. Franklin, B. Hart, M. Smith, and P. Valduriez, "Prototyping Bubba, A highly parallel database system,"IEEE Trans. Knowledge Data Eng., vol. 2, no. 1, pp. 4-24, Mar. 1990.
[7] S. Christodoulakis "Estimating record selectivities,"Inform. Syst., vol. 8, no. 2, pp. 105-115, 1983.
[8] E. G. Coffman and P. J. Denning,Operating Systems Theory. Englewood Cliffs, NJ: Prentice-Hall, 1973.
[9] E. Coffman, M. Garey, and D. S. Johnson, "An application of bin packing to multiprocessor scheduling,"SIAM J. Comput., vol. 7, pp. 1-17, 1978.
[10] D. W. Cornell, D. M. Dias, and P. S. Yu, "On multisystem coupling through function request shipping,"IEEE Trans. Software Eng., vol. SE-12, no. 10, pp. 1006-1017, Oct. 1986.
[11] S. A. Demurjian, D. K. Hsiao, D. S. Kerr, J. Menon, P. R. Strawser, R. C. Tekampe, J. Trimble, and R. J. Watson, "Performance evaluation of a database system in multiple backend configurations," inProc. 1985 Int. Workshop Database Machines, Springer-Verlag, 1985.
[12] D. J. Dewitt and R. H. Gerber "Multiprocessor hash-based join algorithms," inProc. 11th Int. Conf. Very Large Databases, 1985.
[13] D. J. Dewitt, S. Ghandeharizadeh, D. A. Schneider, A. Bricker, H.-I. Hsiao, and R. Rasmussen, "The GAMMA database machine project,"IEEE Trans. Knowledge Data Eng., vol. 2, no. 1, pp. 44-62, Mar. 1990.
[14] D. J. DeWitt, M. Smith, and H. Boral, "A single-user performance evaluation of the Teradata database machine," MCC Tech. Rep. DB- 081-87, 1987.
[15] G. Frederickson and D. B. Johnson, "The complexity of selection and ranking in X+Y and matrices with sorted columns,"J. Comput. Syst. Sci., vol. 24, pp. 197-209, 1982.
[16] G. Frederickson and D. B. Johnson, "Generalized selection and ranking: Sorted matrices,"SIAM J. Comput., vol. 13, pp. 14-30, 1984.
[17] Z. Galil and N. Megiddo, "A fast selection algorithm and the problem of optimum distribution of effort,"J. ACM, vol. 26, pp. 58-64, 1979.
[18] R. Graham "Bounds on multiprocessing timing anomalies,"SIAM J. Appl. Mathemat., vol. 17, no. 2, pp. 416-429, 1969.
[19] D. Hsiao,Advanced Database Machine Architecture. Englewood Cliffs, NJ: Prentice-Hall, 1983.
[20] T. Ibaraki and N. Katoh,Resource Allocation Problems. Cambridge, MA: MIT Press, 1988.
[21] B. R. Iyer and D. M. Dias, "System issues in parallel sorting for database systems," inProc. 6th Int. Conf. Data Eng., 1988.
[22] B. R. Iyer, G. R. Ricard, and P. J. Varman, "Percentile finding algorithm for multiple sorted runs," inProc. 15th Int. Conf. Very Large Databases, 1989.
[23] W. Kim, "A new way to compute the product and join of relations," inACM SIGMOD Conf. Proc., Santa Monica, CA, 1980, pp. 179- 187.
[24] M. Kitsuregawa, H. Tanaka, and T. Motooka, "Application of hash to data base machine and its architecture,"New Generation Comput., vol. 1, no. 1, 1983.
[25] D. E. Knuth,The Art of Computer Programming, Vol. 3, Reading, MA: Addison-Wesley, 1973.
[26] S. Lakshmi and P. Yu, "Analysis of effectiveness of parallel processing in database systems,"Comput. Syst. Sci. Eng., vol. 5, no. 2, pp. 73-81, 1990.
[27] S. Lakshmi and P. S. Yu, "Effectiveness of parallel joins,"IEEE Trans. Knowledge Data Eng., vol. 2, no. 4, pp. 410-424, 1990.
[28] C. Lynch, "Selectivity estimation and query optimization in large databases with highly skewed distributions of column values," inProc. 14th Int. Conf. Very Large Databases, 1988, pp. 240-251.
[29] A. Y. Montgomery, D. J. D'Souza, and S. B. Lee, "The cost of relational algebraic operations on skewed data: Estimates and experiments," inInform. Processing 83, IFIP, 1983.
[30] P. M. Neches and J. E. Shemer, "The genesis of a database computer,"IEEE Comput. Mag., vol. 17, no. 11, pp. 42-56, 1984.
[31] E. Ozkarahan,Database Machines and Database Management. Englewood Cliffs, NJ: Prentice-Hall, 1986.
[32] G. Z. Qadah, "The equi-join operation on a multiprocessor database machine: Algorithms and the evaluation of their performance," inProc. 1985 Int. Workshop Database Machines, Springer-Verlag, 1985, pp. 35-67.
[33] S. Salza, M. Terranova, and P. Velardi, "Performance modeling of the DBMAC architecture," inProc. 1983 Int. Workshop Database Machines, Springer-Verlag, 1983, pp. 74-90.
[34] D. Schneider and D. Dewitt, "A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environment," inProc. ACM SIGMOD Conf.(Portland, OR), May-June 1989, p. 110.
[35] M. Stonebraker, "The case for shared nothing,"IEEE Database Eng., vol. 9, no. 1, 1986.
[36] A. N. Tantawi, D. Towsley, and J. L. Wolf, "An algorithm for a class-constrained resource allocation problem, " inProc. 1988 ACM Sigmetrics Conf., May 1988, pp. 253-260.
[37] P. Valduriez and G. Gardarin, "Join and semi-join algorithms for a multiprocessor database machine,"ACM Trans. Database Syst., vol. 9, no. 1, pp. 133-161, 1984.
[38] J. I. Wolf, D. M. Dias, P. S. Yu, and J Turek, "An efficient algorithm for parallelizing hash joins in presence of data skew," inProc. Int. Conf. Data Eng., pp. 200-209, Kobe, Japan, Apr. 1991.
[39] J. L. Wolf, B. R. Iyer, K. R. Pattipati, and J. Turek, "Optimal buffer partitioning for the nested block join algorithm," inProc. Seventh Int. Conf. Data Eng., Kobe, Japan, Apr. 1991, pp. 510-519.
[40] J. Wolf, D. Dias, P. Yu, and J. Turek, "Comparative performance of parallel join algorithms," inProc. 1st Int. Conf. Parallel Distributed Inform. Syst., pp. 78-88.
[41] G. K. Zipf,Human Behavior and the Principle of Least Effort. Reading, MA: Addison-Wesley, 1949.

Index Terms:
Index Termsdata skew management; transfer phase; sort phase; parallel sort merge join algorithm;divide-and-conquer; scheduling phase; join phases; parallelizable optimization algorithm;multiple processors; Zipf-like distribution; load balancing; distributed databases; merging; parallel algorithms; relational algebra; relational databases; sorting
J.L. Wolf, D.M. Dias, P.S. Yu, "A Parallel Sort Merge Join Algorithm for Managing Data Skew," IEEE Transactions on Parallel and Distributed Systems, vol. 4, no. 1, pp. 70-86, Jan. 1993, doi:10.1109/71.205654
Usage of this product signifies your acceptance of the Terms of Use.