This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Parallel Hash Join Algorithm for Managing Data Skew
December 1993 (vol. 4 no. 12)
pp. 1355-1371

Presents a parallel hash join algorithm that is based on the concept of hierarchicalhashing, to address the problem of data skew. The proposed algorithm splits the usualhash phase into a hash phase and an explicit transfer phase, and adds an extrascheduling phase between these two. During the scheduling phase, a heuristicoptimization algorithm, using the output of the hash phase, attempts to balance the loadacross the multiple processors in the subsequent join phase. The algorithm naturallyidentifies the hash partitions with the largest skew values and splits them as necessary,assigning each of them to an optimal number of processors. Assuming for concreteness aZipf-like distribution of the values in the join column, a join phase which is CPU-bound,and a shared nothing environment, the algorithm is shown to achieve good join phaseload balancing, and to be robust relative to the degree of data skew and the totalnumber of processors. The overall speedup due to this algorithm is compared to someexisting parallel hash join methods. The proposed method does considerably better in high skew situations.

[1] L. Bic and R. Hartman, "Hither hundreds of processors in a database machine," inProc. 1985 Int. Workshop Database Machines, 1985, pp. 153-168.
[2] M. Blasgen and K. Eswaran, "Storage and access in relational databases,"IBM Syst. J., vol. 16, no. 4, pp. 363-377, 1977.
[3] H. Boral, W. Alexander, L. Clay, G. Copeland, S. Danforth, M. Franklin, B. Hart, M. Smith, and P. Valduriez, "Prototyping Bubba, a highly parallel database system,"IEEE Trans. Knowledge Data Eng., vol. 2, pp. 4-24, 1990.
[4] K. Bratbergsengen, "Hashing methods and relational algebra operations," inProc. Conf. Very Large Data Bases(Singapore), Aug. 1984, pp. 323-333.
[5] S. Christodoulakis, "Estimating record selectivities,"Inform. Syst., vol. 8, no. 2, pp. 105-115, 1983.
[6] E. Coffman, M. Garey, and D. Johnson, "An application of bin packing to multiprocessor scheduling,"SIAM J. Comput., vol. 7, pp. 1-17, 1978.
[7] D. W. Cornell, D. M. Dias, and P. S. Yu, "On multisystem coupling through function request shipping,"IEEE Trans. Software Eng., vol. SE-12, no. 10, pp. 1006-1017, Oct. 1986.
[8] S. Demurjian, D. Hsiao, D. Kerr, J. Menon, P. Strawser, R. Tekampe, J. Trimble, and R. Watson, "Performance evaluation of a database system in multiple backend configurations," inProc. 1985 Int. Workshop Database Machines, 1985, pp. 91-111.
[9] D. DeWitt and R. Gerber, "Multiprocessor hash-based join algorithms," inProc. 11th Int. Conf. Very Large Databases, 1985, pp. 151-164.
[10] D. Dewitt, R. H. Gerber, G. Graefe, M. L. Heytens, K. B. Kumar, and M. Muralikrishna, "GAMMA--A high performance dataflow database machine," inProc. 12th Int. Conf. VLDB, Kyoto, Japan, Aug. 1986, pp. 228-237.
[11] D. DeWitt, M. Smith, and H. Boral, "A single-user performance evaluation of the teradata database machine," MCC Tech. Rep. DB- 081-87, 1987.
[12] D. DeWitt, S. Ghandeharizadeh, D. Schneider, A. Bricker, H. Hsiao, and R. Rasmussen, "The gamma database machine project,"IEEE Trans. Knowledge Data Eng., vol. 2, pp. 44-62, 1990.
[13] D. DeWitt, J. Naughton, D. Schneider, and S. Seshadri, "Practical skew handling in parallel joins," inProc. 18th Int. Conf. Very Large Data Bases, 1992, pp. 27-40.
[14] R. Graham, "Bounds on multiprocessing timing anomalies,"SIAM J. Comput., vol. 17, pp. 416-429, 1969.
[15] D. Hsiao,Advanced Database Machine Architecture. Englewood Cliffs, NJ: Prentice-Hall, 1983.
[16] R. Hu and R. Muntz, "Removing skew effect in join operation on parallel processors," UCLA Tech. Rep. CSD-890027, 1989.
[17] K. A. Hua and C. Lee, "Handling data skew in multiprocessor database computers using partition tuning," inProc. Int. Conf. VLDB, Barcelona, Spain, Sept. 1991, pp. 525-535.
[18] M. Kitsuregawa, H. Tanaka, and T. Motooka, "Application of hash to database machine and its architecture,"New Generation Comput., vol. 1, no. 1, pp. 62-74, 1983.
[19] M. Kitsuregawa, M. Nakayama, and M. Takagi, "The effect of bucket size tuning in dynamic hybrid GRACE hash join method," inProc. 15th Int. Conf. Very Large Data Bases, 1989, pp. 257-266.
[20] M. Kitsuregawa, M. Nakano, L. Harada, and M. Takagi, "Performance evaluation of functional disk system with nonuniform data distribution," inProc. 2nd Int. Symp. Databases Parallel Syst., 1990, pp. 80-89.
[21] M. Kitsuregawa and Y. Ogawa, "Bucket spreading parallel hash: A new robust, parallel hash join method for data skew in the super database computer (SDC)," inProc. 16th Int. Conf. VLDB, Brisbane, Australia, Aug. 1990, pp. 210-221.
[22] D. E. Knuth,The Art of Computer Programming, Vol. 3, Reading, MA: Addison-Wesley, 1973.
[23] S. Lakshmi and P. S. Yu, "Limiting factors of join performance on parallel processors,"Proc. 5th Int. Conf. Data Eng., Feb. 1989, pp. 488-496.
[24] S. Lakshmi and P. Yu, "Effectiveness of parallel joins,"IEEE Trans. Knowledge Data Eng., vol. 2, pp. 410-424, 1990.
[25] S. Lakshmi and P. Yu, "Analysis of effectiveness of parallel processing in database systems,"Comput. Syst. Sci. Eng., vol. 5, no. 2, pp. 73-81, 1990.
[26] Y. Lee and P. Yu, "Adaptive access path selection for relational database systems,"Comput. Syst. Sci. Eng., vol. 7, no. 1, pp. 52-61, 1992.
[27] C. Lynch, "Selectivity estimation and query optimization in large databases with highly skewed distributions of column values," inProc. 14th Int. Conf. Very Large Databases, 1988, pp. 240-251.
[28] C. Mohan, I. Narang, and S. Silen, "Solutions to hot spot problems in a shared disks transaction environment," inProc. 4th Int. Workshop High Perform. Trans. Syst., 1991.
[29] A. Montgomery, D. D'Souza, and S. Lee, "The cost of relational algebraic operations on skewed data: Estimates and experiments," inProc. Inform. Process., 1983, pp. 235-241.
[30] M. Nakayama, M. Kitsuregawa, and M. Takagi, "Hash-partitioned join method using dynamic destaging strategy," inProc. Conf. Very Large Databases(Los Angeles, CA), Aug. 1988, pp. 468-478.
[31] P. Neches and J. Shemer, "The genesis of a database computer,"IEEE Comput., vol. 17, pp. 42-56, 1984.
[32] E. Ozkarahan,Database Machines and Database Management. Englewood Cliffs, NJ: Prentice-Hall, 1986.
[33] G. Qadah, "The equi-join operation on a multiprocessor database machine: Algorithms and the evaluation of their performance," inProc. 1985 Int. Workshop Database Machines, 1985, pp. 35-67.
[34] E. Rahm, "Design of optimistic methods for concurrency control in database sharing systems," inProc. 7th Int. Conf. Distrib. Syst., 1987.
[35] S. Salza, M. Terranova, and P. Velardi, "Performance modeling of the DBMAC architecture," inProc. 1983 Int. Workshop Database Machines, pp. 74-90.
[36] D. Schneider and D. Dewitt, "A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environment," inProc. ACM SIGMOD Conf.(Portland, OR), May-June 1989, p. 110.
[37] M. Stonebraker, "The case for shared nothing,"IEEE Database Eng., vol. 9, 1986.
[38] M. Stonebraker, "The design of XPRS," inProc. 14th Int. Conf. VLDB, pp. 318-330, Los Angeles, Aug. 1988.
[39] S. Tucker, "The IBM 3090 system: An overview,"IBM Syst. J., vol. 25, no. 1, pp. 4-19, 1986.
[40] J. Turek, J. Wolf, K. Pattipati, and P. Yu, "Scheduling parallelizable tasks: Putting it all on the shelf," inProc. 1992 ACM Sigmetrics Conf., pp. 225-236.
[41] J. Turek, J. Wolf, and P. Yu, "Approximate algorithms for scheduling parallelizable tasks," inProc. 4th Annu. Symp. Parallel Algorithms Architectures, 1992, pp. 323-332.
[42] J. Turek, J. Wolf, and P. Yu, "On the scheduling of parallelizable tasks in the presence of precedence constraints," IBM Res. Rep. RC 18425, 1992.
[43] P. Valduriez and G. Gardarin, "Join and semi-join algorithms for a multiprocessor database machine,"ACM Trans. Database Syst., vol. 9, no. 1, pp. 133-161, 1984.
[44] J. L. Wolf, D. Dias, and P. S. Yu, "An effective algorithm for parallelizing sort merge joins in the presence of data skew," inProc. Second Int. Conf. Databases in Parallel and Distributed Syst., Dublin, Ireland, July 1990, pp. 103-115.
[45] J. Wolf, D. Dias, and P. Yu, "A parallel sort merge join algorithm for managing data skew,"IEEE Trans. Parallel Distributed Syst., vol. 4, no. 1, pp. 70-86, 1993.
[46] J. Wolf, D. Dias, P. Yu, and J. Turek, "Comparative performance of parallel join algorithms," inProc. 1st Int. Conf. Parallel Distributed Inform. Syst., pp. 78-88.
[47] P. S. Yu, D. M. Dias, J. T. Robinson, B. R. Iyer, and D. W. Cornell, "On coupling multi-systems through data sharing,"Proc. IEEE, vol. 75, pp. 573-587, 1987.
[48] G. K. Zipf,Human Behavior and the Principle of Least Effort. Reading, MA: Addison-Wesley, 1949.

Index Terms:
Index Termsparallel hash join algorithm; data skew; hierarchical hashing; scheduling; heuristicoptimization; load balancing; Zipf-like distribution; join column; combinatorial optimization; complex queries; relational databases; hash joins; database theory; parallel algorithms; query processing; relational databases; resource allocation
Citation:
J.L. Wolf, P.S. Yu, J. Turek, D.M. Dias, "A Parallel Hash Join Algorithm for Managing Data Skew," IEEE Transactions on Parallel and Distributed Systems, vol. 4, no. 12, pp. 1355-1371, Dec. 1993, doi:10.1109/71.250117
Usage of this product signifies your acceptance of the Terms of Use.