The 4th International Symposium on Parallel and Distributed Computing (ISPDC'05)
A New Join Algorithm for Cluster-Based Database Systems
Universit? of Lille 1, France
July 04-July 06
ISBN: 0-7695-2434-6
This paper focuses on cluster-based parallel database systems in which only one of the nodes has the database and the other nodes, which have no initial data, are used for parallel query processing. In such a system, the load of each node changes dynamically depending on the activities of the local users. In addition, in database query processing, data skew exists. Thus, it is very important to develop efficient load balancing/sharing algorithms. This paper proposes a new join algorithm called Symmetric Chunking Hash Join (SCHJ) that divides the hash buckets into chunks and uses them for load balancing. The SCHJ algorithm is compared with a dynamic round-robin algorithm and a sampling algorithm. The results show that the SCHJ algorithm is the best among these algorithms when there is data skew and background load variations