10th Euromicro Workshop on Parallel, Distributed and Network-based Processing (EUROMICRO-PDP 2002) On Improving the Performance of Data Partitioning Oriented Parallel Irregular Reductions Canary Islands, Spain January 09-January 11 ISBN: 0-7695-1444-8
Different parallelization techniques for reductions have been proposed elsewhere, that we have classified in this paper into two classes: LPO (Loop Partitioning Oriented techniques) and DPO (Data Partitioning Oriented techniques). We have analyzed both classes in terms of a set of performance properties: data locality, memory overhead, parallelism and workload balancing.In this paper we propose several techniques to increase the exploited parallelism and to introduce load balancing into a DPO method. Regarding parallelism, the solution is based on the partial expansion of the reduction array. For load balance, a first technique is generic, as it can deal with any kind of load unbalancing present in the problem domain.A second technique handles a special case of load unbalancing, appearing when there are a large number of write operations on small regions of the reduction arrays. Efficient implementations of the proposed optimizing solutions for the DWA--LIP DPO method are presented, experimentally tested on static and dynamic kernel codes and compared with other parallel reduction methods.
Index Terms:
Irregular reductions, data locality, workload balancing, shared-memory multiprocessor
Citation:
E. Gutierrez, O. Plata, E.L. Zapata, "On Improving the Performance of Data Partitioning Oriented Parallel Irregular Reductions," pdp, pp.0445, 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing (EUROMICRO-PDP 2002), 2002 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||