Parallel and Distributed Processing Symposium, International (2011)
Anchorage, Alaska USA
May 16, 2011 to May 20, 2011
We present GPMR, our stand-alone MapReduce library that leverages the power of GPU clusters for large-scale computing. To better utilize the GPU, we modify MapReduce by combining large amounts of map and reduce items into chunks and using partial reductions and accumulation. We use persistent map and reduce tasks and stress aspects of GPMR with a set of standard MapReduce benchmarks. We run these benchmarks on a GPU cluster and achieve desirable speedup and efficiency for all benchmarks. We compare our implementation to the current-best GPU-MapReduce library (runs only on a solo GPU) and a highly-optimized multi-core MapReduce to show the power of GPMR. We demonstrate how typical MapReduce tasks are easily modified to fit into GPMR and leverage a GPU cluster. We highlight how total and relative amounts of communication affect GPMR. We conclude with an exposition on the types of MapReduce tasks well-suited to GPMR, and why some tasks need more modifications than others to work well with GPMR.
J. D. Owens and J. A. Stuart, "Multi-GPU MapReduce on GPU Clusters," 25th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2011)(IPDPS), Anchorage, AK, 2011, pp. 1068-1079.