2014 IEEE International Conference On Cluster Computing (CLUSTER) (2014)
Sept. 22, 2014 to Sept. 26, 2014
Koichi Shirahata , Tokyo Institute of Technology, Japan
Hitoshi Sato , Tokyo Institute of Technology, Japan
Satoshi Matsuoka , Tokyo Institute of Technology, Japan
GPUs can accelerate edge scan performance of graph processing applications; however, the capacity of device memory on GPUs limits the size of graph to process, whereas efficient techniques to handle GPU memory overflows, including overflow detection and performance analysis in large-scale systems, are not well investigated. To address the problem, we propose a MapReduce-based out-of-core GPU memory management technique for processing large-scale graph applications on heterogeneous GPU-based supercomputers. Our proposed technique automatically handles memory overflows from GPUs by dynamically dividing graph data into multiple chunks and overlaps CPU-GPU data transfer and computation on GPUs as much as possible. Our experimental results on TSUBAME2.5 using 1024 nodes (12288 CPU cores, 3072 GPUs) exhibit that our GPU-based implementation performs 2.10x faster than running on CPU when graph data size does not fit on GPUs. We also study the performance characteristics of our proposed out-of-core GPU memory management technique, including application's performance and power efficiency of scale-up and scale-out approaches.
Graphics processing units, Memory management, Sorting, Data transfer, Supercomputers, Performance evaluation, Algorithm design and analysis
K. Shirahata, H. Sato and S. Matsuoka, "Out-of-core GPU memory management for MapReduce-based large-scale graph processing," 2014 IEEE International Conference On Cluster Computing (CLUSTER), Madrid, Spain, 2014, pp. 221-229.