2017 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW) (2017)
Campinas, SP, Brazil
Oct. 17, 2017 to Oct. 20, 2017
The stencil pattern is important in many scientific and engineering domains, spurring great interest from researchers and industry. In recent years, various optimizations have been proposed for parallel stencil applications running on GPUs. However, most of the runtime systems that execute those applications often fail to fully utilize the parallelism of modern heterogeneous systems. In this paper, we propose a mechanism based on machine learning that automatically partitions stencil computations across CPU and GPU. We implemented it into the PSkel framework and found that the mechanism can boost the performance of stencil applications on average by 17.9x compared to their sequential CPU-only counterparts, by 1.34x compared to a GPU-only version, and by 1.48x compared to a parallel CPU-only version.
graphics processing units, learning (artificial intelligence)
A. D. Pereira, R. C. Rocha, L. Ramos, M. Castro and L. F. Goes, "Automatic Partitioning of Stencil Computations on Heterogeneous Systems," 2017 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW), Campinas, SP, Brazil, 2017, pp. 43-48.