The Community for Technology Leaders
2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT) (2012)
Minneapolis, MN, USA
Sept. 19, 2012 to Sept. 23, 2012
ISBN: 978-1-5090-6609-4
pp: 33-42
Sreepathi Pai , Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore, India
R. Govindarajan , Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore, India
Matthew J. Thazhuthaveetil , Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore, India
ABSTRACT
Exploiting the performance potential of GPUs requires managing the data transfers to and from them efficiently which is an error-prone and tedious task. In this paper, we develop a software coherence mechanism to fully automate all data transfers between the CPU and GPU without any assistance from the programmer. Our mechanism uses compiler analysis to identify potential stale accesses and uses a runtime to initiate transfers as necessary. This allows us to avoid redundant transfers that are exhibited by all other existing automatic memory management proposals. We integrate our automatic memory manager into the X10 compiler and runtime, and find that it not only results in smaller and simpler programs, but also eliminates redundant memory transfers. Tested on eight programs ported from the Rodinia benchmark suite it achieves (i) a 1.06× speedup over hand-tuned manual memory management, and (ii) a 1.29× speedup over another recently proposed compiler-runtime automatic memory management system. Compared to other existing runtime-only and compiler-only proposals, it also transfers 2.2× to 13.3× less data on average.
INDEX TERMS
Graphics processing units, Memory management, Kernel, Manuals, Rails, Runtime, Data transfer,Software Coherence, GPU, Memory Management, Data Transfers, Automatic
CITATION
Sreepathi Pai, R. Govindarajan, Matthew J. Thazhuthaveetil, "Fast and efficient automatic memory management for GPUs using compiler-assisted runtime coherence scheme", 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT), vol. 00, no. , pp. 33-42, 2012, doi:
90 ms
(Ver 3.3 (11022016))