The Community for Technology Leaders
2016 International Conference on Parallel Architecture and Compilation Techniques (PACT) (2016)
Haifa, Israel
Sept. 11, 2016 to Sept. 15, 2016
ISBN: 978-1-5090-5308-7
pp: 449-450
Tsung Tai Yeh , Department of Electrical and Computer Engineering, Purdue University, United States of America
Amit Sabne , Department of Electrical and Computer Engineering, Purdue University, United States of America
Putt Sakdhnagool , Department of Electrical and Computer Engineering, Purdue University, United States of America
Rudolf Eigenmann , Department of Electrical and Computer Engineering, Purdue University, United States of America
Timothy G. Rogers , Department of Electrical and Computer Engineering, Purdue University, United States of America
ABSTRACT
Massively multithreaded GPUs achieve high throughput by running thousands of threads in parallel. To fully utilize the hardware, contemporary workloads spawn work to the GPU in bulk by launching large tasks, where each task is a kernel that contains thousands of threads that occupy the entire GPU.
INDEX TERMS
Graphics processing units, Parallel processing, Instruction sets, Runtime, Kernel, Micromechanical devices, Hardware
CITATION
Tsung Tai Yeh, Amit Sabne, Putt Sakdhnagool, Rudolf Eigenmann, Timothy G. Rogers, "POSTER: Pagoda: A runtime system to maximize GPU utilization in data parallel tasks with limited parallelism", 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT), vol. 00, no. , pp. 449-450, 2016, doi:10.1145/2967938.2974055
378 ms
(Ver 3.3 (11022016))