|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
2012 20th Euromicro International Conference on Parallel, Distributed and Network-based Processing
SIMT Microscheduling: Reducing Thread Stalling in Divergent Iterative Algorithms
Munich, Germany
February 15-February 17
ISBN: 978-0-7695-4633-9
| ASCII Text | x | ||
| Steffen Frey, Guido Reina, Thomas Ertl, "SIMT Microscheduling: Reducing Thread Stalling in Divergent Iterative Algorithms," 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008), pp. 399-406, 2012 20th Euromicro International Conference on Parallel, Distributed and Network-based Processing, 2012. | |||
| BibTex | x | ||
| @article{ 10.1109/PDP.2012.62, author = {Steffen Frey and Guido Reina and Thomas Ertl}, title = {SIMT Microscheduling: Reducing Thread Stalling in Divergent Iterative Algorithms}, journal ={16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008)}, volume = {0}, year = {2012}, issn = {1066-6192}, pages = {399-406}, doi = {http://doi.ieeecomputersociety.org/10.1109/PDP.2012.62}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - CONF JO - 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008) TI - SIMT Microscheduling: Reducing Thread Stalling in Divergent Iterative Algorithms SN - 1066-6192 SP399 EP406 A1 - Steffen Frey, A1 - Guido Reina, A1 - Thomas Ertl, PY - 2012 KW - GPU KW - Scheduling KW - Divergence VL - 0 JA - 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008) ER - | |||
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/PDP.2012.62
The global scheduler of a current GPU distributes thread blocks to symmetric multiprocessors (SM), which schedule threads for execution with the granularity of a warp. Threads in a warp execute the same code path in lockstep, which potentially leads to a large amount of wasted cycles for divergent control flow. In order to overcome this general issue of SIMT architectures, we propose techniques to relax divergence on the fly within a computation kernel in order to achieve a much higher total utilization of processing cores. We propose techniques for branch and loop divergence (which may also be combined) switching to suitable tasks during a GPU kernel run every time divergence occurs. Our newly introduced techniques can easily be applied to arbitrary iterative algorithms and we evaluate the performance and effectiveness of our approach exemplarily via synthetic and real world applications.
Index Terms:
GPU, Scheduling, Divergence
Citation:
Steffen Frey, Guido Reina, Thomas Ertl, "SIMT Microscheduling: Reducing Thread Stalling in Divergent Iterative Algorithms," pdp, pp.399-406, 2012 20th Euromicro International Conference on Parallel, Distributed and Network-based Processing, 2012
Usage of this product signifies your acceptance of the Terms of Use.
