This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2012 20th Euromicro International Conference on Parallel, Distributed and Network-based Processing
SIMT Microscheduling: Reducing Thread Stalling in Divergent Iterative Algorithms
Munich, Germany
February 15-February 17
ISBN: 978-0-7695-4633-9
The global scheduler of a current GPU distributes thread blocks to symmetric multiprocessors (SM), which schedule threads for execution with the granularity of a warp. Threads in a warp execute the same code path in lockstep, which potentially leads to a large amount of wasted cycles for divergent control flow. In order to overcome this general issue of SIMT architectures, we propose techniques to relax divergence on the fly within a computation kernel in order to achieve a much higher total utilization of processing cores. We propose techniques for branch and loop divergence (which may also be combined) switching to suitable tasks during a GPU kernel run every time divergence occurs. Our newly introduced techniques can easily be applied to arbitrary iterative algorithms and we evaluate the performance and effectiveness of our approach exemplarily via synthetic and real world applications.
Index Terms:
GPU, Scheduling, Divergence
Citation:
Steffen Frey, Guido Reina, Thomas Ertl, "SIMT Microscheduling: Reducing Thread Stalling in Divergent Iterative Algorithms," pdp, pp.399-406, 2012 20th Euromicro International Conference on Parallel, Distributed and Network-based Processing, 2012
Usage of this product signifies your acceptance of the Terms of Use.