loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2009 International Conference on Parallel Processing
Efficient Scheduling of Nested Parallel Loops on Multi-Core Systems
Vienna, Austria
September 22-September 25
ISBN: 978-0-7695-3802-0
Parallel loops, such as a parallel DO loop, in Fortran, account for large percentage of the total execution time. Given this, we focus on the problem of how to efficiently schedule nested perfect/non-perfect parallel loops on the emerging multi-core systems. In this regard, one of the key aspects is how to determine the profitability of parallel execution and how to efficiently capture the cache behavior as the cache subsystem is often the main performance bottleneck in multi-core systems. In this paper, we present a novel profile-guided compiler technique for cache-aware scheduling of iteration spaces of such loops. Specifically, we propose a technique for iteration space scheduling which captures the effect of variation in the number of cache misses across the iteration space. Subsequently, we propose a general approach to capture the variation of both the number of cache misses and computation across the iteration space. We demonstrate the efficacy of our approach on a dedicated 4-way Intel Xeon based multiprocessor using several kernels from the industry-standard benchmarks.
Index Terms:
Multithreading, Cost modeling, Load balancing, Cache misses
Citation:
Arun Kejariwal, Alexandru Nicolau, Alexander V. Veidenbaum, Utpal Banerjee, Constantine D. Polychronopoulos, "Efficient Scheduling of Nested Parallel Loops on Multi-Core Systems," icpp, pp.74-83, 2009 International Conference on Parallel Processing, 2009
Usage of this product signifies your acceptance of the Terms of Use.