2015 International Conference on Parallel Architecture and Compilation (PACT) (2015)
San Francisco, CA, USA
Oct. 18, 2015 to Oct. 21, 2015
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/PACT.2015.55
Despite the proliferation of multi-core and multi-threaded architectures, exploiting implicit parallelism for a single semantic thread is still a crucial component in achieving high performance. Lookahead is a "tried-and-true" strategy in uncovering implicit parallelism. However, a conventional, monolithic out-of-order core quickly becomes resource-inefficient when looking beyond a small distance. One general approach to mitigate the impact of branch mispredictions and cache misses is to enable deep look-ahead. A particular approach that is both flexible and effective is to use an independent, decoupled look-ahead thread on a separate thread context guided by a program slice known as skeleton.While capable of generating significant performance gains, the look-ahead agent often becomes the new speed limit. We propose to accelerate the look-ahead thread by skipping branch based, sideeffect free code modules that do not contribute to the effectiveness of look-ahead. We call them Do-It-Yourself or DIY branches for which the main thread does not get any help from the look-ahead thread, instead relies on its own branch predictor and prefetcher. By skipping DIY branches, look-ahead thread propels ahead and provides performance-critical assistance down the stream to improve the performance of decoupled look-ahead system by up to 15%.
Skeleton, Context, Prefetching, Parallel processing, Out of order, Performance gain
R. Parihar and M. C. Huang, "Load Balancing in Decoupled Look-ahead: A Do-It-Yourself (DIY) Approach," 2015 International Conference on Parallel Architecture and Compilation (PACT), San Francisco, CA, USA, 2015, pp. 486-487.