Semi-automatic extraction and exploitation of hierarchical pipeline parallelism using profiling information
2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT) (2010)
Sept. 11, 2010 to Sept. 15, 2010
DOI Bookmark: http://doi.ieeecomputersociety.org/
Georgios Tournavitis , Institute for Computing Systems Architecture, School of Informatics, University of Edinburgh, Scotland, United Kingdom
Bjorn Franke , Institute for Computing Systems Architecture, School of Informatics, University of Edinburgh, Scotland, United Kingdom
In recent years multi-core computer systems have left the realm of high-performance computing and virtually all of today's desktop computers and embedded computing systems are equipped with several processing cores. Still, no single parallel programming model has found widespread support and parallel programming remains an art for the majority of application programmers. In addition, there exists a plethora of sequential legacy applications for which automatic parallelization is the only hope to benefit from the increased processing power of modern multi-core systems. In the past automatic parallelization largely focused on data parallelism. In this paper we present a novel approach to extracting and exploiting pipeline parallelism from sequential applications. We use profiling to overcome the limitations of static data and control flow analysis enabling more aggressive parallelization. Our approach is orthogonal to existing automatic parallelization approaches and additional data parallelism may be exploited in the individual pipeline stages. The key contribution of this paper is a whole-program representation that supports profiling, parallelism extraction and exploitation. We demonstrate how this enhances conventional pipeline parallelization by incorporating support for multi-level loops and pipeline stage replication in a uniform and automatic way. We have evaluated our methodology on a set of multimedia and stream processing benchmarks and demonstrate speedups of up to 4.7 on a eight-core Intel Xeon machine.
Program Dependence Graph, Parallelization, Pipeline Parallelism, Streaming Applications
G. Tournavitis and B. Franke, "Semi-automatic extraction and exploitation of hierarchical pipeline parallelism using profiling information," 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT), Vienna, Austria, 2010, pp. 377-388.