2006 International Conference on Parallel Architectures and Compilation Techniques (PACT) (2006)
Seattle, WA, USA
Sept. 16, 2006 to Sept. 20, 2006
DOI Bookmark: http://doi.ieeecomputersociety.org/
Zhelong Pan , Purdue University, School of ECE, West Lafayette, IN
Rudolf Eigenmann , Purdue University, School of ECE, West Lafayette, IN
This paper presents an automated performance tuning solution, which partitions a program into a number of tuning sections and finds the best combination of compiler options for each section. Our solution builds on prior work on feedback-driven optimization, which tuned the whole program, instead of each section. Our key novel algorithm partitions a program into appropriate tuning sections. We also present the architecture of a system that automates the tuning process; it includes several pre-tuning steps that partition and instrument the program, followed by the actual tuning and the post-tuning assembly of the individually-optimized parts. Our system, called PEAK, achieves fast tuning speed by measuring a small number of invocations of each code section, instead of the whole-program execution time, as in common solutions. Compared to these solutions PEAK reduces tuning time from 2.19 hours to 5.85 minutes on average, while achieving similar program performance. PEAK improves the performance of SPEC CPU2000 FP benchmarks by 12% on average over GCC O3, the highest optimization level, on a Pentium IV machine.
Dynamic Compilation, Performance Tuning, Optimization Orchestration
Z. Pan and R. Eigenmann, "Fast, automatic, procedure-level performance tuning," 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT), Seattle, WA, USA, 2006, pp. 173-181.