Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques (2013)
Edinburgh, United Kingdom United Kingdom
Sept. 7, 2013 to Sept. 11, 2013
Majedul Haque Sujon , Dept. of Comput. Sci., Univ. of TX at San Antonio, San Antonio, TX, USA
R. Clint Whaley , Sch. of EE & CS, Louisiana State Univ., Baton Rouge, LA, USA
Qing Yi , Dept. of Comput. Sci., Univ. of Colorado, Colorado Springs, CO, USA
Modern architectures increasingly rely on SIMD vectorization to improve performance for floating point intensive scientific applications. However, existing compiler optimization techniques for automatic vectorization are inhibited by the presence of unknown control flow surrounding partially vectorizable computations. In this paper, we present a new approach, speculative vectorization, which speculates past dependent branches to aggressively vectorize computational paths that are expected to be taken frequently at runtime, while simply restarting the calculation using scalar instructions when the speculation fails. We have integrated our technique in an iterative optimizing compiler and have employed empirical tuning to select the profitable paths for speculation. When applied to optimize 9 floating-point benchmarks, our optimizing compiler has achieved up to 6.8X speedup for single precision and 3.4X for double precision kernels using AVX, while vectorizing some operations considered not vectorizable by prior techniques.
Vectors, Kernel, Optimization, Algorithm design and analysis, Optimizing compilers, Benchmark testing, Safety
M. H. Sujon, R. C. Whaley and Qing Yi, "Task sampling: computer architecture simulation in the many-core era," Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques(PACT), Edinburgh, United Kingdom United Kingdom, 2013, pp. 353-362.