Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques (2009)
Raleigh, North Carolina, USA
Sept. 12, 2009 to Sept. 16, 2009
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/PACT.2009.11
Analytical models have been used to estimate optimal values for parameters such as tile sizes in the context of loop nests. However, important algorithms such as fast Fourier transforms (FFTs) present a far more complex search space consisting of many thousands of different implementations with very different complex access patterns and nesting and code structures. As a results, some of the best available FFT implementations use heuristic search based on runtime measurements. In this paper we present the first analytical model that can successfully replace the measurement in this search on modern platforms. The model includes many details of the platform's memory system including the TLBs, and, for the first time, physically addressed caches and hardware prefetching. The effect, as we show, is a dramatically reduced search time to find the best FFT without significant loss in performance. Even though our model is adapted to the FFT in this paper, its underlying structure should be applicable for a much larger set of code structures and hence is a candidate for iterative compilation.
automatic performance tuning, discrete Fourier transform, FFT, high-performance computing, library generators, model-driven optimization, program optimization
Basilio B. Fraguela, Markus Püschel, Yevgen Voronenko, "Automatic Tuning of Discrete Fourier Transforms Driven by Analytical Modeling", Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, vol. 00, no. , pp. 271-280, 2009, doi:10.1109/PACT.2009.11