The Community for Technology Leaders
Parallel and Distributed Processing Symposium, International (2009)
Rome, Italy
May 23, 2009 to May 29, 2009
ISBN: 978-1-4244-3751-1
pp: 1-12
Ananta Tiwari , University of Maryland, Department of Computer Science, College Park, 20740 USA
Chun Chen , University of Utah, School of Computing, Salt Lake City, 84112 USA
Jacqueline Chame , University of Southern California, Information Sciences Institute, Marina del Ray, 90292 USA
Mary Hall , University of Utah, School of Computing, Salt Lake City, 84112 USA
Jeffrey K. Hollingsworth , University of Maryland, Department of Computer Science, College Park, 20740 USA
ABSTRACT
We describe a scalable and general-purpose framework for auto-tuning compiler-generated code. We combine Active Harmony's parallel search backend with the CHiLL compiler transformation framework to generate in parallel a set of alternative implementations of computation kernels and automatically select the one with the best-performing implementation. The resulting system achieves performance of compiler-generated code comparable to the fully automated version of the ATLAS library for the tested kernels. Performance for various kernels is 1.4 to 3.6 times faster than the native Intel compiler without search. Our search algorithm simultaneously evaluates different combinations of compiler optimizations and converges to solutions in only a few tens of search-steps.
INDEX TERMS
CITATION
Ananta Tiwari, Chun Chen, Jacqueline Chame, Mary Hall, Jeffrey K. Hollingsworth, "A scalable auto-tuning framework for compiler optimization", Parallel and Distributed Processing Symposium, International, vol. 00, no. , pp. 1-12, 2009, doi:10.1109/IPDPS.2009.5161054
86 ms
(Ver 3.3 (11022016))