Proceedings 2000 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00622) (2000)
Oct. 15, 2000 to Oct. 19, 2000
Kang Su Gatlin , University of California at San Diego
Larry Carter , University of California at San Diego
The Fast Fourier Transform (FFT) is one of the most important algorithms in computational science, accounting for large amounts of computing time. One major problem with modern FFT implementations is that they poorly scale to large problem. As the problem size increases, stride and associativity effects play a larger role. The result is a severe drop-off in performance. We use architecture-cognizance, a method for exploiting the interaction between architecture, compiler, and algorithm, to create a more scalable FFT package based on FFTW. Experiments validate our approach on four architectures: two generations of HPs (PA-8000 and 8500), an IBM POWER2, and a DEC Alpha 21164a. Performance increases of up to 65% are obtained.
memory hierarchy, cache, TLB, divide-and-conquer, compiler optimization, runtime systems, feedback, ILP, associativity, registers
K. S. Gatlin and L. Carter, "Faster FFTs via Architecture-Cognizance," Proceedings 2000 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00622)(PACT), Philadelphia, Pennsylvania, 2000, pp. 249.