The Community for Technology Leaders
High Performance Computing and Grid in Asia Pacific Region, International Conference on (2005)
Beijing, China
Nov. 30, 2005 to Dec. 3, 2005
ISBN: 0-7695-2486-9
pp: 45-52
Paolo D?Alberto , Carnegie Mellon University
Alexandru Nicolau , University of California at Irvine
ABSTRACT
<p>Strassen?s algorithm has practical performance benefits for architectures with simple memory hierarchies, because it trades computationally expensive matrix multiplications (MM) with cheaper matrix additions (MA). However, it presents no advantages for high-performance architectures with deep memory hierarchies, because MAs exploit limited data reuse.</p> <p>We present an easy-to-use adaptive algorithm combining Strassen?s recursion and high-tuned version of ATLAS MM. In fact, we introduce a last step in the ATLAS-installation process that determines whether Strassen?smay achieve any speedup. We present a recursive algorithm achieving up to 30% speed-up versus ATLAS alone. We show experimental results for 14 different systems.</p>
INDEX TERMS
null
CITATION
Paolo D?Alberto, Alexandru Nicolau, "Adaptive Strassen and ATLAS?s DGEMM: A Fast Square-Matrix Multiply for Modern High-Performance Systems", High Performance Computing and Grid in Asia Pacific Region, International Conference on, vol. 00, no. , pp. 45-52, 2005, doi:10.1109/HPCASIA.2005.18
90 ms
(Ver 3.3 (11022016))