Parallel Computing in Electrical Engineering, 2004. International Conference on (2004)
Sept. 7, 2004 to Sept. 10, 2004
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/PCEE.2004.43
M. Fleury , University of Essex, UK
G. Tsilikas , University of Essex, UK
Cache-oblivious algorithms for matrix multiplication are confirmed as an effective way of exploiting Intel architecture shared-memory multiprocessors. The performance also remains consistent across a wide range of matrix size. The Cilk programming environment remains an effective way of implementing this type of algorithm, but the need for portability and a compiler upgrade route mean that a portability library is a competitive alternative. The paper considers the interaction of matrix multiplication algorithms with the memory hierarchy, as well as multithreading across differing operating system variants and compilers.
M. Fleury, G. Tsilikas, "Matrix Multiplication Performance on Commodity Shared-Memory Multiprocessors", Parallel Computing in Electrical Engineering, 2004. International Conference on, vol. 00, no. , pp. 13-18, 2004, doi:10.1109/PCEE.2004.43