Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques (2001)
Barcelona, Spain
Sept. 8, 2001 to Sept. 12, 2001
ISBN: 0-7695-1363-8
pp: 0083
Mateo Valero , Universitat Polit?cnica de Catalunya
Jesus Corbal , Universitat Polit?cnica de Catalunya
Roger Espasa , Universitat Polit?cnica de Catalunya
Abstract: Many important multimedia applications contain a significant fraction of reduction operations. Although, in general, multimedia applications are characterized for having high amounts of Data Level Parallelism, reductions and accumulations are difficult to parallelize and show a poor tolerance to increases in the latency of the instructions. This is specially significant for ?-SIMD extensions such as MMX or AltiVec. To overcome the problem of reductions in ?-SIMD ISAs, designers tend to include more and more complex instructions able to deal with the most common forms of reductions in multimedia. As long as the number of processor pipeline stages grows, the number of cycles needed to execute these multimedia instructions increases with every processor generation, severely compromising performance. This paper presents an in-depth discussion of how reductions/accumulations are performed in current ?-SIMD architectures and evaluates the performance trade-offs for a near-future highly aggressive superscalar processors with three different styles of ?-SIMD extensions. We compare a MMX-like alternative to a MDMX-like extension that has Packed accumulators to attack the reduction problem, and we also compare it to MOM, a matrix register ISA. We will show that while packed accumulators present several advantages, they introduce artificial recurrences that severely degrade performance for processors with high number of registers and long latency operations. On the other hand, this paper demonstrates that longer SIMD media extensions such as MOM can take great advantage of accumulators by exploiting the associative parallelism implicit in reductions.
Mateo Valero, Jesus Corbal, Roger Espasa, "On the Efficiency of Reductions in ?-SIMD Media Extensions", Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, vol. 00, no. , pp. 0083, 2001, doi:10.1109/PACT.2001.953290
