2007 IEEE International Symposium on Performance Analysis of Systems&Software Performance Impact of Unaligned Memory Operations in SIMD Extensions for Video Codec Applications San Jose, CA April 25-April 27 ISBN: 1-4244-1081-9
Although SIMD extensions are a cost effective way to exploit the data level parallelism present in most media applications, we will show that they had have a very limited memory architecture with a weak support for unaligned memory accesses. In video codec, and other applications, the overhead for accessing unaligned positions without an efficient architecture support has a big performance penalty and in some cases makes vectorization counter-productive. In this paper we analyze the performance impact of extending the Altivec SIMD ISA with unaligned memory operations. Results show that for several kernels in the H.264/AVC media codec, unaligned access support provides a speedup up to 3.8times compared to the plain SIMD version, translating into an average of 1.2times in the entire application. In addition to providing a significant performance advantage, the use of unaligned memory instructions makes programming SIMD code much easier both for the manual developer and the auto vectorizing compiler
Index Terms:
auto vectorizing compiler, unaligned memory operations, SIMD extensions, video codec applications, data level parallelism, memory architecture, unaligned memory accesses, H.264/AVC media codec
Citation:
M. Alvarez, E. Salami, A. Ramirez, M. Valero, "Performance Impact of Unaligned Memory Operations in SIMD Extensions for Video Codec Applications," ispass, pp.62-71, 2007 IEEE International Symposium on Performance Analysis of Systems&Software, 2007 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||