2007 IEEE International Symposium on Performance Analysis of Systems&Software
Performance Impact of Unaligned Memory Operations in SIMD Extensions for Video Codec Applications
San Jose, CA
April 25-April 27
ISBN: 1-4244-1081-9
M. Alvarez, Dept. of Comput. Archit., Univ. Politecnica de Catalunya
E. Salami, Dept. of Comput. Archit., Univ. Politecnica de Catalunya
A. Ramirez, Dept. of Comput. Archit., Univ. Politecnica de Catalunya
M. Valero, Dept. of Comput. Archit., Univ. Politecnica de Catalunya
Although SIMD extensions are a cost effective way to exploit the data level parallelism present in most media applications, we will show that they had have a very limited memory architecture with a weak support for unaligned memory accesses. In video codec, and other applications, the overhead for accessing unaligned positions without an efficient architecture support has a big performance penalty and in some cases makes vectorization counter-productive. In this paper we analyze the performance impact of extending the Altivec SIMD ISA with unaligned memory operations. Results show that for several kernels in the H.264/AVC media codec, unaligned access support provides a speedup up to 3.8times compared to the plain SIMD version, translating into an average of 1.2times in the entire application. In addition to providing a significant performance advantage, the use of unaligned memory instructions makes programming SIMD code much easier both for the manual developer and the auto vectorizing compiler
Index Terms:
auto vectorizing compiler, unaligned memory operations, SIMD extensions, video codec applications, data level parallelism, memory architecture, unaligned memory accesses, H.264/AVC media codec
Citation:
M. Alvarez, E. Salami, A. Ramirez, M. Valero, "Performance Impact of Unaligned Memory Operations in SIMD Extensions for Video Codec Applications," ispass, pp.62-71, 2007 IEEE International Symposium on Performance Analysis of Systems&Software, 2007