loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Eleventh Euromicro Conference on Parallel, Distributed and Network-Based Processing
Reducing 3D Wavelet Transform Execution Time through the Streaming SIMD Extensions
Genova, Italy
February 05-February 07
ISBN: 0-7695-1875-3
Gregorio Bernabé, Universidad de Murcia
José M. García, Universidad de Murcia
José González, Intel Barcelona Research Center
This paper focuses on reducing the execution time of the video compression algorithms based on the 3D wavelet transform. We present several optimizations that could not be applied by the compiler due to the characteristics of the algorithm. First, we use the Streaming SIMD Extensions (SSE) for some of the dimensions of the sequence (y and time, in order to reduce the number of floating point instructions, exploiting Data Level Parallelism. Then,we apply loop unrolling and data prefetching to critical parts of the code, and finally the algorithm is vectorized by columns, allowing the use of SIMD instructions for the y dimension. Results show improvements of up to 1.54 over a version compiled with the maximum optimizations of the Intel C/C++compiler. Our experiments also show that, allowing the compiler to perform some of these optimizations (i.e. automatic code vectorization) causes performance slowdown which demonstrates the effectiveness of our optimizations.
Citation:
Gregorio Bernabé, José M. García, José González, "Reducing 3D Wavelet Transform Execution Time through the Streaming SIMD Extensions," pdp, pp.49, Eleventh Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2003
Usage of this product signifies your acceptance of the Terms of Use.