Parallel Implementation of the 2D Discrete Wavelet Transform on Graphics Processing Units: Filter Bank versus Lifting
Issue No. 03 - March (2008 vol. 19)
Javier Setoain , IEEE
Manuel Prieto , IEEE
Luis Piñuel , IEEE
Francisco Tirado , IEEE
The widespread usage of the DiscreteWaveletTransform (DWT) has motivated the development of fastDWT algorithms and their tuning on all sorts of computersystems. Several studies have compared the performanceof the most popular schemes, known as Filter Bank(FBS) and Lifting (LS), and have always concluded thatLifting is the most efficient option. However, there isno such study on streaming processors such as modernGraphic Processing Units (GPUs). Current trends havetransformed these devices into powerful stream processorswith enough flexibility to perform intensive and complexfloating-point calculations. The opportunities opened upby these platforms, as well as the growing popularityof the DWT within the computer graphics field, make anew performance comparison of great practical interest.Our study indicates that FBS outperforms LS in currentgeneration GPUs. In our experiments, the actual FBS gainsrange between 10% and 140%, depending on the problemsize and the type and length of the wavelet filter. Moreover,design trends suggest higher gains in future generationGPUs.
Graphics processors, Parallelprocessing, Parallel algorithms, Paralleland vector implementations, Wavelets and fractals, SIMD processors, Optimization
F. Tirado, J. Setoain, M. Prieto, C. Tenllado and L. Piñuel, "Parallel Implementation of the 2D Discrete Wavelet Transform on Graphics Processing Units: Filter Bank versus Lifting," in IEEE Transactions on Parallel & Distributed Systems, vol. 19, no. , pp. 299-310, 2007.