The Community for Technology Leaders
Parallel Architectures, Algorithms and Programming, International Symposium on (2010)
Dalian, Liaoning China
Dec. 18, 2010 to Dec. 20, 2010
ISBN: 978-0-7695-4312-3
pp: 138-143
In publishing and printing of network version field, there are enormous number of TIFF format (CMYK) images which requires too huge space for storing and enough bandwidth for transmitting. Therefore, common need to manipulate huge amount of data brought about the issue of fast lossless compression. 2D integer wavelet transform can be used for lossless compression of static image, such as, 5/3 lifting wavelet is lossless compression of JEPG2000. Today, Multi-core (dual, four or eight cores) CPU technology help to accelerate wavelet transform speed. However, current multi-more is limit for acceleration. In this article, it presents acceleration of 2D integer wavelet transform by CUDA. Using of the NVIDIA graphics processor unit (GPU), multiple thread parallelization give attractive features than traditional CPU computation. Under the dual cores CPU and the CUDA device, the article accelerates HARR and 5/3 lifting wavelet on TIFF format images. For HARR wavelet, analysis and comparison have been done for original image matrix and matrix of transform result. which indicates adjacent four pixels of original image matrix can directly construct the corresponding four pixels of transform result. In addition, the adjacent four pixels have nothing to do with other pixels of transform result. Therefore, parallel HARR wavelet transform can be achieved by CUDA, the unit of kernel is based on four pixels. For 5/3 lifting wavelet, there are four groups of experiments, each of group have two kinds CUDA memory method(global and texture memory). Therefore, there are eight experiments. Firstly, the kernel uses only row transform and transpose computation by unit of row. Secondly, without transpose, the kernel uses both row and column by unit of row. Thirdly, it also computes row and transpose, however, the transform unit is based on single pixel. At last, it computes row and column without transpose, whose unit is also single pixel. The experiment Experimental results on an NVIDIA GeForce 9800GT and an dual cores CPU indicates that the GPU acceleration is obvious with the image resolution increasing whether it is HARR or 5/3 lifting wavelet. For 5/3 lifting wavelet, the second group experiment under texture memory increases about 15 times faster than CPU, and time-consuming decrease 5000ms.
TIFF, GPU, wavelet transform, parallelism, mutli-thread

Y. Xie, Z. Wei, S. Yu and Z. Sun, "GPU Acceleration of Integer Wavelet Transform for TIFF Image," Parallel Architectures, Algorithms and Programming, International Symposium on(PAAP), Dalian, Liaoning China, 2010, pp. 138-143.
96 ms
(Ver 3.3 (11022016))