Ninth IEEE International Symposium on Multimedia (ISM 2007) (2007)
Dec. 10, 2007 to Dec. 12, 2007
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ISM.2007.12
The 2D DWT consists of two 1D DWT in both directions: horizontal filtering processes the rows followed by vertical filtering processes the columns. It is well known that a straightforward implementation of the vertical filtering shows quite different performance with various working set sizes. The only reasonable explanation for this has to be the access behavior of the cache memory. As known, vertical filtering has mapping conflicts in the cache with a working set size that is power of two. However, it is not clear how this conflict forms and whether cache problems exist with other data sizes. Such knowledge is the base for efficient code optimization. In order to acquire this knowl- edge and to achieve more accurate optimization potentials, we apply a cache visualization tool to examine the runtime cache activities of the vertical implementation. We find that besides mapping conflicts, vertical filtering also shows a large number of capacity misses. More specifically, the visualization tool allows us to detect the parameters related to the strategies. This guarantees the feasibility of the opti- mization. Our initial experimental results on several differ- ent architectures show an up to 215% gain in execution time compared to an already optimized baseline implementation. Keywords: Discrete wavelet transform, memory perfor- mance, visualization tool, code optimization.
J. Tao, B. Juurlink, W. Karl, R. Buchty, A. Shahbahrami and S. Vassiliadis, "Optimizing Cache Performance of the Discrete Wavelet Transform Using a Visualization Tool," Ninth IEEE International Symposium on Multimedia (ISM 2007)(ISM), Taichung, Taiwan, 2007, pp. 153-160.