The Community for Technology Leaders
Parallel Processing Symposium, International (1996)
Honolulu, HI
Apr. 15, 1996 to Apr. 19, 1996
ISSN: 1063-7133
ISBN: 0-8186-7255-2
pp: 39
Rafael H. Saavedra , Computer Science Department, University of Southern California
Weihua Mao , Computer Science Department, University of Southern California
Daeyeon Park , Computer Science Department, University of Southern California
Jacqueline Chame , Computer Science Department, University of Southern California
Sungdo Moon , Computer Science Department, University of Southern California
ABSTRACT
Unimodular transformations, tiling, and software prefetching are loop optimizations known to be effective in increasing parallelism, reducing cache miss rates, and eliminating processor stall time. Although these optimizations individually are quite effective, there is the expectation that even better improvements can be obtained by combining them together. In this paper we show that indeed this is the case when unimodular transformations are combined with either tiling or software prefetching. However, our results also show that although combining tiling with prefetching tends to improve the performance of tiling alone, it is also the case that in some situations tiling can degrade the cache performance of software prefetching. The reasons for this unexpected behavior are three fold: 1) tiling introduces interference misses inside the localized space which are difficult to characterize with current techniques based on locality analysis; 2) prefetch predicates are computed using only estimates on the amount of capacity misses, so the latency induced by cache interference is not completely covered; and 3) tiling limits the maximum amount of latency that can be masked with prefetching.
INDEX TERMS
CITATION

J. Chame, S. Moon, D. Park, W. Mao and R. H. Saavedra, "The Combined Effectiveness of Unimodular Transformations, Tiling, and Software Prefetching," Parallel Processing Symposium, International(IPPS), Honolulu, HI, 1996, pp. 39.
doi:10.1109/IPPS.1996.508037
98 ms
(Ver 3.3 (11022016))