Issue No. 07 - July (2006 vol. 17)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TPDS.2006.88
Hwansoo Han , IEEE
<p><b>Abstract</b>—Irregular scientific codes experience poor cache performance due to their irregular memory access patterns. In this paper, we present two new locality improving techniques for irregular scientific codes. Our techniques exploit geometric structures hidden in data access patterns and computation structures. Our new data reordering (G<scp>part</scp>) finds the graph structure within data accesses and applies hierarchical clustering. Quality partitions are constructed quickly by clustering multiple neighbor nodes with priority on nodes with high degree and repeating a few passes. Overhead is kept low by clustering multiple nodes in each pass and considering only edges between partitions. Our new computation reordering (Z-S<scp>ort</scp>) treats the values of index arrays as coordinates and reorders corresponding computations in Z-curve order. Applied to dense inputs, Z-S<scp>ort</scp> achieves performance close to data reordering combined with other computation reordering but without the overhead involved in data reordering. Experiments on irregular scientific codes for a variety of meshes show locality optimization techniques are effective for both sequential and parallelized codes, improving performance by 60-87 percent. G<scp>part</scp> achieved within 1-2 percent of the performance of more sophisticated partitioning algorithms, but with one third of the overhead. Z-S<scp>ort</scp> also yields the performance improvement of 64 percent for dense inputs, which is comparable with data reordering combined with computation reordering.</p>
Compiler optimization, cache memories, inspector/executor, data reordering, computation reordering.
C. Tseng and H. Han, "Exploiting Locality for Irregular Scientific Codes," in IEEE Transactions on Parallel & Distributed Systems, vol. 17, no. , pp. 606-618, 2006.