The Community for Technology Leaders
Green Image
<p><b>Abstract</b>—The aim of this paper is to study the effect of local memory hierarchy and communication network exploitation on message sending and the influence of this effect on the decomposition of regular applications. In particular, we have considered two different parallel computers, a Cray T3E-900 and an SGI Origin 2000. In both systems, the bandwidth reduction due to non-unit-stride memory access is quite significant and could be more important than the reduction due to contention in the network. These conclusions affect the choice of optimal decompositions for regular domains problems. Thus, although traditional 3D decompositions lead to lower inherent communication-to-computation ratios and could exploit more efficiently the interconnection network, lower dimensional decompositions are found to be more efficient due to the data decomposition effects on the spatial locality of the messages to be communicated. This increasing importance of local optimisations has also been shown using a well-known communication-computation overlapping technique which increases execution time, instead of reducing it as we could expect, due to poor cache memory exploitation.</p>
MPI performance evaluation, data locality, data partitioning, domain decomposition applications.

F. Tirado, M. Prieto and I. M. Llorente, "Data Locality Exploitation in the Decomposition of Regular Domain Problems," in IEEE Transactions on Parallel & Distributed Systems, vol. 11, no. , pp. 1141-1150, 2000.
94 ms
(Ver 3.3 (11022016))