The Community for Technology Leaders
Green Image
<p>In distributed memory multicomputers, local memory accesses are much faster than thoseinvolving interprocessor communication. For the sake of reducing or even eliminating theinterprocessor communication, the array elements in programs must be carefullydistributed to local memory of processors for parallel execution. We devote our efforts tothe techniques of allocating array elements of nested loops onto multicomputers in acommunication-free fashion for parallelizing compilers. We first analyze the pattern ofreferences among all arrays referenced by a nested loop, and then partition the iterationspace into blocks without interblock communication. The arrays can be partitioned underthe communication-free criteria with nonduplicate or duplicate data. Finally, a heuristicmethod for mapping the partitioned array elements and iterations onto the fixed-sizemulticomputers under the consideration of load balancing is proposed. Based on thesemethods, the nested loops can execute without any communication overhead on thedistributed memory multicomputers. Moreover, the performance of the strategies withnonduplicate and duplicate data for matrix multiplication is studied.</p>
Index Termsdistributed memory systems; parallel programming; program compilers; storage allocation; communication-free data allocation techniques; parallelizing compilers; multicomputers; distributed memory multicomputers; local memory accesses; interprocessor communication; array elements; parallel execution; nested loops; nested loop; iteration space; interblock communication; communication-free criteria; duplicate data; heuristic method; partitioned array elements; fixed-size multicomputers; load balancing; communication overhead; matrix multiplication

J. Sheu and T. Chen, "Communication-Free Data Allocation Techniques for Parallelizing Compilers on Multicomputers," in IEEE Transactions on Parallel & Distributed Systems, vol. 5, no. , pp. 924-938, 1994.
92 ms
(Ver 3.3 (11022016))