The Community for Technology Leaders
Green Image
<p>Parallel algorithms for several common problems such as sorting and the FFT involve a personalized exchange of data among all the processors. Past approaches to doing complete exchange have taken one of two broad approaches: direct exchange or the indirect message-combining approaches. While combining approaches reduce the number of message startups, direct exchange minimizes the volume of data transmitted. This paper presents a family of hybrid algorithms for wormhole-routed 2D meshes that can effectively utilize the complementary strengths of these two approaches to complete exchange. The performance of hybrid algorithms using Cyclic Exchange and Scott's Direct Exchange are studied using analytical models, simulation, and implementation on a Cray T3D system. The results show that hybrids achieve lower completion times than either pure algorithm for a range of mesh sizes, data block sizes, and message startup costs. It is also demonstrated that barriers may be used to enhance performance by reducing message contention, whether or not the target system provides hardware support for barrier synchronization. The analytical models are shown useful in selecting the optimum hybrid for any given combination of system parameters (mesh size, message startup time, flit transfer time, and barrier cost) and the problem parameter (data block size).</p>
Collective communication, complete exchange, combining, direct exchange, hybrid algorithms, message contention, barrier synchronization, mesh topology, wormhole routing

D. Jayasimha, N. Sundar, D. Panda and P. Sadayappan, "Hybrid Algorithms for Complete Exchange in 2D Meshes," in IEEE Transactions on Parallel & Distributed Systems, vol. 12, no. , pp. 1201-1218, 2001.
139 ms
(Ver 3.3 (11022016))