<p><b>Abstract</b>—All-to-all communication patterns occur in many important parallel algorithms. This paper presents new algorithms for all-to-all communication patterns (all-to-all broadcast and all-to-all personalized exchange) for wormhole switched 2D/3D torus- and mesh-connected multiprocessors. The algorithms use message combining to minimize message start-ups at the expense of larger message sizes. The unique feature of these algorithms is that they are the first algorithms that we know of that operate in a bottom-up fashion rather than a recursive, top-down manner. For a 2<super><it>d</it></super>× 2<super><it>d</it></super> torus or mesh, the algorithms for all-to-all personalized exchange have time complexity of <it>O</it>(2<super>3<it>d</it></super>). An important property of the algorithms is the <it>O</it>(<it>d</it>) time due to message start-ups, compared with <it>O</it>(2<super><it>d</it></super>) for current algorithms [<ref rid="bibl044215" type="bib">15</ref>], [<ref rid="bibl044218" type="bib">18</ref>]. This is particularly important for modern parallel architectures where the start-up cost of message transmissions still dominates, except for very large block sizes. Finally, the 2D algorithms for all-to-all personalized exchange are extended to <it>O</it>(2<super>4<it>d</it></super>) algorithms in a 2<super><it>d</it></super>× 2<super><it>d</it></super>× 2<super><it>d</it></super> 3D torus or mesh. These algorithms also retain the important property of <it>O</it>(<it>d</it>) time due to message start-ups.</p>
Interprocessor communication, parallel algorithms, collective communication, all-to-all communication, all-to-all broadcast, all-to-all personalized exchange, complete exchange.

