Parallel and Distributed Processing Symposium, International (2006)
Rhodes Island, Greece
Apr. 25, 2006 to Apr. 29, 2006
H. Ritzdorf , C&C Res. Labs., NEC Eur. Ltd., Sankt Augustin, Germany
J.L. Traff , C&C Res. Labs., NEC Eur. Ltd., Sankt Augustin, Germany
We give an overview of the algorithms and implementations in the high-performance MPI libraries MPI/SX and MPI/ES of some of the most important collective operations of MPI (the message passing interface). The infrastructure of MPI/SX makes it easy to incorporate new algorithms and algorithms for common special cases (e.g. a single SX node, or a single MPI process per SX node). Algorithms that are among the best known are employed, and special hardware features of the SX architecture and internode crossbar switch (IXS) are exploited wherever possible. We discuss in more detail the implementation of MPLBarrier, MPLBcast, the MPI reduction collectives, MPI-Alltoall, and the gather/scatter collectives. Performance figures and comparisons to straightforward algorithms are given for a large SX-8 system, and for the Earth Simulator. The measurements show excellent absolute performance, and demonstrate the scalability of MPI/SX and MPI/ES to systems with large numbers of nodes.
Earth Simulator, high-performance MPI libraries, MPI/SX, MPI/ES, message passing interface, internode crossbar switch, MPLBarrier, MPLBcast, MPI reduction collectives, MPI-Alltoall, gather-scatter collectives, SX-8 system
J. Traff and H. Ritzdorf, "Collective operations in NEC's high-performance MPI libraries," Parallel and Distributed Processing Symposium, International(IPDPS), Rhodes Island, Greece, 2006, pp. 77.