Portland, Oregon, USA
Nov. 13, 1999 to Nov. 18, 1999
Steve Sistare , Sun Microsystems, Inc.
Rolf vande Vaart , Sun Microsystems, Inc.
Eugene Loh , Sun Microsystems, Inc.
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/SC.1999.10010
Implementors of message-passing libraries have focused on optimizing point-to-point protocols and have largely ignored the performance of collective operations. In addition, algorithms for collectives have been tuned to run well on networks of uni-processor machines, ignoring the performance that may be gained on large-scale SMP's in wide-spread use as compute nodes. This is unfortunate, because the high backplane bandwidths and shared-memory capabilities of large SMP's are a perfect match for the requirements of collectives. We present new algorithms for MPI collective operations that take advantage of the capabilities of fat-node SMP's and provide models that show the characteristics of the old and new algorithms. Using the Sun<sup>TM</sup> MPI library, we present results on a 64-way Starfire<sup>TM</sup> SMP and a 4-node cluster of 8-way Sun Enterprise<sup>TM</sup> 4000 nodes that show performance improvements ranging typically from 2x to 5x for the collectives we studied.
MPI, MPICH, SMP, clustering, shared memory, collective
Steve Sistare, Rolf vande Vaart, Eugene Loh, "Optimization of MPI Collectives on Clusters of Large-Scale SMP's", SC, 1999, SC Conference, SC Conference 1999, pp. 23, doi:10.1109/SC.1999.10010