Parallel and Distributed Processing Symposium, International (2009)
May 23, 2009 to May 29, 2009
Matthew J. Koop , Department of Computer Science and Engineering, The Ohio State University, USA
Jaidev K. Sridhar , Department of Computer Science and Engineering, The Ohio State University, USA
Dhabaleswar K. Panda , Department of Computer Science and Engineering, The Ohio State University, USA
The Message Passing Interface (MPI) is the defacto standard for parallel programming. As system scales increase, application writers often try to increase the overlap of communication and computation. Unfortunately, even on offloaded hardware such as InfiniBand, performance is not improved since the underlying protocols within MPI implementation require control messages that prevent overlap without expensive threads. In this work we propose a fully-asynchronous and zerocopy design to allow full overlap of communication and computation. We design TupleQ with novel use of InfiniBand eXtended Reliable Connection (XRC) receive queues to allow zero-copy and asynchronous transfers for all message sizes. Our evaluation on 64 tasks reveals significant performance gains. By leveraging the network hardware we are able to provide fully-asynchronous progress. We show overlap of nearly 100% for all message sizes, compared to 0% for the traditional RPUT and RGET protocols. We also show a 27% improvement for NAS SP using our design over the existing designs.
Matthew J. Koop, Jaidev K. Sridhar, Dhabaleswar K. Panda, "TupleQ: Fully-asynchronous and zero-copy MPI over InfiniBand", Parallel and Distributed Processing Symposium, International, vol. 00, no. , pp. 1-8, 2009, doi:10.1109/IPDPS.2009.5161056