Bingsheng He , Nanyang Technological University, Singapore
Yifan Gong , Nanyang Technological University, Singapore
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TPDS.2013.96
This paper examines the performance of collective communication operations in Message Passing Interfaces (MPI) in the cloud computing environment. The awareness of network topology has been a key factor in performance optimizations for existing MPI implementations. However, virtualization in the cloud environment not only hides the network topology information from the users, but also causes traffic interference and dynamics to network performance. Existing topology-aware optimizations are no longer feasible in the cloud environment. Therefore, we develop novel network performance aware algorithms for a series of collective communication operations including broadcast, reduce, gather and scatter. We further implement two common applications, N-body and conjugate gradient (CG). We have conducted our experiments with two complementary methods (on Amazon EC2 and simulations). Our experimental results show that the network performance awareness results in 25.4% and 28.3 performance improvement over MPICH2 on Amazon EC2 and on simulations, respectively. Evaluations on N-body and CG show 41.6% and 14.3% respectively on application performance improvement.
Communications Management, Computer Systems Organization, Processor Architectures, Multiple Data Stream Architectures (Multiprocessors), Parallel processors, Computer Systems Organization, Communication/Networking and Information Technology, Network Architecture and Design, Network communications, Software/Software Engineering, Operating Systems
Bingsheng He, Yifan Gong, "Network Performance Aware MPI Collective Communication Operations in the Cloud", IEEE Transactions on Parallel & Distributed Systems, , no. 1, pp. 1, PrePrints PrePrints, doi:10.1109/TPDS.2013.96