2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID)
High Performance Relay Mechanism for MPI Communication Libraries Run on Multiple Private IP Address Clusters
May 19-May 22
ISBN: 978-0-7695-3156-4
We have been developing a Grid-enabled MPI communication library called GridMPI, which is designed to run on multiple clusters connected to a wide-area network. Some of these clusters may use private IP addresses. Therefore, some mechanism to enable communication between private IP address clusters is required. Such a mechanism should be widely adoptable, and should provide high communication performance.In this paper, we propose a message relay mechanism to support privateIP address clusters in the manner of the Interoperable MPI (IMPI) standard. Therefore, any MPI implementations which follow the IMPI standard cancommunicate with the relay. Furthermore, we also propose a trunking method in which multiple pairs ofrelay nodes simultaneously communicate between clusters to improve the available communication bandwidth. While the relay mechanism introduces an one-way latency of about 25 usec,the extra overhead is negligible, since the communication latency through a wide area network is a few hundred times as large as this. By using trunking, the inter-cluster communication bandwidth can improveas the number of trunks increases. We confirmed the effectivenessof the proposed method by experiments using a 10~Gbps emulated WAN environment. When relay nodes with 1 Gbps NICs are used, the performance of most of the NAS Parallel Benchmarks improved proportional to the number of trunks. Especially, using 8 trunks, FT and IS are 4.4 and 3.4 times faster, respectively, compared with the single trunk case. The results showed that the proposed method is effective for running MPI programs over high bandwidth-delay product networks.
Index Terms:
MPI, IMPI, Grid computing, Private IP address clusters
Citation:
Ryousei Takano, Motohiko Matsuda, Tomohiro Kudoh, Yuetsu Kodama, Fumihiro Okazaki, Yutaka Ishikawa, Yasufumi Yoshizawa, "High Performance Relay Mechanism for MPI Communication Libraries Run on Multiple Private IP Address Clusters," ccgrid, pp.401-408, 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID), 2008