This Article 
 Bibliographic References 
 Add to: 
MPI-LAPI: An Efficient Implementation of MPI for IBM RS/6000 SP Systems
October 2001 (vol. 12 no. 10)
pp. 1081-1093

Abstract—The IBM RS/6000 SP system is one of the most cost-effective commercially available high performance machines. IBM RS/6000 SP systems support the Message Passing Interface standard (MPI) and LAPI. LAPI is a low level, reliable, and efficient one-sided communication API library implemented on IBM RS/6000 SP systems. This paper explains how the high performance of the LAPI library has been exploited in order to implement the MPI standard more efficiently than the existing MPI. It describes how to avoid unnecessary data copies at both the sending and receiving sides for such an implementation. The resolution of problems arising from the mismatches between the requirements of the MPI standard and the features of LAPI is discussed. As a result of this exercise, certain enhancements to LAPI are identified to enable an efficient implementation of MPI on LAPI. The performance of the new implementation of MPI is compared with that of the underlying LAPI itself. The latency (in polling and interrupt modes) and bandwidth of our new implementation is compared with that of the native MPI implementation on RS/6000 SP systems. The results indicate that the MPI implementation on LAPI performs comparably to or better than the original MPI implementation in most cases. Improvements of up to 17.3 percent in polling mode latency, 35.8 percent in interrupt mode latency, and 20.9 percent in bandwidth are obtained for certain message sizes. The implementation of MPI on top of LAPI also outperforms the native MPI implementation for the NAS Parallel Benchmarks.

[1] T. Agerwala, J. Martin, J. Mirza, D. Sadler, D. Dias, and M. Snir, “SP2 System Architecture,” IBM Systems J., vol. 34, no. 2,pp. 153–184, 1995.
[2] M. Banikazemi, R.K. Govindaraju, R. Blackmore, and D.K. Panda, “Implementing Efficient MPI on LAPI for IBM RS/6000 SP Systems: Experiences and Performance Evaluation” Proc. 13th Int'l Parallel Processing Symp., pp. 183-190, Apr. 1999.
[3] J. Bruck et al. “Efficient Message Passing Interface (MPI) for Parallel Computing on Clusters of Workstations,” J. Parallel and Distributed Computing, pp. 19-34, Jan. 1997.
[4] C. Chang, G. Czajkowski, C. Hawblitzel, and T.V. Eicken, “Low Latency Communication on the IBM RISC System/6000 SP,” Proc. Supercomputing '96, 1996.
[5] A. Chien, S. Pakin, M. Lauria, M. Buchanan, K. Hane, L. Giannini, and J. Prusakova, “High Performance Virtual Machines (HPVM): Clusters with Supercomputing APIs and Performance,” Proc. Eighth SIAM Conf. Parallel Processing for Scientific Computing, Mar. 1997.
[6] D. Culler, J.P. Singh, and A. Gupta, Parallel Computer Architecture: A Hardware/Software Approach, Morgan Kaufmann, San Francisco, 1998.
[7] J. Duato, S. Yalamanchili, and L.M. Ni, Interconnection Networks: An Engineering Approach. Los Alamitos, Calif.: IEEE CS Press, 1997.
[8] W. Gropp, E. Lusk, N. Doss, and A. Skjellum, “A High-Performance, Portable Implementation of the MPI, Message Passing Interface Standard,” technical report, Argonne Nat'l Laboratory and Mississippi State Univ. 1996.
[9] J.G.I. Foster and S. Tuecke, “MPI on the I-WAY: A Wide-Area, Multimethod Implementation of the Message Passing Interface,” Proc. Second MPI Developer's Conf., pp. 10-17, 1996.
[10] PSSP Command and Technical Reference—LAPI Chapter. IBM, 1997.
[11] M. Lauria, A. Chien, “MPI FM: High Performance MPI on Workstation Clusters,” J. Parallel and Distributed Computing, vol. 40, no. 1, pp. 4-18, Jan. 1997.
[12] MPI: A Message-Passing Interface Standard. Message Passing Interface Forum, May 1994.
[13] L. Prylli and B. Tourancheau, “BIP: A New Protocol Designed for High Performance Networking on Myrinet,” Proc. Int'l Parallel Processing Symp. Workshop Personal Computer Based Networks of Workstations, 1998. Also available at
[14] G. Shah, J. Nieplocha, J. Mirza, C. Kim, R. Harrison, R.K. Govindaraju, K. Gildea, P. DiNicola, and C. Bender, “Performance and Experience with LAPI—A New High-Performance Communication Library for the IBM RS/6000 SP,” Proc. Int'l Parallel Processing Symp., Mar. 1998.
[15] M. Snir et al., "The Communication Software and Parallel Environment of the IBM SP2," IBM Systems J., Vol. 34, No. 2, 1995, pp. 205-221.
[16] C.B. Stunkel et al., “The SP1 High-Performance Switch,” Proc. Scalable High-Performance Computing Conf., CS Press, May 1994, pp. 150-157.
[17] C. Stunkel, D. Shea, B. Abali, M. Atkins, C. Bender, D. Grice, P. Hochshild, D. Joseph, B. Nathanson, R. Swetz, R. Stucke, M. Tsao, and P. Varker, “The SP2 High-Performance Switch,” IBM Systems J., vol. 34, no. 2,pp. 185–204, 1995.
[18] V. Sunderam, “PVM: A Framework for Parallel Distributed Computing,” Concurrency: Practice and Experience, vol. 2, no. 4, pp. 315–339, , 1990.
[19] T. von Eicken et al., “Active Messages: A Mechanism for Integrated Communication and Computation,” Proc. 19th Int’l Symp. Computer Architecture, Assoc. of Computing Machinery, N.Y., May 1992, pp. 256-266.
[20] H. Zhou and A. Geist, “LPVM: A Step Towards Multithread PVM,” technical report, Oak Ridge Nat'l Laboratory, 1995.

Index Terms:
Interprocessor communication, fast messaging layers, networks of workstations, Message Passing Interface (MPI), clustering.
Mohammad Banikazemi, Rama K. Govindaraju, Robert Blackmore, Dhabaleswar K. Panda, "MPI-LAPI: An Efficient Implementation of MPI for IBM RS/6000 SP Systems," IEEE Transactions on Parallel and Distributed Systems, vol. 12, no. 10, pp. 1081-1093, Oct. 2001, doi:10.1109/71.963419
Usage of this product signifies your acceptance of the Terms of Use.