This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Benchmark Evaluation of the IBM SP2 for Parallel Signal Processing
May 1996 (vol. 7 no. 5)
pp. 522-536

Abstract—This paper evaluates the IBM SP2 architecture, the AIX parallel programming environment, and the IBM message-passing library (MPL) through STAP (Space-Time Adaptive Processing) benchmark experiments. Only coarse-grain parallelism was exploited on the SP2 due to its high communication overhead. A new parallelization scheme is developed for programming message passing multicomputers. Parallel STAP benchmark structures are illustrated with domain decomposition, efficient mapping of partitioned programs, and optimization of collective communication operations. We measure the SP2 performance in terms of execution time, Gflop/s rate, speedup over a single SP2 node, and overall system utilization. With 256 nodes, the Maui SP2 demonstrated the best performance of 23 Gflop/s in executing the High-Order Post-Doppler program, corresponding to a 34% system utilization. We have conducted a scalability analysis to reveal the performance growth rate as a function of machine size and STAP problem size. Important lessons learned from these parallel processing benchmark experiments are discussed in the context of real-time, adaptive, radar signal processing on massively parallel processors (MPP).

[1] D. Adams, "Cray T3D System Architecture Overview Manual," http://www.cray.com, Cray Research, Inc., Sept. 1993.
[2] R.C. Agarwal, F.G. Gustavson, and M. Zubair, "Exploiting Functional Parallelism of POWER2 to Design High-Performance Numerical Algorithms," IBM J. Research and Development, vol. 38, no. 5, pp. 563-576, 1994.
[3] R.C. Agarwal et al., "High-Performance Implementations of the NAS Kernel Benchmarks on the IBM SP2," IBM System J., vol. 34, no. 2, pp. 263-272, 1995.
[4] T. Agerwala, J. Martin, J. Mirza, D. Sadler, D. Dias, and M. Snir, “SP2 System Architecture,” IBM Systems J., vol. 34, no. 2,pp. 153–184, 1995.
[5] D.P. Bertsekas and J.N. Tsitsiklis, Parallel and Distributed Computation.Englewood Cliffs, N.J.: Prentice Hall International, 1989.
[6] R. Bond, "Measuring Performance and Scalability Using Extended Versions of the STAP Processor Benchmarks," technical report, MIT Lincoln Laboratory, Dec. 1994.
[7] J.J. Dongarra, "The Performance Database Server (PDS): Reports: Linpack Benchmark - Parallel," http://performance.netlib.org/performance/html/linpack-parallel.data.co10.html.
[8] A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek,, and V. Sunderam,PVM: Parallel Virtual Machine—A Users' Guide and Tutorial for Networked Parallel Computing. The MIT Press, 1994.
[9] W. Gropp, E. Lusk, and A. Skjellum, Using MPI: Portable Parallel Programming with the Message Passing Interface. MIT Press, 1994.
[10] P. Brinch Hansen, Studies in Computational Science: Parallel Programming Paradigms, Prentice Hall, 1995.
[11] R.W. Hockney, "Performance Parameters and Benchmarking of Supercomputers," Parallel Computing, vol. 17, pp. 1,111-1,130, 1991.
[12] R.W. Hockney, "A Framework for Benchmark Performance Analysis," Computer Benchmarks, Advances in Parallel Computing, vol. 8, J.J. Dongarra and W. Gentzsch, eds., pp. 65-76, Elsevier Science, 1993.
[13] R.W. Hockney, "Computational Similarity," Concurrency: Practice and Experience, vol. 7, no. 2, pp. 147-166, 1995.
[14] R. Hockney and M. Berry, "Public International Benchmarks for Parallel Computers: Parkbench Committee Report-1," Scientific Programming, Vol. 3, No. 2, 1994.
[15] R. Hockney and C. Jesshope, Parallel Computers: Architecture, Programming and Algorithms. Adam Hilger, 1981.
[16] K. Hwang, Advanced Computer Architecture: Parallelism, Scalability, Programmability. McGraw-Hill, 1993.
[17] K. Hwang and Z. Xu, “Scalable Parallel Computers for Real-Time Signal Processing,” IEEE Signal Processing Magazine, vol. 13, no. 4, pp. 50-66, July 1979.
[18] IBM Corp., AIX Parallel Environment: Programming Primer, Release 2.0, Pub. No. SH26-7223, IBM Corp., June 1994.
[19] MHPCC, "MHPCC 400-Node SP2 Environment," Maui High-Performance Computing Center, Maui, Hawaii, Oct. 1994
[20] MIT/LL, "STAP Processor Benchmarks," MIT Lincoln Laboratory, Lexington, Mass., Feb.28, 1994.
[21] MIT/LL, "Commercial Programmable Processor Benchmarks: Detailed Design Documents," MIT Lincoln Laboratory, Lexington, Mass., July29, 1994.
[22] J. McComb, "Engineering and Scientific Subroutine Library (ESSL) Version 2.2 Presentation Guide," IBM Kingston, Jan.28, 1994.
[23] C. Stunkel, D. Shea, B. Abali, M. Atkins, C. Bender, D. Grice, P. Hochshild, D. Joseph, B. Nathanson, R. Swetz, R. Stucke, M. Tsao, and P. Varker, “The SP2 High-Performance Switch,” IBM Systems J., vol. 34, no. 2,pp. 185–204, 1995.
[24] L.G. Valiant, “A Bridging Model for Parallel Computation,” Comm. ACM, vol. 33, no. 8, pp. 103-111, Aug. 1990.
[25] Z. Xu and K. Hwang, "Modeling Communication Overhead: MPI and MPL Performance on the IBM SP2 Multicomputer," IEEE Parallel and Distributed Technology, vol. 7, no. 3, pp. 9-23, Mar. 1996.
[26] Z. Xu and K. Hwang, “Early Prediction of MPP Performance: SP2, T3D, and Paragon Experiences,” J. Parallel Computing, vol. 22, pp. 917-942, Oct. 1996.

Index Terms:
Message passing, data parallelism, massively parallel processors, adaptive sensor array processing, scalability, programmability, performance evaluation, STAP benchmarks, real-time applications.
Citation:
Kai Hwang, Zhiwei Xu, Masahiro Arakawa, "Benchmark Evaluation of the IBM SP2 for Parallel Signal Processing," IEEE Transactions on Parallel and Distributed Systems, vol. 7, no. 5, pp. 522-536, May 1996, doi:10.1109/71.503777
Usage of this product signifies your acceptance of the Terms of Use.