This Article 
 Bibliographic References 
 Add to: 
Toward Formally-Based Design of Message Passing Programs
March 2000 (vol. 26 no. 3)
pp. 276-288

Abstract—We present a systematic approach to the development of message passing programs. Our programming model is SPMD, with communications restricted to collective operations: scan, reduction, gather, etc. The design process in such an architecture-independent language is based on correctness-preserving transformation rules, provable in a formal functional framework. We develop a set of design rules for composition and decomposition. For example, scan followed by reduction is replaced by a single reduction, and global reduction is decomposed into two faster operations. The impact of the design rules on the target performance is estimated analytically and tested in machine experiments. As a case study, we design two provably correct, efficient programs using the Message Passing Interface (MPI) for the famous maximum segment sum problem, starting from an intuitive, but inefficient, algorithm specification.

[1] D. Skillicorn, Foundations of Parallel Programming. Cambridge Univ. Press. 1994.
[2] S. Gorlatch, “Abstraction and Performance in the Design of Parallel Programs,” Habilitation thesis, Universität Passau, MIP-9802, 1998.
[3] B. Bacci, M. Danelutto, S. Orlando, S. Pelagatti, and M. Vanneschi., “$\rm P^3$L: A Structured High Level Programming Language and Its Structured Support,” Concurrency: Practice and Experience, vol. 7, no. 3, pp. 225–255, 1995.
[4] M.I. Cole, Algorithmic Skeletons: Structured Management of Parallel Computation, MIT Press, Cambridge, Mass., 1989.
[5] J. O'Donnell and G. Rünger, “A Methodology for Deriving Parallel Programs with a Family of Abstract Parallel Machines,” Proc. Parallel Processing, Euro-Par '97, C. Lengauer, M. Griebl, and S. Gorlatch, eds., pp. 661–668, 1997.
[6] B. Jay, M. Cole, M. Sekanina, and P. Steckler, “A Monadic Calculus for Parallel Costing of a Functional Language of Arrays,” Proc. Parallel Processing. Euro-Par '97, C. Lengauer, M. Griebl, and S. Gorlatch, eds., pp. 650–661, 1997.
[7] W. Gropp, E. Lusk, and A. Skjellum, Using MPI: Portable Parallel Programming with the Message Passing Interface. MIT Press, 1994.
[8] W.F. McColl, “Scalable Computing,” Computer Science Today, J. van Leeuwen, ed., pp. 46–61, 1995.
[9] D. Culler,R. Karp,D. Patterson,A. Sahay,K.E. Schauser,E. Santos,R. Subramonian,, and T. von Eicken,“LogP: Towards a realistic model of parallel computation,” Fourth Symp. Principles and Practices Parallel Programming, SIGPLAN’93, ACM, May 1993.
[10] D.B. Skillicorn and D. Talia, “Models and Languages for Parallel Computation,” ACM Computing Surveys, vol. 30, no. 2, pp. 123-169, 1998.
[11] X. Deng and N. Gu, “Good Programming Style on MultiProcessors,” Proc. Symp. Parallel and Distributed Processing (SPDP '94) pp. 538–543, 1994.
[12] R. Bird, “Lectures on Constructive Functional Programming,” Constructive Methods in Computing Science, NATO ASI Series F: Computer and Systems Sciences, M. Broy, ed., vol. 55, pp.151–216, 1988.
[13] M. Cole, “Parallel Programming with List Homomorphisms,” Parallel Proceeding Letters, vol. 5, no. 2, pp. 191–204, 1994.
[14] S. Gorlatch, “Optimizing Compositions of Scans and Reductions in Parallel Program Derivation,” Techical Report MIP-9711, Universität Passau, , May 1997.
[15] J. Bentley, “Programming Pearls,” Comm. ACM, vol. 27, pp. 65–871, 1984.
[16] D. Smith, “Applications of a Strategy for Designing Divide-and-Conquer Algorithms,” Science of Computer Programming, vol. 8, no. 3, pp. 213–229, 1987.
[17] D. Swierstra and O. de Moor, “Virtual Data Structures,” Proc. Formal Program Development, B. Möller, H. Partsch, and S. Schuman, eds., pp. 55–371, 1993.
[18] V. Kumar, A. Grama, A. Gupta, and G. Karypis, Introduction to Parallel Computing: Design and Analysis of Algorithms. Benjamin Cummings, 1994.
[19] S. Gorlatch, “Systematic Efficient Parallelization of Scan and Other List Homomorphisms,” Proc. Euro-Par '96: Parallel Processing, L. Bougé, P. Fraigniaud, A. Mignotte, and Y. Robert, eds., vol. 2, pp. 401–408, 1996.
[20] R. van de Geijn, “On Global Combine Operations,” J. Parallel and Distributed Computing, vol. 22, pp. 324–328, 1994.
[21] J. Hill and D. Skillicorn, “The BSP Tutorial,” Proc. Euro-Par '97, 1997.
[22] R. van de Geijn, “Using PLAPACK: Parallel Linear Algebra Package,” Scientific and Eng. Computation Series, MIT Press, 1997.
[23] S. Gorlatch, C. Wedler, and C. Lengauer, “Optimization Rules for Programming with Collective Operations,” Proc. 13th Int'l Parallel Processing Symp. and Proc. 10th Symp. Parallel and Distributed Processing (IPPS/SPDP '99), M. Atallah, ed., pp. 492–499, 1999.
[24] B. Bacci, S. Gorlatch, C. Lengauer, and S. Pelagatti, “Skeletons and Transformations in an Integrated Parallel Programming Environment,” Proc. Parallel Computing Technologies (PaCT-99), pp. 13–27, 1999.
[25] R. Bird, “Algebraic Identities for Program Calculation,” The Computer J., vol. 32, no. 2, pp. 122–126, 1989.
[26] W. Cai and D. Skillicorn, “Calculating Recurrences using the Bird-Meertens Formalism,” Parallel Processing Letters, vol. 5, no. 2, pp. 179–190, 1995.
[27] C. Wedler and C. Lengauer, “Parallel Implementations of Combinations of Broadcast, Reduction and Scan,” Proc. Second Int'l Workshop Software Eng. for Parallel and Distributed Systems (PDSE '97), G. Agha and S. Russo, eds., pp. 108–119, 1997.
[28] Z. Hu, H. Iwasaki, and M. Takeichi, “Formal Derivation of Efficient Parallel Programs by Construction of List Homomorphisms,” ACM Trans. Programming Languages and Systems, vol. 19, no. 3, pp. 444–461, 1997.
[29] G. Fox,M. Johnson,G. Lyzenga,S. Otto,J. Salmon,, and D. Walker,Solving Problems on Concurrent Processors, Vol. I: General Techniques andRegular Problems.Englewood Cliffs, N.J.: Prentice Hall 1988.
[30] T. Kielmann, H.E. Bal, and S. Gorlatch, “Bandwidth-Efficient Collective Communication for Clustered Wide Area Systems,” submitted to IPDPS 2000, .
[31] A. Plaat, H.E. Bal, and R.F.H. Hofman, “Sensitivity of Parallel Applications to Large Differences in Bandwidth and Latency in Two-Layer Interconnects,” Proc. Halton-Peel Communications Assoc. HPCA-5, 1999.
[32] T. Kielmann, R.F.H. Hofman, H.E. Bal, A. Plaat, and R.A.F. Bhoedjang, “MagPIe: MPI's Collective Communication Operations for Clustered Wide Area Systems,” Proc. ACM SIGPLAN Symp. Principles and Practice of Parallel Programming (PPoPP '99), 1999
[33] P.S. Pacheco, Parallel Programming with MPI. Morgan Kaufmann, 1997.

Index Terms:
Message passing, MPI, collective operations, program transformations, systematic program design, reduction, scan, homomorphisms, skeletons, maximum segment sum.
Sergei Gorlatch, "Toward Formally-Based Design of Message Passing Programs," IEEE Transactions on Software Engineering, vol. 26, no. 3, pp. 276-288, March 2000, doi:10.1109/32.842952
Usage of this product signifies your acceptance of the Terms of Use.