This Article 
 Bibliographic References 
 Add to: 
A Design Methodology for Data-Parallel Applications
April 2000 (vol. 26 no. 4)
pp. 293-314

Abstract—A methodology for the design and development of data parallel applications and components is presented. Data-parallelism is a well-understood form of parallel computation, yet developing simple applications can involve substantial efforts to express the problem in low-level notations. We describe a process of software development for data-parallel applications starting from high-level specifications, generating repeated refinements of designs to match different architectural models and performance constraints, enabling a development activity with cost-benefit analysis. Primary issues are algorithm choice, correctness, and efficiency, followed by data decomposition, load balancing, and message-passing coordination. Development of a data-parallel multitarget tracking application is used as a case study, showing the progression from high to low-level refinements. We conclude by describing tool support for the process.

[1] L.V. Kalé and S. Krishnan, “CHARM++: A Portable Concurrent Object Oriented System Based on C++,” Proc. OOPSLA '93, 1993.
[2] D.C. Luckham,J. Vera,D. Bryan,L. Augustin,, and F. Belz,“Partial orderings of event sets and their application toprototyping concurrent, timed systems,” J. of Systems and Software, vol. 21, no. 3, pp. 253-265, June 1993.
[3] P.H. Mills, L.S. Nyland, J.F. Prins, and J.H. Reif, “Software Issues in High-Performance Computing and a Framework for the Development of HPC Applications,” Developing a Computer Science Agenda for High-Performance Computing, U. Vishkin, ed., pp. 110-117, ACM Press, 1994.
[4] A. Goldberg, P. Mills, L. Nyland, J. Prins, J. Reif, and J. Riely, “Specification and Development of Parallel Algorithms with the Proteus System,” Specification of Parallel Algorithms, G. Blelloch, M. Chandy, and S. Jagannathan, eds. Am. Math. Soc., 1994.
[5] L. Nyland, J. Prins, A. Goldberg, P. Mills, J. Reif, and R. Wagner, “A Refinement Methodology for Developing Data-Parallel Applications,” Proc. Europar '96, 1996.
[6] L. Nyland, J. Prins, P. Mills, and J. Reif, “Design Study of Data-Parallel Multitarget Tracking Algorithms,” Proc. AIP Design Meeting (7/96), 1996. rometaskII.pdf.
[7] John W. Backus, "Can Programming be Liberated From the von Neumann Style? A Functional Style and Its Algebra of Programs," Communications of the ACM, vol. 21, pp. 613-641, 1978.
[8] G.E. Blelloch, “Programming Parallel Algorithms,” Comm. ACM, vol. 39, no. 3, pp. 85-97, Mar. 1996.
[9] D.C. Cann, “SISAL 1.2: A Brief Introduction and Tutorial,” technical report, Lawrence Livermore Nat'l Laboratory, 1993.
[10] G. Levin and L. Nyland, “An Introduction to Proteus, Version 0.9,” Technical Report TR95-025, Univ. of North Carolina-Chapel Hill, Aug. 1993.
[11] B. Boehm, "A Spiral Model of Software Development and Enhancement," Computer, May 1988, pp. 61-72.
[12] A. Geguelin, J. Dongarra, A. Geist, R. Manchek, and V. Sunderam, “A User's Guide to PVM: Parallel Virtual Machine,” technical report, Oak Ridge Nat'l Laboratory, July 1991.
[13] W. Gropp, E. Lusk, and A. Skjellum, Using MPI: Portable Parallel Programming with the Message Passing Interface. MIT Press, 1994.
[14] R.A. Games, J.D. Ramsdell, and J.J. Rushanan, “Techniques for Real-Time Parallel Processing: Sensor Processing Case Studies,” Technical Report MTR 93B0000186, MITRE Corp., Apr. 1994.
[15] J.K. Antonio, “Architectural Influences on Task Scheduling: A Case Study Implementation of the JPDA Algorithm,” Technical Report RL-TR-94-200, Rome Laboratory, Nov. 1994.
[16] L. Nyland, J. Prins, R.H. Yun, J. Hermans, H.-C. Kum, and L. Wang, “Achieving Scalable Parallel Molecular Dynamics Using Dynamic Spatial Domain Decomposition Techniques,” J. Parallel and Distributed Computing, vol. 47, pp. 125-138, 1997.
[17] G. Blelloch, S. Chatterjee, J. Hardwick, M. Reid-Miller, J. Sipelstein, and M. Zahga, “CVL: A C Vector Library,” Technical Report CMU-CS-93-114, Carnegie Mellon Univ., Feb. 1993.
[18] J. Prins and D. Palmer, “Transforming High-Level Data-Parallel Programs into Vector Operations,” Proc. Fourth ACM SIGPLAN Symp. Principles and Practice of Parallel Programming, 1993.
[19] G. Blelloch and G. Sabot, “Compiling Collection-Oriented Languages into Massively Parallel Computers,” J. Parallel and Distributed Computing, vol. 8, pp. 119-134, 1990.
[20] P.K.T. Au, M.M.T. Chakravarty, J. Darlington, Y. Guo, S. Jähnichen, G. Keller, M. Köhler, M. Simons, and W. Pfannenstiel, “Enlarging the Scope of Vector-Based Computations: Extending Fortran 90 with Nested Data Parallelism,” Proc. Int'l Conf. Advances in Parallel and Distributed Computing, 1997.
[21] OpenMP Architecture Review Board, “The OpenMP API,” OpenMPhttp:/, 1997.
[22] D. Culler,R. Karp,D. Patterson,A. Sahay,K.E. Schauser,E. Santos,R. Subramonian,, and T. von Eicken,“LogP: Towards a realistic model of parallel computation,” Fourth Symp. Principles and Practices Parallel Programming, SIGPLAN’93, ACM, May 1993.
[23] L.G. Valiant, “A Bridging Model for Parallel Computation,” Comm. ACM, vol. 33, no. 8, pp. 103-111, Aug. 1990.
[24] B. Alpern, L. Carter, and J. Ferrante, “Modeling Parallel Computers as Memory Hierarchies,” Proc. Workshop Portability and Performance for Parallel Processing, 1993.
[25] H. Lu, Y.C. Hu, and W. Zwaenepoel, “OpenMP on Networks of Workstations,” Proc. Supercomputing '98, 1998.
[26] J.C. Browne, S.I. Hyder, J. Dongarra, K. Moore, and P. Newton, “Visual Programming and Debugging for Parallel Computing,” IEEE Parallel and Distributed Technology, vol. 3, 1995.
[27] I.T. Foster, Designing and Building Parallel Programs Addison-Wesley, Reading, Mass., 1995.
[28] G.C. Fox, S. Ranka, and P.C.R. Consortium, “Common Runtime Support for High-Performance Parallel Languages,” Draft Technical Report PCRC-001, Northeast Parallel Architectures Center, Syracuse Univ., July 1993.
[29] R. Barrett, M. Berry, T.F. Chan, J. Demmel, J. Donato, J. Dongarra, V. Eijkhout, R. Pozo, C. Romine, and H. v. d. Vorst, Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods. SIAM, 1994.
[30] K.M. Chandy, “Concurrent Program Archetypes,” Computer Science Report 256-80, California Inst. of Technology, Pasadena, ArchPaper.html, 1995.
[31] E. Johnson, D. Gannon, and P. Beckman, “HPC++: Experiments with the Parallel Standard Template Library,” Proc. Int'l Conf. Supercomputing, 1997.
[32] L.V. Streepy Jr., “CXdb: A New View on Optimization,” Proc. Supercomputer Debugging Workshop, 1991.
[33] M.T. Heath, “Performance Visualization with ParaGraph,” Proc. Second Workshop Environments and Tools for Parallel Scientific Computing, 1994.
[34] S. Browne, “Cross-Platform Parallel Debugging and Performance Analysis Tools,” Proc. EuroPVM/MPI '98, 1998.
[35] B. Zhou and N.K. Bose, “Multitarget Tracking in Clutter: Fast Algorithms for Data Association,” IEEE Trans. Aerospace and Electronic Systems, vol. 29, pp. 352-363, 1993.
[36] B. Zhou and N.K. Bose, “An Efficient Algorithm for Data Association in Multitarget Tracking,” IEEE Trans. Aerospace and Electronic Systems, vol. 31, pp. 458-468, 1995.
[37] D.W. Palmer, “Efficient Execution of Nested Data-Parallel Programs,” Univ. of North Carolina, 1996.
[38] R.A. Wagner, “Task Parallel Implementation of the JPDA Algorithm,” technical report, Dept. of Computer Science, Duke Univ., Durham, N.C., June 1995, romewagner-jpda.pdf.
[39] L. Nyland, J. Prins, A. Goldberg, P. Mills, and J. Reif, “A Design Methodology for Data-Parallel Applications,” Proc. AIP Design Meeting (12/95), 1995.
[40] G.E. Blelloch, S. Chatterjee, J. Hardwick, J. Sipelstein, and M. Zagha, “Implementation of a Portable Nested Data-Parallel Language,” Proc. Fourth ACM SIGPLAN Symp. Principles and Practice of Parallel Programming, 1993.
[41] R.M. Karp and Y. Zhang, “A Randomized Parallel Branch-and-Bound Procedure,” Proc 20th Ann. Symp. Theory of Computing, 1988.
[42] L.V. Kale, B. Ramkumar, V. Saletore, and A.B. Sinha, “Prioritization in Parallel Symbolic Computing,” Lecture Notes in Computer Science, T.I.a.R. Halstead, ed., vol. 748, pp. 12-41, Springer-Verlag, 1993.
[43] A. Brüngger, A. Marzetta, K. Fukuda, and J. Nievergelt, “The Parallel Search Bench ZRAM and Its Applications,” Annals of Operations Research, 1999.
[44] A. Marzetta, “ZRAM: A Library of Parallel Search Algorithms and Its Use in Enumeration and Combinatorial Optimization,” ETH Zürich, 1998, .
[45] G.E. Blelloch, “Nesl: A Nested Data-Parallel Language (version 2.6),” Technical Report CMU-CS-93-129, School of Computer Science, Carnegie Mellon Univ., 1993.
[46] G. Keller and M. Chakravarty, “On the Distributed Implementation of Aggregate Data-structures by Program Transformation,” Proc. Fourth Int'l Workshop High-Level Parallel Programming Models and Supportive Environments (HIPS '99), 1999.
[47] G. Agha, “Concurrent Object-Oriented Programming,” Comm. ACM, vol. 33, pp. 125-141, 1990.
[48] H. Korab, “Access Feature: Stormy Weather,” StormPredictionStorms.html, 1999.
[49] K. Droegemeier, “Performance of the CAPS Advanced Regional Prediction System (ARPS) during the 3 May 1999 Oklahoma Tornado Outbreak,” 3 Case/, 1999.

Index Terms:
Software design, high-level programming languages, parallel algorithms, prototyping, software templates, multitarget tracking algorithms.
Lars S. Nyland, Jan F. Prins, Allen Goldberg, Peter H. Mills, "A Design Methodology for Data-Parallel Applications," IEEE Transactions on Software Engineering, vol. 26, no. 4, pp. 293-314, April 2000, doi:10.1109/32.844491
Usage of this product signifies your acceptance of the Terms of Use.