This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
POEMS: End-to-End Performance Design of Large Parallel Adaptive Computational Systems
November 2000 (vol. 26 no. 11)
pp. 1027-1048

Abstract—The POEMS project is creating an environment for end-to-end performance modeling of complex parallel and distributed systems, spanning the domains of application software, runtime and operating system software, and hardware architecture. Toward this end, the POEMS framework supports composition of component models from these different domains into an end-to-end system model. This composition can be specified using a generalized graph model of a parallel system, together with interface specifications that carry information about component behaviors and evaluation methods. The POEMS Specification Language compiler, under development, will generate an end-to-end system model automatically from such a specification. The components of the target system may be modeled using different modeling paradigms (analysis, simulation, or direct measurement) and may be modeled at various levels of detail. As a result, evaluation of a POEMS end-to-end system model may require a variety of evaluation tools including specialized equation solvers, queuing network solvers, and discrete-event simulators. A single application representation based on static and dynamic task graphs serves as a common workload representation for all these modeling approaches. Sophisticated parallelizing compiler techniques allow this representation to be generated automatically for a given parallel program. POEMS includes a library of predefined analytical and simulation component models of the different domains and a knowledge base that describes performance properties of widely used algorithms. This paper provides an overview of the POEMS methodology and illustrates several of its key components. The methodology and modeling capabilities are demonstrated by predicting the performance of alternative configurations of Sweep3D, a complex benchmark for evaluating wavefront application technologies and high-performance, parallel architectures.

[1] V.S. Adve, “Analyzing the Behavior and Performance of Parallel Programs,” Technical Report 1,201, Univ. of Wisconsin-Madison, UW CS, Oct. 1993.
[2] V.S. Adve and J. Mellor-Crummey, “Using Integer Sets for Data-Parallel Program Analysis and Optimization,” Proc. SIGPLAN '98 Conf. Programming Language Design and Implementation, June 1998.
[3] V.S. Adve and R. Sakellariou, “Application Representations for MultiParadigm Performance Modeling,” Int'l J. High Performance Computing Applications, vol. 14, no. 4, 2000.
[4] V.S. Adve, R. Bagrodia, E. Deelman, T. Phan, and R. Sakellariou, “Compiler-Supported Simulation of Highly Scalable Parallel Applications,” High Performance Computing and Networking SC '99, Nov. 1999.
[5] A. Alexandrov, M. Ionescu, K.E. Schauser, and C. Scheiman, “LogGP: Incorporating Long Messages into the LogP Model,” Proc. Symp. Parallel Algorithms and Architectures '95, July 1995.
[6] C. Amza, A.L. Cox, S. Dwarkadas, P. Keleher, H. Lu, R. Rajamony, W. Yu, and W. Zwaenepoel, “TreadMarks: Shared Memory Computing on Networks of Workstations,” Computer, vol. 29, no. 2, Feb. 1996.
[7] R.L. Bagrodia and W.-T. Liao, "Maisie: A Language for the Design of Efficient Discrete-Event Simulations," IEEE Trans. Software Eng., Apr. 1994, pp. 225-238.
[8] R. Bagrodia, S. Docy, and A. Kahn, “Parallel Simulation of Parallel File Systems and I/O Programs,” Proc. Supercomputing '97, 1997.
[9] R. Bagrodia, R. Meyer, M. Takai, Y.A. Chan, X. Zeng, J. Marting, and H.Y. Song, “Parsec: A Parallel Simulation Environment for Complex Systems,” Computer, vol. 31, no. 10, pp. 77-85, Oct. 1998.
[10] R. Bagrodia, E. Deelman, S. Docy, and T. Phan, “Performance Prediction of Large Parallel Applications Using Parallel Simulations,” Proc. Seventh ACM SIGPLAN Symp. Principles and Practices of Parallel Programming (PPoPP `99), May 1999.
[11] B. Bayerdorffer, “Associative Broadcast and the Communication Semantics of Naming in Concurrent Systems,” doctoral dissertation, Dept. of Computer Sciences, Univ. of Texas at Austin, Dec. 1993.
[12] B. Bayerdorffer, “Distributed Programming with Associative Broadcast,” Proc. 28th Int'l Conf. System Sciences, pp. 525–534, Jan. 1995.
[13] G. Booch, J. Rumbaugh, and I. Jacobson, The Unified Modeling Language User Guide. Addison Wesley, 1999.
[14] D. Burger and T.M. Austin, “The SimpleScalar Tool Set, Version 2.0,” Technical Report 1,342, Univ. of Wisconsin-Madison, UW CS, June 1997.
[15] D. Culler,R. Karp,D. Patterson,A. Sahay,K.E. Schauser,E. Santos,R. Subramonian,, and T. von Eicken,“LogP: Towards a realistic model of parallel computation,” Fourth Symp. Principles and Practices Parallel Programming, SIGPLAN’93, ACM, May 1993.
[16] A. Dube, “A Language for Compositional Development of Performance Models and its Translation,” masters thesis, Dept. of Computer Science, Univ. of Texas at Austin, Aug. 1998.
[17] G. Estrin et al., "SARA (System Architects Apprentice): Modeling, Analysis, and Simulation Support for Design of Concurrent Systems," IEEE Trans. Software Eng., vol. Se-12, no. 2, Feb. 1986, pp. 293-311.
[18] R. Jefferey and M. Berry, "A Framework for Evaluation and Prediction of Metrics Program Success," 1st Int'l Software Metrics Symp., IEEE Computer Soc. Press, Los Alamitos, Calif., 1993, pp. 28-39.
[19] A. Hoisie, O.M. Lubeck, and H.J. Wasserman, “Performance and Scalability Analysis of Teraflop-Scale Parallel Architectures Using Multidimensional Wavefront Applications,” Proc. Frontiers `99, 1999.
[20] S. Horwitz, T. Reps, and D. Binkley, “Interprocedural Slicing Using Dependence Graphs,” ACM Trans. Programming Languages and Systems. vol. 12, no. 1, pp. 26-60, Jan. 1990.
[21] D.J. Kerbyson, J.S. Harper, A. Craig, and G.R. Nudd, “PACE: A Toolset to Investigate and Predict Performance in Parallel Systems,” European Parallel Tools Meeting (ONERA), Oct. 1996.
[22] K.R. Koch, R.S. Baker, and R.E. Alcouffe, “Solution of the First-Order Form of the 3-D Discrete Ordinates Equation on a Massively Parallel Processor,” Trans. Amer. Nuclear Soc., vol. 65, no. 198, 1992.
[23] L. Lamport, "Time, clocks and the ordering of events in a distributed system," Comm. ACM, vol. 21, no. 7, pp. 558-565, July 1978.
[24] P. Newton and J.C. Browne, “The CODE 2.0 Graphical Parallel Programming Language,” Proc. ACM Int'l Conf. Supercomputing, pp. 167–177, July 1992.
[25] V.S. Pai, P. Ranganathan, and S.V. Adve, “RSIM ReferenceManual Version 1.0,” Technical Report 9,705, Dept. of Electrical and Computer Eng., Rice Univ., Aug. 1997.
[26] V. Pai, P. Ranganathan, and S. Adve, “The Impact of Instruction-Level Parallelism on Multiprocessor Performance and Simulation Methodology,” Proc. Third Int'l Symp. High-Performance Computer Architecture, pp. 72-83, Feb. 1997.
[27] S. Prakash and R. Bagrodia, “Parallel Simulation of Data Parallel Programs,” Proc. Eighth Workshop Languages and Compilers for Parallel Computing, Aug. 1995.
[28] S. Prakash and R. Bagrodia, “Using Parallel Simulation to Evaluate MPI Programs,” Proc. Winter Simulation Conf., Dec. 1998.
[29] N. Ramakrishnan, “Recommender Systems for Problem Solving Environments,” doctoral dissertation, Dept. of Computer Sciences, Purdue Univ., 1997.
[30] J. Rice, Numerical Methods, Software and Analysis, second ed., pp. 524–527. New York: Academic Press, 1993.
[31] C.L. Chang, R.A. Stachowitz, and J.B. Combs, “Validation of Nonmonotonic Knowledge-Based Systems,” Proc. IEEE Int'l Conf. Tools for Artificial Intelligence, Nov. 1990.
[32] M. Rosenblum, S. Herrod, E. Witchel, and A. Gupta, "Complete Computer System Simulation," IEEE Parallel and Distributed Technology, Fall 1995.
[33] S. Shlaer and S.J. Mellor, Object Lifecycles: Modeling the World in States, Prentice Hall, Englewood Cliffs, N.J., 1992.
[34] X.H. Sun, D. He, K.W. Cameron, and Y. Luo, “A Factorial Performance Evaluation for Hierarchical Memory Systems,” Proc. Int'l Parallel Processing Symp. (IPPS '99), Apr. 1999.
[35] D. Sundaram-Stukel and M.K. Vernon, “Predictive Analysis of a Wavefront Application Using LogGP,” Proc. Seventh ACM SIGPLAN Symp. Principles and Practices of Parallel Programming (PPoPP '99), May 1999.
[36] M.K. Vernon, E.D. Lazowska, and J. Zahorjan, “An Accurate and Efficient Performance Analysis Technique for Multiprocessor Snooping Cache-Consistency Protocols,” Proc. 15th Ann. Int'l Symp. Computer Architecture, pp. 308–315, May 1988.
[37] G. Wiederhold, “Mediation in Information Systems; in Research Directions in Software Engineering,” ACM Computing Surveys, vol. 27, no. 2, pp. 265–267, June 1995.

Index Terms:
Performance modeling, parallel system, message passing, analytical modeling, parallel simulation, processor simulation, task graph, parallelizing compiler, compositional modeling, recommender system.
Citation:
Vikram S. Adve, Rajive Bagrodia, James C. Browne, Ewa Deelman, Aditya Dube, Elias N. Houstis, John R. Rice, Rizos Sakellariou, David J. Sundaram-Stukel, Patricia J. Teller, Mary K. Vernon, "POEMS: End-to-End Performance Design of Large Parallel Adaptive Computational Systems," IEEE Transactions on Software Engineering, vol. 26, no. 11, pp. 1027-1048, Nov. 2000, doi:10.1109/32.881716
Usage of this product signifies your acceptance of the Terms of Use.