This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Parallelized Direct Execution Simulation of Message-Passing Parallel Programs
October 1996 (vol. 7 no. 10)
pp. 1090-1105

Abstract—As massively parallel computers proliferate, there is growing interest in finding ways by which performance of massively parallel codes can be efficiently predicted. This problem arises in diverse contexts such as parallelizing compilers, parallel performance monitoring, and parallel algorithm development. In this paper, we describe one solution where one directly executes the application code, but uses a discrete-event simulator to model details of the presumed parallel machine, such as operating system and communication network behavior. Because this approach is computationally expensive, we are interested in its own parallelization, specifically the parallelization of the discrete-event simulator. We describe methods suitable for parallelized direct execution simulation of message-passing parallel programs, and report on the performance of such a system, LAPSE (Large Application Parallel Simulation Environment), we have built on the Intel Paragon. On all codes measured to date, LAPSE predicts performance well, typically within 10% relative error. Depending on the nature of the application code, we have observed low slowdowns (relative to natively executing code) and high relative speedups using up to 64 processors.

[1] D. Agrawal, M. Choy, H.V. Leong, and A. Singh, "Maya: A Simulation Platform for Distributed Shared Memories," Proc. Eighth Workshop Parallel and Distributed Simulation, pp. 151-155, July 1994.
[2] J.H. Bramble,J.E. Pasciak,, and A.H. Schatz,“The construction of preconditioners for elliptic problems by substructuring: I,” Math. Comp., vol. 47, no. 175, pp. 103-134, 1986.
[3] E.A. Brewer, C.N. Dellarocas, A. Colbrook, and W.E. Weihl, "PROTEUS: A High-Performance Parallel Architecture Simulator," technical report, Massachusetts Inst. of Tech nology, Sept. 1992.
[4] K.M. Chandy and J. Misra, "A Case Study in the Design and Verification of Distributed Programs," IEEE Trans. Software Engineering, vol. 5, no. 5, pp. 440-452, May 1979.
[5] D.-K. Chen, H.-M. Su, and P.-C Yew, "The Impact of Synchronization and Granularity on Parallel Systems," Proc. Int'l Symp. Computer Architecture, pp. 239-248, May 1990.
[6] J.D. Bruner, H. Cheong, A. Veidenbaum, and P.C. Yew, "Chief: A Parallel Simulation Environment for Parallel Systems," Proc. Fifth Int'l Parallel Processing Symp., pp. 568-575, Apr. 1991.
[7] R. Covington, S. Dwarkadas, J.R. Jump, S. Madala, and J.B. Sinclair, "Efficient Simulation of Parallel Computer Systems," Int'l J. Computer Simulation, vol. 1, no. 1, pp. 31-58, June 1991.
[8] R. Covington, J. Jump, and H. Sinclair, "The Rice Parallel Processing Testbed," Proc. 1988 ACM Sigmetrics Conf. Measurement and Modeling of Computer Systems, ACM, 1988.
[9] T.W. Crockett and T. Orloff, "A Parallel Rendering Algorithm for MIMD Architectures," Tech. Report 91-3, ICASE, NASA Langley Research Center, Hampton, Va., 1991.
[10] W.P. Dawkins, V. Debbad, J.R. Jump, and J.B. Sinclair, "Efficient Simulation of Multiprogramming," Proc. 1990 SIGMETRICS Conf., pp. 237-238, May 1990.
[11] H. Davis, S. Goldschmidt, and J. Hennessy, "Multiprocessor Simulation and Tracing Using Tango," Proc. 1991 Int'l Conf. Parallel Processing, pp. II99-II107, Aug. 1991.
[12] P.M. Dickens, M. Haines, P. Mehotra, and D.M. Nicol, "Towards a Thread-Based Parallel Direct Execution Simulator," Proc. 29th Hawaii Int'l Conf. System Sciences, to appear.
[13] P.M. Dickens, P. Heidelberger, and D.M. Nicol, "A Distributed Memory LAPSE: Parallel Simulation of Message-Passing Programs," Proc. Eighth Workshop Parallel and Distributed Simulation (PADS), pp. 32-38,Edinburgh, The Soc. of Computer Simulation, 1994.
[14] P.M. Dickens, P. Heidelberger, and D.M. Nicol, "Parallelized Network Simulators for Message-Passing Parallel Programs," Proc. Int'l Workshop Modeling, Analysis, Simulation of Computer and Telecommunication Systems, pp. 72-76, 1995.
[15] P.M. Dickens, P. Heidelberger, and D.M. Nicol, "Timing Simulation of Paragon Codes Using Workstation Clusters," Proc. Winter Simulation Conf., pp. 1,347-1,353,Orlando, Fla., 1994.
[16] S. Dwarkadas, J.R. Jump, and J.B. Sinclair, "Execution-Driven Simulation of Multiprocessors: Address and Timing Analysis," ACM TOMACS, vol. 4, no. 4, pp. 314-338, Oct. 1994.
[17] W. Gropp, e. Lusk, and A. Skellum, Using MPI.Cambridge, Mass.: MIT Press, 1994.
[18] R.M. Fujimoto, "Simon: A Simulator of Multicomputer Networks," Tech. Report UCB/CSD 83/137, ERL, Univ. of California, Berkeley, 1983.
[19] R. Fujimoto, “Parallel Discrete Event Simulation,” Comm. ACM, vol. 33, no. 10, pp. 30-53, Oct. 1990.
[20] R.M. Fujimoto and W.B. Campbell, "Efficient Instruction Level Simulation of Computers," Trans. Society for Computer Simulation, vol. 5, no. 2, pp. 109-124, Apr. 1988.
[21] F.W. Howell, R. Williams, and R.N. Ibbett, "Hierarchical Architecture Design and Simulation Environment," MASCOTS '94, Proc. Second Int'l Workshop Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, pp. 363-370,Durham, N. C., IEEE CS Press, 1994.
[22] D.R. Jefferson, "Virtual Time," ACM Trans. Programming Languages and Systems, vol. 7, no. 3, pp. 404-425, July 1985.
[23] Intel Corporation, Paragon User's Guide. Order no. 312489-002, Oct. 1993.
[24] D.E. Keyes and W.D. Gropp, "A Comparison of Domain Decomposition Techniques for Elliptic Partial Differential Equations and Their Parallel Implementation," SIAM J. Scientific and Statistical Computing, vol. 8, no. 2, pp. s166-s202, Mar. 1987.
[25] S. Lamberts, G. Stellner, A. Bode, and T. Ludwig, "Paragon Parallel Programming Environment on Sun Workstations," Sun User Group Proc., pp. 87-98, Sun User Group, Dec. 1993.
[26] A.R. Lebeck and D.A. Wood, "Active Memory: A New Abstraction for Memory-System Simulation," Proc. Sigmetrics Conf. Measurement and Modeling of Computer Systems, ACM Press, New York, 1995, pp. 220-230.
[27] I. Mathieson and R. Francis, "A Dynamic-Trace-Driven Simulator for Evaluating Parallelism," Proc. 21st Hawaii Int'l Conf. System Sciences, pp. 158-166, Jan. 1988.
[28] D.M. Nicol, “The Cost of Conservative Synchronization in Parallel Discrete-Event Simulations,” J. ACM, vol. 40, no. 2, pp. 304–333, Apr. 1993
[29] D.M. Nicol and P. Heidelberger, "Parallel Simulation of Markovian Queuing Networks Using Adaptive Uniformization," Proc. 1993 SIGMETRICS Conf., pp. 135-145,Santa Clara, Calif., May 1993.
[30] P. Svensson, "GEO-SAL: A Query Language for Spatial Data Analysis," Proc. Int'l Symp. Advances in Spatial Databases, Lecture Notes in Computer Science 525, pp. 119-140, 1991.
[31] D.M. Nicol and R.M. Fujimoto, "Parallel Simulation Today," Annals of Operations Research, vol. 53, pp. 249-286, 1994.
[32] D.M. Nicol, “Parallel Discrete Event Simulation of FCFS Stochastic Queuing Networks,” Parallel Programming: Experience with Applications, Languages and Systems, ACM SIGPLAN, pp. 124–137, July 1988.
[33] D.M. Nicol, “The Cost of Conservative Synchronization in Parallel Discrete-Event Simulations,” J. ACM, vol. 40, no. 2, pp. 304–333, Apr. 1993
[34] D.M. Nicol and P. Heidelberger, "On Extending Parallelism to Serial Simulators," Proc. Ninth Workshop Parallel and Distributed Simulation (PADS '95), pp. 60-67, IEEE CS Press, 1995.
[35] D.M. Nicol and J. Liu, "Parallelizable Execution-Driven Simulation of Threaded Distributed Memory Parallel Computations," Proc. MASCOTS '96 Conf.,Santa Barbara, Calif., Feb. 1996.
[36] W.H. Press, B.P. Flannery, S.A. Teukolsky, and W.T. Vetterling, Numerical Recipes in C.Cambridge, England: Cambridge Univ. Press, 1988.
[37] S.K. Reinhardt, M.D. Hill, J.R. Larus, A.R. Lebeck, J.C. Lewis, and D.A. Wood, "The Wisconsin Wind Tunnel: Virtual Prototyping of Parallel Computers," Proc. ACM SIGMETRICS Conf. Measurement and Modeling of Computer Systems, pp. 48-60, ACM, May 1993.
[38] R. Righter and J.V. Walrand, "Distributed Simulation of Discrete Event Systems," Proc. IEEE, vol. 77, no. 1, pp. 99-113, Jan. 1989.
[39] H. Schwetman, "CSIM: A C-based, Process Oriented Simulation Language," Proc. 1991 Winter Simulation Conf., pp. 387-396, 1991.

Index Terms:
Direct execution simulation, parallel simulation, architectural simulation, message-passing programs, MIMD, synchronization, contention.
Citation:
Phillip M. Dickens, Philip Heidelberger, David M. Nicol, "Parallelized Direct Execution Simulation of Message-Passing Parallel Programs," IEEE Transactions on Parallel and Distributed Systems, vol. 7, no. 10, pp. 1090-1105, Oct. 1996, doi:10.1109/71.539740
Usage of this product signifies your acceptance of the Terms of Use.