Proceedings 15th Workshop on Parallel and Distributed Simulation (2001)
Lake Arrowhead, California
May 15, 2001 to May 18, 2001
Jason Liu David Nicol , Dartmouth College
Strong reasons exist for executing a large-scale discrete-event simulation on a cluster of processor nodes (each of which may be a shared-memory multiprocessor or a uniprocessor). This is the architecture of the largest scale parallel machines, and so the largest simulation problems can only be solved this way. It is a common architecture even in less esoteric settings, and is suitable for memory-bound simulations. This paper describes our approach to porting the SSF simulation kernel to this architecture, using the Message Passing Interface (MPI) system. The no-table feature of this transformation is to support an efficient two-level synchronization and communication scheme that addresses cost discrepancies between shared-memory and distributed memory. In the initial implementation, we use a globally synchronous approach between distributed-memory nodes, and an asynchronous shared-memory approach within a SMP cluster. The SSF API reflects inherently shared-memory assumptions; we report therefore on our approach for porting an SSF kernel to a cluster of SMP nodes. Experimental results on two architectures are described, for a model of TCP/IP traffic flows over a hierarchical network. The performance on a distributed network of commodity SMPs connected through ethernet is seen to frequently exceed performance on a Sun shared-memory multiprocessor.
J. L. Nicol, "Learning Not to Share," Proceedings 15th Workshop on Parallel and Distributed Simulation(PADS), Lake Arrowhead, California, 2001, pp. 46.