This Article 
 Bibliographic References 
 Add to: 
Accuracy of Memory Reference Traces of Parallel Computations in Trace-Drive Simulation
January 1992 (vol. 3 no. 1)
pp. 97-109
For given input the global trace generated by a parallel program in a shared memory multiprocessing environment may change as the memory architecture, and management policies change. A method is proposed for ensuring that a correct global trace is generated in the new environment. This method involves a new characterization of a parallel program that identifies its address change points and address affecting points. An extension of traditional process traces, called the intrinsic trace of each process, is developed. The intrinsic traces maximize the decoupling of program execution from simulation by describing the address flow graph and path expressions of each process program. At each point where an address is issued, the trace-driven simulator uses the intrinsic traces and the sequence of loads and stores before the current cycle, to determine the next address. The mapping between load and store sequences and next addresses to issue, sometimes, requires partial program reexecution. Programs that do not require partial program re-execution are called graph-traceable.

[1] S. V. Adve and M.D. Hill. "Weak Ordering--A New definition,"Proc. 17th Ann. Int'l Symp. Computer Architecture, IEEE CS Press, June 1990, pp 2-14.
[2] A. Agarwal, "Multiprocessor address tracing: The agony and the ecstasy, " inProc. ISCA'90 Workshop Processor Tracing Methodologies, Seattle, WA, May 1990.
[3] A. V. Aho, R. Sethi, and J. D. Ullman,Compilers: Principles, Techniques, and Tools. Reading, MA: Addison-Wesley, 1986.
[4] BBN,Inside the Butterfly Plus, Bolt Beranek and Newman Advanced Computers, Inc., Cambridge, MA, Oct. 1987.
[5] A. Borg, R.E. Kessler, and D.W. Wall, "Generation and Analysis of Very Long Address Traces,"Proc. 17th Int'l Symp. Computer Architecture, May 1990, IEEE CS Press, Los Alamitos, Calif. Order No. 2047, pp. 270-279.
[6] C.C. Howell and D. E. Mularz, "Exception handling in large Ada systems," inProc. Washington Ada Symp., 1991.
[7] R. G. Covington, S. Madala, V. Mehtaet al., "The Rice parallel processing testbed," inProc. 1988 ACM SIGMETRICS Conf., Santa Fe, NM, May 1988, pp. 4-11.
[8] F. Darema-Rogers, G. F. Pfister, and K. So, "Memory access patterns of parallel scientific programs," inProc. 1988 ACM Sigmetrics Conf. Measurement and Modeling of Comput. Syst., May 1987, pp. 46-58.
[9] H. Davis. S. Goldschmidt, and J. Hennessy, "Multiprocessor simulation and tracing using Tango," inProc. 1991 Int. Conf. Parallel Processing, St. Charles, IL, Aug. 1991, pp. (II)99-107.
[10] M. Dubois and C. Scheurich, "Memory access dependencies in shared-memory multiprocessors,"IEEE Trans. Software Eng., vol. 16, pp. 660-674, June 1990.
[11] S.J. Eggers and R.H. Katz, "A Characterization of Sharing in Parallel Programs and its Application to Coherency Protocol Evaluation,"Proc. 15th Int'l Symp. Computer Architecture, 1988, IEEE CS Press, Los Alamitos, Calif. Order No. 861, pp. 373-382.
[12] S.J. Eggers et al., "Techniques for Efficient In-Line Tracing on a Shared-Memory Multiprocessor,"Proc. ACM SIGMetrics Int'l Conf. Measurement and Modeling of Computer Systems, 1990, pp. 37-47.
[13] M.A. Holliday, "Techniques for cache and memory simulation using address reference traces,"Int. J. in Comput. Simulation, vol. 1, no. 2, pp. 129-152, 1991.
[14] M.A. Holliday, "Reference history, page size, and migration daemons in local/remote architectures," inProc. 3rd Int. Conf. Architectural Support for Programming Languages and Oper. Syst., Boston, MA, Apr. 1989, pp. 104-112.
[15] M.A. Holliday and C.S. Ellis. "An example of correct global trace generation," inScalable Shared Memory multiprocessors, M. Dubois and S. Thakkar, Eds. Boston, MA: Kluwer Academic, 1991.
[16] L. Lamport, "How to make a multiprocessor computer that correctly executes multiprocess programs,"IEEE Trans. Comput., vol. C-28. pp. 690-691, Sept. 1979.
[17] R.P. LaRowe, Jr., and C.S. Ellis, "Experimental Comparison of Memory Management Policies for NUMA Multiprocessors,"ACM Trans. Computer Systems, Vol. 9, No. 4, Nov. 1991, pp. 319-363.
[18] R. LaRowe and C. Ellis, "Page placement policies for NUMA multiprocessors,"J. Parallel Distributed Comput., vol. 11, no. 2. pp. 112-129, Feb. 1991.
[19] J. R. Larus, "Abstract execution: A technique for efficiently tracing programs, "Software Practice and Exp., vol. 20, pp. 1241-1258, Dec. 1990.
[20] H. R. Lewis and C. H. Papadimitriou,Elements of the Theory of Computation. Englewood Cliffs, NJ: Prentice-Hall, 1981.
[21] A. Smith, "Cache Memories,"Computing Surveys, Vol. 14, No. 3, Sept. 1982, pp. 473- 530.
[22] A. J. Smith, "Cache evaluation and the impact of workload choice," inProc. 12th Annu. Symp. Comput. Architecture, Boston, MA, June 1985, pp. 64-73.
[23] C.B. Stunkel and W.K. Fuchs, "TRAPEDS: Producing Traces for Multicomputers Via Execution-Driven Simulation,"Proc. ACM SIGMetrics Int'l Conf. Measurement and Modeling of Computer Systems, 1989, pp. 70-78.
[24] H. M. Taylor and S. Karlin,An Introduction to Stochastic Modeling. Orlando. FL: Academic, 1984.

Index Terms:
Index Termsload sequences; memory management; memory reference traces; parallel computations;trace-drive simulation; global trace; parallel program; shared memory multiprocessingenvironment; memory architecture; address change points; address affecting points;process traces; intrinsic trace; address flow graph; path expressions; store sequences;partial program reexecution; graph-traceable; parallel programming; storage management
M.A. Holliday, C.S. Ellis, "Accuracy of Memory Reference Traces of Parallel Computations in Trace-Drive Simulation," IEEE Transactions on Parallel and Distributed Systems, vol. 3, no. 1, pp. 97-109, Jan. 1992, doi:10.1109/71.113085
Usage of this product signifies your acceptance of the Terms of Use.