This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
April 1987 (vol. 36 no. 4)
pp. 471-482
T.J. Leblanc, Department of Computer Science, University of Rochester
The debugging cycle is the most common methodology for finding and correcting errors in sequential programs. Cyclic debugging is effective because sequential programs are usually deterministic. Debugging parallel programs is considerably more difficult because successive executions of the same program often do not produce the same results. In this paper we present a general solution for reproducing the execution behavior of parallel programs, termed Instant Replay. During program execution we save the relative order of significant events as they occur, not the data associated with such events. As a result, our approach requires less time and space to save the information needed for program replay than other methods. Our technique is not dependent on any particular form of interprocess communication. It provides for replay of an entire program, rather than individual processes in isolation. No centralized bottlenecks are introduced and there is no need for synchronized clocks or a globally consistent logical time. We describe a prototype implementation of Instant Replay on the BBN Butterfly? Parallel Processor, and discuss how it can be incorporated into the debugging cycle for parallel programs.
Index Terms:
shared objects, CREW protocols, distributed debugging, execution replay, parallel programming, program instrumentation
Citation:
T.J. Leblanc, J.M. Mellor-Crummey, "Debugging Parallel Programs with Instant Replay," IEEE Transactions on Computers, vol. 36, no. 4, pp. 471-482, April 1987, doi:10.1109/TC.1987.1676929
Usage of this product signifies your acceptance of the Terms of Use.