Computer Architecture, International Symposium on (2003)
San Diego, California
June 9, 2003 to June 11, 2003
M. Xu , Comput. Sci. Dept. & ECE Dept., Wisconsin Univ., Madison, WI, USA
R. Bodik , Comput. Sci. Dept. & ECE Dept., Wisconsin Univ., Madison, WI, USA
M.D. Hill , Comput. Sci. Dept. & ECE Dept., Wisconsin Univ., Madison, WI, USA
Debuggers have been proven indispensable in improving software reliability. Unfortunately, on most real-life software, debuggers fail to deliver their most essential feature - a faithful replay of the execution. The reason is nondeterminism caused by multithreading and nonrepeatable inputs. A common solution to faithful replay has been to record the nondeterministic execution. Existing recorders, however, either work only for data-race-free programs or have prohibitive overhead. As a step towards powerful debugging, we develop a practical low-overhead hardware recorder for cache-coherent multiprocessors, called flight data recorder (FDR). Like an aircraft flight data recorder, FDR continuously records the execution, even on deployed systems, logging the execution for post-mortem analysis. FDR is practical because it piggybacks on the cache coherence hardware and logs nearly the minimal thread-ordering information necessary to faithfully replay the multiprocessor execution. Our studies, based on simulating a four-processor server with commercial workloads, show that when allocated less than 7% of system's physical memory, our FDR design can capture the last one second of the execution at modest (less than 2%) slowdown.
program debugging, multi-threading, data recording, software reliability, multiprocessing systems, system monitoring, performance evaluation
M. Xu, R. Bodik and M. Hill, "A "flight data recorder" for enabling full-system multiprocessor deterministic replay," Computer Architecture, International Symposium on(ISCA), San Diego, California, , pp. 122-133.