2013 IEEE 37th Annual Computer Software and Applications Conference (2008)
July 28, 2008 to Aug. 1, 2008
We present an online framework to capture and recover from program failures and prevent them from occurring in the future through safe execution perturbations. The perturbations are safe as they respect the semantics of the program. We use a checkpointing/logging mechanism to capture a program execution to an event log. If the execution results in a failure, the framework automatically searches for perturbation of the execution by altering the event log and replaying the execution using the altered log to avoid the failure. If found, the perturbation is recorded as a dynamic patch, which is later applied by all future executions of this application to prevent the failure from occurring again. Our experiments show that the proposed framework is very effective in avoiding concurrency faults, heap memory overflow faults, and malicious requests. The entailed overhead for normal execution is very low (2-18%).
avoiding failures, environmental faults, logging/replay tools, atomicity violation, heap overflow, bad user requests
Xiangyu Zhang, Rajiv Gupta, Sriraman Tallam, Chen Tian, "Avoiding Program Failures Through Safe Execution Perturbations", 2013 IEEE 37th Annual Computer Software and Applications Conference, vol. 00, no. , pp. 152-159, 2008, doi:10.1109/COMPSAC.2008.23