2009 15th IEEE Pacific Rim International Symposium on Dependable Computing Quantitative Analysis of Long-Latency Failures in System Software Shanghai, China November 16-November 18 ISBN: 978-0-7695-3849-5
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/PRDC.2009.13
This paper presents a study on long latency failures using accelerated fault injection. The data collected from the experiments are used to analyze the significance, causes, and characteristics of long latency failures caused by soft errors in the processor and the memory. The results indicate that a non-negligible portion of soft errors in the code and data memory lead to long latency failures. The long latency failures are caused by errors with long fault activation times and errors causing failures only under certain runtime conditions. On the other hand, less than 0.5% of soft errors in the processor registers used in kernel mode lead to a failure with latency longer than a thousand seconds. This is due to a strong temporal locality of the register values. The study shows also that the obtained insight can be used to guide design and placement (in the application code and/or system) of application-specific error detectors.
Index Terms:
Long latency failures, accelerated fault injection, and operating system robustness testing
Citation:
Keun Soo Yim, Zbigniew T. Kalbarczyk, Ravishankar K. Iyer, "Quantitative Analysis of Long-Latency Failures in System Software," prdc, pp.23-30, 2009 15th IEEE Pacific Rim International Symposium on Dependable Computing, 2009 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||