2011 International Conference on Parallel Architectures and Compilation Techniques (2011)
Galveston, Texas USA
Oct. 10, 2011 to Oct. 14, 2011
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/PACT.2011.39
Fault-tolerance has become an essential concern for processor designers due to increasing transient and permanent fault rates. In this study we propose Symptom TM, a symptom-based error detection technique that recovers from errors by leveraging the abort mechanism of Transactional Memory (TM). To the best of our knowledge, this is the first architectural fault-tolerance proposal using Hardware Transactional Memory (HTM). Symptom TM can recover from 86% and 65% of catastrophic failures caused by transient and permanent errors respectively with no performance overhead in error-free executions.
Fault Tolerance, Hardware Transactional Memory
G. Yalcin, O. S. Unsal, A. Cristal, M. Valero and I. Hur, "SymptomTM: Symptom-Based Error Detection and Recovery Using Hardware Transactional Memory," 2011 International Conference on Parallel Architectures and Compilation Techniques(PACT), Galveston, Texas USA, 2011, pp. 199-200.