2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN) (2015)
Rio de Janeiro, Brazil
June 22, 2015 to June 25, 2015
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/DSN.2015.20
Transient and permanent errors in memory and CPUs occur with alarming frequency. Although most of these errors are masked at the hardware level or result in crashes, a non-negligible number of them leads to Silent Data Corruptions (SDCs), i.e., incorrect results of computations. Safety-critical programs require a very high level of confidence that such faults are detected and not propagated to the outside. Unfortunately, state-of-the-art fault detection techniques generally assume a limited Single Event Upset fault model, concentrating only on transient faults.We present &#x03B4;-encoding: a software-only approach to detect hardware faults with very high probability. &#x03B4;-encoding makes no assumptions on the rate and type of faults. Our approach combines AN codes and duplicated instructions to harden programs against transient and permanent hardware errors. Our evaluation shows that &#x03B4;-encoding detects 99.997% of all injected errors with performance slowdown of 2 -- 4 times.
Hardware, Decoding, Encoding, Transient analysis, Random access memory, Fault tolerance, Fault tolerant systems
D. Kuvaiskii and C. Fetzer, "&#x0394;-Encoding: Practical Encoded Processing," 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Rio de Janeiro, Brazil, 2015, pp. 13-24.