Issue No. 04 - October-December (2010 vol. 7)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TDSC.2010.33
P. Bernardi , Politecnico di Torino, Torino
L.M. Bolzani Poehls , Pontificia Universidade Catolica do Rio Grande do Sul, Porto Alegre
M. Grosso , Politecnico di Torino, Torino
M. Sonza Reorda , Politecnico di Torino, Torino
Critical applications based on Systems-on-Chip (SoCs) require suitable techniques that are able to ensure a sufficient level of reliability. Several techniques have been proposed to improve fault detection and correction capabilities of faults affecting SoCs. This paper proposes a hybrid approach able to detect and correct the effects of transient faults in SoC data memories and caches. The proposed solution combines some software modifications, which are easy to automate, with the introduction of a hardware module, which is independent of the specific application. The method is particularly suitable to fit in a typical SoC design flow and is shown to achieve a better trade-off between the achieved results and the required costs than corresponding purely hardware or software techniques. In fact, the proposed approach offers the same fault-detection and -correction capabilities as a purely software-based approach, while it introduces nearly the same low memory and performance overhead of a purely hardware-based one.
Fault tolerance, SoCs, transient faults, online test.
M. Grosso, P. Bernardi, L. Bolzani Poehls and M. Sonza Reorda, "A Hybrid Approach for Detection and Correction of Transient Faults in SoCs," in IEEE Transactions on Dependable and Secure Computing, vol. 7, no. , pp. 439-445, 2010.