The Community for Technology Leaders
RSS Icon
Issue No.04 - July-Aug. (2013 vol.33)
pp: 56-65
Lukasz G. Szafaryn , University of Virginia
Brett H. Meyer , McGill University
Kevin Skadron , University of Virginia
As circuit feature sizes shrink, multibit errors become more significant, while previously unprotected combinational logic becomes more vulnerable, requiring a reevaluation of the resiliency design space within a processor core. The authors present Svalinn, a framework that provides comprehensive analysis of multibit error protection overheads to facilitate better architecture-level design choices. Supported protection techniques include hardening, parity, error-correcting code, parity prediction, residue codes, and spatial and temporal redundancy. The overheads of these are characterized via synthesis and, as a case study, presented here in the context of a simple OpenRISC core. The analysis provided by Svalinn shows the difference in protection overheads per component and circuit category in terms of area, delay, and energy. The authors show that the contribution of logic components to the area of a simple core increases from 35 percent to as much as 54 percent with comprehensive multibit error protection. They also observe that the overhead of protection could increase from 29 percent to as much as 97 percent when transitioning from single-bit to multibit protection. Analysis of Svalinn also suggests that storage components will continue to benefit from the use of error-correcting code, whereas products requiring comprehensive coverage of logic components might use redundancy and residue codes. Optimal core-level protection will require novel combinations of these.
Multicore processing, Random access memory, Redundancy, Error correction codes, Logic circuits, multibit error protection, architecture, reliability, Svalinn
Lukasz G. Szafaryn, Brett H. Meyer, Kevin Skadron, "Evaluating Overheads of Multibit Soft-Error Protection in the Processor Core", IEEE Micro, vol.33, no. 4, pp. 56-65, July-Aug. 2013, doi:10.1109/MM.2013.68
1. E.H. Cannon et al., "SRAM SER in 90, 130 and 180 nm Bulk and SOI Technologies," Proc. 42nd Ann. IEEE Int'l Reliability Physics Symp., IEEE CS, 2004, pp. 300-304.
2. N. Seifert et al., "Soft Error Susceptibilities of 22 nm Tri-Gate Devices," IEEE Trans. Nuclear Science, Dec. 2012, pp. 2666-2673.
3. N.J. George et al., "Bit-Slice Logic Interleaving for Spatial Multibit Soft-Error Tolerance," Proc. IEEE/IFIP Int'l Conf. Dependable Systems and Networks (DSN 10), IEEE CS, 2010, pp. 141-150.
4. P. Shivakumar et al., "Modeling the Effect of Technology Trends on the Soft Error Rate of Combinational Logic," Proc. Int'l Conf. Dependable Systems and Networks (DSN 02), IEEE CS, 2002, pp. 389-398.
5. Q. Zhou and K. Mohanram, "Cost-Effective Radiation Hardening Technique for Combinational Logic," Proc. IEEE/ACM Int'l Conf. Computer Aided Design (CAD 04), IEEE CS, 2004, pp. 100-106.
6. S. Mitra et al., "Combinational Logic Soft Error Correction," Proc. IEEE Int'l Test Conf., IEEE CS, 2006, doi: 10.1109/TEST.2006.297681.
7. A. Meixner, M.E. Bauer, and D. Sorin, "Argus: Low-Cost, Comprehensive Error Detection in Simple Cores," Proc. 40th Ann. IEEE/ACM Int'l Symp. Microarchitecture, IEEE CS, 2007, pp. 210-222.
8. D.H. Yoon and M. Erez, "Memory Mapped ECC: Low-Cost Error Protection for Last Level Caches," Proc. 36th Ann. Int'l Symp. Computer Architecture (ISCA 09), ACM, 2009, pp. 116-127.
9. M. Zhang et al., "Sequential Element Design with Built-In Soft Error Resilience," IEEE Trans. VLSI Systems, Dec. 2006, pp. 1368-1378.
10. T. Calin, M. Nicolaidis, and R. Velazco, "Upset Hardened Memory Design for Submicron CMOS Technology," IEEE Trans. Nuclear Science, Dec. 1996, pp. 2874-2878.
11. P. Hazucha et al., "Measurements and Analysis of SER-Tolerant Latch in a 90-nm Dual-Vt CMOS Process," IEEE J. Solid-State Circuits, Sept. 2004, pp. 1536-1543.
12. S. Mukherjee, Architecture Design for Soft Errors, Morgan Kaufmann, 2008.
13. J. Kim et al., "Multibit Error Tolerant Caches Using Two-Dimensional Error Coding," Proc. 40th Ann. IEEE/ACM Int'l Symp. Microarchitecture, IEEE CS, 2007, pp. 197-209.
14. M. Nicolaidis, "Carry Checking/Parity Prediction Adders and ALUs," IEEE Trans. VLSI Systems, Feb. 2003, pp. 121-128.
15. D. Lipetz and E. Schwarz, "Self Checking in Current Floating-Point Units," Proc. 20th IEEE Symp. Computer Arithmetic, IEEE CS, 2011, pp. 73-76.
16. S.S. Mukherjee, M. Kontz, and S.K. Reinhardt, "Detailed Design and Evaluation of Redundant Multithreading Alternatives," Proc. 29th Ann. Int'l Symp. Computer Architecture (ISCA 02), IEEE CS, 2002, pp. 99-110.
17. J. Smolens et al., "Fingerprinting: Bounding Soft-Error Detection Latency and Bandwidth," Proc. 11th Int'l Conf. Architectural Support for Programming Languages and Operating Systems, ACM, 2004, pp. 224-234.
18. B. Meyer et al., "Cost-Effective Safety and Fault Localization Using Distributed Temporal Redundancy," Proc. 14th Int'l Conf. Compilers, Architectures and Synthesis for Embedded Systems (CASES 11), IEEE CS, 2011, pp. 125-134.
19. S.J.E. Wilton and N.P. Jouppi, "CACTI: An Enhanced Cache Access and Cycle Time Model," IEEE J. Solid-State Circuits, May 1996, pp. 677-688.
20. N. Binkert et al., "The gem5 Simulator," ACM SIGARCH Computer Architecture News, May 2011, pp. 1-7.
21. "OR1200 OpenRISC Processor," OpenCores, 2011;
36 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool