Subscribe

Issue No.11 - Nov. (2012 vol.61)

pp: 1521-1534

A. Savino , Dept. of Control & Comput. Eng., Politec. di Torino, Torino, Italy

S. Di Carlo , Dept. of Control & Comput. Eng., Politec. di Torino, Torino, Italy

G. Politano , Dept. of Control & Comput. Eng., Politec. di Torino, Torino, Italy

A. Benso , Dept. of Control & Comput. Eng., Politec. di Torino, Torino, Italy

A. Bosio , Lab. d'Inf., de Robot. et de Microelectron. de Montpellier, Univ. of Montpellier II, Montpellier, France

G. Di Natale , Lab. d'Inf., de Robot. et de Microelectron. de Montpellier, Univ. of Montpellier II, Montpellier, France

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TC.2011.188

ABSTRACT

What is the probability that the execution state of a given microprocessor running a given application is correct, in a certain working environment with a given soft-error rate? Trying to answer this question using fault injection can be very expensive and time consuming. This paper proposes the baseline for a new methodology, based on microprocessor error probability profiling, that aims at estimating fault injection results without the need of a typical fault injection setup. The proposed methodology is based on two main ideas: a one-time fault-injection analysis of the microprocessor architecture to characterize the probability of successful execution of each of its instructions in presence of a soft-error, and a static and very fast analysis of the control and data flow of the target software application to compute its probability of success. The presented work goes beyond the dependability evaluation problem; it also has the potential to become the backbone for new tools able to help engineers to choose the best hardware and software architecture to structurally maximize the probability of a correct execution of the target software.

INDEX TERMS

statistical analysis, multiprocessing systems, probability, service-oriented architecture, software reliability, target software, statistical reliability estimation, microprocessor-based systems, fault injection, microprocessor error probability profiling, one-time fault-injection analysis, target software application, data flow, dependability evaluation problem, software architecture, Hardware, Reliability, Microprocessors, Software, Estimation, Integrated circuit modeling, Probability, Microprocessor reliability, Hardware, Reliability, Microprocessors, Software, Estimation, Integrated circuit modeling, Probability, safety-critical systems, Hardware, Reliability, Microprocessors, Software, Estimation, Integrated circuit modeling, Probability, statistical analysis

CITATION

A. Savino, S. Di Carlo, G. Politano, A. Benso, A. Bosio, G. Di Natale, "Statistical Reliability Estimation of Microprocessor-Based Systems",

*IEEE Transactions on Computers*, vol.61, no. 11, pp. 1521-1534, Nov. 2012, doi:10.1109/TC.2011.188REFERENCES

- [1] S. Kumar and A. Aggarwal, “Self-Checking Instructions: Reducing Instruction Redundancy for Concurrent Error Detection,”
Proc. 15th Int'l Conf. Parallel Architectures and Compilation Techniques, pp. 64-73, 2006.- [2] A. Shye, J. Blomstedt, T. Moseley, V. Reddi, and D. Connors, “Plr: A Software Approach to Transient Fault Tolerance for Multicore Architectures,”
IEEE Trans. Dependable and Secure Computing, vol. 6, no. 2, pp. 135-148, Apr.-June 2009.- [3] R. Baumann, “Technology Scaling Trends and Accelerated Testing for Soft Errors In Commercial Silicon Devices,”
Proc. IEEE Int'l On-Line Testing Symp., p. 4, 2003.- [4] B.R., “Soft Errors in Advanced Computer Systems,”
IEEE Design and Test of Computers, vol. 22, no. 3, pp. 258-266, May/June 2005.- [5] S. Borkar, “Tackling Variability and Reliability Challenges,”
IEEE Design and Test of Computers, vol. 23, no. 6, p. 520, June 2006.- [6] S. Borkar, “Thousand Core Chips: A Technology Perspective,”
Proc. 44th Ann. Design Automation Conf., pp. 746-749, 2007.- [7] P. Dodd, “Physics-Based Simulation of Single-Event Effects,”
IEEE Trans. Device and Materials Reliability, vol. 5, no. 3, pp. 343-357, Sept. 2005.- [8] S. Mitra, M. Zhang, T. Mak, N. Seifert, V. Zia, and K.S. Kim, “Logic Soft Errors: A Major Barrier to Robust Platform Design,”
Proc. IEEE Int'l Test Conf., pp. 10-696, Nov. 2005.- [9] E. Normand, “Single Event Upset at Ground Level,”
IEEE Trans. Nuclear Science, vol. 43, no. 6, pp. 2742-2750, Dec. 1996.- [10] R. Baumann, “Soft Errors in Commercial Semiconductor Technology: Overview and Scaling Trends,”
Proc. IEEE Reliability Physics Tutorial Notes, Reliability Fundamentals, pp. 121.01.1-121.01.14, Apr. 2002.- [11] S. Krishnamohan and N.R. Mahapatra, “Analysis and Design of Soft-Error Hardened Latches,”
Proc. 15th ACM Great Lakes Symp. VLSI, pp. 328-331, 2005.- [12] M. Hosseinabady, P. Lotfi-Kamran, G. Di Natale, S. Di Carlo, A. Benso, and P. Prinetto, “Single-Event Upset Analysis and Protection in High Speed Circuits,”
Proc. IEEE 11th European Test Symp. (ETS '06), pp. 29-34, May 2006.- [13] H. Ando, Y. Yoshida, A. Inoue, I. Sugiyama, T. Asakawa, K. Morita, T. Muta, T. Motokurumada, S. Okada, H. Yamashita, Y. Satsukawa, A. Konmoto, R. Yamashita, and H. Sugiyama, “A 1.3ghz Fifth Generation Sparc64 Microprocessor,”
Proc. 40th Ann. Design Automation Conf., pp. 702-705, June 2003.- [14] A. Benso, S. Di Carlo, G. Di Natale, and P. Prinetto, “A Watchdog Processor to Detect Data and Control Flow Errors,”
Proc. IEEE Ninth On-Line Testing Symp. (IOLTS), pp. 144-148, July 2003.- [15] S. Di Carlo, G. Di Natale, and R. Mariani, “On-line Instruction-Checking in Pipelined Microprocessors,”
Proc. 17th Asian Test Symp. (ATS '08), pp. 377-382, Nov. 2008.- [16] A. Benso, S. Di Carlo, G. Di Natale, P. Prinetto, and L. Tagliaferri, “Control-Flow Checking via Regular Expressions,”
Proc. 10th Asian Test Symp., pp. 299-303, Nov. 2001.- [17] A. Benso, S. Di Carlo, G. Di Natale, P. Prinetto, L. Tagliaferri, and C. Tibaldi, “Promon: A Profile Monitor of Software Applications,”
Proc. IEEE Eighth Int'l Workshop Design and Diagnostics of Electronic Circuits and Systems (DDECS), pp. 81-86, Apr. 2005.- [18] A. Benso, S. Di Carlo, G. Di Natale, and P. Prinetto, “Seu Effect Analysis in A Open-Source Router via a Distributed Fault Injection Environment,”
Proc. Conf. and Exhibition on Design, Automation and Test in Europe, pp. 219-223, Mar. 2001.- [19] A. Benso, S. Di Carlo, G. Di Natale, L. Tagliaferri, and P. Prinetto, “Validation of a Software Dependability Tool via Fault Injection Experiments,”
Proc. Seventh Int'l On-Line Testing Workshop, 2001, pp. 3-8, July 2001.- [20] A. Benso, S. Di Carlo, G. Di Natale, P. Prinetto, and L. Tagliaferri, “Software Dependability Techniques Validated via Fault Injection Experiments,”
Proc. Sixth European Conf. Radiation and Its Effects on Components and Systems, pp. 269-274, Sept. 2001.- [21] M. Omana, G. Papasso, D. Rossi, and C. Metra, “A Model for Transient Fault Propagation in Combinatorial Logic,”
Proc. IEEE Ninth On-Line Testing Symp., pp. 111-115, 2003.- [22] A. Maheshwari, I. Koren, and W. Burleson, “Techniques for Transient Fault Sensitivity Analysis and Reduction in VLSI Circuits,”
Proc. IEEE Int'l Symp. Defect and Fault-Tolerance in VLSI Systems, p. 597, 2003.- [23] H. Nguyen and Y. Yagil, “A Systematic Approach to Ser Estimation and Solutions,”
Proc. IEEE 41st Ann. Int'l Reliability Physics Symp., pp. 60-70, Mar./Apr. 2003.- [24] K. Mohanram and N. Touba, “Partial Error Masking to Reduce Soft Error Failure Rate in Logic Circuits,”
Proc. IEEE 18th Int'l Symp. Defect and Fault-Tolerance in VLSI Systems, p. 433, 2003.- [25] K. Mohanram and N. Touba, “Cost-Effective Approach for Reducing Soft Error Failure Rate in Logic Circuits,”
Proc. Int'l Test Conf., vol. 1, pp. 893-901, 2003.- [26] M. Sonza Reorda and M. Violante, “Accurate and Efficient Analysis of Single Event Transients in VLSI Circuits,”
Proc. IEEE Int'l On-Line Testing Symp., pp. 101-105, 2003.- [27] P. Shivakumar, M. Kistler, S.W. Keckler, D. Burger, and L. Alvisi, “Modeling the Effect of Technology Trends on the Soft Error Rate of Combinational Logic,”
Proc. Int'l Conf. Dependable Systems and Networks, pp. 389-398, 2002.- [28] S. Kim and A.K. Somani, “Soft Error Sensitivity Characterization for Microprocessor Dependability Enhancement Strategy,”
Proc. Int'l Conf. Dependable Systems and Networks, pp. 416-428, 2002.- [29] S.S. Mukherjee, C. Weaver, J. Emer, S.K. Reinhardt, and T. Austin, “A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance Microprocessor,”
Proc. IEEE/ACM 36th Ann. Int'l Symp. Microarchitecture, pp. 29-40, 2003.- [30] S. Mukherjee, C. Weaver, J. Emer, S. Reinhardt, and T. Austin, “Measuring Architectural Vulnerability Factors,”
IEEE Micro, vol. 23, no. 6, pp. 70-75, Nov./Dec. 2003.- [31] N.J. Wang, J. Quek, T.M. Rafacz, and S.J. patel, “Characterizing the Effects of Transient Faults on a High-Performance Processor Pipeline,”
Proc. Int'l Conf. Dependable Systems and Networks, p. 61, 2004.- [32] C. Weaver, J. Emer, S.S. Mukherjee, and S.K. Reinhardt, “Techniques to Reduce the Soft Error Rate of a High-Performance Microprocessor,”
Proc. 31st Ann. Int'l Symp. Computer Architecture, pp. 264-275, 2004.- [33] X. Li, S.V. Adve, P. Bose, and J.A. Rivers, “Online Estimation of Architectural Vulnerability Factor for Soft Errors,”
Proc. 35th Int'l Symp. Computer Architecture, pp. 341-352, 2008.- [34] X. Li, S. Adve, P. Bose, and J. Rivers, “Softarch: An Architecture-Level Tool for Modeling and Analyzing Soft Errors,”
Proc. Int'l Conf. Dependable Systems and Networks, pp. 496-505, 2005.- [35] V. Sridharan and D.R. Kaeli, “Using pvf Traces to Accelerate avf Modeling,”
Proc. IEEE Workshop Silicon Errors in Logic System Effects, http://web.me.com/vilas.sridharan/Vilas_Sridharan/ Publications_files3_ Sridharan_P.pdf , Mar. 2010.- [36] T.M. Jones and M.F.P., “Evaluating the Effects of Compiler Optimization on Avf,”
Proc. Workshop the Interaction between Compilers and Computer Architecture (INTERACT), 2008.- [37] A. Benso, S. Di Carlo, G. Di Natale, and P. Prinetto, “Static Analysis of Seu Effects on Software Applications,”
Proc. Int'l Test Conf., pp. 500-508, 2002.- [38] T. Karnik and P. Hazucha, “Characterization of Soft Errors Caused by Single Event Upsets in Cmos Processes,”
IEEE Trans. Dependable and Secure Computing, vol. 1, no. 2, pp. 128-143, Apr.-June 2004.- [39] S. Mitra, T. Karnik, N. Seifert, and M. Zhang, “Logic Soft Errors in Sub-65nm Technologies Design and Cad Challenges,”
Proc. 42nd Design Automation Conf., pp. 2-4, June 2005.- [40] NIST/SEMATECH, e-Handbook of Statistical Methods, http://www.itl.nist.gov/div898handbook/, 2011.
- [41] J. Larus, “Efficient Program Tracing,”
Computer, vol. 26, no. 5, pp. 52-61, May 1993.- [42] J. Maebe and B. De Sutter Diablo, http:/diablo.elis.ugent.be/, 2011.
- [43] HT-LAB, Cpu86 cpu86 8088 fpga ip core, http://www.ht-lab.com/freecores/cpu8086cpu86.html , 2011.
- [44] Univ. of Michigan at Ann Arbor, Mibench Version 1.0, http://www.eecs.umich.edumibench/, 2011.
- [45] A. Bosio., and G. Di Natale, “Lifting: A Flexible Open-source Fault Simulator,”
Proc. IEEE 17th Asian Test Symp., pp. 35-40, 2008. |