The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.09 - September (2011 vol.60)
pp: 1260-1273
Michail Maniatakos , Yale University, New Haven
Naghmeh Karimi , University of Tehran, Tehran
Chandrasekharan (Chandra) Tirumurti , Intel Corporation, Santa Clara
Abhijit Jas , Intel Corporation, Austin
Yiorgos Makris , Yale University, New Haven
ABSTRACT
We investigate the correlation between low-level faults in the control logic of a modern microprocessor and their instruction-level impact on the execution of typical workload. Such information can prove immensely useful in accurately assessing and prioritizing faults with regards to their criticality, as well as commensurately allocating resources to enhance online testability and error/fault resilience through concurrent error detection/correction methods. To this end, we developed an extensive fault simulation infrastructure which allows injection of stuck-at faults and transient errors of arbitrary starting time and duration, as well as cost-effective simulation and classification of their repercussions into various instruction-level error types. As a test vehicle for our study, we employ a superscalar, dynamically-scheduled, out-of-order, Alpha-like microprocessor, on which we execute SPEC2000 integer benchmarks. Extensive fault injection campaigns in control modules of this microprocessor facilitate valuable observations regarding the distribution of low-level faults into the instruction-level error types that they cause. Experimentation with both Register Transfer (RT-) and Gate-Level faults, as well as with both stuck-at faults and transient errors, confirms the validity and corroborates the utility of these observations.
INDEX TERMS
Fault simulation, instruction-level error, microprocessor controller, concurrent error detection.
CITATION
Michail Maniatakos, Naghmeh Karimi, Chandrasekharan (Chandra) Tirumurti, Abhijit Jas, Yiorgos Makris, "Instruction-Level Impact Analysis of Low-Level Faults in a Modern Microprocessor Controller", IEEE Transactions on Computers, vol.60, no. 9, pp. 1260-1273, September 2011, doi:10.1109/TC.2010.60
REFERENCES
[1] M. Goessel and S. Graf, Error Detection Circuits. McGraw-Hill, 1993.
[2] C. Metra, M. Favalli, and B. Ricco, “On-Line Detection of Logic Errors Due to Crosstalk, Delay, and Transient Faults,” Proc. IEEE Int'l Test Conf., pp. 524-533, 1998.
[3] S. Mitra and E.J. McCluskey, “Which Concurrent Error Detection Scheme to Choose?,” Proc. IEEE Int'l Test Conf., pp. 985-994, 2000.
[4] K. Mohanram and N.A. Touba, “Cost-Effective Approach for Reducing Soft Error Rate in Logic Circuits,” Proc. IEEE Int'l Test Conf., pp. 893-901, 2003.
[5] S. Almukhaizim, P. Drineas, and Y. Makris, “Entropy-Driven Parity-Tree Selection for Low-Overhead Concurrent Error Detection in Finite State Machines,” IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems vol. 25, no. 8, pp. 1547-1554, Aug. 2006.
[6] J.C. Lo, “A Hyper Optimal Encoding Scheme for Self-Checking Circuits,” IEEE Trans. Computers, vol. 45, no. 9, pp. 1022-1030, Sept. 1996.
[7] C. Metra, D. Rossi, M. Omana, A. Jas, and R. Galivanche, “Function Inherent Code Checking: A New Low Cost On-Line Testing Approach For High Performance Microprocessor Control Logic,” Proc. European Test Symp., pp. 171-176, 2008.
[8] N.J. Wang, A. Mahesri, and S.J. Patel, “Examining ACE Analysis Reliability Estimates Using Fault Injection,” ACM SIGARCH Computer Architecture News, vol. 35, no. 2, pp. 460-469, 2007.
[9] D. Burger and T.M. Austin, “The SimpleScalar Tool Set,” Technical Report CS-TR-97-1342, Version 2.0., Univ. of Wisconsin, Madison, 1997.
[10] N.J. Wang and S.J. Patel, “Restore: Symptom Based Soft Error Detection in Microprocessors,” Proc. Int'l Conf. Dependable Systems and Networks, pp. 30-39, 2005.
[11] N.J. Wang, J. Quek, T.M. Rafacz, and S.J. Patel, “Characterizing the Effects of Transient Faults on a High-Performance Processor Pipeline,” Proc. Int'l Conf. Dependable Systems and Networks, pp. 61-70, 2004.
[12] E. Jenn, J. Arlat, M. Rimen, J. Ohlsson, and J. Karlsson, “Fault Injection into VHDL Models: the MEFISTO Tool,” Proc. Int'l Symp. Fault-Tolerant Computing, pp. 66-75, 1994.
[13] J.C. Baraza, J. Gracia, S. Blanc, D. Gil, and P.J. Gil, “Enhancement of Fault Injection Techniques Based on the Modification of VHDL Code,” IEEE Trans. Very Large Scale Integration (VLSI) Systems, vol. 16, no. 6, pp. 693-706, June 2008.
[14] V.D. Agrawal and H. Kato, “Fault Sampling Revisited,” IEEE Design and Test of Computers, vol. 7, no. 4, pp. 32-35, Aug. 1990.
[15] N. Karimi, M. Maniatakos, C. Tirumurti, A. Jas, and Y. Makris, “Impact Analysis of Performance Faults in Modern Microprocessors,” Proc. IEEE Int'l Conf. Computer Design, pp. 91-96, 2009.
[16] M. Maniatakos, N. Karimi, Y. Makris, A. Jas, and C. Tirumurti, “Design and Evaluation of a Timestamp-Based Concurrent Error Detection Method (CED) in a Modern Microprocessor Controller,” Proc. IEEE Int'l Symp. Defect and Fault Tolerance of Very Large Scale Integration Systems, pp. 454-462, 2008.
[17] N. Karimi, M. Maniatakos, Y. Makris, and A. Jas, “On the Correlation Between Controller Faults and Instruction-Level Errors in Modern Microprocessors,” Proc. IEEE Int'l Test Conf., pp. 24.1.1-24.1.10, 2008.
[18] M. Maniatakos, N. Karimi, C. Tirumurti, A. Jas, and Y. Makris, “Instruction-Level Impact Comparison of RT- versus Gate-Level Faults in a Modern Microprocessor Controller,” Proc. IEEE Very Large Scale Integration Test Symp., pp. 9-14, 2009.
6 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool