Issue No.09 - September (2011 vol.60)
Michail Maniatakos , Yale University, New Haven
Naghmeh Karimi , University of Tehran, Tehran
Chandrasekharan (Chandra) Tirumurti , Intel Corporation, Santa Clara
Abhijit Jas , Intel Corporation, Austin
Yiorgos Makris , Yale University, New Haven
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TC.2010.60
We investigate the correlation between low-level faults in the control logic of a modern microprocessor and their instruction-level impact on the execution of typical workload. Such information can prove immensely useful in accurately assessing and prioritizing faults with regards to their criticality, as well as commensurately allocating resources to enhance online testability and error/fault resilience through concurrent error detection/correction methods. To this end, we developed an extensive fault simulation infrastructure which allows injection of stuck-at faults and transient errors of arbitrary starting time and duration, as well as cost-effective simulation and classification of their repercussions into various instruction-level error types. As a test vehicle for our study, we employ a superscalar, dynamically-scheduled, out-of-order, Alpha-like microprocessor, on which we execute SPEC2000 integer benchmarks. Extensive fault injection campaigns in control modules of this microprocessor facilitate valuable observations regarding the distribution of low-level faults into the instruction-level error types that they cause. Experimentation with both Register Transfer (RT-) and Gate-Level faults, as well as with both stuck-at faults and transient errors, confirms the validity and corroborates the utility of these observations.
Fault simulation, instruction-level error, microprocessor controller, concurrent error detection.
Michail Maniatakos, Naghmeh Karimi, Chandrasekharan (Chandra) Tirumurti, Abhijit Jas, Yiorgos Makris, "Instruction-Level Impact Analysis of Low-Level Faults in a Modern Microprocessor Controller", IEEE Transactions on Computers, vol.60, no. 9, pp. 1260-1273, September 2011, doi:10.1109/TC.2010.60