Issue No. 08 - August (2009 vol. 58)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TC.2009.52
Valeria Bertacco , University of Michigan, Ann Arbor
Todd Austin , University of Michigan, Ann Arbor
Kypros Constantinides , University of Michigan, Ann Arbor
Onur Mutlu , Microsoft Research, Redmond
This work proposes a new, software-based, defect detection and diagnosis technique. We introduce a novel set of instructions, called Access-Control Extensions (ACE), that can access and control the microprocessor's internal state. Special firmware periodically suspends microprocessor execution and uses the ACE instructions to run directed tests on the hardware. When a hardware defect is present, these tests can diagnose and locate it, and then activate system repair through resource reconfiguration. The software nature of our framework makes it flexible: testing techniques can be modified/upgraded in the field to trade-off performance with reliability without requiring any change to the hardware. We describe and evaluate different execution models for using the ACE framework. We also describe how the proposed ACE framework can be extended and utilized to improve the quality of post-silicon debugging and manufacturing testing of modern processors. We evaluated our technique on a commercial chip-multiprocessor based on Sun's Niagara and found that it can provide very high coverage, with 99.22 percent of all silicon defects detected. Moreover, our results show that the average performance overhead of software-based testing is only 5.5 percent. Based on a detailed register transfer level (RTL) implementation of our technique, we find its area and power consumption overheads to be modest, with a 5.8 percent increase in total chip area and a 4 percent increase in the chip's overall power consumption.
Reliability, hardware defects, online defect detection, testing, online self-test, post-silicon debugging, manufacturing test.
Valeria Bertacco, Todd Austin, Kypros Constantinides, Onur Mutlu, "A Flexible Software-Based Framework for Online Detection of Hardware Defects", IEEE Transactions on Computers, vol. 58, no. , pp. 1063-1079, August 2009, doi:10.1109/TC.2009.52