loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
31st Annual International Symposium on Computer Architecture (ISCA'04)
Techniques to Reduce the Soft Error Rate of a High-Performance Microprocessor
M?nchen, Germany
June 19-June 23
ISBN: 0-7695-2143-6
Christopher Weaver, Intel Corporation, Hudson MA
Joel Emer, Intel Corporation, Hudson MA
Shubhendu S. Mukherjee, Intel Corporation, Hudson MA
Steven K. Reinhardt, Intel Corporation, Hudson MA; University of Michigan, Ann Arbor
Transient faults due to neutron and alpha particle strikes pose a significant obstacle to increasing processor transistor counts in future technologies. Although fault rates of individual transistors may not rise significantly, incorporating more transistors into a device makes that device more likely to encounter a fault. Hence, maintaining processor error rates at acceptable levels will require increasing design effort.
This paper proposes two simple approaches to reduce error rates and evaluates their application to a microprocessor instruction queue. The first technique reduces the time instructions sit in vulnerable storage structures by selectively squashing instructions when long delays are encountered. A fault is less likely to cause an error if the structure it affects does not contain valid instructions. We introduce a new metric, MITF (Mean Instructions To Failure), to capture the trade-off between performance and reliability introduced by this approach.
The second technique addresses false detected errors. In the absence of a fault detection mechanism, such errors would not have affected the final outcome of a program. For example, a fault affecting the result of a dynamically dead instruction would not change the final program output, but could still be flagged by the hardware as an error. To avoid signalling such false errors, we modify a pipeline's error detection logic to mark affected instructions and data as possibly incorrect rather than immediately signaling an error. Then, we signal an error only if we determine later that the possibly incorrect value could have affected the program's output.
Citation:
Christopher Weaver, Joel Emer, Shubhendu S. Mukherjee, Steven K. Reinhardt, "Techniques to Reduce the Soft Error Rate of a High-Performance Microprocessor," isca, pp.264, 31st Annual International Symposium on Computer Architecture (ISCA'04), 2004
Usage of this product signifies your acceptance of the Terms of Use.