loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
31st Annual International Symposium on Computer Architecture (ISCA'04)
The Case for Lifetime Reliability-Aware Microprocessors
M?nchen, Germany
June 19-June 23
ISBN: 0-7695-2143-6
Jayanth Srinivasan, University of Illinois at Urbana-Champaign
Sarita V. Adve, University of Illinois at Urbana-Champaign
Pradip Bose, IBM T.J. Watson Research Center, Yorktown Heights, NY
Jude A. Rivers, IBM T.J. Watson Research Center, Yorktown Heights, NY
Ensuring long processor lifetimes by limiting failures due to wear-out related hard errors is a critical requirement for all microprocessor manufacturers. We observe that continuous device scaling and increasing temperatures are making lifetime reliability targets even harder to meet. However, current methodologies for qualifying lifetime reliability are overly conservative since they assume worst-case operating conditions. This paper makes the case that the continued use of such methodologies will significantly and unnecessarily constrain performance. Instead, lifetime reliability awareness at the microarchitectural design stage can mitigate this problem, by designing processors that dynamically adapt in response to the observed usage to meet a reliability target.
We make two specific contributions. First, we describe an architecture-level model and its implementation, called RAMP, that can dynamically track lifetime reliability, responding to changes in application behavior. RAMP is based on state-of-the-art device models for different wear-out mechanisms. Second, we propose dynamic reliability management (DRM) - a technique where the processor can respond to changing application behavior to maintain its lifetime reliability target. In contrast to current worst-case behavior based reliability qualification methodologies, DRM allows processors to be qualified for reliability at lower (but more likely) operating points than the worst case. Using RAMP, we show that this can save cost and/or improve performance, that dynamic voltage scaling is an effective response technique for DRM, and that dynamic thermal management neither subsumes nor is sub-sumed by DRM.
Citation:
Jayanth Srinivasan, Sarita V. Adve, Pradip Bose, Jude A. Rivers, "The Case for Lifetime Reliability-Aware Microprocessors," isca, pp.276, 31st Annual International Symposium on Computer Architecture (ISCA'04), 2004
Usage of this product signifies your acceptance of the Terms of Use.