This Article 
 Bibliographic References 
 Add to: 
ReviveNet: A Self-Adaptive Architecture for Improving Lifetime Reliability via Localized Timing Adaptation
September 2011 (vol. 60 no. 9)
pp. 1219-1232
Guihai Yan, Chinese Academy of Sciences, Institute of Computing Technology, Beijing
Yinhe Han, Chinese Academy of Sciences, Beijing
Xiaowei Li, Chinese Academy of Sciences, Institute of Computing Technology, Beijing
The aggressive technology scaling poses serious challenges to lifetime reliability. A parament challenge comes from a variety of aging mechanisms that can cause gradual performance degradation of circuits. Prior work shows that such progressive degradation can be reliably detected by dedicated aging sensors, which provides a good foundation for proposing a new scheme to improve lifetime reliability. In this paper, we propose ReviveNet, a hardware-implemented aging-aware and self-adaptive architecture. Aging awareness is realized by deploying dedicated aging sensors, and self-adaptation is achieved by employing a group of synergistic agents. Each agent implements a localized timing adaptation mechanism to tolerate aging-induced delay on critical paths. On the evaluation, a reliability model based on widely used weibull distribution is presented. Experimental results show that, without compromising with any nominal architectural performance, ReviveNet can improve the Mean-Time-To-Failure by up to 48.7 percent, at the expense of 9.5 percent area overhead and small power increase.

[1] Sony, Quality and Reliability Handbook, 2000.
[2] Atmel, Quality & Reliability Handbook, 2004.
[3] S. Borkar, “Designing Reliable Systems from Unreliable Components: The Challenges of Transistor Variability and Degradation,” IEEE Micro, vol. 25, no. 6, pp. 10-16, Nov./Dec. 2005.
[4] “Process Integration, Devices, and Structures,” technical report, ITRS, 2007.
[5] M. Agarwal, B.C. Paul, M. Zhang, and S. Mitra, “Circuit Failure Prediction and Its Application to Transistor Aging,” Proc. IEEE 25th VLSI Test Symp. (VTS '07), pp. 277-286, 2007.
[6] J. Blome, S. Feng, S. Gupta, and S. Mahlke, “Self-Calibrating Online Wearout Detection,” Proc. IEEE/ACM 40th Ann. Int'l Symp. Microarchitecture (Micro), pp. 109-122, 2007.
[7] B.C. Paul, K. Kang, H. Kufluoglu, M.A. Alam, and K. Roy, “Impact of NBTI on the Temporal Performance Degradation of Digital Circuits,” IEEE Electron Device Letters, vol. 26, no. 8, pp. 560-562, Aug. 2005.
[8] W. Wang, S. Yang, S. Bhardwaj, R. Vattikonda, S. Vrudhula, F. Liu, and Y. Cao, “The Impact of NBTI on the Performance of Combinational and Sequential Circuits,” Proc. IEEE/ACM 44th Design Automation Conf. (DAC), 2007.
[9] R. Rodriguez, J.H. Stathis, and B.P. Linder, “Modeling and Experimental Verification of the Effect of Gate Oxide Breakdown on CMOS Inverters,” Proc. IEEE 41st Ann. Int'l Reliability Physics Symp., pp. 11-16, 2003.
[10] A. Avellan and W.H. Krautschneider, “Impact of Soft and Hard Breakdown on Analog and Digital Circuits,” IEEE Trans. Device and Materials Reliability, vol. 4, no. 4, pp. 676-680, Dec. 2004.
[11] B. Kaczer, R. Degraeve, P. Roussel, and G. Groeseneken, “Gate Oxide Breakdown in Fet Devices and Circuits: From Nanoscale Physics to System-Level Reliability,” Microelectronics Reliability, vol. 47, nos. 4-5, pp. 559-566, 2007.
[12] S. Borkar, T. Karnik, S. Narendra, J. Tschanz, A. Keshavarzi, and V. De, “Parameter Variations and Impact on Circuits and Microarchitecture,” Proc. Design Automation Conf., pp. 338-342, 2003.
[13] G. Chen, K.Y. Chuah, M.F. Li, D.S.H. Chan, C.H. Ang, J.Z. Zheng, Y. Jin, and D.L. Kwong, “Dynamic NBTI of PMOS Transistors and Its Impact on Device Lifetime,” Proc. IEEE 41st Ann. Int'l Reliability Physics Symp., pp. 196-202, 2003.
[14] R. Vattikonda, W. Wang, and Y. Cao, “Modeling and Minimization of PMOS NBTI Effect for Robust Nanometer Design,” Proc. IEEE/ACM Design Automation Conf., pp. 1047-1052, 2006.
[15] W. Wang, V. Reddy, A.T. Krishnan, R. Vattikonda, S. Krishnan, and Y. Cao, “Compact Modeling and Simulation of Circuit Reliability for 65-nm CMOS Technology,” IEEE Trans. Device and Materials Reliability, vol. 7, no. 4, pp. 509-517, Dec. 2007.
[16] J. Srinivasan, S.V. Adve, P. Bose, and J.A. Rivers, “The Case for Lifetime Reliability-Aware Microprocessors,” Proc. 31st Ann. Int'l Symp. Computer Architecture, pp. 276-287, 2004.
[17] J. Abella, X. Vera, and A. Gonzalez, “Penelope: The NBTI-Aware Processor,” Proc. IEEE/ACM 40th Ann. Int'l Symp. Microarchitecture (Micro), pp. 85-96, 2007.
[18] J. Shin, V. Zyuban, Z. Hu, J.A. Rivers, and P. Bose, “A Framework for Architecture-Level Lifetime Reliability Modeling,” Proc. IEEE/IFIP 37th Ann. Int'l Conf. Dependable Systems and Networks (DSN '07), 2007.
[19] G. Yan, Y. Han, and X. Li, “A Unified Online Fault Detection Scheme via Checking of Stability Violation,” Proc. Conf. Design, Automation and Test in Europe (DATE '09), pp. 496-501, 2009.
[20] A. Tiwari and J. Torrellas, “Facelift: Hiding and Slowing Down Aging in Multicores,” Proc. IEEE/ACM 41st Int'l Symp. Microarchitecture (Micro), pp. 129-140, 2008.
[21] J. Srinivasan, S.V. Adve, B. Pradip, and J.A. Rivers, “Exploiting Structural Duplication for Lifetime Reliability Enhancement,” Proc. 32nd Int'l Symp. Computer Architecture (ISCA '05), pp. 520-531, 2005.
[22] J. Shin, V. Zyuban, P. Bose, and T.M. Pinkston, “A Proactive Wearout Recovery Approach for Exploiting Microarchitectural Redundancy to Extend Cache SRAM Lifetime,” Proc. 35th Ann. Int'l Symp. Computer Architecture (ISCA '08), pp. 353-362, 2008.
[23] M. Agarwal et al., “Optimized Circuit Failure Prediction for Aging: Practicality and Promise,” Proc. IEEE Int'l Test Conf. (ITC), pp. 1-10, 2008.
[24] A. Tiwari, S.R. Sarangi, and J. Torrellas, “ReCycle: Pipeline Adaptation to Tolerate Process Variation,” Proc. 34th Ann. Int'l Symp. Computer Architecture (ISCA '07), pp. 323-334, 2007.
[25] “Opensparc T1 Microarchitecture Specification,” technical report, Sun Microsystems, Inc., 2006.
[26] S. Venkataraman and S.B. Drummonds, “Poirot: A Logic Fault Diagnosis Tool and Its Applications,” Proc. Int'l Test Conf., pp. 253-262, 2000.
[27] T. Xanthopoulos, D.W. Bailey, A.K. Gangwar, M.K. Gowan, A.K. Jain, and B.K. Prewitt, “The Design and Analysis of the Clock Distribution Network for a 1.2 Ghz Alpha Microprocessor,” Proc. IEEE Int'l Solid-State Circuits Conf., pp. 402-403, 2001.
[28] K. Minami et al., “A 1 Ghz Portable Digital Delay-Locked Loop with Infinite Phase Capture Ranges,” Proc. IEEE Int'l Solid-State Circuits Conf. (ISSCC), pp. 350-351, 469, 2000.
[29] Y.-J. Jeon, J.-H. Lee, H.-C. Lee, K.-W. Jin, K.-S. Min, J.-Y. Chung, and H.-J. Park, “A 66-333-Mhz 12-Mw Register-Controlled Dll with a Single Delay Line and Adaptive-Duty-Cycle Clock Dividers for Production Ddr Sdrams,” IEEE J. Solid-State Circuits, vol. 39, no. 11, pp. 2087-2092, Nov. 2004.
[30] P. Mahoney, E. Fetzer, B. Doyle, and S. Naffziger, “Clock Distribution on a Dual-Core, Multi-Threaded Itanium-Family Processor,” Proc. IEEE Int'l Solid-State Circuits Conf. (ISSCC), pp. 292-293, 2005.
[31] D.E. Duarte, N. Vijaykrishnan, and M.J. Irwin, “A Clock Power Model to Evaluate Impact of Architectural and Technology Optimizations,” IEEE Trans. Very Large Scale Integration, vol. 10, no. 6, pp. 844-855, Dec. 2002.
[32] B.C. Paul, K. Kunhyuk, H. Kufluoglu, M.A. Alam, and K. Roy, “Temporal Performance Degradation under NBTI: Estimation and Design for Improved Reliability of Nanoscale Circuits,” Proc. Design Automation and Test in Europe (DATE '06), pp. 1-6, 2006.
[33] R. Doering and Y. Nishi, Handbook of Semiconductor Manufacturing Technology, second ed. CRC, 2007.
[34] K.A. Bowman, S.G. Duvall, and J.D. Meindl, “Impact of Die-to-Die and within-Die Parameter Fluctuations on the Maximum Clock Frequency Distribution for Gigascale Integration,” IEEE J. Solid-State Circuits, vol. 37, no. 2, pp. 183-190, Feb. 2002.
[35] J.H. Stathis, “Reliability Limits for the Gate Insulator in CMOS Technology,” IBM J. Research and Development, vol. 46, no. 2.3, pp. 265-286, Mar. 2002.
[36] J.M. Rabaey, A. Chandrakasan, and B. Nikolic, Digital Integrated Circuits, A Design Perspective, second ed., Pearson Education Asia Limited and Tsinghu Univ. Press, 2004.

Index Terms:
Lifetime reliability, self-adaptive, aging sensor, timing adaptation, NBTI.
Guihai Yan, Yinhe Han, Xiaowei Li, "ReviveNet: A Self-Adaptive Architecture for Improving Lifetime Reliability via Localized Timing Adaptation," IEEE Transactions on Computers, vol. 60, no. 9, pp. 1219-1232, Sept. 2011, doi:10.1109/TC.2011.33
Usage of this product signifies your acceptance of the Terms of Use.