The Community for Technology Leaders
RSS Icon
Issue No.01 - January (2011 vol.60)
pp: 50-63
Zeshan Chishti , Intel Labs, Hillsboro
Alaa R. Alameldeen , Intel Labs, Hillsboro
Wei Wu , Intel Labs, Hillsboro
Shih-Lien Lu , Intel Labs, Hillsboro
The performance/energy trade-off is widely acknowledged as a primary design consideration for modern processors. A less discussed, though equally important, trade-off is the reliability/energy trade-off. Many design features that increase reliability (e.g., redundancy, error detection, and correction) have the side effect of consuming more energy. Many energy-saving features (e.g., voltage scaling) have the side effect of making systems less reliable. In this paper, we propose an adaptive cache design that enables the operating system to optimize for performance or energy efficiency without sacrificing reliability. Our proposed mechanism enables a cache with a wide operating range, where the cache can use a variable part of its data array to store error-correcting codes. A reliable, energy-efficient cache can use up to half of its data array to store error-correcting codes so that it can reliably operate at a low voltage to reduce energy. A reliable high-performance cache uses its whole data array, but operates at a higher voltage to improve reliability while sacrificing energy. We propose a hardware mechanism that allows the operating system to choose different points within that operating range based on the desired levels of performance, energy, and reliability.
Design, reliability, fault tolerance, dependable design.
Zeshan Chishti, Alaa R. Alameldeen, Wei Wu, Shih-Lien Lu, "Adaptive Cache Design to Enable Reliable Low-Voltage Operation", IEEE Transactions on Computers, vol.60, no. 1, pp. 50-63, January 2011, doi:10.1109/TC.2010.207
[1] Z. Chishti et al., "Improving Cache Lifetime Reliability at Ultra-Low Voltages," Proc. 42nd Int'l Symp. Microarchitecture (Micro-42), pp. 89-99, Dec. 2009.
[2] "Intel Celeron Processor—Low Power/Ultra Low Power," Intel Corporation, datashts27350901.pdf, Oct. 2001.
[3] A. Bhavnagarwala et al., "The Impact of Intrinsic Device Fluctuations on CMOS SRAM Cell Stability," IEEE J. Solid State Circuits, vol. 36, no. 4, pp. 658-665, Apr. 2001.
[4] C. Wilkerson et al., "Trading Off Cache Capacity for Reliability to Enable Low Voltage Operation," Proc. 35th Int'l Symp. Computer Architecture (ISCA-35), pp. 203-214, June 2008.
[5] D. Roberts, N.S. Kim, and T. Mudge, "On-Chip Cache Device Scaling Limits and Effective Fault Repair Techniques in Future Nanoscale Technology," Proc. 1oth Euromicro Conf. Digital System Design (DSD '07), pp. 570-578, 2007.
[6] M. Agostinelli et al., "Erratic Fluctuations of SRAM Cache Vmin at the 90nm Process Technology Node," Proc. IEEE Int'l Electron Devices Meeting (IEDM) Technical Digest, pp. 655-658, Dec. 2005.
[7] J. Kim et al., "Multi-Bit Error Tolerant Caches Using Two-Dimensional Error Coding," Proc. 40th Int'l Symp. Micro-Architecture (Micro-40), Dec. 2007.
[8] H.Y. Hsiao et al., "Orthogonal Latin Square Codes," IBM J. Research and Development, vol. 14, no. 4, pp. 390-394, July 1970.
[9] T. Austin, "DIVA: A Reliable Substrate for Deep Submicron Microarchitecture Design," Proc. 32nd Ann. Symp. Microarchitecture (MICRO-32), pp. 196-207, Nov. 1999.
[10] J.P. Kulkarni, K. Kim, and K. Roy, "A 160 mV Robust Schmitt Trigger Based Subthreshold SRAM," IEEE J. Solid-State Circuits, vol. 42, no. 10, pp. 2303-2313, Oct. 2007.
[11] S.E. Schuster, "Multiple Word/Bit Line Redundancy for Semiconductor Memories," IEEE J. Solid-State Circuits, vol. SC-13, no. 5, pp. 698-703, Oct. 1978.
[12] S. Hareland et al., "Impact of CMOS Scaling and SOI on Soft Error Rates of Logic Processes," Proc. VLSI Technology Digest of Technical Papers, pp. 73-74, 2001.
[13] X. Li et al., "Scaling of Architecture Level Soft Error Rates for Superscalar Processors," Proc. First Workshop System Effects of Logic Soft Errors (SELSE), Apr. 2005.
[14] P. Shivakumar et al., "Modeling the Effect of Technology Trends on the Soft Error Rate of Combinational Logic," Proc. Int'l Conf. Dependable Systems and Networks, pp. 389-398, June 2002.
[15] S. Mukherjee, J. Emer, and S. Reinhardt, "The Soft Error Problem: An Architectural Perspective," Proc. 11th Int'l Symp. High-Performance Computer Architecture (HPCA-'05), pp. 243-247, Feb. 2005.
[16] J. Srinivasan et al., "The Case for Lifetime Reliability-Aware Microprocessors," Proc. 31st Int'l Symp. Computer Architecture (ISCA '04), pp. 276-287, June 2004.
[17] C. Weaver et al., "Techniques to Reduce the Soft Error Rate of a High-Performance Microprocessor," Proc. 31st Int'l Symp. Computer Architecture (ISCA-31), pp. 264-275, June 2004.
[18] K. Wu and D. Marculescu, "Soft Error Rate Reduction Using Redundancy Addition and Removal," Proc. IEEE/ACM Asian-South Pacific Design Automation Conf. (ASPDAC), Jan. 2008.
[19] T. Karnik et al., "Impact of Body Bias on Alpha- and Neutron-Induced Soft Error Rates of Flip_flops," Proc. Symp. VLSI Circuits Digest of Technical Papers, pp. 324-325, 2004.
[20] Y. Kawakami et al., "Investigation of Soft Error Rate Including Multi-Bit Upsets in Advanced SRAM Using Neutron Irradiation Test and 3D Mixed-Mode Device Simulation," Proc. IEEE Int'l Electron Devices Meeting, pp. 945-948, Dec. 2004.
[21] F. Ruckerbauer and G. Georgakos, "Soft Error Rates in 65nm SRAMs—Analysis of New Phenomena," Proc. 13th IEEE Int'l Online Testing Symp. (IOLTS '07), pp. 203-204, 2007.
[22] K. Ünlü et al., "Neutron-Induced Soft Error Rate Measurements in Semiconductor Memories," Nuclear Instruments and Methods in Physics Research Section A, vol. 579, no. 1, pp. 252-255, 2007.
[23] J.F. Ziegler et al., "Accelerated Testing for Cosmic Soft-Error Rate," IBM J. Research and Development, vol. 40, no. 1, pp. 51-72, Jan. 1996.
[24] C. Constantinescu, "Impact of Intermittent Faults on Nanocomputing Devices," Proc. Workshop Dependable and Secure Nanocomputing (DSN '07), June 2007.
[25] S. Lin and D.J. Costello, Error Control Coding, second ed. Prentice-Hall, Inc., 2004.
[26] R.C. Bose and R.K. Ray-Chaudhuri, "On a Class of Error-Correcting Binary Group Codes," Information and control, vol. 3, pp. 68-79, 1960.
[27] C.L. Chen and M.Y. Hsiao, "Error-Correcting Codes for Semiconductor Memory Applications: A State-of-the-Art-Review," IBM J. Research Development, vol. 28, no. 2, pp. 124-134, Mar. 1984.
[28] J. Ihm et al., "An 80nm 4Gb/s/Pin 32b 512Mb GDDR4 Graphics DRAM with Low-Power and Low-Noise Data-Bus Inversion," Proc. IEEE Int'l Solid State Circuits Conf., pp. 492-493. 2007.
[29] K. Nakamura and M. Horowitz, "A 50 Percent Noise Reduction Interface Using Low-Weight Coding," Proc. Symp. VLSI Circuits Digest of Technical Papers, pp. 144-145, June 1996.
[30] D. Schinkel et al., "A Double-Tail Latch-Type Voltage Sense Amplifier with 18ps Setup + Hold Time," Proc. IEEE Int'l Solid State Circuits Conf., pp. 314-315, 2007.
[31] "TSMC Standard Cell Libraries," partners/tsmcSC_Brochure_9.pdf , 2010.
[32] W. Liu, J. Rho, and W. Sung, "Low-Power High-Throughput BCH Error Correction VLSI Design for Multi-Level Cell NAND Flash Memories," Proc. IEEE Workshop Signal Processing Systems (SIPS), pp. 248-253, 2006.
[33] J. Doweck, "Inside the Core Microarchitecture," Proc. 18th IEEE Symp. High-Performance Chips, Aug. 2006.
[34] Y. Taur and T.H. Ning, Fundamentals of Modern VLSI Devices, p. 144. Cambridge Univ. Press, 1998.
15 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool