The Community for Technology Leaders
RSS Icon
Issue No.07 - July (2009 vol.58)
pp: 865-877
Madhu Mutyam , Indian Institute of Technology, Madras, Chennai
Feng Wang , Qualcomm Inc., San Diego
Ramakrishnan Krishnan , Penn State University, State College
Vijaykrishnan Narayanan , Pennsylvania State University, State College
Mahmut Kandemir , Pennsylvania State University, State College
Yuan Xie , Pennsylvania State University, State College
Mary Jane Irwin , Pennsylvania State University, State College
Fabricating circuits that employ ever-smaller transistors leads to dramatic variations in critical process parameters. This in turn results in large variations in execution/access latencies of different hardware components. This situation is even more severe for memory components due to minimum-sized transistors used in their design. Current design methodologies that are tuned for the worst case scenarios are becoming increasingly pessimistic from the performance angle, and thus, may not be a viable option at all for future designs. This paper makes two contributions targeting on-chip data caches. First, it presents an adaptive cache management policy based on nonuniform cache access. Second, it proposes a latency compensation approach that employs several circuit-level techniques to change the access latency of select cache lines based on the criticalities of the load instructions that access them. Our experiments reveal that both these techniques can recover significant amount of the lost performance due to worst case designs.
Process variation, cache, address prediction, superscalar processors.
Madhu Mutyam, Feng Wang, Ramakrishnan Krishnan, Vijaykrishnan Narayanan, Mahmut Kandemir, Yuan Xie, Mary Jane Irwin, "Process-Variation-Aware Adaptive Cache Architecture and Management", IEEE Transactions on Computers, vol.58, no. 7, pp. 865-877, July 2009, doi:10.1109/TC.2009.30
[1], 2009.
[2] SimpleScalar toolset, http:/, 2009.
[3] SPEC 2000 Benchmark, http:/, 2009.
[4] A. Agarwal et al., “Process Variation in Embedded Memories: Failure Analysis and Variation Aware Architecture,” IEEE J. Solid-State Circuits, vol. 40, no. 9, pp.1804-1814, Sept. 2005.
[5] M. Bekerman et al., “Correlated Load Address Predictors,” Proc. Ann. Int'l Symp. Computer Architecture (ISCA), 1999.
[6] M. Bekerman et al., “Early Load Address Resolution via Register Tracking,” Proc. Ann. Int'l Symp. Computer Architecture (ISCA), pp.306-315, 2000.
[7] S. Borkar et al., “Parameter Variations and Impact on Circuits and Microarchitecture,” Proc. Design Automation Conf. (DAC), pp.338-342, 2003.
[8] M. Burtscher and B.G. Zorn, “Hybrid Load-Value Predictors,” IEEE Trans. Computers, vol. 51, no. 7, pp.759-774, July 2002.
[9] M.L. Bushnell and V.D. Agarwal, Essentials of Electronic Testing for Digital, Memory, and Mixed-Signal VLSI Circuits. Kluwer, 2000.
[10] H. Chang and S. Sapatnekar, “Full-Chip Analysis of Leakage Power under Process Variations, Including Spatial Correlations,” Proc. Design Automation Conf. (DAC), pp.523-528, 2005.
[11] Q. Chen et al., “Modeling and Testing of SRAM for New Failure Mechanisms due to Process Variations in Nanoscale CMOS,” Proc. VLSI Testing Symp., pp.292-297, 2005.
[12] T. Chen and S. Naffziger, “Comparison of Adaptive Body Bias (ABB) and Adaptive Supply Voltage (ASV) for Improving Delay and Leakage under the Presence of Process Variation,” IEEE Trans. VLSI Systems, vol. 11, no. 5, pp.888-899, Oct. 2003.
[13] G. Chrysos and J. Emer, “Memory Dependence Prediction Using Store Sets,” Proc. Ann. Int'l Symp. Computer Architecture (ISCA), pp.142-153, 1998.
[14] R.J. Eickemeyer and S. Vassiliadis, “A Load Instruction Unit for Pipelined Processors,” IBM J. Research and Development, vol. 37, no. 4, pp.547-564, 1993.
[15] D. Ernst et al., “Razor: Circuit-Level Correction of Timing Errors for Low-Power Operation,” IEEE Micro, vol. 24, no. 6, pp.10-20, Nov./Dec. 2004.
[16] J. Gregg and T. Chen, “Post Silicon Power/Performance Optimization in the Presence of Process Variations Using Individual Well Adaptive Body Biasing (IWABB),” Proc. IEEE Int'l Symp. Quality Electronic Design (ISQED), pp.453-458, 2004.
[17] L.D. Hung, M. Goshima, and S. Sakai, “SEVA: A Soft-Error-and Variation-Aware Cache Architecture,” Proc. 12th Pacific Rim Int'l Symp. Dependable Computing (PRDC), pp.47-54, 2006.
[18] R. Heald and P. Wang, “Variability in Sub-100 nm SRAM Design,” Proc. Int'l Conf. Computer Aided Design (ICCAD), pp.347-353, 2004.
[19] G. Hinton et al., The Microarchitecture of the Pentium 4 Processor, 2001.
[20] R.E. Kessler et al., “The Alpha 21264 Microprocessor Architecture,” IEEE Micro, vol. 19, no. 2, pp.90-95, Oct./Nov. 1999.
[21] C. Kim et al., “An Adaptive, Non-Uniform Cache Structure for Wire-Delay Dominated On-Chip Caches,” Proc. Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp.211-222, 2002.
[22] N.S. Kim et al., “Total Power-Optimal Pipelining and Parallel Processing under Process Variations in Nanometer Technology,” Proc. Int'l Conf. Computer Aided Design (ICCAD), pp.535-540, 2005.
[23] M.H. Lipasti et al., “Value Locality and Load Value Prediction,” Proc. Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp.138-147, 1996.
[24] S. Mahapatra et al., “Device Scaling Effects on Hot-Carrier Induced Interface and Oxide Trapped Charge Distributions in MOSFETs,” IEEE Trans. Electron Devices, vol. 47, no. 4, pp.789-796, Apr. 2000.
[25] G. Memik et al., “Precise Instruction Scheduling,” J. Instruction-Level Parallelism, vol. 7, pp.1-29, 2005.
[26] A. Moshovos et al., “Dynamic Speculation and Synchronization of Data Dependencies,” Proc. Ann. Int'l Symp. Computer Architecture (ISCA), pp.181-193, 1997.
[27] S. Narendra et al., “Forward Body Bias for Microprocessors in 130nm Technology Generation and Beyond,” IEEE J. Solid-State Circuits, vol. 38, no. 5, pp.696-701, May 2003.
[28] S. Nassif, “Within Chip Variability Analysis,” Proc. IEEE Int'l Electron Devices Meeting (IEDM), pp.283-286, 1998.
[29] S. Nassif, “Modeling and Analysis of Manufacturing Variations,” Proc. IEEE Conf. Custom Integrated Circuits (CICC), pp.223-228, 2001.
[30] S. Ozdemir et al., “Yield-Aware Cache Architectures,” Proc. 39th Ann. IEEE/ACM Int'l Symp. Microarchitecture (MICRO), pp.15-25, 2006.
[31] L.T. Pang and B. Nikolic, “Impact of Layout on 90 nm CMOS Process Parameter Fluctuations,” Proc. Symp. VLSI Circuits, pp.69-70, 2006.
[32] A. Papanikolaou et al., “A System-Level Methodology for Fully Compensating Process Variability Impact of Memory Organizations in Periodic Applications,” Proc. Int'l Conf. Hardware-Software Codesign and System Synthesis (CODES+ISSS), pp.117-122, 2005.
[33] M.D. Powell et al., “Reducing Set-Associative Cache Energy via Way-Prediction and Selective Direct-Mapping,” Proc. 34th Int'l Symp. Microarchitecture (MICRO), pp.54-65, 2001.
[34] R. Rakvic et al., “Non-Vital Loads,” Proc. Int'l Symp. High Performance Computer Architecture (HPCA), pp.165-174, 2002.
[35] P. Shivakumar and N.P. Jouppi, “Cacti 3.0: An Integrated Cache Timing, Power, and Area Model,” research report, Western Research Lab., 2001.
[36] J. Srinivasan et al., “The Impact of Technology Scaling on Lifetime Reliability,” Proc. Int'l Conf. Dependable Systems and Networks (DSN), pp.177-186, 2004.
[37] A. Sodani and G. Sohi, “Dynamic Instruction Reuse,” Proc. Ann. Int'l Symp. Computer Architecture (ISCA), pp.194-205, 1997.
[38] D. Tarjan et al., “CACTI 4.0,” Technical Report HPL-2006-86, 2006.
[39] J. Tschanz et al., “Adaptive Body Bias for Reducing Impacts of Die-to-Die and Within-Die Parameter Variations on Microprocessor Frequency and Leakage,” IEEE J. Solid-State Circuits, vol. 37, no. 11, pp.1396-1402, Nov. 2002.
[40] P. Zuchowski et al., ”Process and Environmental Variation Impacts on ASIC Timing,” Proc. Design Automation Conf. (DAC), pp.336-342, 2005.
174 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool