Issue No.06 - November/December (2006 vol.23)
Published by the IEEE Computer Society
Shekhar Borkar , Intel
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/MDT.2006.156
Variability and reliability will be the barriers to future technology scaling. Every discipline, from fabrication to software, needs to cooperate and make the VLSI system reliable in the presence of variability and the resulting inherent unreliability of components.
Technology scaling will continue to follow Moore's law, letting designers use billions of transistors. But variability and reliability will be the barriers to future scaling, just as the barriers today stem from power and energy consumption. Die size, chip yields, and design productivity have thus far limited transistor integration in VLSI designs. But the focus is shifting to energy consumption, power dissipation, and power delivery. Transistor subthreshold leakage continues to increase, and there are leakage avoidance, tolerance, and control techniques for circuits. However, as technology scales further, new challenges will emerge, such as variability, single-event upsets, and device degradation. These problems will inevitably lead to inherent unreliability in components, posing serious design and test challenges.
This problem is not new. Even today, systems design takes into account variability and reliability issues—for example, error-correcting codes in memories to detect and correct soft errors. We cope with variability in transistor performance through careful design, as well as testing for frequency binning. But with continued technology scaling, the impact of these issues is becoming greater, and we must devise techniques to deal with them effectively.
Random dopant fluctuations cause variability in transistor threshold voltages, affecting static memory stability. Subwavelength lithography, which causes line edge roughness and thus variability in transistors, will continue until extreme-ultraviolet technology becomes available. Increasing power density increases heat flux, leading to greater demand on the power distribution system. This greater demand causes voltage variations, as well as hot spots on the die with increased subthreshold leakage power consumption. Thus, we are facing static (process technology) and dynamic (circuit operation) variability in VLSI systems, and it could get worse.
Designs must deal with variability from day one. Today's design methodologies optimize performance and power, but ignore test and yield in the presence of variability. We need a multivariate design optimization capability for probabilistic and statistical design. Circuit design techniques such as body biasing will help, but their effect diminishes with technology scaling.
Gate dielectric scaling will increase gate leakage exponentially, and burn-in power could become prohibitive, making burn-in testing obsolete. Then, how would we screen for defects and infant mortality in VLSI chips? One-time factory testing will not be enough. Test hardware must be embedded in the design to detect errors dynamically, isolate and confine faults, reconfigure the system to work around faults using spare hardware, and recover from errors on the fly.
All this is possible thanks to the abundance of transistors, but every discipline, from fabrication to software, needs to cooperate and make the VLSI system reliable in the presence of variability and the resulting inherent unreliability of components. Significant research and development are necessary to make this concept a reality. This might sound like science fiction. It is certainly challenging, not easy. But when has the electronics or semiconductor industry not faced difficult challenges?
Shekhar Borkar is an Intel Fellow and director of microprocessor research at Intel. Contact him at Shekhar.Y.Borkar@intel.com.