The Community for Technology Leaders

Guest Editors' Introduction: Managing Uncertainty through Postfabrication Calibration and Repair

Swarup Bhunia, Case Western Reserve University
Rahul Rao, IBM T.J. Watson Research Center

Pages: pp. 4-5

Abstract—This special issue presents six articles that highlight challenges, and approaches toward improving, design yield and reliability through postsilicon optimizations. The articles cover postsilicon adaptation and repair issues in a wide range of areas including analog circuits, embedded memories, and multicore systems.

Keywords—design and test, built-in self-repair, calibration, many-core, multicore, postsilicon optimization, reliability, self-repair, thermal management, yield improvement

With increasing parameter variations in the nanometer technology regime, design and test considerations for improving circuit-limited yield and reliability of operation have gained high visibility and significance. Die-to-die and within-die process variations have emerged as critical yield and reliability limiters for digital, analog, and mixed-signal circuits. Increasing device densities and the proliferation of multicore systems have made designs more susceptible to dynamic thermal and electrical fluctuations. Postsilicon calibration and repair strategies constitute a promising class of solutions to address the variation effects as well as power-induced yield and reliability concerns.

Intelligent system design, for systems capable of postsilicon adaptation, is an area of active interest, which presents both new challenges and enormous opportunities. Understanding the limitations of the manufacturing process and technology, and the design's sensitivity to these myriad limitations, is vital in adopting the appropriate adaptation strategy. The design, validation, and calibration cost associated with postsilicon adaptation places additional constraints on the selection of control knobs and the decision on when to activate these features in the field. EDA has an important role to play in terms of enabling efficient and critical design analysis, and in ensuring an optimal design with the knowledge of the possible postsilicon correction approach being adopted. Variation in the workload and application characteristics provides an opportunity to design intelligent systems that can dynamically adapt and reach a better energy-performance operating point. Further, because these self-healing, self-repairing systems may increase test time and introduce additional test challenges, novel test strategies to address test issues become vital.

This special issue of Design and Test presents six articles that highlight challenges, and approaches toward improving design yield and reliability through postsilicon optimizations. The articles cover postsilicon adaptation and repair issues in a wide range of areas including analog circuits, embedded memories, and multicore systems.

The first article, "Analog Signature-Driven Postmanufacture Multidimensional Tuning of RF Systems" by Vishwanath Natarajan et al., addresses the challenges in calibrating tunable knobs for performance optimization in a cost-effective manner. The hardware-iterative approach uses a steepest-descent-based gradient search and genetic-algorithm-based test generation. The utility of the approach is demonstrated for a 2.4-GHz transmitter system.

Next is an article by Wu-Hsin Chen and Byunghoo Jung titled "Self-Healing Phase-Locked Loop in Deep-Scaled CMOS Technologies," which discusses the application−with limitations and challenges−of self-healing techniques to voltage-controlled oscillators in phased-locked loops. The authors present open- and closed-loop automatic frequency calibration and amplitude control that rely on a negative feedback loop using an integrated sensor and difference detector, with emphasis on digitally assisted calibration.

The next article, "Postsilicon Adaptation for Low-Power SRAM under Process Variation" by Minki Cho et al., looks at SRAM arrays that are highly sensitive to device parameter variations. The authors present postsilicon repair approaches for faulty SRAM arrays using on-chip leakage monitors and adaptive body biasing. Further, the article explores the use of adaptive voltage scaling for energy efficiency in multimedia-like applications that can tolerate a small number of failures in select locations.

The fourth article, "The Dawn of Predictive Chip Yield Design: Along and Beyond the Memory Lane" by Rajiv Joshi et al., highlights the role of statistical simulation and analysis techniques toward improving design yield. A mixture importance sampling approach is discussed that relies on distorting the natural Monte Carlo sampling function to produce more samples in important regions of the design space with a relatively small number of simulations. This approach enables sensitivity analysis and yield-driven design of memories, peripheral circuits, and logic blocks.

A fifth article by Tsu-Wei Tseng et al., "A Built-In Method to Repair SoC RAMs in Parallel," focuses on area vs. test time optimization in a multiple-embedded-memory system. It introduces a shared parallel, built-in self-repair (BISR) scheme along with a global time-multiplexed redundancy analyzer that enables a nearly 20% reduction in area with test and repair time comparable to that of a dedicated repair scheme. The article presents an efficient automation flow for the parallel BISR with an automatic sizing approach for local bitmap determination. The scheme's validity is demonstrated through an extensive analysis for repair efficiency, area cost, and test time, with measurement results from a 0.18-μm CMOS test chip.

A final article, "Runtime Thermal Management Using Software Agents for Multi- and Many-Core Architectures" by Mohammad Abdullah Al Faruque et al., addresses the problem of thermal hot spots in multi/many-core designs. The authors' approach is a scalable system-level distributed runtime mapping and remapping algorithm that focuses on task migration under temperature excursions using an agent-based negotiation policy. Minimization of thermal hot spots reduces transient errors and wear-out failures, improving design reliability.

These articles provide a glimpse into the various technologies that are likely to shape the world of self-adapting intelligent systems of the future. We hope that they introduce new concepts, tickle your intellect, and spur innovation in the field of variation tolerance through postsilicon remedies.


This special issue resulted from the inputs and efforts of many individuals to whom we are thankful. We are grateful to all the reviewers for their time and commitment. Further, we appreciate the insightful Last Byte column by Stephen Kosonocky that provides an apt ending to this issue. We thank Design and Test editor-in-chief Krishnendu Chakrabarty for his continued support throughout this process. And, finally, the editorial staff of the IEEE Computer Society deserves special thanks for their wonderful job in editing and organizing this issue.

About the Authors

Swarup Bhunia is an assistant professor of electrical engineering and computer science at Case Western Reserve University, Cleveland, Ohio. His research interests include low power and robust design, hardware security and implantable electronics. He has a PhD in electrical engineering from Purdue University with a focus on yield-aware, low-power and testable design approaches.
Rahul Rao is a research staff member with IBM T. J. Watson Research Center. His research interests include analysis and optimization of technology challenges and features in high performance circuits. He has been actively involved in the design of processors for IBM's high-performance servers. He has a PhD in electrical and computer engineering from the University of Michigan, Ann Arbor, with a focus on approaches for low-power VLSI design.
60 ms
(Ver 3.x)