MARCH/APRIL 2006 (Vol. 23, No. 2) pp. 86-87
0740-7475/06/$31.00 © 2006 IEEE
Published by the IEEE Computer Society
Published by the IEEE Computer Society
Guest Editor's Introduction: Evolving Methods for Detecting and Handling Reliability Defects
|Reliability defect screening methods|
|This special issue|
PDFs Require Adobe Acrobat
This Issue Focuses on methods and models used for reliability defect acceleration and screening. Latent defects, or reliability defects, are defects in which the IC functionally works initially, but will fail in the future after undergoing stress or after use in the application for a period of time. Clearly, suppliers focus heavily on ensuring that customers will not receive ICs with reliability defects. The four articles in this issue focus on various methods for reliability defect acceleration and detection.
Reliability defect screening methods
Historically, reliability defect screening methods have relied mostly on defect acceleration—which involves stressing ICs, causing latent defects that are initially benign to degrade and become detectable at production testing. Burn-in stress, which accelerates reliability defects by applying high temperature and voltages, is the most widely used defect acceleration method. Recently, voltage stress, which applies high voltage during production testing, has been used as a cheaper alternative to burn-in for some products.
Instead of relying solely on defect acceleration, another approach to reliability defect screening uses methods to detect reliability defects before they become functional failures. Typically, these methods rely on test screens that are more sensitive to the subtle abnormalities that such defects cause, or they rely on statistical methods that attempt to identify the ICs that are most likely to fail in the future. Historically, I DDQ testing—which detects reliability defects by rejecting ICs with abnormal leakage current measurements—has been used to screen ASICs and low-power logic products for reliability defects. The challenge of reliability defect detection methods is effectively identifying reliability failures without also rejecting a large number of additional ICs that would not become reliability failures.
The 2005 International Technology Roadmap for Semiconductors ( ITRS) lists "screening for reliability" as one of the IC industry's most difficult challenges. The ITRS states that reliability defect screening is becoming more difficult as a result of the following:
• increasing implementation challenges and efficacies of burn-in, I DDQ, and voltage stress; and
• erratic, nondeterministic, and intermittent device behavior.
In addition to the technical difficulties associated with reliability defect screening, these methods pose major cost and technology challenges. For many products, accelerating reliability defects through burn-in is unaffordable, so other methods besides burn-in are the only alternative. Moreover, the number of products for which burn-in is unaffordable is increasing and includes applications such as consumer electronics, game processors, and some networking and data processing applications. Consequently, "getting out of burn-in"—or GOBI—is a major effort at many semiconductor companies.
It is also becoming more difficult to achieve historic levels of reliability for products made with advanced semiconductor processes. This is because of exponentially increasing leakage currents, new failure modes, and increased integration. Increased leakage currents are causing I DDQ testing to lose screening effectiveness. Also, leakage currents increase exponentially at high voltage and temperature—causing high-performance products to dissipate hundreds of watts at burn-in and under voltage stress conditions. Advanced technologies are also more prone to subtle failure mechanisms, such as negative-bias-temperature instability, subtle SRAM defects causing cell stability failures, and subtle performance-changing defects. These mechanisms are more likely to cause failures than in the past because of smaller design margins and larger process and timing variations. Given that failure modes are changing for advanced technologies, there is a need to reevaluate reliability screening methods (and develop new ones) for each technology node.
Developing new reliability defect screening methods and evaluating existing ones is difficult because of the huge number of parts required to statistically quantify their effectiveness, and also because of the destructive nature of defect acceleration. This is because manufacturers often measure reliability in defective parts per million (DPPM), and products (for example, some automotive applications) sometimes require less than 10 DPPM. It is also difficult to quantify the effectiveness of defect detection methods because, by definition, the devices initially pass functional tests. Of course, it is possible to measure the effectiveness of these methods by using burn-in or life stress tests, but these methods can be costly and might not precisely cause the same failures as in normal use.
Unfortunately, even though reliability defect acceleration is a significant issue for the industry, IC manufacturers tend to publish little about applied methods and industry data. The lack of published data makes it more difficult for university researchers to contribute to developing reliability screening solutions.
This special issue
The four articles in this issue provide an excellent overview of methods and models for reliability defect screening. Each article addresses a unique area within the field of reliability defect screening.
First, Zakaria et al. describe methods for burn-in cost reduction by using voltage stressing during production test. Industry has used voltage stress for many years, but little has been published on methods to evaluate its effectiveness and how to use it in combination with burn-in to optimize reliability levels and cost. In their article, Zakaria et al. demonstrate an opportunity to reduce burn-in durations by a huge amount using voltage stress.
Next, Turakhia et al. describe defect detection methods using test and analysis methods to identify the ICs most likely to fail in the future. This article extends outlier detection to statistical testing, which can improve reliability defect detection while minimizing the cost of testing and yield loss.
A fundamental issue is how to model reliability defects. As mentioned earlier, it is very difficult to thoroughly evaluate the effectiveness and efficiency of reliability screening methods—particularly for new and advanced semiconductor technologies. However, IC suppliers do develop and validate yield models for advanced technologies. In their article, Barnett et al. show the close relationship between yield and reliability models, suggesting that it is possible to estimate reliability using existing yield data.
There are many different types of defects that can cause reliability failures. A difficulty of reliability defect modeling is how to capture each defect type in a model—including the likelihood of the defect, the defect's behavior, and the effectiveness of detection and acceleration methods. Carulli and Anderson describe how to capture these defect characteristics and how to optimize detection and acceleration methods using such a model.
Reliability defect screening is becoming significantly more difficult for advanced semiconductor technologies. To overcome these barriers, new methods and modeling methodologies are required. I hope that the articles in this special issue will contribute to the industry's ability to overcome these barriers.
Phil Nigh is a senior technical staff member at IBM Microelectronics. His research interests include test strategy, DFT, DPM reduction, and methods to use testing to drive yield and defect learning. Nigh has a PhD in electrical and computer engineering from Carnegie Mellon University.