, IBM T.J. Watson Research
, University of Rochester
, Carnegie Mellon University
Pages: pp. 8-11
The quest for higher performance via deep pipelining (for high clock rate) and speculative, out-of-order execution (for high instructions per cycle) has yielded processors with increasing design complexity. The costs of higher complexity are manyfold: increased verification time, higher power dissipation, reduced scalability in terms of microarchitectural resource size parameters and process shrinks, and so on. The recent industry-wide focus on processor power consumption as a major design constraint has forced companies to pay further attention to the underlying complexity issue.
It is important to understand that the two issues, power and complexity, do not necessarily have complementary, causal effects, that is, an increase or decrease in one metric does not necessarily imply a corresponding increase or decrease in the other. To the extent that higher complexity generally translates into a higher transistor count, we can assert that complexity and area (and therefore leakage power, as well as unmanaged dynamic power) are positively correlated. On the other hand, power management through clock gating, V DD gating, or adaptive resizing techniques usually increases verification complexity, even though these techniques reduce the net power consumption. Thus, in a sense, we must separately manage power and complexity, while remaining aware of the subtle interdependencies.
Future designs will require microarchitects, circuit designers, compiler developers, pre-silicon modelers, verification engineers, and system designers to cooperatively develop hardware and software techniques and tools that can break the power and complexity barriers. This is no less an issue today than the memory wall problem that architects have long worried about from a performance-only viewpoint.
With this key challenge in perspective, we decided to put together a special issue on this theme: power- and complexity-aware design (PCAD). We are pleased to present five excellent articles selected from 16 submissions. We based our selections on the IEEE Micro technical review process, which includes at least three reviews for each submission. These articles cover a broad range of topics related to the theme, but they do not address some aspects, such as verification complexity or design for "verifiability." As such, before formally introducing each article, we provide a brief tutorial-style overview of the PCAD theme. This overview will hopefully help future researchers focus on PCAD topics not covered in this special issue.
Perhaps the best-appreciated measure of design complexity is the cost of verification. Companies spend a major fraction (approximately 60 to 70 percent) of the net development cost of a processor or a system on chip (SoC) on verification and validation. As Moore's law continues and it becomes possible to offer more and more transistors on a die, microarchitects invariably find ways of using all of those available devices. This has caused a steady escalation in chip verification costs. Can you estimate the verification complexity beforehand, during the microarchitecture definition phase? If so, can you choose a microarchitecture that bounds the verification cost to an affordable value? These are tough questions that remain unanswered in the PCAD work covered by articles in this special issue.
Informally, however, architects are aware of the following:
Many microprocessor chip (or chipset) designs employ a combination of these heuristics to manage the overall verification budget. A systematic and quantitative methodology of ensuring scalable verification complexity over the life of a product family is still a topic for future research.
The power a processor consumes is a tangible metric measurable in familiar units (watts). Package and cooling costs increase with chip power, and these cost sensitivities are reasonably well known to those working on chip design teams. As such, much recent work proposes and evaluates power-efficient design ideas. The advent of microarchitecture-level power estimation tools such as Wattch (from Princeton University) and SimplePower (from Penn State) have facilitated academic research in this important aspect of PCAD research. Industrial R&D groups have also developed their own proprietary toolsets to enable early stage power-performance tradeoffs. Virtually all of the articles in this special issue touch on the energy-efficiency aspect of PCAD—either in terms of design or the underlying presilicon modeling issues.
Managing power by adding on-chip controls can enhance overall energy efficiency but at the price of increased verification complexity. This type of design tradeoff that attempts to balance power reduction against verification cost is inadequately covered in current PCAD research.
Microarchitecture complexity can have a negative impact on testability, manufacturing cost, and effective chip yield. However, quantifying these aspects of complexity is not easy. Again, in qualitative terms, it seems intuitive that chip designs using complex, irregular microarchitectural constructs suffer from poor yield and testability metrics. These and other related factors could also lead to an increase in the microprocessor's effective manufacturing cost.
From the preceding discussion, defining quantitative metrics for evaluating complexity effectiveness is not easy, even if you interpret the measure of complexity from a single, restricted viewpoint, such as verification cost, power, testability, yield, or manufacturability. So far, the architecture community has found it easiest to view complexity effectiveness primarily in terms of power efficiency. But even here, the issue of using the right metric in the right context—whether millions of instructions per second (MIPS) per watt, versus MIPS 2 per watt, MIPS 3 per watt, and so on—remains rather poorly understood in architectural research today. In putting together this issue, we expected to publish some new thinking on the topic of "metrics" that measure complexity effectiveness and power efficiency, but the submissions did not adequately cover this topic.
Huang et al., in the first article of this issue, focus on the branch prediction logic within a current-generation superscalar microprocessor. This piece of modern microarchitecture has undergone progressive increases in relative area and control complexity in an attempt to attain ever-higher branch prediction accuracies. Clearly, not all applications require the large prediction tables and multilevel decision schemes to yield acceptably high prediction accuracy. It would therefore make sense to architect the branch prediction mechanism to be reconfigurable, where the effective (dynamic) complexity and power consumption is a function of the workload's inherent predictability. This is the basic idea behind the research reported in this article. The authors demonstrate reductions in branch predictor energy consumption by up to 90 percent without appreciable loss in prediction accuracy and performance.
In their article on statistical simulation, Eeckhout et al. address the problem of deriving fast, early stage design tradeoff decisions. To explore the PCAD design space at high speed, it is fruitful to complement classical trace-driven, cycle-accurate simulation tools with more abstract statistical models. This article presents a comprehensive view of such statistical analysis methods.
Julien et al. present an article that describes a new approach to characterizing the power dissipation in complex digital signal processors (DSPs). They illustrate this methodology by applying it to the Texas Instruments C6201 DSP. After proposing a power model, they validate it against actual measurements, with the observed error margin being within 4 percent.
The issue queues within a current-generation, out-of-order superscalar processor are known hot spots in terms of power dissipation and also present a complex problem in terms of control. Recently, various research groups have focused on power- and complexity-aware designs for issue logic in general. In this issue, the article by Abella, Canal, and Gonzalez provides a sound survey of PCAD techniques applied to this key aspect of modern microprocessors.
Last, but not the least, Fryman et al. present an article that explores the energy and delay tradeoffs that occur when you move some or all of the local storage out of a given embedded device and on to a remote network server. The authors demonstrate that using the network to access remote storage in lieu of local DRAM results in significant power savings.
We hope you enjoy this theme issue on PCAD, consisting of a carefully reviewed selection of articles that touch on a broad range of topics.