Trends in CMOS technology point to an era of high-performance microprocessor design in which problems such as power consumption and cooling, device- and chip-level variability, and hard and soft errors threaten to slow down historically established performance growth rates. One approach is to add more hardware to separately control each of these threatening technological trends. Let’s examine this approach more explicitly. In such a design, you could use the following mechanisms to mitigate the adverse effects:

- **Utilization monitors.** Built-in utilization monitors could adapt the computing resources in tune with input workload requirements. Such an architecture, supported by clock- and/or power-gating circuitry, would give designers a handle on minimizing average power consumption. This is good if the goal is to save on the battery power in portable computers or the electric bills that sustain a server farm. Limiting maximum power consumption and temperature (which is important in reducing the cost of the chip package and therefore the overall cost-performance ratio), would require additional controls. For example, throttling the clock or the instruction flow rates on detecting a power or temperature overrun is a commonly used mechanism.

- **GALS design.** The use of multiple voltage or clock domains permits a globally asynchronous locally synchronous design. The decoupling, asynchronous queues connecting the information flow from one domain to another serve as adjustable “springs” that correct for unpredictable speed variations in each domain.

- **Reliability controls.** Reliability monitoring, budgeting, and dynamic management can help keep lifetime failure rates within design specifications. Understanding the vulnerable hot spots in a chip and intelligently managing the distribution of tasks across multiple (possibly redundant) resources to keep temperatures and long-term failure probabilities within budget are the goals of this set of hardware controls.

- **Enhanced error detection and recovery.** Using pervasive error detection and correction circuitry, where feasible, prevents the storing of corrupt data in architected register or memory state, in the presence of soft errors. To support highly reliable computing, designs can even use lock-step duplicated (redundant) computation, backed by ECC-protected checkpointing and recovery mechanisms.

Each of these hardware mechanisms requires additional area and verification cost; and, often, one approach can adversely affect another dimension of the overall problem. For example, dynamic power management increases the variability of the on-chip power supply voltage, due to increased inductive noise. This increases the susceptibility to timing and soft errors. Similarly, redundancy and recovery support to enhance error tolerance would cause an increase in chip power consumption.

The need for an integrated approach in designing future microprocessors is therefore paramount. First, the management of mutually conflicting optimization goals requires the use of a consolidated monitor-and-control system to detect application and environmental changes, and take appropriate action. Second, designs must employ software support, wherever possible, to simplify the area and power complexity of hardware controls. This means that the compiler and operating system must play an increasingly important role in the scheduling and control of on-chip hardware resources.

I posit a view, therefore, that the future of high-performance, multicore microprocessors promises to herald a new era of system-on-chip designs with integrated hardware-software elements. Early-stage integrated modeling methodologies are a key aspect of this new design era. As such, the new generation of microprocessor designs will require significant investment in presilicon design, analysis, and verification tools, many of which will require key new inventions. Thus, the future promises to be one that poses significant new challenges to computer architects, designers, compiler experts, systems software, and presilicon modelers. However, not withstanding the availability of a new generation of sophisticated toolsets, an
integrated design team consisting of experts in individual domains that also have breadth to communicate and work at ease with other experts will be the overarching need of the era.

In this special “Future Trends” issue of *IEEE Micro*, you will find a series of interesting articles from some of the leading experts and visionaries in the field. I hope the microprocessor and microsystems R&D community will be able to draw from the ideas and opinions expressed in these articles to further accelerate and enhance the race to find integrated, efficient solutions to combat the obstacles posed by future CMOS technologies.

For further information on this or any other computing topic, visit our Digital Library at http://computer.org/publications/dlib.

---

**EIC’S MESSAGE**

Future devices will become increasingly unreliable and unpredictable. We are entering an exciting era where novel microarchitectural approaches will need to be developed to build reliable and dependable computers out of unreliable and unpredictable elements. IEEE Micro seeks original manuscripts for a special issue covering reliable microarchitectures. Articles concerning applied research and practical experience reports are welcome.

**23 June 2005: Deadline for abstract submission**
**30 June 2005: Deadline for manuscript submission**
**22 August 2005: Authors notified of acceptance with requested revisions**
**15 September 2005: Final copy due**

The abstract should be submitted in plain text. No other format will be accepted.

For detailed information see http://www.computer.org/micro/articles/CFP/cfpsi0404.htm.

Suggested topics include, but are not limited to:

- Error-tolerant microarchitectures: support for hard and/or soft error tolerance
- Low overhead spatial and/or temporal redundancy techniques
- Low cost error-detecting/correcting codes for storage and logic
- (Micro)architectural support for tolerating timing errors
- Modeling the effect of hard and soft errors at the microarchitecture and system level
- Characterization and modeling of thermally-induced failures and transient errors
- Low overhead checkpointing and recovery schemes
- Variation-tolerant design principles at the microarchitecture level
- Tools and methodologies for reliability-power-performance tradeoff analysis at the processor and system level
- Compiler and operating system support for reliability and availability

---

**Call for Papers**

**Special Issue on Reliability-Aware Microarchitectures**

**Guest editors: Prof. Sarita Adve and Dr. Pia Sanda**

---