The Community for Technology Leaders

Hot Chips 18

John Kubiatowicz, University of California, Berkeley
Howard Sachs, Telairity

Pages: pp. 7-9

Over its 18–year history, the Hot Chips conference has become a leading forum for the latest computing, communications, and networking chips. The conference covers a variety of technical details about these chips, including technology, fundamental algorithms, packaging techniques, architecture, and circuit details. The emphasis is on real chips and applications that are part of actual products; although we do have our share of research talks, we explicitly avoid theoretical talks and marketing hype.

The 2006 conference reached an all-time high in number of submissions. We received 65 submissions from all corners of the globe, including Europe, Asia, and the Middle East. We believe that the quality of the program benefited greatly from this diverse pool of submissions. The final program for Hot Chips 18 consisted of 27 wonderful presentations. As usual, we featured some high-end CPUs, including those destined for PCs and for game machines. We also included embedded CPUs, since these represent some of the highest-volume applications of silicon. A resurgent interest in multiprocessing was reflected by three talks in the large-scale multiprocessing domain. We also included audio-video processors and communication controllers. To spice things up, we even included talks on some novel uses of silicon.

This issue of IEEE Micro brings you six of the best presentations from Hot Chips 18, expanded to full articles.

The rise of the multiprocessor/multicore system

It wasn't all that long ago that consumers of microprocessors expected Moore's law–driven performance gains of 60 percent per year as a matter of course. According to industry analysts such as David Patterson from University of California, Berkeley, such gains abruptly ceased in 2002. At that point, a variety of issues conspired to make it extremely difficult to continue scaling clock rates and issue widths as before; in fact, the latest microprocessors are more than a factor of three behind where they "should be" in performance. Since the sea change of 2002, all the major chip manufacturers have jumped onto the multicore bandwagon in an attempt to continue growth in functionality, if not in single-thread performance. Unfortunately, this trend completely neglects questions of how such systems will be programmed and raises many issues—such as how to efficiently communicate between processors, how to do so at reasonable power, and whether switching to a non-von-Neumann programming model might be necessary. Reflecting this uncertainty, four of our articles fall squarely into the multiprocessor/multicore domain.

The first two articles discuss new northbridge architectures from AMD and Intel targeted at interprocessor communication. In "The AMD Opteron Northbridge Architecture, Present and Future," Pat Conway and Bill Hughes discuss the advantages of a glueless multiprocessor technology utilizing point-to-point networking. They describe the HyperTransport technology that is part of AMD's Opteron line of processors and illustrate where this technology is going in the future. Our second article illustrates an alternate approach. "The Blackford Northbridge Chipset for the Intel 5000," by Sivakumar Radhakrishnan, Sundaram Chinthamani, and Kai Cheng, talks about how Intel has expanded the front-side bus architecture in the latest northbridge chipsets to handle higher bandwidth requirements of multicore chipsets; it also details several of the interesting architectural features of Blackford chipset. Both these articles reflect the belief in the industry that high-performance, coherent communication between processors will be extremely important in the future.

Next, "AsAP: A Fine-Grained Many-Core Platform for DSP Applications," by Bevan Bass et al., presents an interesting multicore architecture targeted at DSP applications. The AsAP architecture utilizes a "globally asynchronous, locally synchronous" methodology and is highly scalable, power-efficient, and apparently quite programmable for its target applications. This article describes the technology, architecture, and programming environment for AsAP.

Finally, "RAMP: Research Accelerator for Multiple Processors," by John Wawrzynek et al., makes a case for utilizing modern FPGAs to enable the codesign of large-scale multiprocessor/multicore hardware and software. Wawrzynek and his coauthors argue that a large system of FPGAs could emulate a 1,000–node multiprocessor with sufficient performance to permit the development of next-generation operating systems and applications—without the expense and time of producing a complete hardware system for each proposed hardware feature. They make case that a system such as RAMP may be the only viable method to smoothly transition the world from uniprocessing to large-scale multiprocessing.

Embedded processing

The bulk of the world's microprocessors appear in embedded applications, not workstation or laptop environments. Reflecting the importance of this segment of the market, we have included "ARM996HS: The First Licensable, Clockless 32–Bit Processor Core," by Arjan Bink and Richard York. The ARM996HS core is particularly interesting because even though it can be embedded in a variety of technologies, it is clockless—or perhaps "self-clocked" is a better term. In the process of describing the architecture of the ARM996HS, the authors discuss the advantages of the clockless design methodology, including lower power consumption and decreased noise generation.

Low-power processing

Finally, given ever-present concerns about power consumption, we have included "Low-Power, High-Performance Architecture of the PWRficient Processor Family," by Tse-Yu Yeh, which describes the PWRficient processors developed by P.A. Semi. These processors are descendants of the extremely power-efficient StrongArm processor developed by Dan Dobberphul, then at Digital Equipment Corporation. The author discusses how the P.A. Semi processors achieve high performance while consuming very low power. This article contains some very interesting technical nuggets about low-power design.

Space limitations prevent us from including more presentations from Hot Chips 18. Most of these presentations, as well as others from previous years, are available at We hope that you find these articles as exciting as we do. We also encourage you to attend Hot Chips 19 in August of this year.

About the Authors

John Kubiatowicz is an associate professor of electrical engineering and computer science at the University of California, Berkeley. His specialties include computer architecture, operating systems, and networking. His research interests include speculative approaches for computer design, such as quantum, biological, and autonomic computing, as well as issues in Internet-scale systems design, namely security, privacy, and denial-of-service resilience. He currently leads the OceanStore research effort (, which is exploring a utility storage architecture that targets millions of servers and billions of users. He is also exploring architectures for quantum transport in quantum computers. Kubiatowicz has a PhD in electrical engineering and computer science from the Massachusetts Institute of Technology, where he was one of the principal designers of the Alewife multiprocessor. He also holds dual BS degrees in electrical engineering and physics, as well as MS degrees in electrical engineering and computer science, from MIT.
Howard Sachs is president and CEO of Telairity. Sachs has held a long list of strategic positions with industry-leading companies dealing in computer hardware-software design, microprocessor design, and ASIC design. He held management positions with Fujitsu Microelectronics, Intergraph Corporation, Fairchild Semiconductor Corporation, and Cray Laboratories, among others. He has dedicated many years to solving SoC problems as president of the Virtual Socket Industry Alliance and with Fujitsu Microelectronics. He holds many major industry patents that have contributed significantly to cache memory and VLIW design today. Sachs has a BSE from California State University, Los Angeles, and an MSEE from the University of Southern California.
61 ms
(Ver 3.x)