Issue No. 02 - March/April (2003 vol. 23)
Keith Diefendorff , MIPS Technologies
John Wawrzynek , University of California, Berkeley
Despite the staggering economic losses suffered by the electronics industry in 2001 and 2002, technological innovation shows no evidence of being stifled. Indeed, if the 14th annual Hot Chips conference at Stanford is any indicator, just the opposite seems to have occurred. The summer 2002 conference brought forth a spectacular assortment of hot, new chips no less stunning than those from prior years.
Equally surprising was the breadth of products. Although some segments of our industry were particularly hard hit, companies in those segments still rolled out impressive new chips. Processors for PCs, servers, networking, embedded applications, and digital-signal processing were all well represented. The turnout bodes well for the future. Apparently, many companies recognize that those who invest during the hard times stand to reap the rewards when good times return.
In keeping with Hot Chips tradition, the conference began with a day of invited tutorial sessions followed by two days of technical presentations. Hot Chips' emphasis remains on real chips and usable technology, not theoretical research or marketing hyperbole. This year's tutorials included a particularly fascinating session on trends in IC scaling, put together by an organization that is among the most respected in this area—International Sematech, Austin, Texas ( http://www.sematech.org). It is Sematech that coordinates many of the activities around the semiconductor industry's bible, the International Technology Roadmap for Semiconductors ( http://public.itrs.net). The ITRS identifies the grand challenges and plots solutions to keep the industry on Moore's law, the importance of which is impossible to overstate.
The speakers for this tutorial covered a myriad of topics in the domains of MOSFET scaling, lithography, and on-chip interconnects. The topics they covered were too numerous to do justice in a single article, but Alfred K. Wong, a speaker at the tutorial and a research investigator in the area of resolution enhancement techniques at the University of Hong Kong, has prepared an article for Micro on near-term microlithography trends and challenges and how they relate to chip design.
Building Reliability From Commodity PCS
In 2002, Hot Chips was fortunate to have Eric Schmidt, chair and CEO of Google ( http://www.google.com), present a keynote talk describing the technology behind everyone's favorite Internet search engine. Urs Hölzle, a Google Fellow, along with Luis André Barroso and Jeffrey Dean, has followed up with an article for Micro detailing the architecture of the Google cluster. Schmidt and Hölzle explain that the Google cluster works its magic through the use of two innovative concepts that depart from traditional approaches. For one, the Google cluster achieves its extraordinarily high degree of reliability through software running on redundant, low-cost commodity PCs, rather than on server-class, high-reliability processors. For another, Google optimizes the cluster for high throughput—not low latency—by massive parallelization of queries distributed to many PCs simultaneously. Google credits the profitability of its business in large part to these two innovations.
LOW-POWER DSP FOR VOICE, DATA, AND MULTIMEDIA
From one of the economically hardest hit segments of the industry came a hot new communications chip: Calisto. The chip, which Broadcom ( http://www.broadcom.com) dubs the BCM1510, is built on technology acquired when the company purchased Silicon Valley start-up Silicon Spice. At Hot Chips, John Nickolls, director of architecture at Broadcom, described how Calisto brings an unprecedented level of parallelism to bear on the job of adapting voice, data, and multimedia streams for transport over the packet-switched networks that are beginning to augment public circuit-switched telephony networks. Using chip multiprocessor and vector-processing parallel technologies, the 0.13-micron, 130-million-transistor Calisto provides enough digital-signal-processing horsepower to implement 240 channels of G.711 packet-voice gateway services with 32 ms of echo cancellation while consuming only 5 mW of power per channel. At such low power levels, designers can assemble 10 or more Calisto chips on a single telecom blade to implement a 2,016-channel, OC-3 packet-voice and multimedia gateway.
INTEL'S MCKINLEY REVEALED
Hewlett-Packard and Intel gave two Hot Chips presentations on their new McKinley processor, which the companies officially label Itanium 2. These presentations were the most detailed public disclosures to date of McKinley's microarchitecture and benchmark performance. Cameron McNairy of Intel and Don Soltis of HP have further expanded the microarchitecture description and present it in this issue of Micro.
In this article, the authors describe the enhancements their respective companies have made to boost the performance of Itanium 2 over its predecessor, Itanium, while retaining full binary compatibility. They made improvements to many features, from pipeline depth, pipeline control, and branch prediction to the cache hierarchy and memory system interface. These improvements boosted Itanium 2's frequency 25% higher than that of Itanium in the same 0.18-micron process technology, more than doubling SPECint2000 and SPECfp2000 performance. The cost of these improvements was about 15 million additional transistors in the core, L1, and L2 caches—an increase of about 60 percent—plus another 180 million transistors in a newly added on-chip L3 cache.
But Intel hasn't spent all its time on 64-bit Itanium processors. The company has also rolled out a new feature in its latest 32-bit x86 desktop processor, the Pentium 4. With this new feature, the Pentium 4 becomes the first mainstream PC microprocessor to implement simultaneous multithreading (SMT), a technology Intel has relabeled hyperthreading. At Hot Chips, Intel's hyperthreading technology architect, Deborah Marr, described how Intel worked this technology into the Pentium 4's Netburst microarchitecture. David Koufaty and Marr have documented that effort for us in this issue of Micro. One remarkable aspect of Intel's hyperthreading implementation is the small amount of silicon it occupies. Intel claims the feature increases Pentium 4's die area by less than 5 percent, yet it yields as much as a 27 percent performance boost on a variety of multitasking and multithreaded applications. Intel is not the only company to recognize the potential of SMT; expect other companies to follow its lead and incorporate this technology into their next-generation microprocessors.
AMD Extends to 64-bits
Meanwhile, Advanced Micro Devices has not been idle, either. AMD, however, is taking a different tack than Intel. Rather than opting for a new 64-bit architecture, like Intel has with Itanium, AMD is extending the x86 architecture into a new, unified 32- and 64-bit architecture it calls Hammer. At Hot Chips, three AMD presentations described the new Hammer architecture; its first implementation in a commercial microprocessor, which the company calls Opteron; and the shared-memory multiprocessor system architecture that supports it. For Micro, AMD condensed these presentations into an article by Chetana Keltcher, Kevin McGrath, Ardsher Ahmed, and Pat Conway. The Opteron chip they describe primarily targets server applications, but observers expect AMD to push the Hammer architecture into PC desktops as well. Opteron is unique not only in its 64-bit internal architecture but also in the chip's external system interface, which includes an integrated 5.3 Gbytes/s double-data-rate SDRAM memory controller and three 6.4 Gbytes/s HyperTransport links. AMD will build the processor in 0.13-micron silicon-on-insulator technology at its new fabrication facility in Dresden, Germany.
Computing has certainly come a long way since the first practical stored-program digital computer went into operation 53 years ago. That computer, EDSAC, was a monstrous device capable of executing a whopping 650 instructions per second, nearly seven orders of decimal magnitude slower than some of the hot chips at this year's conference, although certainly no less innovative. Interestingly, Maurice Wilkes, who developed EDSAC while director of the Cambridge Computer Laboratory, attended Hot Chips 14. Wilkes' list of accomplishments includes the ACM Turing Award, the ACM/IEEE Eckert-Mauchly Award, the IEEE Computer Society Pioneer Award, and an almost uncountable number of others. It was an honor to see him at Hot Chips in 2002; it provided a great opportunity for attendees to meet one of the true gentlemen and pioneers of the computing industry. The Hot Chips program committee wishes him many safe returns.
Finally, we thank the members of the Hot Chips 14 program committee, who were responsible for soliciting, selecting, and shepherding presentations for the conference and for this special issue of Micro. We cochaired the committee, which consisted of Siamak Arya (Telairity), Forest Baskett (New Enterprise Associates), Pradeep Dubey (Broadcom), Mike Flynn (Stanford), John Kubiatowicz (UC Berkeley), Hidetaka Magoshi (Sony), John Mashey (Sensi Partners), Tom Riordan (PMC-Sierra), Howard Sacks (Telairity), John Sell (AMD), John Shen (Intel), Alan Jay Smith (UC Berkeley), and Marc Tremblay (Sun Microsystems).
John Wawrzynek is a professor of electrical engineering and computer science at the University of California, Berkeley, where he teaches courses in computer systems engineering, computer architecture, and VLSI system design. His current research interests include the design and application of reconfigurable computing systems; he heads the Berkeley Reconfigurable Architectures, Software, and Systems (BRASS) group. John also leads research in the application of computers and networks to music synthesis and performance. John has an MS in electrical engineering from the University of Illinois, Urbana-Champaign, and a PhD in computer science from the California Institute of Technology. He is a member of the IEEE and the IEEE Computer Society.
Keith Diefendorff is vice president of product strategy at MIPS Technologies Inc., where he is responsible for the definition of MIPS products. His experience includes work on numerous microprocessors at leading semiconductor companies, including AMD, Texas Instruments, and Motorola, where he was chief architecture for the PowerPC. Diefendorff also worked on PowerPC at Apple Computer, as a distinguished scientist and director of processor architecture. Prior to MIPS, he was vice president of research at ARC Cores and spent two years as editor-in-chief of Microprocessor Report. Diefendorff has an MSEE from the University of Akron and holds 12 US patents. He is a member of IEEE and the Computer Society.