Issue No. 02 - March/April (2012 vol. 32)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/MM.2012.33
Bevan Bass , University of California, Davis
Allen Baum , Intel
Hot Chips has become a bellwether of computer architectural trends. We can trace innovation and maturity in the industry by the rise and fall of these trends, on the basis of product categories that had multiple Hot Chips sessions in some years, but whose numbers shrank in following years. These numbers didn't shrink because of industry failure, but often because they were successful, and are so accepted that they no longer need to be presented.
We've seen this in Hot Chips over the years with reduced-instruction-set computing (RISC) versus complex-instruction-set computing (CISC), new instruction set architectures, instruction set parallelism, graphics and video processors, very long instruction word (VLIW) and digital signal processing (DSP) architectures, network processors, and the growing trend of entire systems, including servers, on a chip.
Many cores, many threads, much power
Hot Chips 23 was clearly the year of many-core and many-thread processors (either sharing core hardware or as a separate many-core chip, or even as partially shared cores); there wasn't a product category that didn't have that as a defining feature. The products presented stood out not by their sheer number of threads, but by the way the cores and threads communicated and synchronized, and the architectural innovations that enabled the threads to be efficiently used.
But the parallelism enabled by multiple threads isn't enough in a new world of power constraints. The product innovations presented at Hot Chips specifically targeted the conflicting goals of keeping threads as busy as possible, while using as little power as possible.
The articles in this special issue (and subsequent issues) exemplify these trends with innovative approaches to achieving high performance under power constraints.
Throughput performance from multiple cores and threads
Oracle's "Sparc T4: A Dynamically Threaded Server-on-a-Chip," by Manish Shah et al., shows stunning performance increases over previous generations of chips by optimizing what is shared—and what isn't—in a system optimized for throughput. Its eight cores process 64 threads, and the chip is implemented in a 40-nm technology.
Power management in servers
Intel's "Power-Management Architecture of the Intel Microarchitecture Code-Named Sandy Bridge," by Efraim Rotem et al., shows the effort needed to actively and dynamically manage power of multiple cores at extremely fine-grained time intervals.
"AMD Fusion APU: Llano" by Alexander Branover et al. combines core power management with GPU power management, dynamically sharing the power constraints between them as application demands change. The 32-nm silicon on insulator (SOI) die contains four x86 cores, each of which has a 1-Mbyte L2 cache, and a graphics processor containing 400 stream cores.
Coordinating cores and threads
"Godson-T: An Efficient Many-Core Processor Exploring Thread-Level Parallelism" by Dongrui Fan et al. primarily concerns itself with using dedicated hardware to accelerate how multiple cores synchronize and communicate in a single application to avoid idle cycles. The 16-core, 130-nm, 300 MHz prototype chip gives a good picture of what can be expected in the upcoming 64-processor Godson-T.
Finally, in "The IBM Blue Gene/Q Compute Chip," Ruud A. Haring and the IBM Blue Gene team show how cache, memory, floating-point architectures, and a server network interface can all be combined in a single chip to yield performance that can best humans at what humans (think they) do best. The 45-nm SOI die contains 18 processors and a massive 32-Mbyte L2 cache made from embedded DRAM.
Of course, these weren't the only interesting presentations made at Hot Chips, but only those that could fit into an IEEE Micro special issue. The full set of presentation slides for Hot Chips 23 can be accessed online at http://www.hotchips.org/conference-archives/hot-chips-23.
We hope you enjoy this special issue. We thank our numerous reviewers for insightful comments and the Hot Chips program committee for putting together a program with so many fine presentations.
Allen Baum is a computer architect at Intel. He has worked in the industry for 40 years on products ranging from the Apple I and II to HP's HP45 and PA-RISC, ARM and StrongArm, and Alpha, Itanium, and x86 server processors. Baum has an MS in electrical engineering from the Massachusetts Institute of Technology. He is a senior member of IEEE and a member of the ACM and the IEEE Computer Society.
Bevan Baas is an associate professor in the Department of Electrical and Computer Engineering at the University of California, Davis. His research interests include algorithms, architectures, circuits, and VLSI design for high-performance, energy-efficient, and area-efficient computation. Recent projects include the 36-processor and 167-processor AsAP chips and applications. Baas has a PhD in electrical engineering from Stanford University. He is a senior member of IEEE.