The Community for Technology Leaders

How to make your own processor architecture

Scott , Sun Microsystems

Pages: pp. 96—98

Reviewed in this issue

Processor Design: System-on-Chip Computing for ASICs and FPGAs, edited by Jari Nurmi (Springer, 2007, ISBN: 978-1-4020-5529-4, 528 pp., $99.00).



Processor Design:System-on-ChipComputing for ASICs and FPGAs is a far-ranging book on computer architectures customized for particular jobs. Because processors are now embedded in SoCs and programmable devices, a system designer is not limited to chips available from major manufacturers. The theory is that a system built of specialized processors will be more efficient, and this book covers a wide range of such architectures. This book includes three main types of chapters. The first group consists of background information; the second focuses on stages of the processor design process; and the third includes examples of architecture types and experimental architectures, mostly from universities.

Chapter 2 gives a short overview of processor architectures, covering both instruction set and cache architectures. Although this chapter covers a lot of ground in only 20 pages, I wish it had included examples of the processor architecture types discussed. For instance, I'm not sure if the author classifies a traditional x86 architecture as RISC (reduced-instruction-set computing) or CISC (complex-instruction-set computing). Chapter 3, "Beyond the Valley of the Lost Processors," is a delight, at least for this former computer architect. It consists of 13 problems with architecture-things computer architects have often gotten wrong in the past. These include high-level-language-directed architectures, stack machines, very long instruction word (VLIW) architectures, and many more. (I regret to say that I supported some of those in my youth.) All the points are illustrated with real examples from real architectures. It's a fascinating take on the history of computer architecture. However, I think customized architectures should have been added to the list.

We now come to the chapters on design. Chapter 4 describes the design of an instruction set and architecture for a speech encoding/decoding processor in a mobile phone. The design process begins with an analysis of the instructions used in the application, and continues with the design of instructions to support these. I'm not sure that anyone not already knowledgeable in instruction set design would understand the rationale behind this design, because few alternatives are given.

Another dimension of embedded-processor design is the implementation platform. Chapter 11 describes processors implemented using FPGAs. Such an approach can significantly impact design decisions, and this chapter gives several hints on each design area, from arithmetic units to instruction sets. The example at the end of this chapter illustrates the impact of using an FPGA on processor design.

The last few chapters are about the design process. Chapter 15 is an excellent tutorial on processor clock generation and distribution. The author illustrates distribution schemes of major processors and methods for reducing power (of which clocks are a major contributor), and he also presents some forward-looking work. I like how he always describes the trade-offs involved in each scheme, which makes the chapter useful and practical. We now move from clock design to clockless design. The well-written, though a bit enthusiastic, next chapter is a useful guide for those wishing to explore asynchronous design. Asynchronous processors have been successful for applications in which low power consumption is essential, because such processors don't have the power overhead of clock switching.

The next two chapters discuss different types of verification. Chapter 17 concerns early-estimation modeling. But although the first few sections talk about modeling, the chapter meanders into leakage and physical-design problems-quite irrelevant to high-level estimation. Moreover, there is little on the process of estimating, and the new reader won't have a clue about how to proceed. Chapter 18 discusses system-level simulation. After a brief introduction, the authors briefly describe a System-C simulator for TACO (Tools for Application-Specific Hardware/Software Codesign)-one of the example processors described in an earlier chapter-and then an interesting simulator generator for another example processor discussed earlier, called Coffee. Both chapters are at a very high level, but serve as a good introduction to the subject.

Any processor requires programming tools, which are hard to buy for instruction sets that you've designed yourself. Chapter 19 describes tools that let you create other tools, such as compilers, that can work with many different machines, given an instruction set description. Test is relegated to the penultimate chapter (Chapter 20), which describes software-based self-test, a processor test that executes out of the processor cache or from an external source. Although this was a good chapter, including an excellent historical overview and results, generation of these tests is still a research activity, despite over a quarter century of work. As a survey, this chapter is very effective. As a guide to the creation of tests for processors in SoCs, it is unrealistic.

The bulk of this book describes experimental implementations. For example, Chapter 5 describes the design of the Coffee RISC embedded processor. Chapter 6 presents a high-level overview of digital signal processors (DSPs) and their history. The author is Texas Instruments' Gene Frantz, so this chapter is somewhat TI centric. But most of the chapter is an excellent, though a bit too brief, tutorial on DSPs' characteristics and architecture.

Chapter 7 describes a reconfigurable VLIW DSP core, called 3a. Most interesting here is an evaluation process that involves simulating example application programs to provide information needed to choose a proper configuration.

Chapter 8 describes another embedded processor from Tensilica. Of high interest here is the Tensilica Instruction Extension (TIE) language, which designers can use to describe additional hardware they want to add to the processor. The first section of this chapter gives the advantages of customizing processors but doesn't discuss possible downsides. The Tensilica processors are reconfigured during design. Runtime-reconfigurable processors, described in Chapter 9, can be fine-grained, whereas their instruction sets can be extended, or coarse-grained, and their functional units can be configured for what is usually a single instruction, multiple data (SIMD) architecture.

At times a reconfigurable processor is unnecessary, and an application can be performed efficiently using a coprocessor. That is the topic of Chapter 10, with a section on requirements and a nice section on some coprocessors described in the literature. The last part of the chapter describes two coprocessors designed at the authors' university. Although this section was interesting, it seemed more like a report, and I didn't find much that would be useful to a reader interested in either using or designing a coprocessor. There was a short historical overview at the beginning of the chapter, but it began with the 1990s-quite late.

An advantage of embedded processors is their ability to be customized for specific tasks. Processors can be optimized for processing protocols, such as asynchronous transfer mode (ATM) and Internet Protocol Version 6 (IPv6). Most of Chapter 12 describes the authors' TACO processor, which consists of a set of functional units connected by an interconnection network, and processing data placed in local registers when a trigger register is loaded. Although this is an interesting architecture, and reminiscent of data-flow architectures popular 25 years ago, I wish more had been said about the problems of dealing with protocols.

Chapter 13 describes a Java coprocessor. Although this chapter was interesting, I think it is premature. There are no results given other than a statement that the current system doesn't provide reasonable execution speed because of communication overhead. The reader, thus, has no idea whether the proposed architecture is feasible. Chapter 14 doesn't have this problem. It describes the Raw processor, which consists of 16 homogeneous tiles arranged in a grid. This processor has been implemented, and extensive benchmarking results against the Pentium 3 are given. In fact, as I was writing this review, a commercial processor based on this technology was announced, and is now available. I wish more of the chapters had followed this example.

The theme of this book, increasing performance through diversity of processors, sounds familiar to some of us. User microprogrammable machines, a subject of much interest in the late 1970s, were supposed to do exactly that. Many of the chapters in this book would map nicely to one on microprogramming. Special-purpose architectures addressing a wide market need are clearly practical, as the example of DSPs shows, but only the most computationally intensive applications require new architectures. This requires time and effort by developers to define, implement, and verify such architectures. How often would the greater time to market be justified by performance improvements? Our experience with microprogramming would seem to indicate that the answer is, not very often. This isn't to say that there aren't some applications in which the additional effort makes sense, but this book doesn't give the pitfalls of using customizable processors. Perhaps that's the reason the vast majority of examples in the book are academic. Finally, this book could have been edited more carefully to improve the grammar and word choice of some of the chapters.Nevertheless, this book is still a good guide to customizable processors, their architectures, and the design processes needed to support them. Anyone looking for a good overview of this area is sure to get a lot out of this book. I also find it encouraging that there is still a lot of work being done on the creation of innovative architectures.

61 ms
(Ver 3.x)