Issue No. 04 - July/August (2011 vol. 28)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/MDT.2011.87
George Constantinides , Imperial College London
Nicola Nicolici , McMaster University
FPGAs have been considered as possible implementation platforms for computation since the early 1990s. At that time, however, FPGAs were small devices, capable only of a low computational throughput, and their internal structure did not allow for efficient synthesis of arithmetic components. Over the technology generations, the regularity of FPGA designs has allowed them to stay at the leading edge of each new technology node. Responding to their adoption in computational applications, several architectural innovations have further improved device density, such as the integration into the devices of "hard" multipliers and embedded memory blocks.
The digital signal processing (DSP) community has responded well to these platforms: the large number of integer multiply-accumulate operations possible on a high-end FPGA makes the FPGAs an ideal target for high data-rate DSP. Several stable design flows now exist from DSP-oriented input specifications, as exemplified by the Xilinx System Generator (Simulink) and National Instruments LabView flows.
The adoption of high-end FPGAs for DSP applications has not gone unnoticed in the high-performance computing sector, and several established companies and start-ups have tried to enter the market of FPGA-based acceleration for scientific computing, including Cray, SRC, and SGI, among others. The potential of the technology for scientific computation is well understood; however, there is currently a mismatch in expectation between the electronic design tool flows supported by FPGA vendors and the compilation flows expected by the scientific computing community.
This mismatch has hindered adoption of FPGAs in scientific computing applications. An example of the distinctions between signal processing and scientific computing domains is numerical analysis, which, for most cases, is a key problem that must be addressed before a scientific application can be ported to an FPGA. We currently stand on a threshold, where various key research contributions and initiatives could radically alter the landscape of this emerging field, thus propelling FPGA-based computation from the DSP and embedded space into the scientific computing domain.
The landscape of research in designing FPGA-based accelerators is surveyed at a critically important time. A series of peer-reviewed contributions are presented in this special issue of IEEE Design & Test, with perspectives that we believe are likely to have long-term impact on the future of parallel computation in this domain. It is our hope, ultimately, that bringing together the research in this special issue will help readers to bridge the historically distinct FPGA, high-performance computing, and numerical analysis communities.
The first article in this special issue is a survey of the current status of research and practice in "Numerical Data Representations for FPGA-Based Scientific Computing." Because of the flexibility of FPGAs in implementing arbitrary data paths, this subject is crucial to achieving high-performance designs, and a number of advances have been made in the field in recent years.
The second article, "Designing Custom Arithmetic Data Paths with FloPoCo," delves more deeply into the design, and design automation of data-path cores, thereby presenting a complete design flow that allows the production of numerical cores for FPGA implementation.
The third article, "High-Level Languages and Floating-Point Arithmetic for FPGA-Based CFD Simulations," investigates a case study in which modern FPGA design tools are used to accelerate a classical scientific computing problem.
The fourth article, "Data Reorganization and Prefetching of Pointer-Based Data Structures," presents some recent efforts to overcome the inability of existing design flows from high-level languages to cope with pointers and their manipulation.
The fifth article, "FPGA-Based Particle Recognition in the HADES Experiment," highlights a case study on the use of FPGAs in real-time scientific computing environments for particle physics.
The sixth article, "Computational Mass Spectrometry in a Reconfigurable Coherent Coprocessing Architecture," discusses a bioinformatics application, one of the current growth areas for customized acceleration architectures.
The final article, "An End-to-End Tool Flow for FPGA-Accelerated Scientific Computing," showcases the work of the authors with the National Science Foundation Center for High-Performance Reconfigurable Computing (CHREC) to target the productivity of FPGA-based scientific computing.
FPGA technology is established as an implementation fabric for hardware acceleration; however, alternatives have emerged in recent years. A notable example is general-purpose graphics processing unit (GPGPU) computing, which has seen significant adoption by the scientific computing community. Although both high-density and massively parallel FPGAs and GPUs have been facilitated by the same advancements in semiconductor technology (enabled by Moore's law), recently there have been significant research, development, and investment in improving the scientific application mapping onto GPUs, ranging from specification languages to design environments and compiler technology.
Combined with the ability to configure the communication architecture to the application at hand, FPGAs can potentially provide a larger number of processing engines on a silicon die than GPUs. Despite such promise, however, the lack of design methods and tools for FPGA-based accelerators has led to a faster adoption of the GPGPU technology, often based on productivity rather than performance reasons.
We believe that the articles in this special issue will help to highlight both the current status of FPGA-based acceleration and some of the emerging solutions to overcome current productivity limitations. Moreover, it is our strong belief that the techniques discussed in this issue—that, at an abstract level, describe how to map scientific computing problems onto spatial parallelism—will be of critical importance both for FPGAs and for future many-core architectures, as the number of cores scales with time.
George A. Constantinides is a reader in digital systems and head of the Circuits and Systems Research Group at Imperial College London. His research interests include field-programmable gate arrays, numerical computation, and design automation. He has a PhD in electrical and electronic engineering from Imperial College London. He is a Senior Member of the IEEE and a Fellow of the British Computer Society.
Nicola Nicolici is an associate professor in the Department of Electrical and Computer Engineering at McMaster University. His research interests include computer-aided design and test. He has a PhD in electronics and computer science from the University of Southampton, UK, and is a member of the IEEE and the ACM.