This Article 
 Bibliographic References 
 Add to: 
Architectures and Execution Models for Hardware/Software Compilation and Their System-Level Realization
October 2010 (vol. 59 no. 10)
pp. 1363-1377
Holger Lange, Technische Universität Darmstadt, Darmstadt
Andreas Koch, Technische Universität Darmstadt, Darmstadt
We propose an execution model that orchestrates the fine-grained interaction of a conventional general-purpose processor (GPP) and a high-speed reconfigurable hardware accelerator (HA), the latter having full master-mode access to memory. We then describe how the resulting requirements can actually be realized efficiently in a custom computer by hardware architecture and system software measures. One of these is a low-latency HA-to-GPP signaling scheme with latency up to 23{\times} times shorter than conventional approaches. Another one is a high-bandwidth shared memory interface that does not interfere with time-critical operating system functions executing on the GPP, and still makes 89 percent of the physical memory bandwidth available to the HA. Finally, we show two schemes with different flexibility/performance trade-offs for running the HA in protected virtual memory scenarios. All of the techniques and their interactions are evaluated at the system level using the full-scale virtual memory variant of the Linux operating system on actual hardware.

[1] M. Gokhale and P.S. Graham, Reconfigurable Computing. Springer, 2005.
[2] Synplicity Inc., Synplify DSP, products/synplifydsp index.html, 2007.
[3] Xilinx Inc., System Generator for DSP, , 2007.
[4] S. Gupta et al., SPARK. Kluwer, 2004.
[5] W. Najjar et al., "From Algorithms to Hardware," Computer, Aug. 2003.
[6] M.B. Gokhale et al., "Stream-Oriented FPGA Computing in the Streams-C High-Level Language," Proc. IEEE Symp. Field-Programmable Custom Computing Machines (FCCM), 2000.
[7] M. Budiu et al., "Spatial Computation," Proc. Int'l ACM Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2004.
[8] T. Callahan, J. Hauser, and J. Wawrzynek, "The Garp Architecture and C Compiler," Computer, vol. 33, no. 4, pp. 62-69, Apr. 2000.
[9] D. MacMillen, "Nimble Compiler Environment for Agile Hardware," Storming Media LLC (USA), 2001.
[10] N. Kasprzyk and A. Koch, "High-Level-Language Compilation for Reconfigurable Computers," Proc. Int'l Conf. Reconfigurable Comm.-Centric SoCs (ReCoSoC), 2005.
[11] S. Balacco, "Linux in the Embedded Systems Market," vol. 7, Venture Development Corp., 2007.
[12] H. So, A. Tkachenko, and R. Brodersen, "A Unified Hardware/Software Runtime Environment for FPGA-Based Reconfigurable Computers Using BORPH," Proc. 16th Int'l Conf. Field Programmable Logic and Applications (FPL), 2006.
[13] A. Donlin, P. Lysaght, B. Blodget, and G. Troeger, "A Virtual File System for Dynamically Reconfigurable FPGAs," Proc. 14th Int'l Conf. Field Programmable Logic and Applications (FPL), 2004.
[14] V. Nollet, P. Coene, D. Verkest, S. Vernalde, and R. Lauwereins, "Designing an Operating System for a Heterogeneous Reconfigurable SoC," Proc. Int'l Parallel and Distributed Processing Symp. (IPDPS), 2003.
[15] Nallatech, Intel Xeon FSB FPGA Accelerator Module, http:/, 2009.
[16] C. Dovrolis, B. Thayer, and P. Ramanathan, "HIP: Hybrid Interrupt-Polling for the Network Interface," ACM SIGOPS Operating Systems Rev., vol. 35, no. 4, pp. 50-60, 2001.
[17] M. Aron and P. Druschel, "Soft Timers: Efficient Microsecond Software Timer Support for Network Processing," ACM SIGOPS Operating Systems Rev., vol. 33, no 5, pp. 232-246, 1999.
[18] P. Laurich, "A Comparison of Hard Real-Time Linux Alternatives," LinuxDevices, 2004.
[19] Advanced RISC Machines Ltd., "AMBA AXI Protocol Specification 1.0," 2004.
[20] Open Core Protocol Int'l Partnership, http:/, 2006.
[21] IBM Corp., "The CoreConnect Bus Architecture," white paper, 1999.
[22] Xilinx Inc., "Embedded System Tools Reference Manual," Xilinx UG111, 2006.
[23] Xilinx Inc., "Virtex II Pro and Virtex II Pro X Platform FPGAs: Complete Data Sheet," Xilinx DS083, 2005.
[24] Tool Interface Standard (TIS) Executable and Linking Format (ELF) Specification Version 1.2, TIS Committee, 1995.
[25] Philips Semiconductors, "SAA7146A Multimedia Bridge, High Performance Scaler and PCI Circuit (SPCI)," Product Specification, 2004.
[26] M. Vuletić, L. Pozzi, and P. Ienne, "Virtual Memory Window for Application-Specific Reconfigurable Coprocessors," Proc. Design Automation Conf. (DAC), 2004.
[27] P. Garcia and K.A. Compton, "Reconfigurable Hardware Interface for a Modern Computing System," Proc. 15th Ann. IEEE Symp. Field-Programmable Custom Computing Machines (FCCM), 2007.
[28] T.R. Halfhill, "MicroBlaze v7 Gets an MMU," microprocessor report, Nov. 2007.
[29] T. Thatcher and P. Hartke, "OpenSPARC T1 on Xilinx FPGAs— Updates," RAMP Retreat, 2008.
[30] M. Friedl et al., OpenSSH 5.2 Release Notes, http://www., 2009.
[31] G. Giacobbi, "GNU Netcat 0.7.1," User Manual, http://netcat. sourceforge.netdownload.php , 2004.
[32] R.M. Stallman Using the GNU Compiler Collection for GCC 3.3.6, http://gcc.gnu.orgonlinedocs/, 2002.
[33] ETSI, "Digital Cellular Telecommunications System (Phase 2+); ANSI-C Code for the GSM Enhanced Full Rate (EFR) Speech Codec," Standard ETSI EN 300 724 V8.0.1, 2000.
[34] J. Fisher, P. Faraboschi, and C. Young, Embedded Computing: A VLIW Approach to Architecture, Compiler and Tools, chapter 11.1. Elsevier, 2005.
[35] J. Turley, Operating Systems on the Rise, http://www.embedded. com/columnsshowArticle.jhtml?articleID=187203732 , 2006.
[36] H. Lange and A. Koch, "Design and System Level Evaluation of a High Performance Memory System for Reconfigurable SoC Platforms," Proc. High Performance and Embedded Architectures and Compilers (HiPEAC) Workshop Reconfigurable Computing, 2007.
[37] H. Lange and A. Koch, "An Execution Model for Hardware/Software Compilation and Its System-Level Realization," Proc. Int'l Conf. Field Programmable Logic and Applications (FPL), 2007.
[38] H. Lange and A. Koch, "Low-Latency High-Bandwidth HW/SW Communication in a Virtual Memory Environment," Proc. Int'l Conf. Field Programmable Logic and Applications (FPL), 2008.
[39] H. Lange and A. Koch, "Memory Access Schemes for Configurable Processors," Proc. Workshop Field-Programmable Logic and Applications (FPL), 2000.
[40] N. Kasprzyk, "COMRADE—Ein Hochsprachen-Compiler für Adaptive Computersysteme," PhD thesis, Tech. Univ. Braunschweig, 2005.
[41] M. Müller-Hannemann and M. Schnee, "Finding All Attractive Train Connections by Multi-Criteria Pareto Search," Proc. Fourth Workshop Algorithmic Methods and Models for Optimization of Railways (ATMOS), 2004.

Index Terms:
Reconfigurable computing, FPGA, hardware accelerator, memory system, operating system integration, virtual memory.
Holger Lange, Andreas Koch, "Architectures and Execution Models for Hardware/Software Compilation and Their System-Level Realization," IEEE Transactions on Computers, vol. 59, no. 10, pp. 1363-1377, Oct. 2010, doi:10.1109/TC.2009.180
Usage of this product signifies your acceptance of the Terms of Use.