2014 IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA) (2014)
Aug. 26, 2014 to Aug. 28, 2014
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ISPA.2014.29
This paper presents a binary acceleration approach based on extending a General Purpose Processor (GPP) with a Reconfigurable Processing Unit (RPU), both sharing an external data memory. In this approach repeating sequences of GPP instructions are migrated to the RPU. The RPU resources are selected and organized off-line using execution trace information. The RPU core is composed of Functional Units (FUs) that correspond to single CPU instructions. The FUs are arranged in stages of mutually independent operations. The RPU can enable several stages in tandem, depending on the data dependencies. External data memory accesses are handled by a configurable dual-port cache. A prototype implementation of the architecture on a Spartan-6 FPGA was validated with 12 benchmarks and achieved an overall geometric mean speedup of 1.91x.
Acceleration, Clocks, Arrays, Synchronization, Ports (Computers), Benchmark testing, Registers
N. M. Paulino, J. C. Ferreira and J. M. Cardoso, "Trace-Based Reconfigurable Acceleration with Data Cache and External Memory Support," 2014 IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA), Milan, Italy, 2014, pp. 158-165.