The Community for Technology Leaders
2014 Second International Symposium on Computing and Networking (CANDAR) (2014)
Shizuoka, Japan
Dec. 10, 2014 to Dec. 12, 2014
ISBN: 978-1-4799-4152-0
pp: 322-328
ABSTRACT
The REPLICA architecture is a massively hardware threaded very long instruction word (VLIW) architecture. REPLICA has two execution modes supported by the underlying on-chip memory, PRAM and NUMA which can be switched between at runtime. PRAM mode is considered the standard execution mode and targets mainly applications with very high thread level parallelism (TLP). In contrast, NUMA mode is for sequential legacy applications and applications with low amount of TLP, but for some cases very regular applications suits NUMA mode as well. However, there is a switching cost between the modes which is not neglect able. We combine machine-learning (symbolic regression) with shortest path problem to optimize software composition of parameterized stencil-like algorithms which have regular control flow and memory access pattern. Using the tool Eureqa Pro which is based on symbolic regression and training data we can create predictors for execution time for parameterized software components. We use the predictors and formulate an optimization problem based on shortest path to map component execution on the available modes (PRAM or NUMA). When composing for three randomly selected components from an evaluation set we get speedups up to 2.9 times including overhead and an average speedup of 1.4 also including overhead. Overhead costs which includes running predictors, solving shortest path and switching to the selected runtime modes are just a few percent.
INDEX TERMS
Phase change random access memory, Switches, Optimization, Computer architecture, Runtime, Instruction sets
CITATION

E. Hansson and C. Kessler, "Global Optimization of Execution Mode Selection for the Reconfigurable PRAM-NUMA Multicore Architecture REPLICA," 2014 Second International Symposium on Computing and Networking (CANDAR), Shizuoka, Japan, 2014, pp. 322-328.
doi:10.1109/CANDAR.2014.72
186 ms
(Ver 3.3 (11022016))