A Speed-up Technique for an Auto-Memoization Processor by Reusing Partial Results of Instruction Regions
2012 Third International Conference on Networking and Computing (2012)
Okinawa, Japan Japan
Dec. 5, 2012 to Dec. 7, 2012
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICNC.2012.17
We have proposed an auto-memoization processor based on computation reuse. The auto-memoization processor dynamically detects functions and loop iterations as reusable blocks, and memoizes them automatically. In the past model, computation reuse cannot be applied if the current input sequence even differs by only one input value from the past input sequences, since processing results will differ. This paper proposes a new partial reuse model, which can apply computation reuse to the early part of a reusable block as long as the early part of the current input sequence matches one of the past sequences. In addition, in order to acquire sufficient benefit from the partial reuse model, we also propose a technique that reduces the searching overhead for memoization table by partitioning it. The result of the experiment with SPEC CPU95 suite benchmarks shows that the new method improves the maximum speedup from 40.6% to 55.1%, and the average speedup from 10.6% to 22.8%.
auto-memoization processor, microprocessor architecture, computation reuse, memoization
K. Kamimura, R. Oda, T. Yamada, T. Tsumura, H. Matsuo and Y. Nakashima, "A Speed-up Technique for an Auto-Memoization Processor by Reusing Partial Results of Instruction Regions," 2012 Third International Conference on Networking and Computing(ICNC), Okinawa, Japan Japan, 2012, pp. 49-57.