The Community for Technology Leaders
2006 International Conference on Parallel Architectures and Compilation Techniques (PACT) (2006)
Seattle, WA, USA
Sept. 16, 2006 to Sept. 20, 2006
ISBN: 978-1-5090-3022-4
pp: 192-201
Chengmo Yang , Computer Science and Engineering Department, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093
Alex Orailoglu , Computer Science and Engineering Department, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093
ABSTRACT
As power dissipation inexorably becomes the major bottleneck in system integration and reliability, the front-end instruction delivery path in a traditional out-of-order superscalar-processor needs to deliver high application performance in an energy-effective manner. This challenge can be addressed by efficiently reusing the work of fetch and decode performed during preceding loop iterations and resident mostly within the processor itself. As a large percentage of the instructions currently under fetch have previously dispatched copies resident in the Reorder Buffer (ROB), in this paper we develop a mechanism to utilize the ROB as a storage location for previously decoded instructions. Thus instructions can be fed directly from the ROB into the rename and issue stages, enabling the gating off of the fetch and decode logic for large periods of time so as to deliver significant power savings. Power and performance criticality of the ROB requires an efficient reuse identification mechanism; we outline such a cost-efficient Reuse Identification Unit (RIU) which enables effective identification of the matches between the ROB entries and the instructions currently under fetch. Simulation results on both multimedia and SPEC 2000 benchmarks confirm that incorporating the proposed technique on traditional out-of-order superscalar processors results in not only a sight improvement in performance, but also significant savings in the overall system power dissipation, achieved within a limited hardware budget.
INDEX TERMS
instruction delivery, Low-power design, adaptive processor
CITATION
Chengmo Yang, Alex Orailoglu, "Power-efficient instruction delivery through trace reuse", 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT), vol. 00, no. , pp. 192-201, 2006, doi:
162 ms
(Ver 3.3 (11022016))