The Community for Technology Leaders
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques (2004)
Antibes Juan-les-Pins, France
Sept. 29, 2004 to Oct. 3, 2004
ISSN: 1089-795X
ISBN: 0-7695-2229-7
pp: 153-164
YongKang Zhu , University of Rochester, New York
Michael L. Scott , University of Rochester, New York
Chen Ding , University of Rochester, New York
David H. Albonesi , University of Rochester, New York
Grigorios Magklis , Intel Barcelona Research Center, Barcelona, Spain
Loop fusion combines corresponding iterations of different loops. It is traditionally used to decrease program run time, by reducing loop overhead and increasing data locality. In this paper, however, we consider its effect on energy.<div></div> the uniformity, or balance of demand for system resources. On a conventional superscalar processor, increased balance tends to increase IPC, and thus dynamic power, so that fusion-induced improvements in program energy are slightly smaller than improvements in program run time. If IPC is held constant, however, by reducing frequency and voltage-particularly on a processor with multiple clock domains-then energy improvements may significantly exceed run time improvements.<div></div> We demonstrate the benefits of increased program balance under a theoretical model of processor energy consumption. We then evaluate the benefits of fusion empirically on synthetic and real-world benchmarks, using our existing loop-fusing compiler and a heavily modified version of the SimpleScalar/Wattch simulator. For the real-world benchmarks, we demonstrate energy savings ranging from 7-40%, with run times ranging from 1% slowdown to 17% speedup. In addition to validating our theoretical model, the simulation results allow us to "tease apart" the factors that contribute to fusion-induced time and energy savings.
YongKang Zhu, Michael L. Scott, Chen Ding, David H. Albonesi, Grigorios Magklis, "The Energy Impact of Aggressive Loop Fusion", Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, vol. 00, no. , pp. 153-164, 2004, doi:10.1109/PACT.2004.10011
91 ms
(Ver 3.3 (11022016))