The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.03 - March (2010 vol.21)
pp: 342-353
Sangyeun Cho , University of Pittsburgh, Pittsburgh
Rami G. Melhem , University of Pittsburgh, Pittsburgh
ABSTRACT
This paper derives simple, yet fundamental formulas to describe the interplay between parallelism of an application, program performance, and energy consumption. Given the ratio of serial and parallel portions in an application and the number of processors, we derive optimal frequencies allocated to the serial and parallel regions in an application to either minimize the total energy consumption or minimize the energy-delay product. The impact of static power is revealed by considering the ratio between static and dynamic power and quantifying the advantages of adding to the architecture capability to turn off individual processors and save static energy. We further determine the conditions under which one can obtain both energy and speed improvement, as well as the amount of improvement. While the formulas we obtain use simplifying assumptions, they provide valuable theoretical insights into energy-aware processor resource management. Our results form a basis for several interesting research directions in the area of energy-aware multicore processor architectures.
INDEX TERMS
Multicore processor, Amdahl's law, dynamic voltage and frequency scaling (DVFS), energy-delay product (EDP).
CITATION
Sangyeun Cho, Rami G. Melhem, "On the Interplay of Parallelization, Program Performance, and Energy Consumption", IEEE Transactions on Parallel & Distributed Systems, vol.21, no. 3, pp. 342-353, March 2010, doi:10.1109/TPDS.2009.41
REFERENCES
[1] G.S. Almasi and A. Gottlieb, Highly Parallel Computing, second ed. Benjamin/Cummings Publishing Company, 1994.
[2] AMD Dual-Core Processors, http:/www.amd.com, 2009.
[3] G.M. Amdahl, "Validity of the Single Processor Approach to Achieving Large Scale Computing Capabilities," Proc. Am. Federation of Information Processing Soc. (AFIPS) Conf., pp. 483-485, 1967.
[4] K. Asanović, R. Bodik, B.C. Catanzaro, J.J. Gebis, P. Husbands, K. Keutzer, D.A. Patterson, W.L. Plishker, J. Shalf, S.W. Williams, and K.A. Yelick, "The Landscape of Parallel Computing Research: A View from Berkeley," Technical Report UCB/EECS-2006-183, Univ. of California, Berkeley, Dec. 2006.
[5] S. Borkar, "Microarchitecture and Design Challenges for Gigascale Integration," Proc. Int'l Symp. Microarchitecture (MICRO), Dec. 2004.
[6] S. Cho and R.G. Melhem, "Corollaries to Amdahl's Law for Energy," IEEE Computer Architecture Letters (CAL), vol. 7, no. 1, pp. 25-28, Jan. 2008.
[7] K. Choi, R. Soma, and M. Pedram, "Fine-Grained Dynamic Voltage and Frequency Scaling for Precise Energy and Performance Trade Off Based on the Ratio of Off-Chip Access to On-Chip Computation Times," Proc. Design Automation and Test in Europe Conf. (DATE), Feb. 2004.
[8] J. Dorsey, S. Searles, M. Ciraula, S. Johnson, N. Bujanos, D. Wu, M. Braganza, S. Meyers, E. Fang, and R. Kumar, "An Integrated Quad-Core Opteron Processor," Proc. Int'l Solid-State Circuits Conf. (ISSCC), pp. 102-103, Feb. 2007.
[9] J. Friedrich, B. McCredie, N. James, B. Huott, B. Curran, E. Fluhr, G. Mittal, E. Chan, Y. Chan, D. Plass, S. Chu, H. Le, L. Clark, J. Ripley, S. Taylor, J. Dilullo, and M. Lanzerotti, "Design of the Power6 Microprocessor," Proc. Int'l Solid-State Circuits Conf. (ISSCC), pp. 96-97, Feb. 2007.
[10] R. Ge and K.W. Cameron, "Power-Aware Speedup," Proc. Int'l Parallel and Distributed Processing Symp. (IPDPS), pp. 1-10, Mar. 2007.
[11] R. Ge, X. Feng, and K.W. Cameron, "Performance-Constrained Distributed DVS Scheduling for Scientific Applications on Power-Aware Clusters," Proc. Conf. Supercomputing, pp. 34-44, Nov. 2005.
[12] J.L. Hennessy and D.A. Patterson, Computer Architecture: A Quantitative Approach, fourth ed. Morgan Kaufmann Publishers, 2007.
[13] M.D. Hill and M.R. Marty, "Amdahl's Law in the Multiore Era," Computer, vol. 41, no. 7, pp. 33-38, July 2008.
[14] C.-H. Hsu and W.-C. Feng, "Effective Dynamic Voltage Scaling through CPU-Boundedness Detection," Proc. Workshop Power-Aware Computing Systems (PACS), Dec. 2004.
[15] Intel, "A New Era of Architectural Innovation Arrives with Intel Dual-Core Processors," Technology@Intel Magazine, pp. 1-11, May 2005.
[16] Intel, Intel XScale Microarchitecture, technical summary, 2000.
[17] C. Isci, A. Buyuktosunoglu, C.-Y. Cher, P. Bose, and M. Martonosi, "An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget," Proc. Int'l Symp. Microarchitecture (MICRO), pp. 347-358, Dec. 2006.
[18] T. Ishihara and H. Yasuura, "Voltage Scheduling Problem for Dynamically Variable Voltage Processors," Proc. Int'l Symp. Low-Power Electronics and Design (ISLPED), pp. 197-202, Aug. 1998.
[19] ITRS (International Technology Roadmap for Semiconductors), 2005 ed., http:/public.itrs.net, 2005.
[20] N. James, P. Restle, J. Friedrich, B. Huott, and B. McCredie, "Comparison of Split- Versus Connected Core Supplies in the POWER6 Microprocessor," Proc. Int'l Solid-State Circuits Conf. (ISSCC), pp. 298-299, 604, Feb. 2007.
[21] P. Kongetira, K. Aingaran, and K. Olukotun, "Niagara: A 32-Way Multithreaded Sparc Processor," IEEE Micro, vol. 25, no. 2, pp. 21-29, Mar./Apr. 2005.
[22] M. LaPedus, "Intel Tips Teraflops Programmable Processor," EETimes, Sept. 26, 2006.
[23] J. Li and J.F. Martínez, "Power-Performance Considerations of Parallel Computing on Chip Multiprocessors," ACM Trans. Architecture and Code Optimization (TACO), vol. 2, no. 4, pp. 397-422, Dec. 2005.
[24] J. Li and J.F. Martínez, "Dynamic Power-Performance Adaptation of Parallel Computation on Chip Multiprocessors," Proc. Int'l Symp. High Performance Computer Architecture (HPCA), pp. 77-87, Feb. 2006.
[25] R. Mishra, N. Rastogi, D. Zhu, D. Mossé, and R. Melhem, "Energy Aware Scheduling for Distributed Real-Time Systems," Proc. Int'l Parallel and Distributed Processing Symp. (IPDPS), pp. 21-29, Apr. 2003.
[26] R. Ronen, A. Mendelson, K. Lai, S.-L. Lu, F. Pollack, and J.P. Shen, "Coming Challenges in Microarchitecture and Architecture," Proc. IEEE, vol. 89, no. 3, pp. 325-340, Mar. 2001.
[27] N. Sakran, M. Yuffe, M. Mehalel, J. Doweck, E. Knoll, and A. Kovacs, "The Implementation of the 65 nm Dual-Core 64b Merom Processors," Proc. Int'l Solid-State Circuits Conf. (ISSCC), pp. 106-107, 590, Feb. 2007.
[28] K. Seth, A. Anantaraman, F. Mueller, and E. Rotenberg, "FAST: Frequency-Aware Static Timing Analysis," Proc. Real-Time Systems Symp. (RTSS), Dec. 2003.
[29] D. Shin, J. Kim, and S. Lee, "Intra-Task Voltage Scheduling for Low-Energy Hard Real-Time Applications," IEEE Design and Test of Computers, vol. 18, no. 2, pp. 20-30, Mar./Apr. 2001.
[30] A. Sinha and A.P. Chandrakasan, "JouleTrack—A Web Based Tool for Software Energy Profiling," Proc. Design Automation Conf. (DAC), pp. 220-225, June 2001.
[31] B. Sinharoy, R.N. Kalla, J.M. Tendler, R.J. Eickemeyer, and J.B. Joyner, "POWER5 System Microarchitecture," IBM J. Research & Development, vol. 49, nos. 4/5, pp. 505-521, July-Sept. 2005.
[32] T. Takayanagi, J.L. Shin, B. Petrick, J.Y. Su, H. Levy, H. Pham, J. Son, N. Moon, D. Bistry, U. Nair, M. Singh, V. Mathur, and A.S. Leon, "A Dual-Core 64-bit UltraSPARC Microprocessor for Dense Server Applications," IEEE J. Solid-State Circuits, vol. 40, no. 1, pp. 7-18, Jan. 2005.
[33] R. Teodorescu and J. Torrellas, "Variation-Aware Application Scheduling and Power Management for CMPs," Proc. Int'l Symp. Computer Architecture (ISCA), pp. 363-374, June 2008.
[34] D.-H. Woo and H.-H.S. Lee, "Extending Amdahl's Law for Energy-Efficient Computing in the Many-Core Era," Computer, vol. 41, no. 12, pp. 24-31, Dec. 2008.
[35] F. Yao, A. Demers, and S. Shenker, "A Scheduling Model for Reduced CPU Energy," Proc. Symp. Foundations of Computer Science (FOCS), pp. 374-382, Oct. 1995.
[36] D. Zhu, R. Melhem, and B.R. Childers, "Scheduling with Dynamic Voltage/Speed Adjustment Using Slack Reclamation in Multi-Processor Real-Time Systems," Proc. Real-Time Systems Symp. (RTSS), pp. 84-94, Dec. 2001.
22 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool