The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.12 - December (2009 vol.58)
pp: 1668-1681
Davy Genbrugge , Ghent University, Gent
Lieven Eeckhout , Ghent University, Gent
ABSTRACT
Developing fast chip multiprocessor simulation techniques is a challenging problem. Solving this problem is especially valuable for design space exploration purposes during the early stages of the design cycle where a large number of design points need to be evaluated quickly. This paper studies statistical simulation as a fast simulation technique for chip multiprocessor (CMP) design space exploration. The idea of statistical simulation is to measure a number of program execution characteristics from a real program execution through profiling, to generate a synthetic trace from it, and simulate that synthetic trace as a proxy for the original program. The important benefit is that the synthetic trace is much shorter compared to a real program trace, which leads to substantial simulation speedups. This paper enhances state-of-the-art statistical simulation: 1) by modeling the memory address stream behavior in a more microarchitecture-independent way and 2) by modeling a program's time-varying execution behavior. These two enhancements enable accurately modeling resource conflicts in shared resources as observed in the memory hierarchy of contemporary chip multiprocessors when multiple programs are coexecuting on the CMP. Our experimental evaluation using the SPEC CPU benchmarks demonstrates average prediction error of 7.3 percent across a range of CMP configurations while varying the number of cores and memory hierarchy configurations.
INDEX TERMS
Performance of systems (modeling techniques, simulation).
CITATION
Davy Genbrugge, Lieven Eeckhout, "Chip Multiprocessor Design Space Exploration through Statistical Simulation", IEEE Transactions on Computers, vol.58, no. 12, pp. 1668-1681, December 2009, doi:10.1109/TC.2009.77
REFERENCES
[1] K.C. Barr, H. Pan, M. Zhang, and K. Asanovic, “Accelerating Multiprocessor Simulation with a Memory Timestamp Record,” Proc. 2005 IEEE Int'l Symp. Performance Analysis of Systems and Software (ISPASS), pp. 66-77, Mar. 2005.
[2] N.L. Binkert, R.G. Dreslinski, L.R. Hsu, K.T. Lim, A.G. Saidi, and S.K. Reinhardt, “The M5 Simulator: Modeling Networked Systems,” IEEE Micro, vol. 26, no. 4, pp. 52-60, July/Aug. 2006.
[3] R. Carl and J.E. Smith, “Modeling Superscalar Processors via Statistical Simulation,” Proc. Workshop Performance Analysis and Its Impact on Design (PAID), Held in Conjunction with the 25th Ann. Int'l Symp. Computer Architecture (ISCA), June 1998.
[4] D. Chandra, F. Guo, S. Kim, and Y. Solihin, “Predicting Inter-Thread Cache Contention on a Chip-Multiprocessor Architecture,” Proc. 11th Int'l Symp. High-Performance Computer Architecture (HPCA), pp. 340-351, Feb. 2005.
[5] D. Chiou, D. Sunwoo, J. Kim, N.A. Patil, W. Reinhart, D.E. Johnson, J. Keefe, and H. Angepat, “FPGA-Accelerated Simulation Technologies (FAST): Fast, Full-System, Cycle-Accurate Simulators,” Proc. Ann. IEEE/ACM Int'l Symp. Microarchitecture (MICRO), pp. 249-261, Dec. 2007.
[6] L. Eeckhout, R.H. BellJr., B. Stougie, K. De Bosschere, and L.K. John, “Control Flow Modeling in Statistical Simulation for Accurate and Efficient Processor Design Studies,” Proc. 31st Ann. Int'l Symp. Computer Architecture (ISCA), pp. 350-361, June 2004.
[7] L. Eeckhout and K. De Bosschere, “Hybrid Analytical-Statistical Modeling for Efficiently Exploring Architecture and Workload Design Spaces,” Proc. Int'l Conf. Parallel Architectures and Compilation Techniques (PACT), pp. 25-34, Sept. 2001.
[8] M. Ekman and P. Stenström, “Enhancing Multiprocessor Architecture Simulation Speed Using Matched-Pair Comparison,” Proc. 2005 IEEE Int'l Symp. Performance Analysis of Systems and Software (ISPASS), pp. 89-99, Mar. 2005.
[9] S. Eyerman and L. Eeckhout, “System-Level Performance Metrics for Multi-Program Workloads,” IEEE Micro, vol. 28, no. 3, pp. 42-53, May/June 2008.
[10] S. Eyerman, L. Eeckhout, T. Karkhanis, and J.E. Smith, “A Mechanistic Performance Model for Superscalar Out-of-Order Processors,” Proc. ACM Trans. Computer Systems (TOCS), May 2009.
[11] D. Genbrugge and L. Eeckhout, “Memory Data Flow Modeling in Statistical Simulation for the Efficient Exploration of Microprocessor Design Spaces,” IEEE Trans. Computers, vol. 57, no. 10, pp.41-54, Jan. 2007.
[12] D. Genbrugge and L. Eeckhout, “Statistical Simulation of Chip Multiprocessors Running Multi-Program Workloads,” Proc. 2007 Int'l Conf. Computer Design (ICCD), pp. 464-471, Oct. 2007.
[13] C. Hughes and T. Li, “Accelerating Multi-Core Processor Design Space Evaluation Using Automatic Multi-Threaded Workload Synthesis,” Proc. IEEE Int'l Symp. Workload Characterization (IISWC), pp. 163-172, Sept. 2008.
[14] V.S. Iyengar, L.H. Trevillyan, and P. Bose, “Representative Traces for Processor Models with Infinite Cache,” Proc. Second Int'l Symp. High-Performance Computer Architecture (HPCA), pp. 62-73, Feb. 1996.
[15] T. Kgil, S. D'Souza, A. Saidi, B. Nathan, R. Dreslinski, S. Reinhardt, K. Flautner, and T. Mudge, “PicoServer: Using 3D Stacking Technology to Enable a Compact Energy Efficient Chip Multiprocessor,” Proc. 12th Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 117-128, Oct. 2006.
[16] B. Lee, J. Collins, H. Wang, and D. Brooks, “CPR: Composable Performance Regression for Scalable Multiprocessor Models,” Proc. 41st Ann. IEEE/ACM Int'l Symp. Microarchitecture (MICRO), Nov. 2008.
[17] D.B. Noonburg and J.P. Shen, “A Framework for Statistical Modeling of Superscalar Processor Performance,” Proc. Third Int'l Symp. High-Performance Computer Architecture (HPCA), pp. 298-309, Feb. 1997.
[18] S. Nussbaum and J.E. Smith, “Modeling Superscalar Processors via Statistical Simulation,” Proc. 2001 Int'l Conf. Parallel Architectures and Compilation Techniques (PACT), pp. 15-24, Sept. 2001.
[19] S. Nussbaum and J.E. Smith, “Statistical Simulation of Symmetric Multiprocessor Systems,” Proc. 35th Ann. Simulation Symp., pp. 89-97, Apr. 2002.
[20] M. Oskin, F.T. Chong, and M. Farrens, “HLS: Combining Statistical and Symbolic Simulation to Guide Microprocessor Design,” Proc. 27th Ann. Int'l Symp. Computer Architecture (ISCA), pp. 71-82, June 2000.
[21] M. Pellauer, M. Vijayaraghavan, M. Adler, Arvind, and J.S. Emer, “Quick Performance Models Quickly: Closely-Coupled Partitioned Simulation on fpgas,” Proc. IEEE Int'l Symp. Performance Analysis of Systems and Software (ISPASS), pp. 1-10, Apr. 2008.
[22] D.A. Penry, D. Fay, D. Hodgdon, R. Wells, G. Schelle, D.I. August, and D. Connors, “Exploiting Parallelism and Structure to Accelerate the Simulation of Chip Multi-Processors,” Proc. 12th Int'l Symp. High-Performance Computer Architecture (HPCA), pp. 27-38, Feb. 2006.
[23] E. Perelman, G. Hamerly, and B. Calder, “Picking Statistically Valid and Early Simulation Points,” Proc. 12th Int'l Conf. Parallel Architectures and Compilation Techniques (PACT), pp. 244-256, Sept. 2003.
[24] T. Sherwood, E. Perelman, G. Hamerly, and B. Calder, “Automatically Characterizing Large Scale Program Behavior,” Proc. Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 45-57, Oct. 2002.
[25] A. Snavely and D.M. Tullsen, “Symbiotic Jobscheduling for Simultaneous Multithreading Processor,” Proc. Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 234-244, Nov. 2000.
[26] D.J. Sorin, V.S. Pai, S.V. Adve, M.K. Vernon, and D.A. Wood, “Analytic Evaluation of Shared-Memory Systems with ILP Processors,” Proc. 25th Ann. Int'l Symp. Computer Architecture (ISCA), pp. 380-391, June 1998.
[27] M. Van Biesbrouck, L. Eeckhout, and B. Calder, “Considering All Starting Points for Simultaneous Multithreading Simulation,” Proc. Int'l Symp. Performance Analysis of Systems and Software (ISPASS), pp. 143-153, Mar. 2006.
[28] M. Van Biesbrouck, L. Eeckhout, and B. Calder, “Representative Multiprogram Workloads for Multithreaded Processor Simulation,” Proc. IEEE Int'l Symp. Workload Characterization (IISWC), pp.193-203, Oct. 2007.
[29] M. Van Biesbrouck, T. Sherwood, and B. Calder, “A Co-Phase Matrix to Guide Simultaneous Multithreading Simulation,” Proc. Int'l Symp. Performance Analysis of Systems and Software (ISPASS), pp. 45-56, Mar. 2004.
[30] J. Wawrzynek, D. Patterson, M. Oskin, S.-L. Lu, C. Kozyrakis, J.C. Hoe, D. Chiou, and K. Asanovic, “RAMP: Research Accelerator for Multiple Processors,” IEEE Micro, vol. 27, no. 2, pp. 46-57, Mar. 2007.
[31] T.F. Wenisch, R.E. Wunderlich, M. Ferdman, A. Ailamaki, B. Falsafi, and J.C. Hoe, “SimFlex: Statistical Sampling of Computer System Simulation,” IEEE Micro, vol. 26, no. 4, pp. 18-31, July 2006.
25 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool