Subscribe

Issue No.03 - March (2008 vol.57)

pp: 389-403

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TC.2007.70817

ABSTRACT

Advances in semiconductor technology enable larger processor design space, leading to increasingly complex systems. Designers must evaluate many architecture design points to achieve the optimal design. Currently, most architecture exploration is performed using cycle accurate simulators. Although accurate, these tools are slow, thus limiting a comprehensive design search. The vast design space of today's complex processors and time to market economic pressures motivate the need for faster architectural evaluation methods. This paper presents a superscalar processor performance model that enables rapid exploration of the architecture design space for superscalar processors. It supplements current design tools by quickly identifying promising areas for more thorough and time consuming exploration with traditional tools. The model estimates instruction throughput of a superscalar processor based on early architectural design parameters and application properties. It has been validated with the Simplescalar out-of-order simulator. The model, which executed 40,000 times faster, produces instruction throughput estimates that are with within 5.5% of the corresponding SimpleScalar values.

INDEX TERMS

Modeling of computer architecture, Pipeline processors, Modeling techniques

CITATION

Tarek M. Taha, Scott Wills, "An Instruction Throughput Model of Superscalar Processors",

*IEEE Transactions on Computers*, vol.57, no. 3, pp. 389-403, March 2008, doi:10.1109/TC.2007.70817REFERENCES

- [1] T. Austin, E. Larson, and D. Ernst, “SimpleScalar: An Infrastructure for Computer System Modeling,”
Computer, vol. 35, no. 2, pp. 59-67, Feb. 2002.- [2] E. Berg and E. Hagersten, “StatCache: A Probabilistic Approach to Efficient and Accurate Data Locality Analysis,”
Proc. Int'l Symp. Performance Analysis of Systems and Software, 2004.- [3] E. Berg, H. Zeffer, and E. Hagersten, “A Statistical Multiprocessor Cache Model,”
Proc. Int'l Symp. Performance Analysis of Systems and Software, 2006.- [4] R. Desikan, D.C. Burger, and S.W. Keckler, “Measuring Experimental Error in Microprocessor Simulation,”
Proc. Int'l Symp. Computer Architecture, 2001.- [6] L. Eeckhout and K. De Bosschere, “Increasing the Accuracy of Statistical Simulation for Modeling Superscalar Processors,”
Proc. IEEE Int'l Conf. Performance, Computing, and Comm., 2001.- [8] L. Eeckhout, R.H. Bell Jr., B. Stougie, K. De Bosschere, and L.K. John, “Control Flow Modeling in Statistical Simulation for Accurate and Efficient Processor Design Studies,”
Proc. Int'l Symp. Computer Architecture, 2004.- [9] S. Eyerman, L. Eeckhout, and K. De Bosschere, “Efficient Design Space Exploration of High Performance Embedded Out-of-Order Processors,”
Proc. Design, Automation and Test in Europe, 2006.- [10] D. Genbrugge, L. Eeckhout, and K. De Bosschere, “Accurate Memory Data Flow Modeling in Statistical Simulation,”
Proc. Int'l Conf. Supercomputing, 2006.- [11] G. Hamerly, E. Perelman, J. Lau, and B. Calder, “SimPoint 3.0: Faster and More Flexible Program Phase Analysis,”
J. Instruction-Level Parallelism, vol. 7, pp. 1-28, 2005.- [12] G. Hamerly, E. Perelman, J. Lau, B. Calder, and T. Sherwood, “Using Machine Learning to Guide Architecture Simulation,”
J.Machine Learning Research, vol. 7, pp. 343-378, 2006.- [13] A. Hossain and D.J. Pease, “An Analytical Model for Trace Cache Instruction Fetch Performance,”
Proc. Int'l Conf. Computer Design, 2001.- [14] T. Huffmire and T. Sherwood, “Wavelet-Based Phase Classification,”
Proc. Int'l Conf. Parallel Architectures and Compilation Techniques, 2006.- [15] E. Ipek, S.A. McKee, B.R. de Supinski, and R. Caruana, “Efficiently Exploring Architectural Design Spaces via Predictive Modeling,”
Proc. ACM Symp. Architectural Support for Programming Languages and Operating Systems, 2006.- [16] P.J. Joseph, K. Vaswani, and M.J. Thazhuthaveetil, “Construction and Use of Linear Regression Models for Processor Performance Analysis,”
Proc. Int'l Symp. High Performance Computer Architecture, 2006.- [17] P.J. Joseph, K. Vaswani, and M.J. Thazhuthaveetil, “A Predictive Performance Model for Superscalar Processors,”
Proc. Int'l Symp. Microarchitecture, Dec. 2006.- [19] T.S. Karkhanis and J.E. Smith, “A First-Order Superscalar Processor Model,”
Proc. Int'l Symp. Computer Architecture, 2004.- [20] H.J. Kim, S.M. Kim, and S.B. Choi, “System Performance Analyses of Out-of-Order Superscalar Processors Using Analytical Method,”
IEICE Trans. Fundamentals of Electronics Comm. and Computer Sciences, vol. E82A, no. 6, pp. 927-938, June 1999.- [21] R. Kumar and D.M. Tullsen, “Compiling for Instruction Cache Performance on a Multithreaded Architecture,”
Proc. Int'l Symp. Microarchitecture, 2002.- [22] B. Lee and D. Brooks, “Accurate and Efficient Regression Modeling for Microarchitectural Performance and Power Prediction,”
Proc. ACM Symp. Architectural Support for Programming Languages and Operating Systems, 2006.- [23] R. Lee and M. Smith, “Media Processing: A New Design Target,”
IEEE Micro, vol. 16, no. 4, pp. 6-9, Aug. 1996.- [24] M. Lipasti and J. Shen, “Exceeding the Dataflow Limit with Value Prediction,”
Proc. Int'l Symp. Microarchitecture, 1996.- [25] P. Michaud, A. Seznec, and S. Jourdan, “Exploring Instruction-Fetch Bandwidth Requirement in Wide-Issue Superscalar Processors,”
Proc. Int'l Conf. Parallel Architectures and Compilation Techniques, 1999.- [26] D.B. Noonburg and J.P. Shen, “Theoretical Modeling of Superscalar Processor Performance,”
Proc. Int'l Symp. Microarchitecture, 1994.- [27] D.B. Noonburg and J.P. Shen, “A Framework for Statistical Modeling of Superscalar Processor Performance,”
Proc. Int'l Symp. High Performance Computer Architecture, 1997.- [28] S. Nussbaum and J.E. Smith, “Statistical Simulation of Symmetric Multiprocessor Systems,”
Proc. 35th Ann. Simulation Symp., 2002.- [29] M. Oskin, F.T. Chong, and M. Farrens, “HLS: Combining Statistical and Symbolic Simulation to Guide Microprocessor Designs,”
Proc. Int'l Symp. Computer Architecture, 2000.- [30] S. Palacharla, N.P. Jouppi, and J.E. Smith, “Complexity-Effective Superscalar Processors,”
Proc. Int'l Symp. Computer Architecture, 1997.- [31] Y.H. Pyun, C.S. Park, and S.B. Choi, “The Effect of Instruction Window on the Performance of Superscalar Processors,”
IEICE Trans. Fundamentals of Electronics Comm. and Computer Sciences, vol. E81A, no. 6, pp. 1036-1044, June 1998.- [32] M.J. Serrano, “Performance Estimation in a Simultaneous Multithreading Processor,”
Proc. Int'l Workshop Modeling, Analysis, and Simulation of Computer and Telecomm. Systems, 1996.- [34] T.M. Taha and D.S. Wills, “An Instruction Throughput Model of Superscalar Processors,”
Proc. Int'l Workshop Rapid System Prototyping, 2003.- [36] I. Williams, “An Illustration of the MIPS R12000TM Microprocessor and OCTANE System Architecture,” white paper, http://www.sgi.com/products/remarketed/octane octane.pdf, 1999.
- [41] Y. Zhu and W.F. Wong, “Modeling Architectural Improvements in Superscalar Processors,”
Proc. Fourth Int'l High Performance Computing in the Asia-Pacific Region, 2000. |