Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques (2005)
St. Louis, Missouri
Sept. 17, 2005 to Sept. 21, 2005
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/PACT.2005.42
John D. Davis , Stanford University
James Laudon , Sun Microsystems, Inc.
Kunle Olukotun , Stanford University
<p>In this paper we compare the performance of area equivalent small, medium, and large-scale multithreaded chip multiprocessors (CMTs) using throughput-oriented applications. We use area models based on SPARC processors incorporating these architectural features. We examine CMTs with inorder scalar processor cores, 2-way or 4-way in-order superscalar cores, private primary instruction and data caches, and a shared secondary cache. We explore a large design space, ranging from processor-intensive to cache-intensive CMTs. We use SPEC JBB2000, TPCC, TPC-W, and XML Test to demonstrate that the scalar simple-core CMTs do a better job of addressing the problems of low instruction-level parallelism and high cache miss rates that dominate web-service middleware and online transaction processing applications. For the best overall CMT performance, smaller cores with lower performance, so called "mediocre" cores, maximize the total number of CMT cores and outperform CMTs built from larger, higher performance cores.</p>
J. D. Davis, James Laudon and Kunle Olukotun, "Maximizing CMP Throughput with Mediocre Cores," PACT 2005. 14th International Conference on Parallel Architectures and Compilation Techniques(PACT), St. Louis, MO, USA, 2005, pp. 51-62.