The Community for Technology Leaders
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques (2007)
Brasov, Romania
Sept. 15, 2007 to Sept. 19, 2007
ISSN: 1089-795X
ISBN: 0-7695-2944-5
pp: 13-24
Miquel Pericas , Universitat Politecnica de Catalunya, Spain; Barcelona Supercomputing Center, Spain
Adrian Cristal , Barcelona Supercomputing Center, Spain
Francisco J. Cazorla , Barcelona Supercomputing Center, Spain
Ruben Gonzalez , Universitat Politecnica de Catalunya, Spain
Daniel A. Jimenez , The University of Texas at San Antonio, USA
Mateo Valero , Universitat Politecnica de Catalunya, Spain; Barcelona Supercomputing Center, Spain
ABSTRACT
Multi-core processors naturally exploit thread-level par- allelism (TLP). However, extracting instruction-level paral- lelism (ILP) from individual applications or threads is still a challenge as application mixes in this environment are nonuniform. Thus, multi-core processors should be flexi- ble enough to provide high throughput for uniform paral- lel applications as well as high performance for more gen- eral workloads. Heterogeneous architectures are a first step in this direction, but partitioning remains static and only roughly fits application requirements. <p>This paper proposes the Flexible Heterogeneous Mul- tiCore processor (FMC), the first dynamic heterogeneous multi-core architecture capable of reconfiguring itself to fit application requirements without programmer intervention. The basic building block of this microarchitecture is a scal- able, variable-size window microarchitecture that exploits the concept of Execution Locality to provide large-window capabilities. This allows to overcome the memory wall for applications with high memory-level parallelism (MLP). The microarchitecture contains a set of small and fast cache processors that execute high locality code and a network of small in-order memory engines that together exploit low locality code. Single-threaded applications can use the entire network of cores while multi-threaded applications can effi- ciently share the resources. The sizing of critical structures remains small enough to handle current power envelopes.</p> <p>In single-threaded mode this processor is able to out- perform previous state-of-the-art high-performance proces- sor research by 12% on SpecFP. We show how in a quad- threaded/quad-core environment the processor outperforms a statically allocated configuration in both throughput and harmonic mean, two commonly used metrics to evaluate SMT performance, by around 2-4%. This is achieved while using a very simple sharing algorithm.</p>
INDEX TERMS
null
CITATION
Miquel Pericas, Adrian Cristal, Francisco J. Cazorla, Ruben Gonzalez, Daniel A. Jimenez, Mateo Valero, "A Flexible Heterogeneous Multi-Core Architecture", Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, vol. 00, no. , pp. 13-24, 2007, doi:10.1109/PACT.2007.5
97 ms
(Ver 3.3 (11022016))