The Community for Technology Leaders
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques (2005)
St. Louis, Missouri
Sept. 17, 2005 to Sept. 21, 2005
ISSN: 1089-795X
ISBN: 0-7695-2429-X
pp: 255-266
Avi Mendelson , Intel Corporation Haifa, Israel
Avinoam Kolodny , Department of Electrical Engineering Technion, Haifa, Israel
Michael Behar , Department of Electrical Engineering
<p>This paper presents a new technique for efficient usage of small trace caches. A trace cache can significantly increase the performance of wide out-oforder processors, but to be effective, the size of the trace cache should be large.</p> <p>Power and timing considerations indicate that a small trace cache is desirable, with special mechanisms to increase its effectiveness despite the limited size. Hence several authors have proposed various filtering methods to select "good traces" for keeping in the trace cache, from among the general population of traces.</p> <p>This paper presents a new filtering technique, which is based on sampling. Our new technique suggests that instead of building all the traces and trying to select the good ones among them, it is more efficient to make a preliminary selection of traces. This selection is based on a random sampling approach.</p> <p>We show that the Sampling Filter improves trace cache and overall system performance, while reducing power dissipation. The Sampling Filter reduces admission of traces that are not used prior to their eviction from the cache, and prolongs the percentage of time a trace is in its live phase during its stay in the cache. Moreover, the Sampling Filter reduces duplication between the trace cache and the instruction cache and thus reduces the overall misses in the first level of cache hierarchy.</p>
Avi Mendelson, Avinoam Kolodny , Michael Behar, "Trace Cache Sampling Filter", Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, vol. 00, no. , pp. 255-266, 2005, doi:10.1109/PACT.2005.38
75 ms
(Ver 3.3 (11022016))