The Community for Technology Leaders
Green Image
<p><it>Abstract—</it>Cache coherence schemes that dynamically adapt to memory referencing patterns have been proposed to improve coherence enforcement in shared-memory multiprocessors. By using only run-time information, however, these existing schemes are incapable of looking ahead in the memory referencing stream. We present a combined hardware-software strategy that uses the predictive capability of the compiler to select updating or invalidating for each write reference. To determine the potential performance improvement that can be achieved with this optimization, three different levels of compiler capabilities are examined. Simulations using memory traces show that with an ideal compiler, this optimization can potentially reduce the miss ratio by 0.4% to 15% compared to an invalidating-only scheme, while reducing the generated network traffic by 13% to 94 % compared to an updating-only scheme. In addition, this optimization can potentially reduce the miss ratio by up to 13%, while reducing the generated network traffic by up to 92%, compared to a dynamic adaptive scheme. Furthermore, performance can be potentially improved even with a compiler capable of performing only imprecise array subscript analysis and no interprocedural analysis.</p><p><it>Index Terms—</it>Compiler optimization, update, invalidate, directory, cache coherence, shared-memory, multiprocessor.</p>

F. Mounes-Toussi and D. J. Lilja, "The Potential of Compile-Time Analysis to Adapt the Cache Coherence Enforcement Strategy to the Data Sharing Characteristics," in IEEE Transactions on Parallel & Distributed Systems, vol. 6, no. , pp. 470-481, 1995.
93 ms
(Ver 3.3 (11022016))