The Community for Technology Leaders
Green Image
<p><b>Abstract</b>—In the past few years, exception support for memory functions such as virtual memory, informing memory operations, software assist for shared memory protocols, or interactions with processors in memory has been advocated in various research papers. These memory traps may occur on a miss in the cache hierarchy or on a local or remote memory access. However, contemporary, dynamically scheduled processors only support memory exceptions detected in the TLB associated with the first-level cache. They do not support memory exceptions taken deep in the memory hierarchy. In this case, memory traps may be <it>late</it>, in the sense that the exception condition may still be undecided when a long-latency memory instruction reaches the retirement stage. In this paper we evaluate through simulation the overhead of memory traps in dynamically scheduled processors, focusing on the added overhead incurred when a memory trap is late. We also propose some simple mechanisms to reduce this added overhead while preserving the memory consistency model. With more aggressive memory access mechanisms in the processor we observe that the overhead of all memory traps—either early or late—is increased while the lateness of a trap becomes largely tolerated so that the performance gap between early and late memory traps is greatly reduced. Additionally, because of caching effects in the memory hierarchy, the frequency of memory traps usually decreases as they are taken deeper in the memory hierarchy and their overall impact on execution times becomes negligible. We conclude that support for memory traps taken throughout the memory hierarchy could be added to dynamically scheduled processors at low hardware cost and little performance degradation.</p>
Microarchitecture, memory system, exception, trap, simulations, instruction-level parallelism, memory consistency model, prefetching.

X. Qiu and M. Dubois, "Tolerating Late Memory Traps in Dynamically Scheduled Processors," in IEEE Transactions on Computers, vol. 53, no. , pp. 732-743, 2004.
86 ms
(Ver 3.3 (11022016))