Issue No.08 - August (2004 vol.15)
Mark Heinrich , IEEE
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TPDS.2004.35
<p><b>Abstract</b>—Distributed shared memory (DSM) multiprocessors typically require disjoint networks for deadlock-free execution of cache coherence protocols. This is normally achieved by implementing virtual networks with the help of virtual channels or virtual lanes multiplexed on a single physical network. To keep the coherence protocol simple, messages are usually assigned to virtual lanes in a predefined static manner based on a cycle-free lane assignment dependence graph. However, this static split of virtual networks (such as request and reply networks) may lead to underutilization of certain virtual networks while saturating the other networks. In this paper, we explore different static and dynamic schemes to select the virtual lanes for outgoing messages and mix the load among them without restricting any particular type of message to be carried only by a particular virtual network. We achieve this by exposing the selection algorithms to the coherence protocol itself, so that it can inject messages into selected virtual lanes based on some local information, and still enjoy deadlock-freedom. Our execution-driven simulation on five applications from the SPLASH-2 suite shows that as the system scales, the virtual network selection algorithms play an important role. For 128-node systems, our dynamic selection algorithm speeds up parallel execution by as much as 22 percent over an optimized baseline system running a modified SGI Origin 2000 protocol. We also explore how network latency, the number of message buffers per virtual lane, and the depth of network interface output queues affect the relative performance of various virtual lane selection algorithms.</p>
Distributed shared memory, cache coherence protocol, virtual network, deadlock-freedom.
Mainak Chaudhuri, Mark Heinrich, "Exploring Virtual Network Selection Algorithms in DSM Cache Coherence Protocols", IEEE Transactions on Parallel & Distributed Systems, vol.15, no. 8, pp. 699-712, August 2004, doi:10.1109/TPDS.2004.35