loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2007 IEEE 13th International Symposium on High Performance Computer Architecture
An Adaptive Cache Coherence Protocol Optimized for Producer-Consumer Sharing
Scottsdale, AZ, USA
February 10-February 14
ISBN: 1-4244-0804-0
Liqun Cheng, University of Utah, legion@cs.utah.edu, retrac@cs.utah.edu
John B. Carter, University of Utah, legion@cs.utah.edu, retrac@cs.utah.edu
Donglai Dai, Silicon Graphics, Inc. dai@sgi.com
Shared memory multiprocessors play an increasingly important role in enterprise and scientific computing facilities. Remote misses limit the performance ofshared memory applications, and their significance is growing as network latency increases relative to processor speeds. This paper proposes two mechanisms that improve shared memory performance by eliminating remote misses and/or reducing the amount of communication required to maintain coherence. We focus on improving the performance of applications that exhibit producer-consumer sharing. We first present a simple hardware mechanism for detecting producer-consumer sharing. We then describe a directory delegation mechanism whereby the "home node" of a cache line can be delegated to a producer node, thereby converting 3-hop coherence operations into 2-hop operations. We then extend the delegation mechanism to support speculative updates for data accessed in a producer-consumer pattern, which can convert 2-hop misses into local misses, thereby eliminating the remote memory latency. Both mechanisms can be implemented without changes to the processor We evaluate our directory delegation and speculative update mechanisms on seven benchmark programs that exhibit producer-consumer sharing using a cycle-accurate execution-driven simulator of a future 16-node SGI multiprocessor We find that the mechanisms proposed in this paper reduce the av average remote miss rate by 40%, reduce network traffic by 15%, and improve performance by 21%. Finally, we use Murphi to verify that each mechanism is error-free and does not violate sequential consistency.
Citation:
Liqun Cheng, John B. Carter, Donglai Dai, "An Adaptive Cache Coherence Protocol Optimized for Producer-Consumer Sharing," hpca, pp.328-339, 2007 IEEE 13th International Symposium on High Performance Computer Architecture, 2007
Usage of this product signifies your acceptance of the Terms of Use.