This Article 
 Bibliographic References 
 Add to: 
An Efficient Tree Cache Coherence Protocol for Distributed Shared Memory Multiprocessors
March 1999 (vol. 48 no. 3)
pp. 352-360

Abstract—Directory schemes have long been used to solve the cache coherence problem for large scale shared memory multiprocessors. In addition, tree-based protocols have been employed to reduce the directory size and the invalidation latency for a large degree of data sharing in the system. However, the existing tree-based protocols involve a very high communication overhead for maintaining a balanced tree, especially when the degree of data sharing is low. This paper presents a new tree-based cache coherence protocol which is a hybrid of the limited directory and the linked list schemes. By utilizing a limited number of pointers in the directory, the proposed protocol connects the nodes caching a shared block in a tree fashion without incurring any communication overhead. In addition to the low communication overhead, the proposed scheme also possesses the advantages of the existing bit-map and tree-based linked list protocols, namely, scalable memory requirement and logarithmic invalidation latency. We evaluate the performance of our protocol by running four applications on the Proteus execution-driven simulator. Our simulation results show that the performance of the proposed protocol is very close to that of the full-map protocol.

[1] M. Dubois, C. Scheurich, and F.A. Briggs, “Synchronization, Coherence, and Event Ordering in Multiprocessors,” Computer, vol. 21, no. 2, pp. 9-21, Feb. 1998.
[2] D. J. Lilja,“Cache coherence in large-scale shared memory multiprocessors: Issues and comparisons,”ACM Comput. Surv., vol. 25, no. 3, pp. 303–338, Sept. 1993.
[3] IEEE Std 1596-1992, Scalable Coherent Interface (SCI), IEEE, Piscataway, N.J., 1992.
[4] M. Thapar, B. Delagi, and M.J. Flynn, “Linked List Cache Coherence for Scalable Shared Memory Multiprocessors,” Proc. Int'l Symp. Parallel Processing, pp. 34-43, 1993.
[5] M. Thapar and B. Delagi, “Stanford Distributed Directory Protocol,” Computer, vol. 23, no. 6, pp. 78-80, June 1990.
[6] H. Nilsson and P. Stenström, “The Scalable Tree Protocol—A Cache Coherence Approach for Large-Scale Multiprocessors,” Proc. IEEE Symp. Parallel and Distributed Processing, pp. 498-506, 1992.
[7] R.E. Johnson, “Extending the Scalable Coherent Interface for Large-Scale Shared-Memory Multiprocessors,” PhD thesis, Univ. of Wisconsin-Madison, 1993.
[8] S. Kaxiras and J. Goodman, “Kiloprocessor Extensions to SCI,” Proc. Int'l Parallel Processing Symp., pp. 166-172, Apr. 1996.
[9] A. Agarwal, R. Simoni, J. Hennessy, and M. Horowitz, “An Evaluation of Directory Schemes for Cache Coherence,” Proc. 15th Ann. Int'l Symp. Computer Architecture, pp. 280-289, 1988.
[10] D. Chaiken, J. Kubiatowicz, and A. Agarwal,“LimitLESS directories: A scalable cache coherence scheme,”inProc. Int. Conf. Architect. Support Programm. Languages Oper. Syst., 1991, pp. 224–234.
[11] D. Chaiken and A. Agarwal, "Software-Extended Coherent Shared Memory—Performance and Cost," Twenty-First Annual Int'l Symp. Computer Arch., (ISCA 21), ACM, April 1994.
[12] D.V. James, A.T. Laundrie, S. Gjessing, and G.S. Sohi, “Distributed-Directory Scheme: Scalable Coherent Interface,” Computer, vol. 23, no. 6, pp. 74-77, June 1990.
[13] L.M. Censier and P. Feautrier, “A New Solution to Coherence Problems in Multicache Systems,” IEEE Trans. Computers, pp. 1,112-1,118 Dec. 1978.
[14] M.D. Hill, J.R. Larus, S.K. Reinhardt, and D.A. Wood, “Cooperative Shared Memory: Software and Hardware for Scalable Multiprocessors,” Proc. Fifth Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS-V), pp. 262-273, 1992.
[15] D.A. Wood, S. Chandra, B. Falsafi, M.D. Hill, J.R. Larus, A.R. Lebeck, J.C. Lewis, S.S. Mukherjee, S. Palacharla, and S.K. Reinhardt, “Mechanisms for Cooperative Shared Memory,” Proc. 20th Ann. Int'l Symp. Computer Architecture, pp. 156-167, 1993.
[16] A. Gupta et al., “Reducing Memory and Traffic Requirements for Scalable Directory-Based Cache Coherence Schemes,” Proc. Int'l Conf. Parallel Processing, 1990.
[17] W. Michael, “A Scalable Coherence System with a Dynamic Pointing Scheme,” Proc. Supercomputing, pp. 358-367, 1992.
[18] A. Gupta and W. Weber,“Analysis of cache invalidation patterns in multiprocessors,”inProc. Int. Symp. Comput. Architect., 1989, pp. 243–455.
[19] E.A. Brewer, C.N. Dellarocas, A. Colbrook, and W.E. Weihl, "PROTEUS: A High-Performance Parallel Architecture Simulator," technical report, Massachusetts Inst. of Tech nology, Sept. 1992.
[20] J.P. Singh, W.-D. Weber, and A. Gupta, “SPLASH: Stanford Parallel Applications for Shared-Memory,” Technical Report CSL-TR-92-526, Stanford Univ., Palo Alto, Calif., June 1992.
[21] D. Lenoski et al., “The DASH prototype: Logic overhead and performance,” IEEE Trans. on Parallel and Distributed Systems, vol. 4, no. 1, 1993, pp. 41-61.

Index Terms:
Cache coherence, tree-based directory protocols, shared memory, large scale multiprocessors, execution-driven simulation.
Yeimkuan Chang, Laxmi N. Bhuyan, "An Efficient Tree Cache Coherence Protocol for Distributed Shared Memory Multiprocessors," IEEE Transactions on Computers, vol. 48, no. 3, pp. 352-360, March 1999, doi:10.1109/12.755001
Usage of this product signifies your acceptance of the Terms of Use.