2016 International Conference on Parallel Architecture and Compilation Techniques (PACT) (2016)
Sept. 11, 2016 to Sept. 15, 2016
Xiangyao Yu , MIT CSAIL, United States
Hongzhe Liu , Algonquin Regional High School, United States
Ethan Zou , Lexington High School, United States
Srinivas Devadas , MIT CSAIL, United States
Cache coherence scalability is a big challenge in shared memory systems. Traditional protocols do not scale due to the storage and traffic overhead of cache invalidation. Tardis, a recently proposed coherence protocol, removes cache invalidation using logical timestamps and achieves excellent scalability. The original Tardis protocol, however, only supports the Sequential Consistency (SC) memory model, limiting its applicability. Tardis also incurs extra network traffic on some benchmarks due to renew messages, and has suboptimal performance when the program uses spinning to communicate between threads. In this paper, we address these downsides of Tardis protocol and make it significantly more practical. Specifically, we discuss the architectural, memory system and protocol changes required in order to implement the TSO consistency model on Tardis, and prove that the modified protocol satisfies TSO. We also describe modifications for Partial Store Order (PSO) and Release Consistency (RC). Finally, we propose optimizations for better leasing policies and to handle program spinning. On a set of benchmarks, optimized Tardis improves on a full-map directory protocol in the metrics of performance, storage and network traffic, while being simpler to implement.
Protocols, Coherence, Scalability, Multicore processing, Program processors, Physiology, Benchmark testing
X. Yu, H. Liu, E. Zou and S. Devadas, "Tardis 2.0: Optimized time traveling coherence for relaxed consistency models," 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT), Haifa, Israel, 2016, pp. 261-274.