The Community for Technology Leaders
2011 IEEE Sixth International Conference on Networking, Architecture, and Storage (2011)
Dalian, Lianong China
July 28, 2011 to July 30, 2011
ISBN: 978-0-7695-4509-7
pp: 140-148
As more threads added to execute the multi-threaded applications in the many-core era, memory contentions among different threads impose a severe challenge to both the programmability and performance. Existing studies show that Transactional Memory (TM) is able to solve the programmability problem and scale well on the fine-grained applications in the SPLASH-2 benchmark suite. As more investigations on the coarse-grained applications in the STAMP benchmark suite, the long-running transactions block the parallelism among the concurrent transactions and failed to obtain the performance returns when the number of threads is beyond 4. In order to address this problem, we propose TMTLS, which combines TM with Thread-Level Speculation (TLS) to limit the number of concurrent executing transactions due to the memory contention in the runtime, divides the coarse-grained transactions into several epochs and assigns them to the available threads to speculatively exploit the parallelism in the coarse-grained transactions. This proposal not only alleviates the memory contention among the threads but also shortens the execution period of the coarse-grained transactions. Moreover, it further reduces the serializing overheads due to the transactional conflicts among the transactions. Our evaluation show this method achieves an average speedup of 2.27 over the baseline TM system under the 4 high-contention and coarse-grained applications selected from the STAMP benchmark suite on a 16-core CMP.
Parallel Programming, Transactional Memory, Thread Level Speculation

Z. Yan, D. Feng and Y. Tan, "TMTLS: Combine TM with TLS to Limit the Memory Contentions and Exploit the Parallelism in the Long-Running Transactions," 2011 IEEE Sixth International Conference on Networking, Architecture, and Storage(NAS), Dalian, Lianong China, 2011, pp. 140-148.
90 ms
(Ver 3.3 (11022016))