The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.02 - February (2012 vol.23)
pp: 375-382
Mathieu Desnoyers , EfficiOS Inc., Montréal
Paul E. McKenney , IBM Linux Technology Center, Beaverton
Alan S. Stern , Rowland Institute at Harvard, Cambridge
Michel R. Dagenais , Ecole Polytechnique de Montréal, Montréal
Jonathan Walpole , Portland State University, Portland
ABSTRACT
Read-copy update (RCU) is a synchronization technique that often replaces reader-writer locking because RCU's read-side primitives are both wait-free and an order of magnitude faster than uncontended locking. Although RCU updates are relatively heavy weight, the importance of read-side performance is increasing as computing systems become more responsive to changes in their environments. RCU is heavily used in several kernel-level environments. Unfortunately, kernel-level implementations use facilities that are often unavailable to user applications. The few prior user-level RCU implementations either provided inefficient read-side primitives or restricted the application architecture. This paper fills this gap by describing efficient and flexible RCU implementations based on primitives commonly available to user-level applications. Finally, this paper compares these RCU implementations with each other and with standard locking, which enables choosing the best mechanism for a given workload. This work opens the door to widespread user-application use of RCU.
INDEX TERMS
Synchronization, process management, operating systems, software/software engineering, threads, concurrency.
CITATION
Mathieu Desnoyers, Paul E. McKenney, Alan S. Stern, Michel R. Dagenais, Jonathan Walpole, "User-Level Implementations of Read-Copy Update", IEEE Transactions on Parallel & Distributed Systems, vol.23, no. 2, pp. 375-382, February 2012, doi:10.1109/TPDS.2011.159
REFERENCES
[1] B. Gamsa, O. Krieger, J. Appavoo, and M. Stumm, "Tornado: Maximizing Locality and Concurrency in a Shared Memory Multiprocessor Operating System," Proc. the Third Symp. Operating System Design and Implementation, pp. 87-100, Feb. 1999.
[2] J.P. Hennessy, D.L. Osisek, and J.W. Seigh II, "Passive Serialization in a Multitasking Environment," Technical Report US Patent 4,809,168 (Lapsed), US Patent and Trademark Office, Washington, DC, Feb. 1989.
[3] V. Jacobson, "Avoid Read-Side Locking via Delayed Free," Private Comm., Sept. 1993.
[4] A. John, "Dynamic Vnodes—Design and Implementation," Proc. the USENIX 1995 Technical Conf. pp. 11-23, Jan. 1995.
[5] P.E. McKenney and J.D. Slingwine, "Read-Copy Update: Using Execution History to Solve Concurrency Problems," Proc. Parallel and Distributed Computing and Systems, pp. 509-518, Oct. 1998.
[6] S. Boyd-Wickizer, A.T. Clements, Y. Mao, A. Pesterev, M.F. Kaashoek, R. Morris, and N. Zeldovich, "An Analysis of Linux Scalability to Many Cores," Proc. Ninth USENIX Symp. Operating System Design and Implementation, pp. 1-16, Oct. 2010.
[7] T.E. Hart, P.E. McKenney, A.D. Brown, and J. Walpole, "Performance of Memory Reclamation for Lockless Synchronization," J. Parallel Distributed Computing, vol. 67, no. 12, pp. 1270-1285, 2007.
[8] K.A. Fraser, "Practical Lock-Freedom," PhD dissertation, King's College, Univ. of Cambridge, 2003.
[9] K. Fraser and T. Harris, "Concurrent Programming without Locks," ACM Trans. Computer Systems, vol. 25, no. 2, pp. 1-61, 2007.
[10] S. Heller, M. Herlihy, V. Luchangco, M. Moir, W.N. Scherer III, and N. Shavit, "A Lazy Concurrent List-Based Set Algorithm," OPODIS '05: Proc. Ninth Int'l Conf. Principles of Distributed Systems, pp. 3-16, 2005.
[11] H.T. Kung and Q. Lehman, "Concurrent Maintenance of Binary Search Trees," ACM Trans. Database Systems, vol. 5, no. 3, pp. 354-382, Sept. 1980.
[12] P. Becker, "Working Draft, Standard for Programming Language C++," http://open-std.org/jtc1/sc22/wg21/docs/ papers/2010n3126.pdf, Aug. 2010.
[13] D. Guniguntala, P.E. McKenney, J. Triplett, and J. Walpole, "The Read-Copy-Update Mechanism for Supporting Real-Time Applications on Shared-Memory Multiprocessor Systems with Linux," IBM Systems J., vol. 47, no. 2, pp. 221-236, May 2008.
[14] P.E. McKenney and J. Walpole What Is RCU, Fundamentally? Linux Weekly News, http://lwn.net/Articles262464/, Dec. 2007.
[15] M. Herlihy, "Implementing Highly Concurrent Data Objects," , ACM Trans. Programming Languages and Systems, vol. 15, no. 5, pp. 745-770,, Nov. 1993.
[16] R.K. Treiber, "Systems Programming: Coping with Parallelism," RJ 5118, Apr. 1986.
[17] D. Sarma and P.E. McKenney, "Making RCU Safe for Deep Sub-Millisecond Response Realtime Applications," Proc. the 2004 USENIX Ann. Technical Conf. (FREENIX Track), pp. 182-191, June 2004.
[18] P.E. McKenney What Is RCU? Part 2: Usage, Linux Weekly News, http://lwn.net/Articles263130/, Jan. 2008.
[19] M. Desnoyers, "Low-Impact Operating System Tracing," PhD dissertation, Ecole Polytechnique de Montréal, http://www.lttng.org/pub/thesisdesnoyers-dissertation-2009-12.pdf , Dec. 2009.
[20] P.-M. Fournier, M. Desnoyers, and M.R. Dagenais, "Combined Tracing of the Kernel and Applications with LTTng," Proc. the 2009 Linux Symp., July 2009.
[21] T. Jinmei and P. Vixie, "Implementation and Evaluation of Moderate Parallelism in the BIND9 DNS Server," Proc. the USENIX Ann. Technical Conf., pp. 115-128, Feb. 2006.
[22] W.C. Hsieh and W.E. Weihl, "Scalable Reader-Writer Locks for Parallel Systems," Proc. the Sixth Int'l Parallel Processing Symp., pp. 216-230, Mar. 1992.
[23] C. Cascaval, C. Blundell, M. Michael, H.W. Cain, P. Wu, S. Chiras, and S. Chatterjee, "Software Transactional Memory: Why Is It Only a Research Toy?," ACM Queue, vol. 6, pp. 46-58, Sept. 2008.
[24] L. Dalessandro, M.F. Spear, and M.L. Scott, "NOrec: Streamlining STM by Abolishing Ownership Records," Proc. 15th ACM SIGPLAN Symp. Principles and Practice of Parallel Programming (PPOPP), pp. 67-78, 2010.
[25] A. Dragovejic, P. Felber, V. Gramoli, and R. Guerraoui, "Why STM Can Be More than a Research Toy," http://infoscience.epfl.ch/record/144052/ filespaper.pdf, Feb. 2010.
[26] H. Chafi, J. Casper, B.D. Carlstrom, A. McDonald, C.C. Minh, W. Baek, C. Kozyrakis, and K. Olukotun, "A Scalable, Non-Blocking Approach to Transactional Memory," Proc. IEEE 13th Int'l Symp. High Performance Computer Architecture (HPCA), pp. 97-108, 2007.
[27] S.H. Pugsley, M. Awasthi, N. Madan, N. Muralimanohar, and R. Balasubramonian, "Scalable and Reliable Communication for Hardware Transactional Memory," Proc. 17th Int'l Conf. Parallel Architectures and Compilation Techniques (PACT), pp. 144-154, 2008.
[28] D. Dice, Y. Lev, M. Moir, and D. Nussbaum, "Early Experience with a Commercial Hardware Transactional Memory Implementation," Proc. 14th Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS '09), pp. 157-168, Mar. 2009.
[29] P.E. McKenney, "RCU vs. Locking Performance on Different CPUs," linux.conf.au, Adelaide, Australia, http://www.rdrop. com/users/paulmck/RCUlockperf.2004.01.17a.pdf , Jan. 2004.
8 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool