This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
User-Level Implementations of Read-Copy Update
February 2012 (vol. 23 no. 2)
pp. 375-382
Mathieu Desnoyers, EfficiOS Inc., Montréal
Paul E. McKenney, IBM Linux Technology Center, Beaverton
Alan S. Stern, Rowland Institute at Harvard, Cambridge
Michel R. Dagenais, Ecole Polytechnique de Montréal, Montréal
Jonathan Walpole, Portland State University, Portland
Read-copy update (RCU) is a synchronization technique that often replaces reader-writer locking because RCU's read-side primitives are both wait-free and an order of magnitude faster than uncontended locking. Although RCU updates are relatively heavy weight, the importance of read-side performance is increasing as computing systems become more responsive to changes in their environments. RCU is heavily used in several kernel-level environments. Unfortunately, kernel-level implementations use facilities that are often unavailable to user applications. The few prior user-level RCU implementations either provided inefficient read-side primitives or restricted the application architecture. This paper fills this gap by describing efficient and flexible RCU implementations based on primitives commonly available to user-level applications. Finally, this paper compares these RCU implementations with each other and with standard locking, which enables choosing the best mechanism for a given workload. This work opens the door to widespread user-application use of RCU.

[1] B. Gamsa, O. Krieger, J. Appavoo, and M. Stumm, "Tornado: Maximizing Locality and Concurrency in a Shared Memory Multiprocessor Operating System," Proc. the Third Symp. Operating System Design and Implementation, pp. 87-100, Feb. 1999.
[2] J.P. Hennessy, D.L. Osisek, and J.W. Seigh II, "Passive Serialization in a Multitasking Environment," Technical Report US Patent 4,809,168 (Lapsed), US Patent and Trademark Office, Washington, DC, Feb. 1989.
[3] V. Jacobson, "Avoid Read-Side Locking via Delayed Free," Private Comm., Sept. 1993.
[4] A. John, "Dynamic Vnodes—Design and Implementation," Proc. the USENIX 1995 Technical Conf. pp. 11-23, Jan. 1995.
[5] P.E. McKenney and J.D. Slingwine, "Read-Copy Update: Using Execution History to Solve Concurrency Problems," Proc. Parallel and Distributed Computing and Systems, pp. 509-518, Oct. 1998.
[6] S. Boyd-Wickizer, A.T. Clements, Y. Mao, A. Pesterev, M.F. Kaashoek, R. Morris, and N. Zeldovich, "An Analysis of Linux Scalability to Many Cores," Proc. Ninth USENIX Symp. Operating System Design and Implementation, pp. 1-16, Oct. 2010.
[7] T.E. Hart, P.E. McKenney, A.D. Brown, and J. Walpole, "Performance of Memory Reclamation for Lockless Synchronization," J. Parallel Distributed Computing, vol. 67, no. 12, pp. 1270-1285, 2007.
[8] K.A. Fraser, "Practical Lock-Freedom," PhD dissertation, King's College, Univ. of Cambridge, 2003.
[9] K. Fraser and T. Harris, "Concurrent Programming without Locks," ACM Trans. Computer Systems, vol. 25, no. 2, pp. 1-61, 2007.
[10] S. Heller, M. Herlihy, V. Luchangco, M. Moir, W.N. Scherer III, and N. Shavit, "A Lazy Concurrent List-Based Set Algorithm," OPODIS '05: Proc. Ninth Int'l Conf. Principles of Distributed Systems, pp. 3-16, 2005.
[11] H.T. Kung and Q. Lehman, "Concurrent Maintenance of Binary Search Trees," ACM Trans. Database Systems, vol. 5, no. 3, pp. 354-382, Sept. 1980.
[12] P. Becker, "Working Draft, Standard for Programming Language C++," http://open-std.org/jtc1/sc22/wg21/docs/ papers/2010n3126.pdf, Aug. 2010.
[13] D. Guniguntala, P.E. McKenney, J. Triplett, and J. Walpole, "The Read-Copy-Update Mechanism for Supporting Real-Time Applications on Shared-Memory Multiprocessor Systems with Linux," IBM Systems J., vol. 47, no. 2, pp. 221-236, May 2008.
[14] P.E. McKenney and J. Walpole What Is RCU, Fundamentally? Linux Weekly News, http://lwn.net/Articles262464/, Dec. 2007.
[15] M. Herlihy, "Implementing Highly Concurrent Data Objects," , ACM Trans. Programming Languages and Systems, vol. 15, no. 5, pp. 745-770,, Nov. 1993.
[16] R.K. Treiber, "Systems Programming: Coping with Parallelism," RJ 5118, Apr. 1986.
[17] D. Sarma and P.E. McKenney, "Making RCU Safe for Deep Sub-Millisecond Response Realtime Applications," Proc. the 2004 USENIX Ann. Technical Conf. (FREENIX Track), pp. 182-191, June 2004.
[18] P.E. McKenney What Is RCU? Part 2: Usage, Linux Weekly News, http://lwn.net/Articles263130/, Jan. 2008.
[19] M. Desnoyers, "Low-Impact Operating System Tracing," PhD dissertation, Ecole Polytechnique de Montréal, http://www.lttng.org/pub/thesisdesnoyers-dissertation-2009-12.pdf , Dec. 2009.
[20] P.-M. Fournier, M. Desnoyers, and M.R. Dagenais, "Combined Tracing of the Kernel and Applications with LTTng," Proc. the 2009 Linux Symp., July 2009.
[21] T. Jinmei and P. Vixie, "Implementation and Evaluation of Moderate Parallelism in the BIND9 DNS Server," Proc. the USENIX Ann. Technical Conf., pp. 115-128, Feb. 2006.
[22] W.C. Hsieh and W.E. Weihl, "Scalable Reader-Writer Locks for Parallel Systems," Proc. the Sixth Int'l Parallel Processing Symp., pp. 216-230, Mar. 1992.
[23] C. Cascaval, C. Blundell, M. Michael, H.W. Cain, P. Wu, S. Chiras, and S. Chatterjee, "Software Transactional Memory: Why Is It Only a Research Toy?," ACM Queue, vol. 6, pp. 46-58, Sept. 2008.
[24] L. Dalessandro, M.F. Spear, and M.L. Scott, "NOrec: Streamlining STM by Abolishing Ownership Records," Proc. 15th ACM SIGPLAN Symp. Principles and Practice of Parallel Programming (PPOPP), pp. 67-78, 2010.
[25] A. Dragovejic, P. Felber, V. Gramoli, and R. Guerraoui, "Why STM Can Be More than a Research Toy," http://infoscience.epfl.ch/record/144052/ filespaper.pdf, Feb. 2010.
[26] H. Chafi, J. Casper, B.D. Carlstrom, A. McDonald, C.C. Minh, W. Baek, C. Kozyrakis, and K. Olukotun, "A Scalable, Non-Blocking Approach to Transactional Memory," Proc. IEEE 13th Int'l Symp. High Performance Computer Architecture (HPCA), pp. 97-108, 2007.
[27] S.H. Pugsley, M. Awasthi, N. Madan, N. Muralimanohar, and R. Balasubramonian, "Scalable and Reliable Communication for Hardware Transactional Memory," Proc. 17th Int'l Conf. Parallel Architectures and Compilation Techniques (PACT), pp. 144-154, 2008.
[28] D. Dice, Y. Lev, M. Moir, and D. Nussbaum, "Early Experience with a Commercial Hardware Transactional Memory Implementation," Proc. 14th Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS '09), pp. 157-168, Mar. 2009.
[29] P.E. McKenney, "RCU vs. Locking Performance on Different CPUs," linux.conf.au, Adelaide, Australia, http://www.rdrop. com/users/paulmck/RCUlockperf.2004.01.17a.pdf , Jan. 2004.

Index Terms:
Synchronization, process management, operating systems, software/software engineering, threads, concurrency.
Citation:
Mathieu Desnoyers, Paul E. McKenney, Alan S. Stern, Michel R. Dagenais, Jonathan Walpole, "User-Level Implementations of Read-Copy Update," IEEE Transactions on Parallel and Distributed Systems, vol. 23, no. 2, pp. 375-382, Feb. 2012, doi:10.1109/TPDS.2011.159
Usage of this product signifies your acceptance of the Terms of Use.