This Article 
 Bibliographic References 
 Add to: 
Clustering Support and Replication Management for Scalable Network Services
November 2003 (vol. 14 no. 11)
pp. 1168-1179

Abstract—The ubiquity of the Internet and various intranets has brought about widespread availability of online services and applications accessible through the network. Cluster-based network services have been rapidly emerging due to their cost-effectiveness in achieving high availability and incremental scalability. This paper presents the design and implementation of the Neptune middleware system that provides clustering support and replication management for scalable network services. Neptune employs a loosely connected and functionally symmetric clustering architecture to achieve high scalability and robustness. It shields the clustering complexities from application developers through simple programming interfaces. In addition, Neptune provides replication management with flexible replication consistency support at the clustering middleware level. Such support can be easily applied to a large number of applications with different underlying data management mechanisms or service semantics. The system has been implemented on Linux and Solaris clusters, where a number of applications have been successfully deployed. Our evaluations demonstrate the system performance and smooth failure recovery achieved by proposed techniques.

[1] The Alexandria Digital Library Project,http:/www.alexan, 2003.
[2] A. Adya and B. Liskov, Lazy Consistency Using Loosely Synchronized Clocks Proc. ACM Symp. Principles of Distributed Computing, pp. 73-82, Aug. 1997.
[3] D. Agrawal, A. El Abbadi, and R.C. Steinke, Epidemic Algorithms in Replicated Databases Proc. 16th Symp. Principles of Database Systems, pp. 161-172, May 1997.
[4] T. Anderson, Y. Breitbart, H.F. Korth, and A. Wool, Replication, Consistency, and Practicality: Are These Mutually Exclusive? Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 484-495, June 1998.
[5] M. Aron, D. Sanders, P. Druschel, and W. Zwaenepoel, Scalable Content-Aware Request Distribution in Cluster-Based Network Services Proc. USENIX Ann. Technical Conf., June 2000.
[6] Ask Jeeves Search,http:/, 2003.
[7] A. Barak, S. Guday, and R.G. Wheeler, The MOSIX Distributed Operating System: Load Balancing for UNIX, Springer-Verlag, 1993.
[8] D.A. Benson, I. Karsch-Mizrachi, D.J. Lipman, J. Ostell, B.A. Rapp, and D.L. Wheeler, GenBank Nucleic Acids Research, vol. 30, no. 1, pp. 17-20, 2002.
[9] P. Bernstein and E. Newcomer, Principles of Transaction Processing. Morgan Kaufmann, 1997.
[10] G. Berry, J. Chase, G. Cohen, L. Cox, and A. Vahdat, Toward Automatic State Management for Dynamic Web Services Proc. Network Storage Symp., Oct. 1999.
[11] E.V. Carrera and R. Bianchini, Efficiency vs. Portability in Cluster-Based Network Servers Proc. Eighth ACM Symp. Principles and Practice of Parallel Programming, pp. 113-122, June 2001.
[12] L. Chu, K. Shen, and T. Yang, A Guide to Neptune: Clustering Middleware for Online Services part of the Neptune software distribution,http:/, Apr. 2003.
[13] P. Chundi, D.J. Rosenkratz, and S.S. Ravi, "Deferred Updates and Data Placement in Distributed Databases," Proc. 12th Int'l Conf. Data Eng.,New Orleans, 1996.
[14] D.L. Eager, E.D. Lazowska, and J. Zahorjan, Adaptive Load Sharing in Homogeneous Distributed Systems IEEE Trans. Software Eng., vol. 12, no. 5, pp. 662-675, May 1986.
[15] eBay Online Auctions,http:/, 2003.
[16] D. Ferrari, A Study of Load Indices for Load Balancing Schemes Technical Report CSD-85-262, EECS Dept., Univ. of California Berkeley, Oct. 1985.
[17] A. Fox, S.D. Gribble, Y. Chawathe, E.A. Brewer, and P. Gauthier, Cluster-Based Scalable Network Services Proc. 16th ACM Symp. Operating System Principles, pp. 78-91, Oct. 1997.
[18] H. Garcia-Molina, Elections in a Distributed Computing System IEEE Trans. Computers, vol. 31, no. 1, pp. 48-59, Jan. 1982.
[19] Google Search,http:/, 2003.
[20] J. Gray, P. Helland, P. O'Neil, and D. Shasha, The Dangers of Replication and a Solution Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 173-182, June 1996.
[21] J. Gray and A. Reuter, Transaction Processing: Concepts and Techniques. Morgan Kaufmann, 1993.
[22] S.D. Gribble, E.A. Brewer, J.M. Hellerstein, and D. Culler, Scalable, Distributed Data Structures for Internet Service Construction Proc. Fourth USENIX Symp. Operating Systems Design and Implementation, Oct. 2000.
[23] G.D.H. Hunt, G.S. Goldszmidt, R.P. King, and R. Mukherjee, Network Dispatcher: A Connection Router for Scalable Internet Services Proc. Seventh Int'l World Wide Web Conf., Apr. 1998.
[24] M. Mitzenmacher, On the Analysis of Randomized Load Balancing Schemes Proc. Ninth ACM Symp. Parallel Algorithms and Architectures, pp. 292-301, June 1997.
[25] MSN Groups Service,http:/, 2003.
[26] V.S. Pai, M. Aron, G. Banga, M. Svendsen, P. Druschel, W. Zwaenepoel, and E. Nahum, Locality-Aware Request Distribution in Cluster-Based Network Servers Proc. Eighth ACM Conf. Architectural Support for Programming Languages and Operating Systems, pp. 205-216, Oct. 1998.
[27] K. Petersen, M.J. Spreitzer, D.B. Terry, M.M. Theimer, and A.J. Demers, Flexible Update Propagation for Weakly Consistent Replication Proc. 16th ACM Symp. Operating Systems Principles, pp. 288-301, Oct. 1997.
[28] Y. Saito, B.N. Bershad, and H.M. Levy, Manageability, Availability, and Performance in Porcupine: A Highly Scalable, Cluster-Based Mail Service Proc. 17th ACM Symp. Operating Systems Principles, pp. 1-15, Dec. 1999.
[29] Teoma Search,http:/, 2003.
[30] D. Terry, A. Demers, K. Petersen, M. Spreitzer, M. Theimer, and B. Welch, Session Guarantees for Weakly Consistent Replicated Data Proc. Int'l Conf. Parallel and Distributed Information Systems, pp. 140-149, Sept. 1994.
[31] WebLogic and Tuxedo Transaction Application Server White Papers, , 2003.
[32] J. Robert von Behren, E.A. Brewer, N. Borisov, M. Chen, M. Welsh, J. MacDonald, J. Lau, S. Gribble, and D. Culler, Ninja: A Framework for Network Services Proc. USENIX Ann. Technical Conf., June 2002.
[33] H. Yu and A. Vahdat, Design and Evaluation of a Continuous Consistency Model for Replicated Services Proc. Fourth USENIX Symp. Operating Systems Design and Implementation, Oct. 2000.
[34] S. Zhou, An Experimental Assessment of Resource Queue Lengths as Load Indices Proc. Winter USENIX Technical Conf., pp. 73-82, Jan. 1987.
[35] S. Zhou, "A Trace-Driven Simulation Study of Dynamic Load Balancing," IEEE Trans. Software Eng., vol. 14, no. 9, pp. 1,327-1,341, Sept. 1988.
[36] H. Zhu and T. Yang, Class-Based Cache Management for Dynamic Web Content Proc. IEEE INFOCOM, pp. 1215-1224, Apr. 2001.

Index Terms:
Network services, programming support, replication management, failure recovery, load balancing.
Kai Shen, Tao Yang, Lingkun Chu, "Clustering Support and Replication Management for Scalable Network Services," IEEE Transactions on Parallel and Distributed Systems, vol. 14, no. 11, pp. 1168-1179, Nov. 2003, doi:10.1109/TPDS.2003.1247676
Usage of this product signifies your acceptance of the Terms of Use.