This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Improving Availability and Performance with Application-Specific Data Replication
January 2005 (vol. 17 no. 1)
pp. 106-120
The emerging edge services architecture promises to improve the availability and performance of Web services by replicating servers at geographically distributed sites. A key challenge in such systems is data replication and consistency, so that edge server code can manipulate shared data without suffering the availability and performance penalties that would be incurred by accessing a traditional centralized database. This article explores using a distributed object architecture to build an edge service data replication system for an e-commerce application, the TPC-W benchmark, which simulates an online bookstore. We take advantage of application-specific semantics to design distributed objects that each manages a specific subset of shared information using simple and effective consistency models. Our experimental results show that by slightly relaxing consistency within individual distributed objects, our application realizes both high availability and excellent performance. For example, in one experiment, we find that our object-based edge server system provides five times better response time over a traditional centralized cluster architecture and a factor of nine improvement over an edge service system that distributes code but retains a centralized database.

[1] Akamai Technologies, Inc., “Akamai— The Business Internet— A Predictable Platform for Profitable E-Business,” http://www.a kamai.com/BusinessInternetwhitepaper_business_internet.pdf , 2004.
[2] Akamai Technologies, Inc., “Turbo-Charging Dynamic Web Sites with Akamai EdgeSuite,” White paper, 2004.
[3] C. Amza, E. Cecchet, A. Chanda, A. Cox, S. Elnikety, R. Gil, J. Marguerite, K. Rajamani, and W. Zwaenepoel, “Bottleneck Characterization of Dynamic Web Site Benchmarks,” Technical Report TR02-391, Rice Univ., Feb. 2002.
[4] M. Arlitt, D. Krishnamurthy, and J. Rolia, “Characterizing the Scalability of a Large Web-Based Shopping System,” ACM Trans. Internet Technology, June 2001.
[5] A. Awadallah and M. Rosenblum, “The vMatrix: A Network of Virtual Machine Monitors for Dynamic Content Distribution,” Proc. Seventh Int'l Workshop Web Content Caching and Distribution, Aug. 2002.
[6] S. Bhattacharjee, K. Calvert, and E. Zegura, “Self-Organizing Wide Area Network Caches,” Technical Report GIT-CC-97/31, Georgia Tech., 1997.
[7] E. Brewer, “Lessons from Giant-Scale Services,” IEEE Internet Computing, July/Aug. 2001.
[8] P. Cao, J. Zhang, and K. Beach, “Active Cache: Caching Dynamic Contents on the Web,” Proc. Middleware '98 Conf., 1998.
[9] J. Challenger, P. Dantzig, and A. Iyengar, “A Scalable and Highly Available System for Serving Dynamic Data at Frequently Accessed Web Sites,” Proc. ACM/IEEE Conf. Supercomputing '98 (SC98), Nov. 1998.
[10] J. Challenger, P. Dantzig, A. Iyengar, “A Scalable System for Consistently Caching Dynamic Web Data,” Proc. IEEE Infocom Conf., Mar. 1999.
[11] J. Challenger, A. Iyengar, K. Witting, C. Ferstat, and P. Reed, “A Publishing System for Efficiently Creating Dynamic Web Content,” Proc. IEEE Infocom Conf., Mar. 2000.
[12] B. Chandra, “Web Workloads Influencing Disconnected Services Access,” master's thesis, Univ. of Texas at Austin, 2001.
[13] S. Cheung, M. Ahamad, and M.H. Ammar, “Optimizing Vote and Quorum Assignments for Reading and Writing Replicated Data,” IEEE Trans. Knowlegde and Data Eng., vol. 1, no. 3, pp. 387-397, Sept. 1989.
[14] IBM Corp., “MQSeries: An Introduction to Messaging and Queueing,” Technical Report GC33-0805-01, IBM Corp., July 1995, ftp://ftp.software.ibm.com/software/mqseries/ pdfhor aa101.pdf.
[15] Transaction Processing Performance Council, “TPC BENCHMARK W,” http://www.tpc.org/tpcw/spec/-tpcw_V1.8.pdf, 2002.
[16] M. Dahlin, B. Chandra, L. Gao, and A. Nayate, “End-to-End WAN Service Availability,” IEEE/ACM Trans. Networking, 2003.
[17] V. Duvvuri, P. Shenoy, and R. Tewari, “Adaptive Lease: A Strong Consistency Mechanism for the World Wide Web,” Proc. IEEE Infocom Conf., Mar. 2000.
[18] Z. Fei, S. Bhattacharjee, E. Zegura, and M. Ammar, “A Novel Server Selection Technique for Improving the Response Time of a Replicated Service,” Proc. IEEE Infocom Conf., Mar. 1998.
[19] M. Frigo, “The Weakest Reasonable Memory Model,” master's thesis, MIT, 1988.
[20] D. Garcia and J. Garcia, “TPC-W E-Commerce Benchmark Evaluation,” Computer, Feb. 2003.
[21] C. Gray and D. Cheriton, “Leases: An Efficient Fault-Tolerant Mechanism for Distributed File Cache Consistency,” Proc. 12th ACM Symp. Operating Systems Principles, pp. 202-210, 1989.
[22] J. Gray, P. Helland, P.E. O'Neil, and D. Shasha, “Dangers of Replication and a Solution,” Proc. SIGMOD Conf., pp. 173-182, 1996.
[23] S. Gribble, E. Brewer, J. Hellerstein, and D. Culler, “Scalable, Distributed Data Structures for Internet Service Construction,” Proc. Fourth Symp. Operating Systems Design and Implementation, Oct. 2000.
[24] M. Herlihy, “A Quorum-Consensus Replication Method for Abstract Data Types,” ACM Trans. Computer Systems, vol. 4, no. 1, pp. 32-53, Feb. 1986.
[25] IBM, The Economic Value of Rapid Response Time, pp. 11-82, Number GE20-0752-0, 1982.
[26] Java Message Service (JMS), http://java.sun.com/productsjms, 2004.
[27] JORAM, http://www.objectWeb.orgjoram, 2004.
[28] R. Ladin, B. Liskov, L. Shrira, and S. Ghemawat, “Providing High Availability Using Lazy Replication,” ACM Trans. Computer Systems, vol. 10, no. 4, pp. 360-391, Nov. 1992.
[29] R. Lipton and J. Sandberg, “PRAM: A Scalable Shared Memory,” Technical Report CS-TR-180-88, Princeton Univ., 1988.
[30] W. Litwin, M-A. Neimat, and D. Schneider, “LH*— A Scalable, Distributed Data Structure,” ACM Trans. Database Systems, Dec. 1996.
[31] L. Mummert, M. Ebling, and M. Satyanarayanan, “Exploiting Weak Connectivity for Mobile File Access,” Proc. 15th ACM Symp. Operating Systems Principles, Dec. 1995.
[32] A. Nayate, M. Dahlin, and A. Iyengar, “Data Invalidation and Prefetching for Transparent Edge-Service Replication,” Technical Report TR-03-44, Univ. of Texas at Austin, Dept. of Computer Sciences, Nov. 2002.
[33] NIST Net Home Page, http://snad.ncsl.nist.gov/itgnistnet/, 2004.
[34] Oracle7 Server Distributed Systems: Replicated Data, http://www.oracle.com/products/oracle7/server/ whitepapers/replica tion/htmlindex , 1994.
[35] V. Paxson, “End-to-End Routing Behavior in the Internet,” Proc. ACM SIGCOMM '96 Conf. Applications, Technologies, Architectures, and Protocols for Computer Comm., Aug. 1996.
[36] K. Petersen, M. Spreitzer, D. Terry, M. Theimer, and A. Demers, “Flexible Update Propagation for Weakly Consistent Replication,” Proc. 16th ACM Symp. Operating Systems Principles, Oct. 1997.
[37] Y. Saito, B. Bershad, and H. Levy, “Manageability, Availability and Performance in Porcupine: A Highly Scalable, Cluster-Based Mail Service,” Proc. 17th ACM Symp. Operating Systems Principles, Dec. 1999.
[38] Y. Saito and H. Levy, “Optimistic Replication for Internet Data Services,” Proc. 14th Int'l Conf. Distributed Computing, Oct. 2000.
[39] M. Shapiro, “Structure and Encapsulation in Distributed Systems: The Proxy Principle,” Proc. Sixth Int'l Conf. Distributed Computing Systems, May 1986.
[40] C. Sterling, “Programming Best Practices with Microsoft Message Queuing Services (MSMQ),” http://msdn.microsoft. com/library/default.asp?url=/ library/en-us/dnmqqc/htmlmsmqbest.asp , 2004.
[41] A. Tanenbaum and M. van Steen, Distributed Systems: Principles and Paradigms, Consistency and Replication (chapter), Prentice Hall, 2002.
[42] D. Terry, M. Theimer, K. Petersen, A. Demers, M. Spreitzer, and C. Hauser, “Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System,” Proc. 15th ACM Symp. Operating Systems Principles, pp. 172-183, Dec. 1995.
[43] The PHARM Project at the University of Wisconsin, http://www.ece.wisc.edu/pharmtpcw/, 2003.
[44] A. Vahdat, M. Dahlin, T. Anderson, and A. Aggarwal, “Active Naming: Flexible Location and Transport of Wide-Area Resources,” Proc. Second USENIX Symp. Internet Technologies and Systems, Oct. 1999.
[45] M. van Steen, P. Homburg, and S. Tanenbaum, “Globe: A Wide-Area Distributed System,” technical report, Vrije Universiteit, Mar. 1999.
[46] K. Walsh, A. Vahdat, and J. Yang, “Enabling Wide-Area Replication of Database Services with Continuous Consistency,” Unpublished Manuscript.
[47] A. Whitaker, M. Shaw, and S. Gribble, “Scale and Performance in the Denali Isolation Kernel,” Proc. Fifth USENIX Symp. Operating Systems Design and Implementation (OSDI '02), Dec. 2002.
[48] TPC-W Performance Result in Price/Performance, http://www.tpc.org/tpcw/resultstpcw_price_perf_results.asp , 2004.
[49] J. Yin, L. Alvisi, M. Dahlin, and A. Iyengar, “Eng. Server-Driven Consistency for Large Scale Dynamic Web Services,” Proc. 2001 Int'l World Wide Web Conf., May 2001.
[50] C. Yoshikawa, B. Chun, P. Eastham, A. Vahdat, T. Anderson, and D. Culler, “Using Smart Clients to Build Scalable Services,” Proc. 1997 USENIX Technical Conf., Jan. 1997.
[51] H. Yu and A. Vahdat, “The Costs and Limits of Availability for Replicated Services,” Proc. 18th ACM Symp. Operating Systems Principles, 2001.
[52] H. Yu and A. Vahdat, “Minimal Cost Replication for Availability,” Proc. 21 Symp. Principles of Distributed Computing, 2002.
[53] Y. Zhang, V. Paxson, and S. Shenkar, “The Stationarity of Internet Path Properties: Routing, Loss, and Throughput,” technical report, AT&T Center for Internet Research at ICSI, http:/www.aciri. org/, May 2000.

Index Terms:
Availability, data replication, distributed objects, edge services, performance, Wide Area Networks (WAN).
Citation:
Lei Gao, Mike Dahlin, Amol Nayate, Jiandan Zheng, Arun Iyengar, "Improving Availability and Performance with Application-Specific Data Replication," IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 1, pp. 106-120, Jan. 2005, doi:10.1109/TKDE.2005.10
Usage of this product signifies your acceptance of the Terms of Use.