The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.04 - Fourth Quarter (2012 vol.5)
pp: 525-539
Zhou Wei , VU University Amsterdam, Amsterdam and Tsinghua University, Beijing
Guillaume Pierre , VU University Amsterdam
Chi-Hung Chi , Tsinghua University, Beijing
ABSTRACT
NoSQL cloud data stores provide scalability and high availability properties for web applications, but at the same time they sacrifice data consistency. However, many applications cannot afford any data inconsistency. CloudTPS is a scalable transaction manager which guarantees full ACID properties for multi-item transactions issued by web applications, even in the presence of server failures and network partitions. We implement this approach on top of the two main families of scalable data layers: Bigtable and SimpleDB. Performance evaluation on top of HBase (an open-source version of Bigtable) in our local cluster and Amazon SimpleDB in the Amazon cloud shows that our system scales linearly at least up to 40 nodes in our local cluster and 80 nodes in the Amazon cloud.
INDEX TERMS
Cloud computing, Servers, Scalability, Distributed databases, Data models, Internet, NoSQL, Scalability, web applications, cloud computing, transactions
CITATION
Zhou Wei, Guillaume Pierre, Chi-Hung Chi, "CloudTPS: Scalable Transactions for Web Applications in the Cloud", IEEE Transactions on Services Computing, vol.5, no. 4, pp. 525-539, Fourth Quarter 2012, doi:10.1109/TSC.2011.18
REFERENCES
[1] B. Hayes, "Cloud Computing," Comm. ACM, vol. 51, no. 7, pp. 9-11, July 2008.
[2] Amazon.com, "Amazon SimpleDB," http://aws.amazon.comsimpledb, 2010.
[3] F. Chang, J. Dean, S. Ghemawat, W.C. Hsieh, D.A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R.E. Gruber, "Bigtable: A Distributed Storage System for Structured Data," Proc. Conf. USENIX Symp. Operating Systems Design and Implementation, 2006.
[4] J. Gray and A. Reuter, Transaction Processing: Concepts and Techniques. Morgan Kaufmann, 1993.
[5] Transaction Processing Performance Council, "TPC Benchmark C Standard Specification, Revision 5," http://www.tpc.orgtpcc, Dec. 2006.
[6] S. Gilbert and N. Lynch, "Brewer's Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services," SIGACT News, vol. 33, no. 2, pp. 51-59, 2002.
[7] B.F. Cooper, R. Ramakrishnan, U. Srivastava, A. Silberstein, P. Bohannon, H.-A. Jacobsen, N. Puz, D. Weaver, and R. Yerneni, "PNUTS: Yahoo!'s Hosted Data Serving Platform," Proc. VLDB Endowment, vol. 1, pp. 1277-1288, 2008.
[8] "Microsoft SQL Azure Database," http://www.microsoft.com/azuredata.mspx, 2010.
[9] W. Vogels, "Data Access Patterns in the Amazon.com Technology Platform," Proc. 33rd Int'l Conf. Very Large Databases (VLDB), 2007.
[10] G. Urdaneta, G. Pierre, and M. van Steen, "Wikipedia Workload Analysis for Decentralized Hosting," Elsevier Computer Networks, vol. 53, no. 11, pp. 1830-1845, July 2009.
[11] D.A. Menascé, "TPC-W: A Benchmark for E-Commerce," IEEE Internet Computing, vol. 6, no. 3, pp. 83-87, May/June 2002.
[12] HBase, "An Open-Source, Distributed, Column-Oriented Store Modeled After the Google Bigtable Paper," http://hadoop. apache.orghbase, 2006.
[13] "EC2 Elastic Compute Cloud," http://aws.amazon.comec2, 2010.
[14] Z. Wei, G. Pierre, and C.-H. Chi, "Scalable Transactions for web Applications in the Cloud," Proc. 15th Int'l Euro-Par Conf. Parallel Processing, 2009.
[15] B. Kemme and G. Alonso, "Don't Be Lazy, Be Consistent: Postgres-R, a New Way to Implement Database Replication," Proc. 26th Int'l Conf. Very Large Databases (VLDB), 2000.
[16] M. Atwood, "A MySQL Storage Engine for AWS S3," Proc. MySQL Conf. and Expo, http://fallenpegasus.com/codemysql-awss3 , 2007.
[17] A. Lakshman, P. Malik, and K. Ranganathan, "Cassandra: A Structured Storage System on a P2P Network," Proc. Keynote Talk at SIGMOD Int'l Conf. Management of Data, 2008.
[18] W. Vogels, "Eventually Consistent," Comm. ACM, vol. 52, no. 1, pp. 40-44, 2009.
[19] J.J. Furman, J.S. Karlsson, J.M. Leon, S. Newman, A. Lloyd, and P. Zeyliger, "Megastore: A Scalable Data System for User Facing Applications," Proc. SIGMOD Int'l Conf. Management of Data, 2008.
[20] J. Baker, C. Bond, J.C. Corbett, J. Furman, A. Khorlin, J. Larson, J.-M. Leon, Y. Li, A. Lloyd, and V. Yushprakh, "Megastore: Providing Scalable, Highly Available Storage for Interactive Services," Proc. Conf. Innovative Data Systems Research (CIDR), 2011.
[21] M. Brantner, D. Florescu, D. Graf, D. Kossmann, and T. Kraska, "Building a Database on S3," Proc. SIGMOD Int'l Conf. Management of Data, pp. 251-264, 2008.
[22] S. Das, D. Agrawal, and A.E. Abbadi, "ElasTraS: An Elastic Transactional Data Store in the Cloud," Proc. USENIX HotCloud, 2009.
[23] M.K. Aguilera, A. Merchant, M. Shah, A. Veitch, and C. Karamanolis, "Sinfonia: A New Paradigm for Building Scalable Distributed Systems," Proc. ACM Symp. Operating Systems Principles (SOSP), 2007.
[24] M.T. Özsu and P. Valduriez, Principles of Distributed Database Systems, second ed. Prentice-Hall, Feb. 1999.
[25] M.L. Liu, D. Agrawal, and A. El Abbadi, "The Performance of Two Phase Commit Protocols in the Presence of Site Failures," Distributed Parallel Databases, vol. 6, no. 2, pp. 157-182, 1998.
[26] M. Stonebraker, "Concurrency Control and Consistency of Multiple Copies of Data in Distributed INGRES," IEEE Trans. Software Eng., vol. 5, no. 3, pp. 188-194, May 1979.
[27] R. Gupta, J. Haritsa, and K. Ramamritham, "Revisiting Commit Processing in Distributed Database Systems," Proc. SIGMOD Int'l Conf. Management of Data, 1997.
[28] P.A. Bernstein and N. Goodman, "Concurrency Control in Distributed Database Systems," ACM Computing Surveys, vol. 13, no. 2, pp. 185-221, 1981.
[29] P.A. Bernstein, V. Hadzilacos, and N. Goodman, Concurrency Control and Recovery in Database Systems. Addison-Wesley Longman, 1987.
[30] P.A. Bernstein and N. Goodman, "Timestamp-Based Algorithms for Concurrency Control in Distributed Database Systems," Proc. Sixth Int'l Conf. Very Large Data Bases (VLDB), 1980.
[31] M. Stonebraker, S. Madden, D.J. Abadi, S. Harizopoulos, N. Hachem, and P. Helland, "The End of an Architectural Era: (It's Time for a Complete Rewrite)," Proc. 33rd Int'l Conf. Very Large Data Bases (VLDB), 2007.
[32] R. Kallman, H. Kimura, J. Natkins, A. Pavlo, A. Rasin, S. Zdonik, E.P.C. Jones, S. Madden, M. Stonebraker, Y. Zhang, J. Hugg, and D.J. Abadi, "H-Store: A High-Performance, Distributed Main Memory Transaction Processing System," Proc. VLDB Endowment, vol. 1, pp. 1496-1499, 2008.
[33] M. Herlihy, V. Luchangco, M. Moir, and W.N. SchererIII, "Software Transactional Memory for Dynamic-Sized Data Structures," Proc. 22nd Ann. Symp. Principles of Distributed Computing (PODC), 2003.
[34] K. Manassiev, M. Mihailescu, and C. Amza, "Exploiting Distributed Version Concurrency in a Transactional Memory Cluster," Proc. 11th ACM SIGPLAN Symp. Principles and Practice of Parallel Programming (PPoPP), 2006.
[35] C. Kotselidis, M. Ansari, K. Jarvis, M. Luján, C. Kirkham, and I. Watson, "DiSTM: A Software Transactional Memory Framework for Clusters," Proc. 37th Int'l Conf. Parallel Processing (ICPP), 2008.
[36] R.L. Bocchino, V.S. Adve, and B.L. Chamberlain, "Software Transactional Memory for Large Scale Clusters," Proc. 13th ACM SIGPLAN Symp. Principles and Practice of Parallel Programming (PPoPP), 2008.
[37] S. Plantikow, A. Reinefeld, and F. Schintke, "Transactions for Distributed Wikis on Structured Overlays," Proc. Distributed Systems: Operations and Management (DSOM) 18th IFIP/IEEE Int'l Conf. Managing Virtualization of Networks and Services, 2007.
[38] F.D. Daniel Peng, "Large-Scale Incremental Processing Using Distributed Transactions and Notifications," Proc. Ninth USENIX Symp. Operating Systems Design and Implementation (OSDI), 2010.
[39] D. Karger, E. Lehman, T. Leighton, R. Panigrahy, M. Levine, and D. Lewin, "Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web," Proc. 29th Ann.ACM Symp. Theory of Computing (STOC), 1997.
[40] S.-O. Hvasshovd, Ø. Torbjørnsen, S.E. Bratsberg, and P. Holager, "The Clustra Telecom Database: High Availability, High Throughput, and Real-Time Response," Proc. 21st Int'l Conf. Very Large Data Bases (VLDB), pp. 469-477, 1995.
[41] L. Lamport, "Time, Clocks, and the Ordering of Events in a Distributed System," Comm. ACM, vol. 21, no. 7, pp. 558-565, 1978.
[42] M. Michael and M. Scott, "Simple, Fast, and Practical Nonblocking and Blocking Concurrent Queue Algorithms," Proc. 15th Ann. ACM Symp. Principles of Distributed Computing (PODC), 1996.
[43] G. Schlageter, "Optimistic Methods for Concurrency Control in Distributed Database Systems," Proc. Seventh Int'l Conf. Very Large Data Bases (VLDB), 1981.
[44] A. El Abbadi, D. Skeen, and F. Cristian, "An Efficient, Fault-Tolerant Protocol for Replicated Data Management," Proc. Fourth ACM SIGACT-SIGMOD Symp. Principles of Database Systems (PODS), 1985.
[45] D.B. Terry, A.J. Demers, K. Petersen, M.J. Spreitzer, M.M. Theimer, and B.B. Welch, "Session Guarantees for Weakly Consistent Replicated Data," Proc. Third Int'l Conf. Parallel and Distributed Information Systems (PDCS), 1994.
[46] P. Cao and S. Irani, "Cost-Aware WWW Proxy Caching Algorithms," Proc. USENIX Symp. Internet Technologies and Systems (USITS), 1997.
[47] Z. Wei, J. Dejun, G. Pierre, C.-H. Chi, and M. van Steen, "Service-Oriented Data Denormalization for Scalable Web Applications," Proc. 17th Int'l Conf. World Wide Web (WWW), 2008.
[48] DAS3, "The Distributed ASCI Supercomputer 3," http://www.cs.vu.nldas3, 2007.
[49] J. Dejun, G. Pierre, and C.-H. Chi, "EC2 Performance Analysis for Resource Provisioning of Service-Oriented Applications," Proc. Workshop Non-Functional Properties and Service Level Agreements Management in Service Oriented Computing (NFPSLAM-SOC), 2009.
[50] S. Sivasubramanian, G. Pierre, M. van Steen, and G. Alonso, "Analysis of Caching and Replication Strategies for Web Applications," IEEE Internet Computing, vol. 11, no. 1, pp. 60-66, Jan./Feb. 2007.
[51] C. Olston, A. Manjhi, C. Garrod, A. Ailamaki, B.M. Maggs, and T.C. Mowry, "A Scalability Service for Dynamic Web Applications," Proc. Conf. Innovative Data Systems Research (CIDR), 2005.
[52] K. Amiri, S. Park, and R. Tewari, "DBProxy: A Dynamic Data Cache for Web Applications," Proc. 19th Int'l Conf. Data Eng. (ICDE), 2003.
33 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool