This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Distributed View Divergence Control of Data Freshness in Replicated Database Systems
October 2009 (vol. 21 no. 10)
pp. 1403-1417
Takao Yamashita, Nippon Telegraph and Telephone Corporation, Tokyo
In this paper, we propose a distributed method to control the view divergence of data freshness for clients in replicated database systems whose facilitating or administrative roles are equal. Our method provides data with statistically defined freshness to clients when updates are initially accepted by any of the replicas, and then, asynchronously propagated among the replicas that are connected in a tree structure. To provide data with freshness specified by clients, our method selects multiple replicas using a distributed algorithm so that they statistically receive all updates issued up to a specified time before the present time. We evaluated by simulation the distributed algorithm to select replicas for the view divergence control in terms of controlled data freshness, time, message, and computation complexity. The simulation showed that our method achieves more than 36.9 percent improvement in data freshness compared with epidemic-style update propagation.

[1] P.A. Bernstein, V. Hadzilacos, and N. Goodman, Concurrency Control and Recovery in Database Systems. Addison-Wesley, 1987.
[2] A. Helal, A. Heddaya, and B. Bhargava, Replication Techniques in Distributed Systems. Kluwer Academic Publishers, 1996.
[3] R. Ladin, B. Liskov, and S. Ghemawat, “Providing High Availability Using Lazy Replication,” ACM Trans. Computer Systems, vol. 10, no. 4, pp. 360-391, 1992.
[4] C. Pu and A. Leff, “Replica Control in Distributed Systems: An Asynchronous Approach,” Proc. ACM SIGMOD '91, pp. 377-386, May 1991.
[5] J. Gray, P. Helland, P. O'Neil, and D. Shasha, “The Dangers of Replication and a Solution,” Proc. ACM SIGMOD '96, pp. 173-182, June 1996.
[6] J.J. Fischer and A. Michael, “Sacrificing Serializability to Attain High Availability of Data in an Unreliable Network,” Proc. First ACM Symp. Principles of Database Systems, pp. 70-75, May 1982.
[7] D.S. Parker and R.A. Ramos, “A Distributed File System Architecture Supporting High Availability,” Proc. Sixth Berkeley Workshop Distributed Data Management and Computer Networks, pp.161-183, Feb. 1982.
[8] P. Cox and B.D. Noble, “Fast Reconciliations in Fluid Replication,” Proc. Int'l Conf. Distributed Computing Systems, pp. 449-458, 2001.
[9] The Grid 2: Blueprint for a New Computing Infrastructure, I. Foster and C. Kesselman, eds. Morgan Kaufmann, 2003.
[10] B. Shin, “An Exploratory Investigation of System Success Factors in Data Warehousing,” J. Assoc. for Information Systems, vol. 4, pp.141-170, 2003.
[11] M. Bouzeghoub and V. Peralta, “A Framework for Analysis of Data Freshness,” Proc. Int'l Workshop Information Quality in Information Systems, pp. 59-67, 2004.
[12] T. Yamashita and S. Ono, “View Divergence Control of Replicated Data Using Update Delay Estimation,” Proc. 18th IEEE Symp. Reliable Distributed Systems, pp. 102-111, Oct. 1999.
[13] T. Yamashita and S. Ono, “Controlling View Divergence of Data Freshness in a Replicated Database System Using Statistical Update Delay Estimation,” IEICE Trans. Information and Systems, vol. E88-D, no. 4, pp. 739-749, 2005.
[14] J. Han and M. Kamber, Data Mining, second ed. Morgan Kaufmann, 2006.
[15] L.P. English, Improving Data Warehouse and Business Information Quality. John Wiley & Sons, 1999.
[16] Distributed Systems, S. Mullender, ed. ACM Press, 1989.
[17] I. Foster, C. Kesselman, J.M. Nick, and S. Tuecke, “Grid Services for Distributed System Integration,” Computer, vol. 35, no. 6, pp.37-46, June 2002.
[18] C. Huitema, Routing in the Internet. Prentice-Hall, 1995.
[19] L.M. Leemis, Reliability. Prentice-Hall, 1995.
[20] S. Shiba and H. Watanabe, Statistical Methods II: Estimation, (in Japanese). Shinyosha, 1976.
[21] M. Hollander and D.A. Wolfe, Nonparametric Statistical Methods, second ed. John Wiley & Sons, 1999.
[22] A. Demers, D. Greene, C. Hauser, W. Irish, J. Larson, S. Shenker, H. Sturgis, D. Swinehart, and D. Terry, “Epidemic Algorithm for Replicated Database Maintenance,” Proc. Sixth Ann. ACM Symp. Principles of Distributed Computing, pp. 1-12, 1987.
[23] D.L. Mills, “Precision Synchronization of Computer Network Clocks,” Computer Comm. Rev., vol. 24, no. 2, pp. 28-43, 1994.
[24] J. Levine, “An Algorithm to Synchronize the Time of a Computer to Universal Time,” IEEE/ACM Trans. Networking, vol. 3, no. 1, pp.42-50, Feb. 1995.
[25] J. Gray and A. Reuter, Transaction Processing: Concepts and Techniques. Morgan Kaufmann Publishers, 1993.
[26] E. Pacitti, E. Simon, and R. Melo, “Improving Data Freshness in Lazy Master Schemes,” Proc. 18th IEEE Int'l Conf. Distributed Computing Systems, pp. 164-171, May 1998.
[27] T.H. Cormen, C.H. Leiserson, R.L. Rivest, and C. Stein, Introduction to Algorithms, second ed. MIT Press, 2001.
[28] E.W. Dijkstra, “Termination Detection for Diffusing Computations,” Information Processing Letters, vol. 11, no. 1, pp. 1-4, 1980.
[29] N.A. Lynch, Distributed Algorithms. Morgan Kaufmann Publishers, 1996.
[30] Combinatorial Network Theory, D. Du and D.F. Hsu, eds. Kluwer Academic Publishers, 1996.
[31] A. Datta, M. Hauswirth, and K. Aberer, “Updates in Highly Unreliable, Replicated Peer-to-Peer Systems,” Proc. 23rd IEEE Int'l Conf. Distributed Computing Systems, pp. 76-88, 2003.
[32] P.T. Eugster, R. Guerraoui, A.-M. Kermarrec, and L. Massoouliè, “Epidemic Information Dissemination in Distributed Systems,” Computer, vol. 37, no. 5, pp. 60-67, May 2004.
[33] I. Gupta, A.-M. Kermarrec, and A.J. Ganesh, “Efficient and Adaptive Epidemic-Style Protocols for Reliable and Scalable Multicast,” IEEE Trans. Parallel and Distributed Systems, vol. 17, no. 7, pp. 593-605, July 2006.
[34] Z. Wang, S.K. Das, M. Kumar, and H. Shen, “Update Propagation through Replica Chain in Decentralized and Unstructured P2P Systems,” Proc. Int'l Conf. Peer-to-Peer Computing (P2P '04), pp. 64-71, 2004.
[35] A. Labrinidis and N. Roussopoulos, “Exploring the Tradeoff between Performance and Data Freshness in Database-Driven Web Servers,” VLDB J., vol. 13, no. 3, pp. 240-255, 2004.
[36] R. Hull and G. Zhou, “A Framework for Supporting Data Integration Using the Materialized and Virtual Approaches,” Proc. ACM SIGMOD '96, pp. 481-492, June 1996.
[37] F.M. Cuenca-Acuna, R.P. Martin, and T.D. Nguyen, “Autonomous Replication for High Availability in Unstructured P2P Systems,” Proc. 22nd IEEE Int'l Symp. Reliable Distributed Systems, pp. 99-108, 2003.
[38] V. Gopalakrishnan, B. Silaghi, B. Bhattacharjee, and P. Keleher, “Adaptive Replication in Peer-to-Peer Systems,” Proc. 24th IEEE Int'l Conf. Distributed Computing Systems, pp. 360-369, 2004.

Index Terms:
Data replication, weak consistency, freshness, delay, asynchronous update.
Citation:
Takao Yamashita, "Distributed View Divergence Control of Data Freshness in Replicated Database Systems," IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 10, pp. 1403-1417, Oct. 2009, doi:10.1109/TKDE.2008.230
Usage of this product signifies your acceptance of the Terms of Use.