The Community for Technology Leaders
RSS Icon
Issue No.04 - April (2013 vol.24)
pp: 825-838
Yang Wang , IBM Center for Adv. Studies (CAS Atlantic), Univ. of New Brunswick, Fredericton, NB, Canada
B. Veeravalli , Dept. of Electr. & Comput. Eng., Nat. Univ. of Singapore, Singapore, Singapore
Chen-Khong Tham , Dept. of Electr. & Comput. Eng., Nat. Univ. of Singapore, Singapore, Singapore
In this paper, we study the strategies for efficiently achieving data staging and caching on a set of vantage sites in a cloud system with a minimum cost. Unlike the traditional research, we do not intend to identify the access patterns to facilitate the future requests. Instead, with such a kind of information presumably known in advance, our goal is to efficiently stage the shared data items to predetermined sites at advocated time instants to align with the patterns while minimizing the monetary costs for caching and transmitting the requested data items. To this end, we follow the cost and network models in [1] and extend the analysis to multiple data items, each with single or multiple copies. Our results show that under homogeneous cost model, when the ratio of transmission cost and caching cost is low, a single copy of each data item can efficiently serve all the user requests. While in multicopy situation, we also consider the tradeoff between the transmission cost and caching cost by controlling the upper bounds of transmissions and copies. The upper bound can be given either on per-item basis or on all-item basis. We present efficient optimal solutions based on dynamic programming techniques to all these cases provided that the upper bound is polynomially bounded by the number of service requests and the number of distinct data items. In addition to the homogeneous cost model, we also briefly discuss this problem under a heterogeneous cost model with some simple yet practical restrictions and present a 2-approximation algorithm to the general case. We validate our findings by implementing a data staging solver, whereby conducting extensive simulation studies on the behaviors of the algorithms.
dynamic programming, approximation theory, cache storage, cloud computing, data staging solver, data staging algorithms, data caching, cloud system, transmission cost, caching cost, dynamic programming techniques, homogeneous cost model, 2-approximation algorithm, Prediction algorithms, Distributed databases, Data models, Upper bound, Computational modeling, Bandwidth, Cloud computing, data placement and migration, Cloud computing, data staging and caching, resource constraints
Yang Wang, B. Veeravalli, Chen-Khong Tham, "On Data Staging Algorithms for Shared Data Accesses in Clouds", IEEE Transactions on Parallel & Distributed Systems, vol.24, no. 4, pp. 825-838, April 2013, doi:10.1109/TPDS.2012.178
[1] B. Veeravalli, "Network Caching Strategies for a Shared Data Distribution for a Predefined Service Demand Sequence," IEEE Trans. Knowledge and Data Eng., vol. 15, no. 6, pp. 1487-1497, Nov. 2003.
[2] K. Candan, B. Prabhakaran, and V. Subrahmanian, "Collaborative Multimedia Documents: Authoring and Presentation," Technical Report CS-TR-3596, UMIACS-TR-96-9, Computer Science Technical Series Report, Univ. of Maryland, College Park, Jan. 1996.
[3] B. Veeravalli and E. Yew, "Network Caching Strategies for Reservation-Based Multimedia Services on High-Speed Networks," Data and Knowledge Eng., vol. 41, no. 1, Apr. 2002.
[4] D. Arora, A. Feldmann, G. Schaffrath, and S. Schmid, "On the Benefit of Virtualization: Strategies for Flexible Server Allocation," Proc. USENIX Workshop Hot Topics in Management of Internet, Cloud, and Enterprise Networks and Services (Hot-ICE), 2011.
[5] X. Chen and X. Zhang, "A Popularity-Based Prediction Model for Web Prefetching," Computer, vol. 36, no. 3, pp. 63-70, Mar. 2003.
[6] Y. Bartal, M. Charikar, and P. Indyk, "On Page Migration and Other Relaxed Task Systems," Theoretical Computer Science, vol. 268, no. 1, pp. 43-66, 2001.
[7] A. Karlin, S. Phillips, and P. Raghavan, "Markov Paging," SIAM J. Computing, vol. 30, no. 3, pp. 906-922, 2000.
[8] D. Aksoy, M.J. Franklin, and S.B. Zdonik, "Data Staging for On-Demand Broadcast," Proc. 27th Int'l Conf. Very Large Data Bases (VLDB '01), pp. 571-580, 2001.
[9] A. Borodin, S. Irani, P. Raghavan, and B. Schieber, "Competitive Paging with Locality of Reference," J. Computer and Systems, vol. 50, pp. 244-258, 1995.
[10] C. Hopps, "Analysis of an Equal-Cost Mult-Path Algorithm," RFC 2992, Internet Eng. Task Force, 2000.
[11] S. Baase, A Gift of Fire: Social, Legal, and Ethical Issues in Computing. Prentice Hall, 1997.
[12] C.H. Papadimitriou, S. Ramanathan, and P.V. Rangan, "Optimal Information Delivery," Proc. Sixth Int'l Symp. Algorithms and Computation (ISAAC '95), pp. 181-187, 1995.
[13] M. Charikar, D. Halperin, and R. Motwani, "The Dynamic Servers Problem," Proc. Ninth Ann. ACM-SIAM Symp. Discrete Algorithms (SODA '98), pp. 410-419, 1998.
[14] C.H. Papadimitriou, S. Ramanathan, P.V. Rangan, and S.S. Kumar, "Multimedia Information Caching for Personalized Video-on-Demand," Computer Comm., vol. 18, no. 3, pp. 204-216, 1995.
[15] M. Manasse, L. McGeoch, and D. Sleator, "Competitive Algorithms for on-Line Problems," Proc. 20th Ann. ACM Symp. Theory of Computing, pp. 322-333, 1988.
[16] M. Chroboak, H. Karloff, T. Payne, and S. Vishwanathan, "New Results on Server Problems," SIAM J. Discrete Math., vol. 4, pp. 172-181, 1991.
[17] W. Shi and C. Su, "The Rectilinear Steiner Arborescence Problem is Np-Complete," Proc. 11th Ann. ACM-SIAM Symp. Discrete Algorithms (SODA '00), pp. 780-787, 2000.
[18] S. Rao, P. Sadayappan, F. Hwang, and P. Shor, "The Rectilinear Steiner Arborescence Problem," Algorithmica, vol. 7, pp. 277-288, 1992.
[19] M. Bazaraa, J. Jarvis, and H. Sherall, Linear Programming and Network Flows. Wiley-Interscience, 2004.
[20] E. Koutsoupias, "The K-Server Problem," Computer Science Rev., vol. 3, no. 2, pp. 105-118, 2009.
[21] Y. Chu and T. Liu, "On the Shortest Arborescence of a Directed Graph," Science Sinica, vol. 14, pp. 1396-1400, 1965.
[22] J. Edmonds, "Optimum Branchings," J. Research of the Nat'l Bureau of Standards, vol. 71B, pp. 233-240, 1967.
[23] L. Breslau, P. Cue, P. Cao, L. Fan, G. Phillips, and S. Shenker, "Web Caching and Zipf-Like Distributions: Evidence and Implications," Proc. IEEE INFOCOM, pp. 126-134, 1999.
[24] S. Wong, Y. Yuan, and S. Lu, "Characterizing Flows in Large Wireless Data Networks," Proc. ACM MOBICOM, pp. 174-186, 2004.
[25] A. Benoit, V. Rehn-Sonigo, and Y. Robert, "Replica Placement and Access Policies in Tree Network," IEEE Trans. Parallel and Distributed Systems, vol. 19, no. 12, pp. 1614-1627, Dec. 2008.
[26] H. Gupta and B. Tang, "Data Caching under Number Constraint," Proc. IEEE INFOCOM, 2006.
[27] K. Kalpakis, K. Dasgupta, and O. Wolfson, "Steiner-Optimal Data Replication in Tree Networks with Storage Costs," Proc. Int'l Symp. Database Eng. & Applications, pp. 285-293, 2001.
[28] B. Awerbuch, Y. Bartal, and A. Fiat, "Competitive Distributed File Allocation," Information Computing, vol. 185, pp. 1-40, Aug. 2003.
[29] Y. Bartal, A. Fiat, and Y. Rabani, "Competitive Algorithms for Distributed Data Management (Extended Abstract)," Proc. 24th Ann. ACM Symp. Theory of Computing (STOC '92), pp. 39-50, 1992.
[30] L. Jackson, G. Rouskas, and M. Stallmann, "The Directional P-Median Problem: Definition, complexity, and Algorithms," European J. Operational Research, vol. 179, pp. 1097-1108, 2007.
[31] L. Jackson, "The Directional P-Median Problem with Applications to Traffic Quantization and Multiprocessor Scheduling," PhD thesis, North Carolina State Univ., Raleigh, NC, Dec. 2003.
[32] M. Bienkowski, A. Feldmann, D. Jurca, W. Kellerer, G. Schaffrath, S. Schmid, and J. Widmer, "Competitive Analysis for Service Migration in Vnets," Proc. Second ACM SIGCOMM Workshop Virtualized Infrastructure System and Architectures (VISA), Sept. 2010.
6 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool