Issue No. 06 - June (2013 vol. 24)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TPDS.2013.20
Dong Yuan , Swinburne University of Technology, Melbourne
Yun Yang , Swinburne University of Technology, Melbourne
Xiao Liu , Swinburne University of Technology, Melbourne
Wenhao Li , Swinburne University of Technology, Melbourne
Lizhen Cui , Shandong University, Jinan
Meng Xu , Shandong University, Jinan
Jinjun Chen , University of Technology Sydney, Sydney
Massive computation power and storage capacity of cloud computing systems allow scientists to deploy computation and data intensive applications without infrastructure investment, where large application data sets can be stored in the cloud. Based on the pay-as-you-go model, storage strategies and benchmarking approaches have been developed for cost-effectively storing large volume of generated application data sets in the cloud. However, they are either insufficiently cost-effective for the storage or impractical to be used at runtime. In this paper, toward achieving the minimum cost benchmark, we propose a novel highly cost-effective and practical storage strategy that can automatically decide whether a generated data set should be stored or not at runtime in the cloud. The main focus of this strategy is the local-optimization for the tradeoff between computation and storage, while secondarily also taking users' (optional) preferences on storage into consideration. Both theoretical analysis and simulations conducted on general (random) data sets as well as specific real world applications with Amazon's cost model show that the cost-effectiveness of our strategy is close to or even the same as the minimum cost benchmark, and the efficiency is very high for practical runtime utilization in the cloud.
Benchmark testing, Computational modeling, Materials, Runtime, Delay, Algorithm design and analysis, Cloud computing, cloud computing, Data sets storage, computation-storage tradeoff, computation- and data-intensive applications
J. Chen et al., "A Highly Practical Approach toward Achieving Minimum Data Sets Storage Cost in the Cloud," in IEEE Transactions on Parallel & Distributed Systems, vol. 24, no. , pp. 1234-1244, 2013.