2013 IEEE International Conference on Dependable, Autonomic and Secure Computing (DASC) (2013)
Dec. 21, 2013 to Dec. 22, 2013
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/DASC.2013.56
Data reusing is an important way to save storage capacity and improve query efficiency in the management of massive data. The column-store architecture stores data from the same column continuously, which greatly improves the performance of 'read optimization' application and moreover increases the feasibility and flexibility of data reusing. In this paper, we propose a novel reusing method based on the column-store data warehouse. Firstly, we propose an improved iMAP method based on the schema mapping technique to generate as more candidate reusable columns as possible and then conduct further filter on these candidate data, which greatly reduces the complexity of reusable data detection. Based on the column-store architecture, we then propose the reuse implement at the storage layer. The method for query execution based on reusable data is provided finally. The experiment results conducted on the real data sets indicate that the presented strategy can reduce the storage space and query execution time efficiently.
schema mapping, massive data, data reusing, column-store
M. Wang, J. Zhou, Y. Li, X. Xia and J. Le, "A Data Reusing Strategy Based on Column-Stores," 2013 IEEE International Conference on Dependable, Autonomic and Secure Computing (DASC), Chengdu, China, 2013, pp. 163-168.