The Community for Technology Leaders
Green Image
Issue No. 01 - Jan. (2014 vol. 63)
ISSN: 0018-9340
pp: 3-16
Yu Hua , Huazhong University of Science and Technology, Wuhan
Xue Liu , McGill University, Montreal
Dan Feng , Huazhong University of Science and Technology, Wuhan
ABSTRACT
The cloud is emerging for scalable and efficient cloud services. To meet the needs of handling massive data and decreasing data migration, the computation infrastructure requires efficient data placement and proper management for cached data. In this paper, we propose an efficient and cost-effective multilevel caching scheme, called MERCURY, as computation infrastructure of the cloud. The idea behind MERCURY is to explore and exploit data similarity and support efficient data placement. To accurately and efficiently capture the data similarity, we leverage a low-complexity locality-sensitive hashing (LSH). In our design, in addition to the problem of space inefficiency, we identify that a conventional LSH scheme also suffers from the problem of homogeneous data placement. To address these two problems, we design a novel multicore-enabled locality-sensitive hashing (MC-LSH) that accurately captures the differentiated similarity across data. The similarity-aware MERCURY, hence, partitions data into the L1 cache, L2 cache, and main memory based on their distinct localities, which help optimize cache utilization and minimize the pollution in the last-level cache. Besides extensive evaluation through simulations, we also implemented MERCURY in a system. Experimental results based on real-world applications and data sets demonstrate the efficiency and efficacy of our proposed schemes.
INDEX TERMS
data similarity, Cloud computing, multicore processor, cache management
CITATION

Y. Hua, X. Liu and D. Feng, "Data Similarity-Aware Computation Infrastructure for the Cloud," in IEEE Transactions on Computers, vol. 63, no. 1, pp. 3-16, 2013.
doi:10.1109/TC.2013.111
632 ms
(Ver 3.3 (11022016))