Issue No. 02 - February (2012 vol. 23)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TPDS.2011.169
Yu Hua , Huazhong University of Science and Technology, Wuhan
Hong Jiang , University of Nebraska-Lincoln, Lincoln
Yifeng Zhu , University of Maine, Orono
Dan Feng , Huazhong University of Science and Technology, Wuhan
Lei Tian , University of Nebraska-Lincoln, Lincoln
Existing data storage systems based on the hierarchical directory-tree organization do not meet the scalability and functionality requirements for exponentially growing data sets and increasingly complex metadata queries in large-scale, Exabyte-level file systems with billions of files. This paper proposes a novel decentralized semantic-aware metadata organization, called SmartStore, which exploits semantics of files' metadata to judiciously aggregate correlated files into semantic-aware groups by using information retrieval tools. The key idea of SmartStore is to limit the search scope of a complex metadata query to a single or a minimal number of semantically correlated groups and avoid or alleviate brute-force search in the entire system. The decentralized design of SmartStore can improve system scalability and reduce query latency for complex queries (including range and top-k queries). Moreover, it is also conducive to constructing semantic-aware caching, and conventional filename-based point query. We have implemented a prototype of SmartStore and extensive experiments based on real-world traces show that SmartStore significantly improves system scalability and reduces query latency over database approaches. To the best of our knowledge, this is the first study on the implementation of complex queries in large-scale file systems.
File systems, metadata management, scalability, performance evaluation.
Y. Zhu, Y. Hua, H. Jiang, D. Feng and L. Tian, "Semantic-Aware Metadata Organization Paradigm in Next-Generation File Systems," in IEEE Transactions on Parallel & Distributed Systems, vol. 23, no. , pp. 337-344, 2011.