The Community for Technology Leaders
Green Image
Issue No. 06 - June (2016 vol. 28)
ISSN: 1041-4347
pp: 1503-1517
Yang Hong , Shanghai Key Laboratory of Scalable and Computing and Systems, Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
Qiwei Tang , Shanghai Key Laboratory of Scalable and Computing and Systems, Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
Xiaofeng Gao , Shanghai Key Laboratory of Scalable and Computing and Systems, Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
Bin Yao , Shanghai Key Laboratory of Scalable and Computing and Systems, Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
Guihai Chen , Shanghai Key Laboratory of Scalable and Computing and Systems, Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
Shaojie Tang , Naveen Jindal School of Management, University of Texas at Dallas, Richardson, Texas, U.S
ABSTRACT
Cloud storage system poses new challenges to the community to support efficient concurrent querying tasks for various data-intensive applications, where indices always hold important positions. In this paper, we explore a practical method to construct a two-layer indexing scheme for multi-dimensional data in diverse server-centric cloud storage system. We first propose RT-HCN, an indexing scheme integrating R-tree based indexing structure and HCN-based routing protocol. RT-HCN organizes storage and compute nodes into an HCN overlay, one of the newly proposed sever-centric data center topologies. Based on the properties of HCN, we design a specific index mapping technique to maintain layered global indices and corresponding query processing algorithms to support efficient query tasks. Then, we expand the idea of RT-HCN onto another server-centric data center topology DCell, discovering a potential generalized and feasible way of deploying two-layer indexing schemes on other server-centric networks. Furthermore, we prove theoretically that RT-HCN is both space-efficient and query-efficient, by which each node actually maintains a tolerable number of global indices while high concurrent queries can be processed within accepted overhead. We finally conduct targeted experiments on Amazon's EC2 platforms, comparing our design with RT-CAN, a similar indexing scheme for traditional P2P network. The results validate the query efficiency, especially the speedup of point query of RT-HCN, depicting its potential applicability in future data centers.
INDEX TERMS
Indexing, Topology, Network topology, Servers, Cloud computing, Query processing
CITATION

Y. Hong, Q. Tang, X. Gao, B. Yao, G. Chen and S. Tang, "Efficient R-Tree Based Indexing Scheme for Server-Centric Cloud Storage System," in IEEE Transactions on Knowledge & Data Engineering, vol. 28, no. 6, pp. 1503-1517, 2016.
doi:10.1109/TKDE.2016.2526006
238 ms
(Ver 3.3 (11022016))