The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.12 - December (2009 vol.58)
pp: 1585-1598
Yu Hua , Huazhong University of Science and Technology, China
Bin Xiao , The Hong Kong Polytechnic University, Hong Kong
Jianping Wang , City University of Hong Kong, Hong Kong
ABSTRACT
Multidimensional data indexing has received much research attention recently in a centralized system. However, it remains a nascent area of research in providing an integrated structure for multiple queries on multidimensional data in a distributed environment. In this paper, we propose a new data structure, called BR-tree (Bloom-filter-based R-tree), and implement such a prototype in the context of a distributed system. The node in a BR-tree, viewed as an expansion from the traditional R-tree node structure, incorporates space-efficient Bloom filters to facilitate fast membership queries. The proposed BR-tree can simultaneously support not only existing point and range queries, but also cover and bound queries that can potentially benefit various data indexing services. Compared with previous data structures, BR-tree achieves space efficiency and provides quick response ({\le}O(log n)) on these four types of queries. Our extensive experiments in a distributed environment further validate the practicality and efficiency of the proposed BR-tree structure.
INDEX TERMS
BR-tree, multidimensional data, point query, range query, cover query, bound query.
CITATION
Yu Hua, Bin Xiao, Jianping Wang, "BR-Tree: A Scalable Prototype for Supporting Multiple Queries of Multidimensional Data", IEEE Transactions on Computers, vol.58, no. 12, pp. 1585-1598, December 2009, doi:10.1109/TC.2009.97
REFERENCES
[1] R. Devine, “Design and Implementation of DDH: A Distributed Dynamic Hashing Algorithm,” Proc. Fourth Int'l Conf. Foundations of Data Organizations and Algorithms, pp. 101-114, 1993.
[2] “Distributed Hash Tables Links,” http://www.etse.urv.es/cpairotdhts.html, 2009.
[3] M. Harren, J.M. Hellerstein, R. Huebsch, B.T. Loo, S. Shenker, and I. Stoica, “Complex Queries in DHT-Based Peer-to-Peer Networks,” Proc. Int'l Workshop Peer-to-Peer Systems (IPTPS), 2002.
[4] Y. Hua, Y. Zhu, H. Jiang, D. Feng, and L. Tian, “Scalable and Adaptive Metadata Management in Ultra Large-Scale File Systems,” Proc. Int'l Conf. Distributed Computing Systems (ICDCS), pp. 403-410, 2008.
[5] Y. Hua, D. Feng, H. Jiang, and L. Tian, “RBF: A New Storage Structure for Space-Efficient Queries for Multidimensional Metadata in OSS,” File and Storage Technologies (FAST) Work-in-Progress Reports, 2007.
[6] L. Arge, M. de Berg, H.J. Haverkort, and K. Yi, “The Priority R-Tree: A Practically Efficient and Worst-Case Optimal R-Tree,” Proc. ACM SIGMOD, pp. 347-358, 2004.
[7] A. Guttman, “R-Trees: A Dynamic Index Structure for Spatial Searching,” Proc. ACM SIGMOD, pp. 47-57, 1984.
[8] C. du Mouza, W. Litwin, and P. Rigaux, “SD-Rtree: A Scalable Distributed Rtree,” Proc. Int'l Conf. Data Eng. (ICDE), pp. 296-305, 2007.
[9] V. Gaede and O. Günther, “Multidimensional Access Methods,” ACM Computing Surveys, vol. 30, no. 2, pp. 170-231, 1998.
[10] E. Bertino, B.C. Ooi, R. Sacks-Davis, K.-L. Tan, J. Zobel, B. Shidlovsky, and B. Cantania, Indexing Techniques for Advanced Database Applications. Kluwer Academics, 1997.
[11] B. Bloom, “Space/Time Trade Offs in Hash Coding with Allowable Errors,” Comm. ACM, vol. 13, no. 7, pp. 422-426, 1970.
[12] A. Broder and M. Mitzenmacher, “Network Applications of Bloom Filters: A Survey,” Internet Math., vol. 1, pp. 485-509, 2005.
[13] A. Broder and M. Mitzenmacher, “Using Multiple Hash Functions to Improve IP Lookups,” Proc. IEEE INFOCOM, pp.1454-1463, 2001.
[14] F. Baboescu and G. Varghese, “Scalable Packet Classification,” IEEE/ACM Trans. Networking, vol. 13, no. 1, pp. 2-14, Feb. 2005.
[15] S. Dharmapurikar, P. Krishnamurthy, and D.E. Taylor, “Longest Prefix Matching Using Bloom Filters,” Proc. ACM SIGCOMM, pp.201-212, 2003.
[16] Y. Hua and B. Xiao, “A Multi-Attribute Data Structure with Parallel Bloom Filters for Network Services,” Proc. IEEE Int'l Conf. High Performance Computing (HiPC), pp. 277-288, 2006.
[17] B. Xiao and Y. Hua, “Using Parallel Bloom Filters for Multi-Attribute Representation on Network Services,” IEEE Trans. Parallel and Distributed Systems, 2009.
[18] L. Fan, P. Cao, J. Almeida, and A. Broder, “Summary Cache: A Scalable Wide Area Web Cache Sharing Protocol,” IEEE/ACM Trans. Networking, vol. 8, no. 3, pp. 281-293, June 2000.
[19] M. Mitzenmacher, “Compressed Bloom Filters,” IEEE/ACM Trans. Networking, vol. 10, no. 5, pp. 604-612, Oct. 2002.
[20] A. Kumar, J.J. Xu, J. Wang, O. Spatschek, and L.E. Li, “Space-Code Bloom Filter for Efficient Per-Flow Traffic Measurement,” Proc. IEEE INFOCOM, pp. 1762-1773, 2004.
[21] C. Saar and M. Yossi, “Spectral Bloom Filters,” Proc. ACM SIGMOD, pp. 241-252, 2003.
[22] D. Guo, J. Wu, H. Chen, and X. Luo, “Theory and Network Application of Dynamic Bloom Filters,” Proc. IEEE INFOCOM, 2006.
[23] F. Hao, M. Kodialam, and T.V. Lakshman, “Incremental Bloom Filters,” Proc. IEEE INFOCOM, pp. 1741-1749, 2008.
[24] T.K. Sellis, N. Roussopoulos, and C. Faloutsos, “The $R^+$ -Tree: A Dynamic Index for Multi-Dimensional Objects,” Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 507-518, 1987.
[25] N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger, “The ${\rm R}^\ast$ -Tree: An Efficient and Robust Access Method for Points and Rectangles,” Proc. ACM SIGMOD, pp. 322-331, 1990.
[26] C. Bohm, S. Berchtold, and D.A. Keim, “Searching in High-Dimensional Spaces Index Structures for Improving the Performance of Multimedia Databases,” ACM Computing Surveys, vol. 33, no. 3, pp. 322-373, 2001.
[27] J. Aspnes and G. Shah, “Skip Graphs,” Proc. ACM-SIAM Symp. Discrete Algorithms (SODA), pp. 384-393, 2003.
[28] A.R. Bharambe, M. Agrawal, and S. Seshan, “Mercury: Supporting Scalable Multi-Attribute Range Queries,” Proc. ACM SIGCOMM, pp. 353-366, 2004.
[29] C. Zheng, G. Shen, S. Li, and S. Shenker, “Distributed Segment Tree: Support of Range Query and Cover Query Over DHT,” Proc. Int'l Workshop Peer-to-Peer Systems (IPTPS), 2006.
[30] “Opendht,” http:/opendht.org/, 2009.
[31] J. Gao and P. Steenkiste, “An Adaptive Protocol for Efficient Support of Range Queries in DHT-Based Systems,” Proc. IEEE Int'l Conf. Network Protocols (ICNP), pp. 239-250, 2004.
[32] D. Li, J. Cao, X. Lu, K.C.C. Chan, B. Wang, J. Su, H. va Leong, and A.T.S. Chan, “Delay-Bounded Range Queries in DHT-based Peer-to-Peer Systems,” Proc. Int'l Conf. Distributed Computing Systems (ICDCS), 2006.
[33] X. Li, Y.J. Kim, R. Govindan, and W. Hong, “Multi-Dimensional Range Queries in Sensor Networks,” Proc. ACM Conf. Embedded Networked Sensor Systems (SenSys), pp. 63-75, 2003.
[34] H.V. Jagadish, B.C. Ooi, Q.H. Vu, R. Zhang, and A. Zhou, “VBI-Tree: A Peer-to-Peer Framework for Supporting Multi-Dimensional Indexing Schemes,” Proc. Int'l Conf. Data Eng. (ICDE), 2006.
[35] H. Jagadish, B. Ooi, and Q. Vu, “BATON: A Balanced Tree Structure for Peer-to-Peer Networks,” Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 661-672, 2005.
[36] “The Internet Traffic Archive,” http:/ita.ee.lbl.gov/, 2009.
[37] C.A. Cunha, A. Bestavros, and M.E. Crovella, “Characteristics of WWW Client-Based Traces,” Technical Report TR-95-010, Dept. of Computer Science, Boston Univ., 1995.
[38] E. Riedel, M. Kallahalla, and R. Swaminathan, “A Framework for Evaluating Storage System Security,” Proc. Conf. File and Storage Technologies (FAST), pp. 15-30, 2002.
[39] Y. Theodoridis, J. Silva, and M. Nascimento, “On the Generation of Spatiotemporal Datasets,” Proc. Int'l Symp. Spatial Databases (SSD), pp. 147-164, 1999.
[40] The Forest CoverType Data Set, “UCI Machine Learning Repository,” http://archive.ics.uci.edu/ml/datasetsCovertype , 2009.
[41] Y. Zhu, H. Jiang, J. Wang, and F. Xian, “HBA: Distributed Metadata Management for Large Cluster-Based Storage Systems,” IEEE Trans. Parallel and Distributed Systems, vol. 19, no. 6, pp. 750-763, June 2008.
14 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool