The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.09 - September (2010 vol.22)
pp: 1313-1330
Filippo Furfaro , University of Calabria, Rende
Giuseppe Massimiliano Mazzeo , Institute of High Performance Computing and Networking of CNR National Council of Research (ICAR-CNR), Rende
Andrea Pugliese , University of Calabria, Rende
ABSTRACT
A P2P-based framework supporting the extraction of aggregates from historical multidimensional data is proposed, which provides efficient and robust query evaluation. When a data population is published, data are summarized in a synopsis, consisting of an index built on top of a set of subsynopses (storing compressed representations of distinct data portions). The index and the subsynopses are distributed across the network, and suitable replication mechanisms taking into account the query workload and network conditions are employed that provide the appropriate coverage for both the index and the subsynopses.
INDEX TERMS
P2P networks, multidimensional data management, data compression.
CITATION
Filippo Furfaro, Giuseppe Massimiliano Mazzeo, Andrea Pugliese, "Managing Multidimensional Historical Aggregate Data in Unstructured P2P Networks", IEEE Transactions on Knowledge & Data Engineering, vol.22, no. 9, pp. 1313-1330, September 2010, doi:10.1109/TKDE.2009.160
REFERENCES
[1] S. Acharya, V. Poosala, and S. Ramaswamy, "Selectivity Estimation in Spatial Databases," Proc. 1999 ACM SIGMOD, 1999.
[2] A. Andrzejak and Z. Xu, "Scalable, Efficient Range Queries for Grid Information Services," Proc. Second Int'l Conf. Peer-to-Peer Computing, 2002.
[3] B. Arai, G. Das, D. Gunopulos, and V. Kalogeraki, "Approximating Aggregation Queries in Peer-to-Peer Networks," Proc. 22nd Int'l Conf. Data Eng., 2006.
[4] M. Bawa, H. Garcia-Molina, A. Gionis, and R. Motwani, "Estimating Aggregates on a Peer-to-Peer Network," technical report, Stanford InfoLab, 2003.
[5] N. Bruno, S. Chaudhuri, and L. Gravano, "STHoles: A Multidimensional Workload-Aware Histogram," Proc. 2001 ACM SIGMOD, 2001.
[6] S. Chaudhuri and U. Dayal, "An Overview of Data Warehousing and OLAP Technology," Sigmod Record, vol. 26, no. 1, pp. 65-74, Mar. 1997.
[7] E. Cohen and S. Shenker, "Replication Strategies in Unstructured Peer-to-Peer Networks," Proc. ACM SIGCOMM '02, 2002.
[8] A. Crainiceanu, P. Linga, J. Gehrke, and J. Shanmugasundaram, "Querying Peer-to-Peer Networks Using P-Trees," Proc. Seventh Int'l Workshop Web and Databases, 2004.
[9] M. Demirbas and H. Ferhatosmanoglu, "Peer-to-Peer Spatial Queries in Sensor Networks," Proc. Third Int'l Conf. Peer-to-Peer Computing, 2003.
[10] F. Furfaro, G.M. Mazzeo, and C. Sirangelo, "Exploiting Cluster Analysis for Constructing Multi-Dimensional Histograms on Both Static and Evolving Data," Proc. 10th Int'l Conf. Extending Database Technology, 2006.
[11] P. Ganesan, M. Bawa, and H. Garcia-Molina, "Online Balancing of Range-Partitioned Data with Applications to Peer-to-Peer Systems," Proc. 30th Int'l Conf. Very Large Data Bases, 2004.
[12] P. Ganesan, B. Yang, and H. Garcia-Molina, "One Torus to Rule Them All: Multidimensional Queries in P2P Systems," Proc. Seventh Int'l Workshop Web and Databases, 2004.
[13] M. Garofalakis and P.B. Gibbons, "Wavelet Synopses with Error Guarantees," Proc. 2002 ACM SIGMOD, 2002.
[14] M. Garofalakis and P.B. Gibbons, "Probabilistic Wavelet Synopses," ACM Trans. Database Systems, vol. 29, no. 1, pp. 43-90, Mar. 2004.
[15] M. Garofalakis and A. Kumar, "Deterministic Wavelet Thresholding for Maximum-Error Metrics," Proc. 23rd ACM SIGACT-SIGMOD-SIGART Symp. Principles of Database Systems, 2004.
[16] M. Garofalakis and A. Kumar, "Wavelet Synopses for General Error Metrics," ACM Trans. Database Systems, vol. 30, no. 4, pp. 888-928, Dec. 2005.
[17] C. Gkantsidis, M. Mihail, and A. Saberi, "Random Walks in Peer-to-Peer Networks," Proc. 23rd IEEE INFOCOM, 2004.
[18] http:/www.gnutella.com, 2008.
[19] P. Gould, "Letting the Data Speak for Themselves," Ann. of the Assoc. of Am. Geographers, vol. 71, no. 2, pp. 166-176, June 1981.
[20] D. Gunopulos, G. Kollios, V.J. Tsotras, and C. Domeniconi, "Approximating Multi-Dimensional Aggregate Range Queries over Real Attributes," Proc. ACM SIGMOD, 2000.
[21] A. Gupta, D. Agrawal, and A. El Abbadi, "Approximate Range Selection Queries in Peer-to-Peer Systems," Proc. First Conf. Innovative Data Systems Research, 2003.
[22] A. Guttman, "R-Trees: A Dynamic Index Structure for Spatial Searching," Proc. ACM SIGMOD '84, 1984.
[23] Y. Ioannidis, "The History of Histograms (Abridged)," Proc. 29th Int'l Conf. Very Large Data Bases, 2003.
[24] Y.E. Ioannidis and V. Poosala, "Balancing Histogram Optimality and Practicality for Query Result Size Estimation," Proc. ACM SIGMOD, 1995.
[25] H.V. Jagadish, B.C. Ooi, and Q.H. Vu, "BATON: A Balanced Tree Structure for Peer-to-Peer Networks," Proc. 31st Int'l Conf. Very Large Data Bases, 2005.
[26] M. Jurgens and H.-J. Lenz, "The ${\rm R}_a^{\ast}$ -Tree: An improved R-Tree with Materialized Data for Supporting Range Queries on OLAP-Data," Proc. Ninth Int'l Workshop Database and Expert Systems Applications, 1998.
[27] I. Kamel and C. Faloutsos, "On Packing R-Trees," Proc. Second Int'l Conf. Information and Knowledge Management, 1993.
[28] D. Kempe, A. Dobra, and J. Gehrke, "Gossip-Based Computation of Aggregate Information," Proc. 44th Symp. Foundations of Computer Science, 2003.
[29] R. Kooi, "The Optimization of Queries in Relational Databases," PhD thesis, Case Western Reserve Univ., 1980.
[30] N. Koudas, C. Faloutsos, and I. Kamel, "Declustering Spatial Databases on a Multi-Computer Architecture," Proc. Fifth Int'l Conf. Extending Database Technology, 1996.
[31] Y. Matias, J.S. Vitter, and M. Wang, "Wavelet-Based Histograms for Selectivity Estimation," Proc. 1998 ACM SIGMOD, 1998.
[32] A. Mondal, Y. Lifu, and M. Kitsuregawa, "P2PR-Tree: An R-Tree-Based Spatial Index for Peer-to-Peer Environments," Proc. Extending Database Technology Workshop Peer-to-Peer Computing and Databases, 2004.
[33] M. Muralikrishna and D.J. DeWitt, "Equi-Depth Histograms for Estimating Selectivity Factors for Multi-Dimensional," Proc. Int'l Conf. Management of Data, 1988.
[34] N. Ntarmos, P. Triantafillou, and G. Weikum, "Counting at Large: Efficient Cardinality Estimation in Internet-Scale Data Networks," Proc. 22nd Int'l Conf. Data Eng., 2006.
[35] T. Pitoura, N. Ntarmos, and P. Triantafillou, "Replication, Load Balancing and Efficient Range Query Processing in DHTs," Proc. 10th Int'l Conf. Extending Database Technology, 2006.
[36] V. Poosala and Y.E. Ioannidis, "Selectivity Estimation without the Attribute Value Independence Assumption," Proc. 23rd Int'l Conf. Very Large Data Bases, 1997.
[37] O.D. Sahin, A. Gupta, D. Agrawal, and A. El Abbadi, "A Peer-to-Peer Framework for Caching Range Queries," Proc. 20th Int'l Conf. Data Eng., 2004.
[38] P.G. Selinger, M.M. Astrahan, D.D. Chamberlin, R.A. Lorie, and T.G. Price, "Access Path Selection in a Relational Database Management System," Proc. 1979 ACM SIGMOD, 1979.
[39] http:/setiathome.ssl.berkeley.edu, 2008.
[40] Y. Shu, B.C. Ooi, K.L. Tan, and A. Zhou, "Supporting Multi-Dimensional Range Queries in Peer-to-Peer Systems," Proc. Fifth Int'l Conf. Peer-to-Peer Computing, 2005.
[41] E. Tanin, A. Harwood, and H. Samet, "Using a Distributed Quadtree Index in Peer-to-Peer Networks," The VLDB J., vol. 16, no. 2, pp. 165-178, Apr. 2007.
[42] UCI KDD Archive, http:/kdd.ics.uci.edu, 2010.
[43] T.A Welch, "A Technique for High-Performance Data Compression," Computer, vol. C-17, no. 6, pp. 8-9, June 1984.
[44] S. Wu, J. Li, B.C. Ooi, and K.-L. Tan, "Just-in-Time Query Retrieval over Partially Indexed Data on Structured P2P Overlays," Proc. Int'l Conf. Managment of Data, 2008.
[45] C. Zhang, A. Krishnamurthy, and R.Y. Wang, "Skipindex: Towards a Scalable Peer-to-Peer Index Service for High Dimensional Data," Technical Report TR-703-04, Princeton Univ., May 2004.
86 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool