This Article 
 Bibliographic References 
 Add to: 
Approximate Aggregations in Structured P2P Networks
November 2011 (vol. 23 no. 11)
pp. 1748-1752
Dalie Sun, Harbin Institute of Technology, Harbin
Sai Wu, National University of Singapore, Singapore
Shouxu Jiang, Harbin Institute of Technology, Harbin
Jianzhong Li, Harbin Institute of Technology, Harbin
In corporate networks, daily business data are generated in gigabytes or even terabytes. It is costly to process aggregate queries in those systems. In this paper, we propose PACA, a probably approximately correct aggregate query processing scheme, for answering aggregate queries in structured Peer-to-Peer (P2P) network. PACA retrieves random samples from peers' databases and applies the samples to process queries. Instead of scanning the entire database of each peer, PACA only accesses a small random number of data. Moreover, based on the query distribution,PACA publishes a precomputed synopsis and uses the synopsis to answer future queries. Most queries are expected to be answered by the precomputed synopsis partially or fully. And the synopsis is adaptively tuned to follow the query distribution. Experiments on the PlanetLab show the effectiveness of the approach.

[1] 2009/01flickr-camera- statistics-december-2008.html , 2011.
[2] http:/, 2011.
[3] http://www.tpc.orgtpch, 2011.
[4] S. Acharya, P.B. Gibbons, V. Poosala, and S. Ramaswamy, "Join Synopses for Approximate Query Answering," ACM SIGMOD Record, vol. 28, no. 2, pp. 275-286, 1999.
[5] F. Chang, J. Dean, S. Ghemawat, W.C. Hsieh, D.A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. Gruber, "Bigtable: A Distributed Storage System for Structured Data (Awarded Best Paper!)," Proc. Operating Systems Design and Implementation (OSDI '06), 2006.
[6] P.J. Haas and J.M. Hellerstein, "Ripple Joins for Online Aggregation," ACM SIGMOD Record, vol. 28, no. 2, pp. 287-298, 1999.
[7] J.M. Hellerstein, P.J. Haas, and H.J. Wang, "Online Aggregation," Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '97) , pp. 171-182, 1997.
[8] H.V. Jagadish, B.C. Ooi, and Q.H. Vu, "Baton: A Balanced Tree Structure for Peer-to-Peer Networks," Proc. 31st Int'l Conf. Very Large Data Bases (VLDB '05), 2005.
[9] J. Li, B.T. Loo, J.M. Hellerstein, M.F. Kaashoek, D.R. Karger, and R. Morris, "On the Feasibility of Peer-to-Peer Web Indexing and Search," Proc. Int'l Workshop Peer-to-Peer Systems (IPTPS '03), pp. 207-215, 2003.
[10] F. Olken and D. Rotem, "Maintenance of Materialized Views of Sampling Queries," Proc. Eighth Int'l Conf. Data Eng. (ICDE), 1992.
[11] I. Stoica, R. Morris, D.R. Karger, M.F. Kaashoek, and H. Balakrishnan, "Chord: A Scalable Peer-To-Peer Lookup Service for Internet Applications," Proc. Conf. Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM '01), 2001.
[12] K.-L. Tan, C.H. Goh, and B.C. Ooi, "Online Feedback for Nested Aggregate Queries with Multi-Threading," Proc. 25th Int'l Conf. Very Large Data Bases (VLDB '99), pp. 18-29, 1999.
[13] Q.H. Vu, M. Lupu, and B.C. Ooi, Peer-To-Peer Computing: Principles And Applications. Springer, Nov. 2009.
[14] S. Wu, S. Jiang, B.C. Ooi, and K.-L. Tan, "Distributed Online Aggregation," Proc. VLDB Endowment (PVLDB), vol. 2, no. 1, pp. 443-454, 2009.

Index Terms:
Peer-to-Peer, BATON, approximate query processing.
Dalie Sun, Sai Wu, Shouxu Jiang, Jianzhong Li, "Approximate Aggregations in Structured P2P Networks," IEEE Transactions on Knowledge and Data Engineering, vol. 23, no. 11, pp. 1748-1752, Nov. 2011, doi:10.1109/TKDE.2010.198
Usage of this product signifies your acceptance of the Terms of Use.