This Article 
 Bibliographic References 
 Add to: 
Scalable Execution of Continuous Aggregation Queries over Web Data
January/February 2012 (vol. 16 no. 1)
pp. 43-51
Rajeev Gupta, IBM Research
Krithi Ramamritham, Indian Institute of Technology Bombay

Data delivered over the Internet is increasingly being used to provide dynamic and personalized user experiences. Queries over fast-changing data from distributed data sources are executed to create content to be delivered to users. Because these queries require data from multiple sources, they're executed at intermediate proxies or data aggregators. The authors discuss various techniques for executing aggregation queries over distributed data to minimize the number of message exchanges between data sources, aggregators, and users. They carefully examine the problem in terms of different types of queries, aggregation functions, query imprecisions, and whether the aggregators get data from sources using pull- or push-based mechanisms.

1. R. Cheng et al., "Filtering Data Streams for Entity-Based Continuous Queries," , IEEE Trans. Knowledge and Data Eng., vol. 22, no. 2, 2010, pp. 234–248.
2. B. Babcock and C. Olston, "Distributed Top-k Monitoring," Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD 03), ACM Press, 2003, pp. 28–39.
3. G. Cormode and M. Garofalakis, "Sketching Streams through the Net: Distributed Approximate Query Tracking," Proc. 31st Int'l Conf. Very Large Data Bases (VLDB 05), VLDB Endowment, 2005, pp. 13–24.
4. Y. Zhou, B.C. Ooi, and K.-L. Tan, "Disseminating Streaming Data in a Dynamic Environment: An Adaptive and Cost-Based Approach," Int'l J. Very Large Data Bases, vol. 17, no. 6, 2008, pp. 1465–1483.
5. I. Sharfman, A. Schuster, and D. Keren, "A Geometric Approach to Monitoring Threshold Functions over Distributed Data Streams," ACM Trans. Database Systems, vol. 32, no. 4, 2007, article 23.
6. M. Bhide, K. Ramamritham, and P. Shenoy, "Efficiently Maintaining Stock Portfolios Up-to-Date on the Web," Proc. 12th Int'l Workshop Research Issues in Data Eng.: Eng. E-Commerce/E-Business Systems (RIDE 02), IEEE CS Press, 2002, pp. 60–65.
7. R. Gupta, K. Ramamritham, and M. Mohania, "Ratio Threshold Queries over Distributed Data Sources," Proc. IEEE 26th Int'l Conf. Data Eng. (ICDE 10), IEEE Press, 2010, pp. 581–584.
8. C. Fiorentino et al., "Building a Configurable Publish/Subscribe Notification Service," Proc. 5th IFIP WG 6.1 Int'l Conf. Distributed Applications and Interoperable Systems (DAIS 05), LNCS 3543, Springer, 2005, pp. 1083–1085.
9. R. Zhang and Y.C. Hu, "HYPER: A Hybrid Approach to Efficient Content-Based Publish/Subscribe," Proc. 25th IEEE Int'l Conf. Distributed Computing Systems (ICDCS 05), IEEE CS Press, 2005, pp. 427–436.
10. S. Zhu and C.V. Ravishankar, "Stochastic Consistency, and Scalable Pull-Based Caching for Erratic Data Stream Sources," Proc. 30th Int'l Conf. Very Large Data Bases (VLDB 04), VLDB Endowment, 2004, pp. 192–203.
11. R. kr. Majumdar, K.M. Moudgalya, and K. Ramamritham, "Adaptive Coherency Maintenance Techniques for Time-Varying Data," Proc. 24th IEEE Real-Time Systems Symp. (RTSS 03), IEEE CS Press, 2003, pp. 98–107.
12. R. Gupta, A. Puri, and K. Ramamritham, "Executing Incoherency Bounded Continuous Queries at Web Data Aggregators," Proc. 14th Int'l Conf. World Wide Web (WWW 05), ACM Press, 2005, pp. 54–65.
13. C. Olston, J. Jiang, and J. Widom, "Adaptive Filters for Continuous Queries over Distributed Data Streams," Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD 03), ACM Press, 2003, pp. 563–574.
14. N. Jain et al., "STAR: Self-Tuning Aggregation for Scalable Monitoring," Proc. 33rd Int'l Conf. Very Large Data Bases (VLDB 07), VLDB Endowment, 2007, pp. 962–973
15. R. Gupta and K. Ramamritham, "Query Planning for Continuous Aggregation Queries over a Network of Data Aggregators," IEEE Trans. Knowledge and Data Eng., preprint, 6 Jan. 2011; doi:10.1109/TKDE.2011.12.

Index Terms:
continuous queries, aggregation networks, threshold queries, entity queries, value-based queries, data refresh, push-based mechanism, pull-based mechanism
Rajeev Gupta, Krithi Ramamritham, "Scalable Execution of Continuous Aggregation Queries over Web Data," IEEE Internet Computing, vol. 16, no. 1, pp. 43-51, Jan.-Feb. 2012, doi:10.1109/MIC.2012.13
Usage of this product signifies your acceptance of the Terms of Use.