This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Cost-Based Approach to Adaptive Resource Management in Data Stream Systems
February 2008 (vol. 20 no. 2)
pp. 230-245
Data stream management systems need to control their resources adaptively since stream characteristics as well as query workload vary over time. In this paper we investigate an approach to adaptive resource management for continuous sliding window queries that adjusts window sizes and time granularities to keep resource usage within bounds. These two novel techniques differ from standard load shedding approaches based on sampling as they ensure exact query answers for given user-defined Quality of Service specifications, even under query re-optimization. In order to quantify the effects of both techniques on the various operations in a query plan, we develop an appropriate cost model for estimating operator resource allocation in terms of memory usage and processing costs. A thorough experimental study not only validates the accuracy of our cost model but also demonstrates the efficacy and scalability of the proposed techniques.

[1] Y. Yang, J. Krämer, D. Papadias, and B. Seeger, “HybMig: A Hybrid Approach to Dynamic Plan Migration for Continuous Queries,” IEEE Trans. Knowledge and Data Eng., vol. 19, no. 3, pp.398-411, Mar. 2007.
[2] N. Tatbul, U. Cetintemel, S.B. Zdonik, M. Cherniack, and M. Stonebraker, “Load Shedding in a Data Stream Manager,” Proc. 29th Int'l Conf. Very Large Data Bases (VLDB '03), pp. 309-320, 2003.
[3] S. Chaudhuri, R. Motwani, and V. Narasayya, “On Random Sampling over Joins,” Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '99), pp. 263-274, 1999.
[4] M. Cammert, C. Heinz, J. Krämer, T. Riemenschneider, M. Schwarzkopf, B. Seeger, and A. Zeiss, “Stream Processing in Production-to-Business Software,” Proc. 22nd IEEE Int'l Conf. Data Eng. (ICDE '06), pp. 168-169, 2006.
[5] L. Golab and M. Özsu, “Issues in Data Stream Management,” SIGMOD Record, vol. 32, no. 2, pp. 5-14, 2003.
[6] A. Arasu, S. Babu, and J. Widom, “The CQL Continuous Query Language: Semantic Foundations and Query Execution,” VLDB J., vol. 15, no. 2, pp. 121-142, 2006.
[7] J. Krämer and B. Seeger, “A Temporal Foundation for Continuous Queries over Data Streams,” Proc. 11th Int'l Conf. Management of Data (COMAD '05), pp. 70-82, 2005.
[8] R. Motwani et al., “Query Processing, Resource Management, and Approximation in a Data Stream Management System,” Proc. First Biennial Conf. Innovative Data Systems Research (CIDR '03), 2003.
[9] S. Babu and P. Bizarro, “Adaptive Query Processing in the Looking Glass,” Proc. Second Biennial Conf. Innovative Data Systems Research (CIDR '05), pp. 238-249, 2005.
[10] M. Cammert, J. Krämer, B. Seeger, and S. Vaupel, “An Approach to Adaptive Memory Management in Data Stream Systems,” Proc. 22nd IEEE Int'l Conf. Data Eng. (ICDE '06), pp. 137-139, 2006.
[11] S.D. Viglas and J.F. Naughton, “Rate-Based Query Optimization for Streaming Information Sources,” Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '02), pp. 37-48, 2002.
[12] L. Golab and M. Öszu, “Processing Sliding Window Multi-Joins in Continuous Queries over Data Streams,” Proc. 29th Int'l Conf. Very Large Databases (VLDB '03), pp. 500-511, 2003.
[13] A. Ayad and J.F. Naughton, “Static Optimization of Conjunctive Queries with Sliding Windows over Infinite Streams,” Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '04), pp. 419-430, 2004.
[14] S. Babu, U. Srivastava, and J. Widom, “Exploiting $k\hbox{-}{\rm Constraints}$ to Reduce Memory Overhead in Continuous Queries over Data Streams,” ACM Trans. Database Systems, vol. 29, no. 3, pp. 545-580, 2004.
[15] P.A. Tucker, D. Maier, T. Sheard, and L. Fegaras, “Exploiting Punctuation Semantics in Continuous Data Streams,” IEEE Trans. Knowledge and Data Eng., vol. 15, no. 3, pp. 555-568, May/June 2003.
[16] A. Arasu, B. Babcock, S. Babu, J. McAlister, and J. Widom, “Characterizing Memory Requirements for Queries over Continuous Data Streams,” ACM Trans. Database Systems, vol. 29, no. 1, pp. 162-194, 2004.
[17] A. Arasu and J. Widom, “Resource Sharing in Continuous Sliding-Window Aggregates,” Proc. 30th Int'l Conf. Very Large Databases (VLDB '04), pp. 336-347, 2004.
[18] M. Datar, A. Gionis, P. Indyk, and R. Motwani, “Maintaining Stream Statistics over Sliding Windows,” Proc. 13th ACM-SIAM Symp. Discrete Algorithms (SODA '02), pp. 635-644, 2002.
[19] S. Krishnamurthy, C. Wu, and M. Franklin, “On-the-Fly Sharing for Streamed Aggregation,” Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '06), pp. 623-634, 2006.
[20] B. Babcock, M. Datar, and R. Motwani, “Load Shedding for Aggregation Queries over Data Streams,” Proc. 20th IEEE Int'l Conf. Data Eng. (ICDE '04), pp. 350-361, 2004.
[21] B. Gedik, K.-L. Wu, P.S. Yu, and L. Liu, “Adaptive Load Shedding for Windowed Stream Joins,” Proc. 14th ACM Int'l Conf. Information and Knowledge Management (CIKM '05), pp. 171-178, 2005.
[22] B. Gedik, K.-L. Wu, P.S. Yu, and L. Liu, “A Load Shedding Framework and Optimizations for $M\hbox{-}{\rm Way}$ Windowed Stream Joins,” Proc. 23rd IEEE Int'l Conf. Data Eng. (ICDE '07), pp. 536-545, 2007.
[23] A. Das, J. Gehrke, and M. Riedewald, “Approximate Join Processing over Data Streams,” Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '03), pp. 40-51, 2003.
[24] A. Deshpande, C. Guestrin, and S. Madden, “Using Probabilistic Models for Data Management in Acquisitional Environments,” Proc. Second Biennial Conf. Innovative Data Systems Research (CIDR '05), pp. 317-328, 2005.
[25] C.S. Jensen, J. Clifford, R. Elmasri, S.K. Gadia, P.J. Hayes, and S. Jajodia, “A Consensus Glossary of Temporal Database Concepts,” SIGMOD Record, vol. 23, no. 1, pp. 52-64, 1994.
[26] B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom, “Models and Issues in Data Stream Systems,” Proc. 21st Symp. Principles of Database Systems (PODS '02), pp. 1-16, 2002.
[27] J. Kang, J. Naughton, and S. Viglas, “Evaluating Window Joins over Unbounded Streams,” Proc. 19th IEEE Int'l Conf. Data Eng. (ICDE '03), pp. 341-352, 2003.
[28] J. Li, D. Maier, K. Tufte, V. Papadimos, and P.A. Tucker, “Semantics and Evaluation Techniques for Window Aggregates in Data Streams,” Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '05), pp. 311-322, 2005.
[29] G. Slivinskas, C.S. Jensen, and R.T. Snodgrass, “A Foundation for Conventional and Temporal Query Optimization Addressing Duplicates and Ordering,” IEEE Trans. Knowledge and Data Eng., vol. 13, no. 1, pp. 21-49, Jan./Feb. 2001.
[30] J. Krämer, “Continuous Queries over Data Streams—Semantics and Implementation,” PhD dissertation, Univ. of Marburg, 2007.
[31] S. Madden, M.J. Franklin, J.M. Hellerstein, and W. Hong, “TinyDB: An Acquisitional Query Processing System for Sensor Networks,” ACM Trans. Database Systems, vol. 30, no. 1, pp. 122-173, 2005.
[32] A.J. Demers, J. Gehrke, B. Panda, M. Riedewald, V. Sharma, and W.M. White, “Cayuga: A General-Purpose Event Monitoring System,” Proc. Third Biennial Conf. Innovative Data Systems Research (CIDR '07), pp. 412-422, 2007.
[33] J.M. Hellerstein, P.J. Haas, and H. Wang, “Online Aggregation,” Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '97), pp. 171-182, 1997.
[34] T.M. Ghanem, M.A. Hammad, M.F. Mokbel, W.G. Aref, and A.K. Elmagarmid, “Incremental Evaluation of Sliding-Window Queries over Data Streams,” IEEE Trans. Knowledge and Data Eng., vol. 19, no. 1, pp. 57-72, Jan. 2007.
[35] U. Srivastava and J. Widom, “Flexible Time Management in Data Stream Systems,” Proc. 23rd Symp. Principles of Database Systems (PODS '04), pp. 263-274, 2004.
[36] J. Krämer and B. Seeger, “PIPES—A Public Infrastructure for Processing and Exploring Streams,” Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '04), pp. 925-926, 2004.
[37] Y.-N. Law, H. Wang, and C. Zaniolo, “Query Languages and Data Models for Database Sequences and Data Streams,” Proc. 30th Int'l Conf. Very Large Databases (VLDB '04), pp. 492-503, 2004.
[38] A. Cardenas, “Analysis and Performance of Inverted Database Structures,” Comm. ACM, vol. 18, no. 5, pp. 253-263, 1975.

Index Terms:
Query processing, Systems
Citation:
Michael Cammert, J? Kr?mer, Bernhard Seeger, Sonny Vaupel, "A Cost-Based Approach to Adaptive Resource Management in Data Stream Systems," IEEE Transactions on Knowledge and Data Engineering, vol. 20, no. 2, pp. 230-245, Feb. 2008, doi:10.1109/TKDE.2007.190686
Usage of this product signifies your acceptance of the Terms of Use.