This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Using Punctuation Schemes to Characterize Strategies for Querying over Data Streams
September 2007 (vol. 19 no. 9)
pp. 1227-1240
Many systems and strategies have been proposed for processing non-terminating data streams. Each approach has advantages and disadvantages, including the kinds of queries that can be executed. We present a framework for characterizing the kinds of queries that can be executed over streams based on a notion of compact sets from topology. We first apply our framework to queries over punctuated data streams. Previous work on punctuations focused primarily on the behavior of individual query operators. We use our framework to determine if an entire query can benefit from punctuations available from stream sources. We then consider other common strategies proposed in the literature for executing queries over streams, and we discuss how our framework can characterize the kinds of queries each strategy can answer.

[1] eBay homepage, http:/www.eBay.com/, 2007.
[2] Yahoo! auctions homepage, http:/auctions.yahoo.com/, 2007.
[3] J.A. Rodríguez, P. Noriega, C. Sierra, and J. Padget, “FM96.5 A Java-Based Electronic Auction House,” Proc. Int'l Conf. and Exhibition on the Practical Application of Intelligent Agents and Multi-Agent Technology, pp. 207-224, Apr. 1997.
[4] P.R. Wurman, M.P. Wellman, and W.E. Walsh, “The Michigan Internet AuctionBot: A Configurable Auction Server for Human and Software Agents,” Proc. Second Int'l Conf. Autonomous Agents (Agents '98), pp. 301-308, May 1998.
[5] A. Arasu, M. Cherniak, E. Galvez, D. Maier, A. Maskey, E. Ryvkina, M. Stonebraker, and R. Tibbets, “Linear Road: A Stream Data Management Benchmark,” Proc. Int'l Conf. Very Large Data Bases, pp. 480-491, Aug. 2004.
[6] J. Li, D. Maier, V. Papadimos, P. Tucker, and K. Tufte, “NEXMark—A Benchmark for Queries over Data Streams,” http://datalab.cs.pdx.edu/niagaraNEXMark /, 2003.
[7] P.A. Tucker, “Punctuated Data Streams,” PhD dissertation, OGI School of Science and Eng. at Oregon Health and Science Univ., Aug. 2005.
[8] P.A. Tucker, D. Maier, T. Sheard, and L. Fegaras, “Exploiting Punctuation Semantics in Continuous Data Streams,” IEEE Trans. Knowledge and Data Eng., vol. 15, no. 3, pp. 555-568, May/June 2003.
[9] P.G. Selinger, M.M. Astrahan, D.D. Chamberlin, R.A. Lorie, and T.G. Price, “Access Path Selection in a Relational Database Management System,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 23-34, May 1979.
[10] R.H. Kasriel, Undergraduate Topology. W.B. Saunders, 1971.
[11] W. Rudin, Principles of Mathematical Analysis. McGraw-Hill, 1964.
[12] A.N. Wilschut and P.M.G. Apers, “Dataflow Query Execution in a Parallel Main-Memory Environment,” Proc. IASTED Int'l Conf. Parallel and Distributed Information Systems, pp. 68-77, Dec. 1991.
[13] J. Naughton, D. DeWitt, D. Maier, J. Chen, L. Galanis, K. Tufte, J. Kang, Q. Luo, N. Prakash, and F. Tian, “The Niagara Query System,” The IEEE Data Eng. Bull., vol. 24, no. 2, pp. 27-33, June 2000.
[14] P. Seshadri, M. Livny, and R. Ramakrishnan, “Sequence Query Processing,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 430-441, May 1994.
[15] P. Seshadri, M. Livny, and R. Ramakrishnan, “SEQ: A Model for Sequence Databases,” Proc. IEEE Int'l Conf. Data Eng., pp. 232-239, Mar. 1995.
[16] T. Johnson, C. Cranor, O. Spatscheck, and V. Shkapenyuk, “Gigascope: A Stream Database for Network Applications,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 647-651, June 2003.
[17] A. Arasu, S. Babu, and J. Widom, “The CQL Continuous Query Language: Semantic Foundations and Query Execution,” Int'l J. Very Large Data Bases, vol. 15, no. 2, pp. 121-142, June 2006.
[18] S. Chandrasekaran and M.J. Franklin, “Streaming Queries over Streaming Data,” Proc. Int'l Conf. Very Large Data Bases, pp. 203-214, Aug. 2002.
[19] J. Gehrke, F. Korn, and D. Srivastava, “On Computing Correlated Aggregates over Continuous Data Streams,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 13-24, May 2001.
[20] M. Sullivan and A. Heybey, “Tribeca: A System for Managing Large Databases of Network Traffic,” Proc. USENIX Ann. Technical Conf., pp. 13-24, June 1998.
[21] Y. Zhu and D. Shasha, “StatStream: Statistical Monitoring of Thousands of Data Streams in Real Time,” Proc. Int'l Conf. Very Large Data Bases, pp. 358-369, Aug. 2002.
[22] J. Li, D. Maier, K. Tufte, V. Papadimos, and P.A. Tucker, “Semantics and Evaluation Techniques for Window Aggregates in Data Streams,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 311-322, June 2005.
[23] S. Babu and J. Widom, “Continuous Queries over Data Streams,” SIGMOD Record, vol. 30, no. 3, pp. 109-120, Sept. 2001.
[24] B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom, “Models and Issues in Data Stream Systems,” Proc. ACM SIGACT-SIGMOD-SIGART Symp. Principles of Database Systems, pp. 1-16, June 2002.
[25] M.D. Soo, “Bibliography on Temporal Databases,” SIGMOD Record, vol. 20, no. 1, pp. 14-23, 1991.
[26] G. Özsoyoğlu and R.T. Snodgrass, “Temporal and Real-Time Databases: A Survey,” IEEE Trans. Knowledge and Data Eng., vol. 7, no. 4, pp. 513-532, Aug. 1995.
[27] A. Segev and A. Shoshani, “Logical Modeling of Temporal Data,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 454-466, May 1987.
[28] D.S. Parker, R.R. Muntz, and L. Chau, “The Tangram Stream Query Processing System,” Proc. Fifth IEEE Int'l Conf. Data Eng., pp. 556-563, Feb. 1989.
[29] H.-G. Li, S. Chen, J. Tatemura, D. Agrawal, K.S. Candan, and W.-P. Hsiung, “Safety Guarantee of Continuous Join Queries over Punctuated Data Streams,” Proc. Int'l Conf. Very Large Data Bases, pp. 19-30, Sept. 2006.
[30] V. Shkapenyuk, T. Johnson, O. Spatscheck, and S. Muthukrishnan, “A Heartbeat Mechanism and Its Application in Gigascope,” Proc. Int'l Conf. Very Large Data Bases, pp. 1079-1088, Aug. 2005.
[31] L. Ding, N. Mehta, E.A. Rundensteiner, and G.T. Heineman, “Joining Punctuated Streams,” Proc. Ninth Int'l Conf. Extending Database Technology, pp. 587-604, Mar. 2004.
[32] S. Babu, U. Srivastava, and J. Widom, “Exploiting k-Constraints to Reduce Memory Overhead in Continuous Queries over Data Streams,” ACM Trans. Database Systems, vol. 29, no. 3, pp. 545-580, Sept. 2004.

Index Terms:
Data streams, Query execution, Punctuation
Citation:
Peter Tucker, David Maier, Tim Sheard, Paul Stephens, "Using Punctuation Schemes to Characterize Strategies for Querying over Data Streams," IEEE Transactions on Knowledge and Data Engineering, vol. 19, no. 9, pp. 1227-1240, Sept. 2007, doi:10.1109/TKDE.2007.1052
Usage of this product signifies your acceptance of the Terms of Use.