This Article 
 Bibliographic References 
 Add to: 
AdaptWID: An Adaptive, Memory-Efficient Window Aggregation Implementation
November/December 2008 (vol. 12 no. 6)
pp. 22-29
Jin Li, Portland State University
Kristin Tufte, Portland State University
David Maier, Portland State University
Vassilis Papadimos, Microsoft
Memory efficiency is important for processing high-volume data streams. Previous stream-aggregation methods can exhibit excessive memory overhead in the presence of skewed data distributions. Further, data skew is a common feature of massive data streams. The authors introduce the AdaptWID algorithm, which uses adaptive processing to cope with time-varying data skew. AdaptWID models the memory usage of alternative aggregation algorithms and selects between them at runtime on a group-by-group basis. The authors' experimental study using the NiagaraST stream system verifies that the adaptive algorithm improves memory usage while maintaining execution cost and latency comparable to existing implementations.

1. J. Li et al., "Semantics and Evaluation Techniques for Window Aggregates in Data Streams," Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD 05), ACM Press, 2005, pp. 311–322.
2. P. Tucker et al., "Exploiting Punctuation Semantics in Continuous Data Streams," Trans. Knowledge and Data Eng., vol. 15, no. 3, 2003, pp. 555–568.
1. D. Abadi et al., "Aurora: A New Model and Architecture for Data Stream 1. Management," Very Large Databases J., vol. 12, no. 2, 2003, pp. 120–139.
2. N. Kabra and D.J. DeWitt, "Efficient Mid-Query Reoptimization of Suboptimal Query Execution Plans," Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD 98), ACM Press, 1998, pp. 106–117.
3. T. Urhan et al., "Cost Based Query Scrambling for Initial Delays," Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD 98), ACM Press, 1998, pp. 130–141.
4. R. Avnur and J.M. Hellerstein, "Eddies: Continuously Adaptive Query Processing," Proc ACM SIGMOD Int'l Conf. Management of Data (SIGMOD 00), ACM Press, 2000, pp. 261–272.
5. L.D. Shapiro, "Join Processing in Database Systems with Large Main Memories," ACM Trans. Database Systems, vol. 11, no. 3, 1986, pp. 239–264.
6. T. Urhan and M.J. Franklin, "XJoin: A Reactively-Scheduled Pipelined Join Operator," IEEE Data Eng. Bulletin, vol. 23, no. 2, 2000, pp. 27–33.
7. S. Viglas, J.F. Naughton, and J. Burger, "Maximizing the Output Rate of Multi-Way Join Queries over Streaming Information Sources," Proc. 29th Int'l Conf. Very Large Databases (VLDB 03), Morgan Kaufmann, 2003, pp. 285–296.
8. J. Kang, J.F. Naughton, and S. Viglas, "Evaluating Window Joins over Unbounded Streams," Proc. 19th Int'l Conf. Data Eng. (ICDE 03), IEEE CS Press, 2003, pp. 341–352.
9. G. Cormode et al., "Holistic UDAFs at Streaming Speeds," Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD 04), ACM Press, 2004, pp. 35–46.

Index Terms:
databases, query processing, data stream management
Jin Li, Kristin Tufte, David Maier, Vassilis Papadimos, "AdaptWID: An Adaptive, Memory-Efficient Window Aggregation Implementation," IEEE Internet Computing, vol. 12, no. 6, pp. 22-29, Nov.-Dec. 2008, doi:10.1109/MIC.2008.116
Usage of this product signifies your acceptance of the Terms of Use.