The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.06 - November/December (2008 vol.12)
pp: 50-60
Ying Li , IBM T.J. Watson Research Center
Geetika T. Lakshmanan , IBM T.J. Watson Research Center
ABSTRACT
Optimally assigning streaming tasks to network machines is a key factor that influences a large data-stream-processing system's performance. Although researchers have prototyped and investigated various algorithms for task placement in data stream management systems, taxonomies and surveys of such algorithms are currently unavailable. To tackle this knowledge gap, the authors identify a set of core placement design characteristics and use them compare eight placement algorithms. They also present a heuristic decision tree that can help designers judge how suitable a given placement solutions might be to specific problems.
INDEX TERMS
stream processing, system performance, task-placement algorithms, data-management systems, data stream management
CITATION
Ying Li, Geetika T. Lakshmanan, "Placement Strategies for Internet-Scale Data Stream Systems", IEEE Internet Computing, vol.12, no. 6, pp. 50-60, November/December 2008, doi:10.1109/MIC.2008.129
REFERENCES
1. Y. Ahmad and U. Cetintemel, "Network-Aware Query Processing for Stream-Based Applications," Proc. Conf. Very Large Databases, Elsevier Science, 2004, pp. 456–467.
2. Y. Ahmad et al., "Network Awareness in Internet-Scale Stream Processing," Proc. IEEE Data Eng. Bulletin, vol. 28, no. 1, 2005, pp. 63–69.
3. D.J. Abadi et al., "The Design of the Borealis Stream Processing Engine," Proc. 2nd Biennial Conf. Innovative Data Systems Research, Very Large Data Base Endowment, 2005, pp. 277–289.
4. M. Cherniack et al., "Scalable Distributed Stream Processing," Proc. 1st Biennial Conf. Innovative Database Systems, Very Large Data Base Endowment, 2003, pp. 23–35.
5. M. Branson et al., "CLASP: Collaborating, Autonomous Stream Processing Systems," Proc. ACM Int'l Middleware Conf., ACM Press, 2007, pp. 348–367.
6. J. Wolf et al., "SODA: An Optimizing Scheduler for Large-Scale Stream-Based Distributed Computer Systems," to be published in Proc. ACM Int'l Middleware Conf., Springer-Verlag, 2008.
7. M. Stonebraker, U. Cetintemel, and S. Zdonik, "The Eight Requirements of Real-Time Stream Processing," Proc. SIGMOD Record, vol. 34, no. 4, 2005, pp. 42–47.
8. U. Srivastava, K. Mungala, and J. Widom, "Operator Placement for In-Network Stream Query Processing," Proc. Principles of Distributed Systems, ACM Press, 2005, pp. 250–258.
9. P. Pietzuch et al., "Network-Aware Operator Placement for Stream-Processing Systems," Proc. Int'l Conf. Data Eng., IEEE CS Press, 2006, p. 49.
10. V. Kumar, B.F. Cooper, and K. Schwan, "Distributed Stream Management using Utility-Driven Self-Adaptive Middleware," Proc. 2nd IEEE Int'l Conf. Autonomic Computing, IEEE CS Press, 2005, pp. 3–14.
11. M. Balazinska, H. Balakrishnan, and M. Stonebraker, "Contract-Based Load Management in Federated Distributed Systems," Proc. Symp. Network System Design and Implementation, Usenix Assoc., 2004, pp. 197–210.
12. L. Amini et al., "Adaptive Control of Extreme-Scale Stream Processing Systems," Proc. Int'l Conf. Distributed Computing Systems, IEEE CS Press, 2006, pp. 71–71.
13. Y. Xing and S. Zdonik, and J-H. Hwang, "Dynamic Load Distribution in the Borealis Stream Processor," Proc. Int'l Conf. Data Engineering., IEEE CS Press, 2005, pp. 791–802.
14. V. Pandit et al., Performance Modeling and Placement of Transforms for Stateful Mediations, tech. report RI08002, IBM, 2004; http://domino.watson.ibm.com/library/cyberdig.nsf Home.
15. Y. Xing et al., "Providing Resiliency to Load Variations in Distributed Stream Processing," Proc. Conf. Very Large Databases, Very Large Data Base Endowment, 2006, pp. 775–786.
16. Y. Zhou et al., "Efficient Dynamic Operator Placement in a Locally Distributed Continuous Query System," Proc. 14th Int'l Conf. Cooperative Information Systems, Springer-Verlag, 2006, pp. 54–71.
17. L. Ying et al., "Distributed Operator Placement and Data Caching in Large-Scale Sensor Networks," Proc. IEEE Conf. Computer Comm. (INFOCOM), IEEE Press, 2008, pp. 977–985.
18. G. Lakshmanan and R. Strom, "Biologically-Inspired Distributed Middleware Management for Stream Processing Systems," to be published in Proc. ACM Int'l Middleware Conf., Springer-Verlag, 2008.
19. B.J. Bonfils and P. Bonnet, "Adaptive and Decentralized Operator Placement for In-Network Query Processing," Proc. Conf. Information Processing in Sensor Networks, Springer-Verlag, 2003, pp. 47–62.
20. Y. Li, R. Strom, and C. Dorai, "Placement of Replicated Message Mediation Components," Proc. ACM Int'l Middleware Conf. (Demos and Posters), ACM Press, 2007, p. 2.
21. T. Repantis, X. Gu, and V. Kalogeraki, "Synergy: Sharing-Aware Component Composition for Distributed Stream Processing Systems," Proc. ACM Int'l Middleware Conf., Springer-Verlag, 2006, pp. 322–341.
22. M. Shah et al., "Flux: An Adaptive Partitioning Operator for Continuous Query Systems," Proc. 19th Int'l Conf. Data Eng., IEEE CS Press, 2003, pp. 25–36.
15 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool