The Community for Technology Leaders
RSS Icon
Issue No.10 - Oct. (2012 vol.24)
pp: 1833-1847
Albert Yu , Duke University, Durham
Pankaj K. Agarwal , Duke University, Durham
Jun Yang , Duke University, Durham
We study the problem of assigning subscribers to brokers in a wide-area content-based publish/subscribe system. A good assignment should consider both subscriber interests in the event space and subscriber locations in the network space, and balance multiple performance criteria including bandwidth, delay, and load balance. The resulting optimization problem is NP-complete, so systems have turned to heuristics and/or simpler algorithms that ignore some performance criteria. Evaluating these approaches has been challenging because optimal solutions remain elusive for realistic problem sizes. To enable proper evaluation, we develop a Monte Carlo approximation algorithm with good theoretical properties and robustness to workload variations. To make it computationally feasible, we combine the ideas of linear programming, randomized rounding, coreset, and iterative reweighted sampling. We demonstrate how to use this algorithm as a yardstick to evaluate other algorithms, and why it is better than other choices of yardsticks. With its help, we show that a simple greedy algorithm works well for a number of workloads, including one generated from publicly available statistics on Google Groups. We hope that our algorithms are not only useful in their own right, but our principled approach toward evaluation will also be useful in future evaluation of solutions to similar problems in content-based publish/subscribe.
Bismuth, Bandwidth, Filtering algorithms, Optimization, Complexity theory, Approximation algorithms, Materials, wide-area networks., Network architecture and design
Albert Yu, Pankaj K. Agarwal, Jun Yang, "Subscriber Assignment for Wide-Area Content-Based Publish/Subscribe", IEEE Transactions on Knowledge & Data Engineering, vol.24, no. 10, pp. 1833-1847, Oct. 2012, doi:10.1109/TKDE.2012.65
[1] A. Yu, P.K. Agarwal, and J. Yang, "Subscriber Assignment for Wide-Area Content-Based Publish/Subscribe," Proc. 27th Int'l Conf. Data Eng. (ICDE), pp. 267-278, 2011.
[2] M.K. Aguilera, R.E. Strom, D.C. Sturman, M. Astley, and T.D. Chandra, "Matching Events in a Content-Based Subscription System," Proc. 18th Ann. ACM Symp. Principles of Distributed Computing (PODC), pp. 53-61, 1999.
[3] Y. Diao, S. Rizvi, and M.J. Franklin, "Towards an Internet-Scale XML Dissemination Service," Proc. 13th Int'l Conf. Very Large Data Bases (VLDB), pp. 612-623, 2004.
[4] O. Papaemmanouil and U. Cetintemel, "SemCast: Semantic Multicast for Content-Based Data Dissemination," Proc. 21st Int'l Conf. Data Eng. (ICDE), pp. 242-253, 2005.
[5] P. Rao, J. Cappos, V. Khare, B. Moon, and B. Zhang, "Net-$\chi$ : Unified Data-Centric Internet Services," Proc. Int'l Workshop Networking Meets Databases (NETDB), pp. 1-6, 2007.
[6] O. Papaemmanouil, Y. Ahmad, U. Cetintemel, J. Jannotti, and Y. Yildirim, "Extensible Optimization in an Overlay Data Dissemination Trees," Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD), pp. 611-622, 2006.
[7] A. Yu, P.K. Agarwal, and J. Yang, "Generating Wide-Area Content-Based Publish/Subscribe Workloads," Proc. Int'l Workshop Networking Meets Databases (NETDB), 2009.
[8] F. Dabek, R. Cox, F. Kaashoek, and R. Morris, "Vivaldi: A Decentralized Network Coordinate System," Proc. SIGCOMM Conf., pp. 15-26, 2004.
[9] J. Ledlie, P. Pietzuch, and M. Seltzer, "Stable and Accurate Network Coordinates," Proc. IEEE 26th Int'l Conf. Distributed Computing Systems (ICDCS), p. 74, 2002.
[10] T.S.E. Ng and H. Zhang, "Predicting Internet Network Distance with Coordinates-Based Approaches," Proc. IEEE INFOCOM Conf., pp. 170-179, 2002.
[11] V.V. Vazirani, Approximation Algorithms. Springer-Verlag, 2003.
[12] P.K. Agarwal, S. Har-Peled, and K.R. Varadarajan, "Geometric Approximation via Coresets - Survey," Combinatorial and Computational Geometry (MSRI Publication), vol. 52, pp. 1-30, 2005.
[13] K.L. Clarkson, "Las Vegas Algorithms for Linear and Integer Programming When the Dimension is Small," J. ACM, vol. 42, no. 2, pp. 488-499, 1995.
[14] H. Brönnimann and M.T. Goodrich, "Almost Optimal Set Covers in Finite VC-Dimension," Discrete and Computational Geometry, vol. 14, pp. 463-479, 1995.
[15] P.K. Agarwal, C.M. Procopiuc, and K.R. Varadarajan, "Approximation Algorithms for K-Line Center," Proc. 10th Ann. European Symp. Algorithms (ESA), pp. 54-63, 2002.
[16] J. Kleinberg and E. Tardos, "Ch 7: Network Flow," Algorithm Design, Addison Wesley, 2005.
[17] V. Bilò, I. Caragiannis, C. Kaklamanis, and P. Kanellopoulos, "Geometric Clustering to Minimize the sum of Cluster Sizes," Proc. 13th Ann. European Symp. Algorithms (ESA), pp. 460-471, 2005.
[18] V. Ramasubramanian, R. Peterson, and E.G. Sirer, "Corona: A High Performance Publish-Subscribe System for the World Wide Web," Proc. Third Conf. Networked Systems Design and Implementation (NSDI), pp. 15-28, 2006.
[19] O. Papaemmanouil, U. Çetintemel, and J. Jannotti, "Supporting Generic Cost Models for Wide-Area Stream Processing," Proc. IEEE Int'l Conf. Data Eng. (ICDE), pp. 1084-1095, 2009.
[20] S. Voulgaris, E. Rivire, A.M. Kermarrec, and M. van Steen, "Sub-2-Sub: Self-Organizing Content-Based Publish and Subscribe for Dynamic and Large Scale Collaborative Networks," Proc. Fifth Int'l Workshop Peer-to-Peer Systems (IPTPS), 2005.
[21] A. Machanavajjhala, E. Vee, M. Garofalakis, and J. Shanmugasundaram, "Scalable Ranked Publish/Subscribe," Proc. VLDB Endowment, vol. 1, no. 1, pp. 451-462, 2008.
[22] S. Bianchi, P. Felber, and M. Gradinariu, "Content-Based Publish/Subscribe Using Distributed R-Trees," Euro-Par, vol. 4641, pp. 537-548, 2007.
[23] S. Shah, K. Ramamritham, and C.V. Ravishankar, "Client Assignment in Content Dissemination Networks for Dynamic Data," Proc. 31st Int'l Conf. Very Large Data Bases (VLDB), pp. 673-684, 2005.
[24] A. Tariq, B. Koldehofe, G. Koch, and K. Rothermel, "Providing Probabilistic Latency Bounds for Dynamic Publish/Subscribe Systems," Proc. Kommunikation in Verteilten Systemen (KiVS), pp. 155-166, 2009.
[25] R. Baldoni, R. Beraldi, L. Querzoni, and A. Virgillito, "Efficient Publish/Subscribe through a Self-Organizing Broker Overlay and Its Application to SIENA," The Computer J., vol. 50, no. 4, pp. 444-459, 2007.
[26] M.A. Jaeger, H. Parzyjegla, G. Mühl, and K. Herrmann, "Self-Organizing Broker Topologies for Publish/Subscribe Systems," Proc. ACM Symp. Applied Computing (SAC), pp. 543-550, 2007.
[27] G.T. Lakshmanan, Y. Li, and R. Strom, "Placement Strategies for Internet-Scale Data Stream Systems," IEEE Internet Computing, vol. 12, no. 6, pp. 50-60, Nov./Dec. 2008.
[28] Y. Zhou, B.C. Ooi, and K.-L. Tan, "Disseminating Streaming Data in a Dynamic Environment: An Adaptive and Cost-Based Approach," The Int'l J. Very Large Data Bases, vol. 17, no. 6, pp. 1465-1483, 2008.
[29] M. Bern and P. Plassmann, "The Steiner Problem with Edge Lengths 1 and 2," Information Processing Letters, vol. 32, no. 4, pp. 171-176, 1989.
[30] M. Grotschel, A. Martin, and R. Weismantel, "Packing Steiner Trees: A Cutting Plane Algorithm and Computational Results," Math. Programming, vol. 78, pp. 265-281, 1997.
[31] M. Migliavacca and G. Cugola, "Adapting Publish-Subscribe Routing to Traffic Demands," Proc. Inaugural Int'l Conf. Distributed Event-Based Systems (DEBS), pp. 91-96, 2007.
[32] J. Dilley, B. Maggs, J. Parikh, H. Prokop, R. Sitaraman, and B. Weihl, "Globally Distributed Content Delivery," IEEE Internet Computing, vol. 6, no. 5, pp. 50-58, Sept./Oct. 2002.
[33] N. Ball and P. Pietzuch, "Distributed Content Delivery Using Load-Aware Network Coordinates," Proc. ACM CoNEXT Conf., pp. 1-6, 2008.
[34] K.L. Clarkson, "Algorithms for Polytope Covering and Approximation," Proc. Third Workshop Algorithms and Data Structures (WADS), pp. 246-252, 1993.
[35] D. Haussler and E. Welzl, "Epsilon-Nets and Simplex Range Queries," Proc. Second Ann. Symp. Computational Geometry (SCG), pp. 61-71, 1986.
67 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool