The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.04 - April (2009 vol.20)
pp: 446-459
Ramsés Morales , University of Illinois at Urbana-Champaign, Urbana
ABSTRACT
This paper proposes to build overlays that help in monitoring of long-term availability histories of hosts, with a focus on large-scale distributed settings where hosts may be selfish or colluding. Concretely, we target the important problems of selection and discovery of an availability monitoring overlay. We motivate six significant goals - firstly, consistency, verifiability, and randomness, in selecting availability monitors of nodes, so as to be probabilistically resilient to selfish and colluding nodes. The next three goals are discoverability, load-balancing, and scalability in finding these monitors. We present AVMON, the first availability monitoring overlay to satisfy these six requirements. Our core algorithmic contribution is a range of protocols for discovering the availability monitoring overlay scalably and efficiently, given any arbitrary monitor selection scheme that is consistent and verifiable. We mathematically analyze the performance of AVMON's discovery protocols w.r.t. scalability and discovery time of monitors. Most interestingly, we are able to derive optimal (and practical) variants of AVMON, that minimize different combinations of memory, bandwidth, computation, and monitor discovery time. Finally, our extensive experimental evaluations using three types of availability traces - synthetic, from PlanetLab, and from the Overnet p2p system - demonstrate AVMON's practicality in a variety of distributed systems.
INDEX TERMS
Distributed Systems, Churn, Availability, Monitoring, Overlay, Consistency, Scalability, Optimality
CITATION
Ramsés Morales, "AVMON: Optimal and Scalable Discovery of Consistent Availability Monitoring Overlays for Distributed Systems", IEEE Transactions on Parallel & Distributed Systems, vol.20, no. 4, pp. 446-459, April 2009, doi:10.1109/TPDS.2008.84
REFERENCES
[1] R. Morales and I. Gupta, “AVMON: Optimal and Scalable Discovery of Consistent Availability Monitoring Overlays for Distributed Systems,” Proc. Int'l Conf. Distributed Computing Systems (ICDCS '07), pp. 55-65, 2007.
[2] L. Peterson, T. Anderson, D. Culler, and T. Roscoe, “A Blueprint for Introducing Disruptive Technology into the Internet,” Proc. ACM Hot Topics in Networking (HotNets '02), pp. 59-64, 2002.
[3] Open Grid Forum, http:/www.ogf.org/, 2008.
[4] R. Bhagwan, K. Tati, Y.-C. Cheng, S. Savage, and G.M. Voelker, “Total Recall: System Support for Automated Availability Management,” Proc. Usenix Symp. Networked Systems Design and Implementation (NSDI '04), pp. 337-350, 2004.
[5] B.-G. Chun et al., “Efficient Replica Maintenance for Distributed Storage Systems,” Proc. Usenix Symp. Networked Systems Design and Implementation (NSDI '06), pp. 45-58, 2006.
[6] P.B. Godfrey, S. Shenker, and I. Stoica, “Minimizing Churn in Distributed Systems,” Proc. ACM SIGCOMM, 2006.
[7] J.W. Mickens and B.D. Noble, “Exploiting Availability Prediction in Distributed Systems,” Proc. Usenix Symp. Networked Systems Design and Implementation (NSDI '06), pp. 73-86, 2006.
[8] T. Pongthawornkamol and I. Gupta, “AVCast: New Approaches for Implementing Availability-Dependent Reliability for Multicast Receivers,” Proc. IEEE Symp. Reliable Distributed Systems (SRDS'06), pp. 345-354, 2006.
[9] T. Schwarz, Q. Xin, and E.L. Miller, “Availability in Global Peer-to-Peer Storage Systems,” Proc. Workshop Distributed Data Structures (WDAS), 2004.
[10] J.R. Douceur, “The Sybil Attack,” Revised Papers from the First Int'l Workshop Peer-to-Peer Systems (IPTPS '01), pp. 251-260, 2002.
[11] A. Rowstron and P. Druschel, “Pastry: Scalable, Distributed Object Location and Routing for Large-Scale Peer-to-Peer Systems,” Proc. IFIP/ACM Int'l Conf. Distributed Systems Platforms (Middleware '01), pp. 329-350, 2001.
[12] I. Stoica, R. Morris, D. Karger, F. Kaashoek, and H. Balakrishnan, “Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications,” Proc. ACM SIGCOMM '01, pp. 149-160, 2001.
[13] V. Vishnumurthy, S. Chandrakumar, and E.G. Sirer, “KARMA: A Secure Economic Framework for P2P Resource Sharing,” Proc. Workshop Economics of P2P Systems (EconP2P), 2003.
[14] A. Das, I. Gupta, and A. Motivala, “SWIM: Scalable Weakly-Consistent Infection-Style Process Group Membership Protocol,” Proc. IEEE Int'l Conf. Dependable Systems and Networks (DSN '02), pp. 303-312, 2002.
[15] R. van Renesse, Y. Minsky, and M. Hayden, “A Gossip-Style Failure Detection Service,” Proc. Int'l Conf. Distributed Systems Platforms (Middleware), 1998.
[16] A.J. Ganesh, A.-M. Kermarrec, and L. Massoulie, “Peer-to-Peer Membership Management for Gossip-Based Protocols,” IEEE Trans. Computers, vol. 52, pp. 139-149, Feb. 2003.
[17] S. Voulgaris, D. Gavidia, and M. van Steen, “CYCLON: Inexpensive Membership Management for Unstructured P2P Overlays,” J. Network and Systems Management, vol. 13, no. 2, pp.197-217, June 2005.
[18] M. Jelasity and O. Babaoglu, “T-Man: Gossip-Based Overlay Topology Management,” Self-Organising Systems: Engineering Self-Organizing Systems, pp. 1-15, July 2005.
[19] R. Bhagwan, S. Savage, and G. Voelker, “Understanding Availability,” Proc. Int'l Workshop Peer-to-Peer Systems (IPTPS '03), pp.135-140, Feb. 2003.
[20] J. Chu, K. Labonte, and B. Levine, “Availability and Locality Measurements of Peer-to-Peer File Systems,” Proc. SPIE, vol. 4868, 2002.
[21] X. Hei, C. Liang, J. Liang, Y. Liu, and K.W. Ross, “A Measurement Study of a Large-Scale P2P IPTV System,” IEEE Trans. Multimedia, vol. 9, no. 8, pp. 1672-1687, Dec. 2007.
[22] D. Stutzbach and R. Rejaie, “Characterizing Unstructured Overlay Topologies in Modern P2P File-Sharing Systems,” Proc. Internet Measurement Conf. (IMC '05), pp. 49-62, 2005.
[23] D. Kostoulas, D. Psaltoulis, I. Gupta, K. Birman, and A.J. Demers, “Active and Passive Techniques for Group Size Estimation in Large-Scale and Dynamic Distributed Systems,” J. Systems and Software, vol. 80, no. 10, pp. 1639-1658, Oct. 2007.
[24] Speed Benchmarks for MD5 and Other Cryptographic Functions, http://www.eskimo.com/~weidaibenchmarks.html , 2008.
[25] A.-M. Kermarrec, L. Massoulie, and A.J. Ganesh, “Probabilistic Reliable Dissemination in Large-Scale Systems,” IEEE Trans. Parallel and Distributed Systems, vol. 14, no. 3, pp. 248-258, Mar. 2003.
[26] M. Raab and A. Steger, “Balls into Bins—A Simple and Tight Analysis,” Proc. Second Int'l Workshop Randomization and Approximation Techniques in Computer Science, pp. 159-170, http://citeseer.ist.psu.edu296823.html, 1998.
44 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool