This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Topology Dissemination for Reliable One-Hop Distributed Hash Tables
May 2009 (vol. 20 no. 5)
pp. 680-694
John Risson, University of New South Wales, Rowville
Aaron Harwood, University of Melbourne, Melbourne
Tim Moors, University of New South Wales, Sydney
Many distributed hash tables (DHTs) resolve lookups in O(\log n) hops, where n is the number of nodes. One-hop DHTs give lower lookup latencies and lower lookup failure rates. However, it is hard to maintain large, wide-area one-hop topologies. We contribute aecast, a new topology dissemination algorithm for one-hop DHTs. It avoids expensive repair mechanisms and critical points of failure in existing one-hop DHTs. When a node discovers by anti-entropy that it has missed a topology update, it initiates "controlled flooding,” sending the update to nodes in the multicast tree that also missed the update. We compare aecast with a widely cited epidemic multicasting algorithm, pbcast, by analysis and simulation. Aecast gives at least fivefold fewer out-of-date nodes on average within one round of a topology update. We support it with a fault-tolerant topology agreement protocol, so that only legitimate topology changes propagate throughout the overlay. Consequently, we argue that one-hop DHTs deserve greater attention for Internet applications in which reasonably reliable nodes carry high lookup loads.

[1] S. Rhea, B. Godfrey, B. Karp, J. Kubiatowicz, S. Ratnasamy, S. Shenker, I. Stoica, and H. Yu, “OpenDHT: A Public DHT Service and Its Uses,” Proc. ACM SIGCOMM '05, pp. 73-84, Aug. 2005.
[2] J. Risson and T. Moors, “Survey of Research towards Robust Peer-to-Peer Networks: Search Methods,” Computer Networks, vol. 50, no. 17, pp. 3485-3521, 2006.
[3] A. Gupta, B. Liskov, and R. Rodrigues, “Efficient Routing for Peer-to-Peer Overlays,” Proc. First Symp. Networked Systems Design and Implementation (NSDI '04), pp. 113-126, Mar. 2004.
[4] C. Tang, M. Buco, R. Chang, S. Dwarkadas, L. Luan, E. So, and C. Ward, “Low Traffic Overlay Networks with Large Routing Tables,” Proc. ACM Sigmetrics Int'l Conf. Measurement and Modeling of Computer Systems (SIGMETRICS '05), pp. 14-25, June 2005.
[5] G. DeCandia, D. Hastorum, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels, “Dynamo: Amazon's Highly Available Key-Value Store,” Proc. 21st ACM Symp. Operating Systems Principles (SOSP'07), pp. 205-220, Oct. 2007.
[6] J. Li, J. Stribling, R. Morris, F. Kaashoek, and T. Gil, “A Performance versus Cost Framework for Evaluating DHT Design Tradeoffs under Churn,” Proc. IEEE INFOCOM '05, pp. 225-236, Mar. 2005.
[7] J. Risson, A. Harwood, and T. Moors, “Stable High-Capacity One-Hop Distributed Hash Tables,” Proc. IEEE Symp. Computers and Comm. (ISCC '06), pp. 687-694, June 2006.
[8] T. Deegan, J. Crowcroft, and A. Warfield, “The Main Name System: An Exercise in Centralized Computing,” ACM SIGCOMM Computer Comm. Rev., vol. 35, no. 3, pp. 5-14, 2005.
[9] M. Afergan, J. Wein, and A. LaMeyer, “Experience with Some Principles for Building an Internet-Scale Reliable System,” Proc. Fifth IEEE Int'l Symp. Network Computing and Applications (NCA'06), keynote address, July 2006.
[10] E. Brewer, “Lessons from Giant-Scale Services,” IEEE Internet Computing, vol. 5, no. 4, pp. 46-55, 2001.
[11] A.C. Huang and A. Fox, “Cheap Recovery: A Key to Self-Managing State,” ACM Trans. Storage, vol. 1, no. 1, pp. 38-70, 2004.
[12] J. Risson, S. Qazi, T. Moors, and A. Harwood, “A Dependable Global Location Service Using Rendezvous on Hierarchic Distributed Hash Tables,” Proc. Fifth IEEE Int'l Conf. Networking (ICN'06), p. 7, Apr. 2006.
[13] P.V. Mockapetris, “Telephony's Next Act,” IEEE Spectrum, vol. 43, no. 4, pp. 29-32, 2006.
[14] T. Koponen, M. Chawla, B. Chun, A. Ermolinskiy, K.Y. Kim, S. Shenker, and I. Stoica, “A Data-Oriented (and Beyond) Network Architecture,” Proc. ACM SIGCOMM '07, pp. 181-192, Aug. 2007.
[15] A. Demers, D. Greene, C. Hauser, W. Irish, J. Larson, S. Shenker, H. Sturgis, D. Swinehart, and D. Terry, “Epidemic Algorithms for Replicated Data Management,” Proc. Sixth ACM Symp. Principles of Distributed Computing (PODC '87), pp. 1-12, Aug. 1987.
[16] I. Gupta, A.-M. Kermarrec, and A. Ganesh, “Efficient Epidemic-Style Protocols for Reliable and Scalable Multicast,” Proc. 21st IEEE Symp. Reliable Distributed Systems (SRDS '02), pp. 180-189, Oct. 2002.
[17] S. Banerjee, S. Lee, B. Bhattacharjee, and A. Srinivasan, “Resilient Multicast Using Overlays,” ACM SIGMETRICS Performance Evaluation Rev., (republished in IEEE/ACM Trans. Networking, vol. 14, no. 2, pp. 237-248, Apr. 2006), vol. 31, no. 1, pp. 102-113, 2003.
[18] K. Birman, M. Hayden, O. Ozkasap, Z. Xiao, and M. Budiu, “Bimodal Multicast,” ACM Trans. Computer Systems, vol. 17, no. 2, pp. 41-88, 1999.
[19] I. Gupta, A.-M. Kermarrec, and A. Ganesh, “Efficient and Adaptive Epidemic-Style Protocols for Reliable and Scalable Multicast,” IEEE Trans. Parallel and Distributed Systems, vol. 17, no. 7, pp. 593-605, July 2006.
[20] A.-M. Kermarrec, L. Massoulie, and A. Ganesh, “Probabilistic Reliable Dissemination in Large-Scale Systems,” IEEE Trans. Parallel and Distributed Systems, vol. 14, no. 3, pp. 248-258, Mar. 2003.
[21] P. Eugster, R. Guerraoui, S.B. Handurukande, A.M. Kermarrec, and P. Kouznetsov, “Lightweight Probabilistic Broadcast,” Proc. Int'l Conf. Dependable Systems and Networks (DSN '01), pp. 443-452, July 2001.
[22] L.O. Alima, A. Ghodsi, and S. Haridi, “A Framework for Structured Peer-to-Peer Overlay Networks,” Proc. IST/FET Int'l Workshop Global Computing (GC '05), vol. 3267, pp. 223-250, 2005.
[23] B. Leong, B. Liskov, and E. Demaine, “Epichord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management,” Proc. 12th IEEE Int'l Conf. Networks (ICON '04), pp. 270-276, Nov. 2004.
[24] I. Gupta, K. Birman, P. Linga, A. Demers, and R. Van Renesse, “Kelips: Building an Efficient and Stable P2P DHT through Increased Memory and Background Overhead,” Proc. Second Int'l Workshop Peer-to-Peer Systems (IPTPS '03), Feb. 2003.
[25] V. Ramasubramanian and E. Sirer, “Beehive: $O(1)$ Lookup Performance for Power-Law Query Distributions in Peer-to-Peer Overlays,” Proc. First Symp. Networked Systems Design and Implementation (NSDI '04), pp. 99-112, Mar. 2004.
[26] J. Li, J. Stribling, R. Morris, and F. Kaashoek, “Bandwidth-Efficient Management of DHT Routing Tables,” Proc. Second Symp. Networked Systems Design and Implementation (NSDI '05), May 2005.
[27] W. Litwin, M.-A. Niemat, and D. Schneider, “${\rm LH}^{\ast}$ —Linear Hashing for Distributed Files,” Proc. ACM SIGMOD '93, pp.327-336, May 1993.
[28] B. Zhao, L. Huang, J. Stribling, S. Rhea, A. Joseph, and J. Kubiatowicz, “Tapestry: A Resilient Global-Scale Overlay for Service Deployment,” IEEE J. Selected Areas in Comm., vol. 22, no. 1, pp. 41-53, 2004.
[29] W. Litwin, R. Moussa, and T. Schwarz, “${\rm LH}^{\ast}{\rm RS}$ —A Highly-Available Scalable Distributed Data Structure,” ACM Trans. Database Systems, vol. 30, no. 3, pp. 769-811, 2005.
[30] J. Hromkovic, R. Klasing, A. Pelc, P. Ruzicka, and W. Unger, Dissemination of Information in Communication Networks: Broadcasting, Gossiping, Leader Election and Fault-Tolerance. Springer-Verlag, 2005.
[31] A. Pelc, “Fault-Tolerant Broadcasting and Gossiping in Communication Networks,” Networks, vol. 28, no. 3, pp. 143-156, 1996.
[32] M.-J. Lin, K. Marzullo, and S. Masini, “Gossip versus Deterministically Constrained Flooding on Small Networks,” Proc. 14th Int'l Conf. Distributed Computing (DISC '00), pp.253-267, Oct. 2000.
[33] V. Hadzilacos and S. Toueg, “Fault-Tolerant Broadcasts and Related Problems,” Distributed Systems, S. Mullender, ed., second ed, pp. 97-145, ACM Press/Addison-Wesley, 1993.
[34] W. Vogels, R.V. Renesse, and K. Birman, “The Power of Epidemics: Robust Communication for Large-Scale Distributed Systems,” ACM SIGCOMM Computer Comm. Rev., vol. 33, no. 1, pp. 131-135, 2003.
[35] D. Kempe, J. Kleinberg, and A. Demers, “Spatial Gossip and Resource Location Protocols,” J. ACM, vol. 51, no. 6, pp. 943-967, 2004.
[36] P. Eugster, R. Guerraoiu, S. Handurukande, P. Kouznetsov, and A.-M. Kermarrec, “Lightweight Probabilistic Broadcast,” ACM Trans. Computer Systems, vol. 21, no. 4, pp. 341-374, 2003.
[37] A. Ganesh, A.-M. Kermarrec, and L. Massoulie, “Peer-to-Peer Membership Management for Gossip-Based Protocols,” IEEE Trans. Computers, vol. 52, no. 2, pp. 139-149, Feb. 2003.
[38] A. Mislove and P. Druschel, “Providing Administrative Control and Autonomy in Structured Peer-to-Peer Overlays,” Proc. Third Int'l Workshop Peer-to-Peer Systems (IPTPS '04), June 2004.
[39] S. Saroiu, P. Gummadi, and S. Gribble, “A Measurement Study of Peer-to-Peer File Sharing Systems,” Proc. Multimedia Computing and Networking (MMCN '02), Jan. 2002.
[40] I. Stoica, R. Morris, D. Liben-Nowell, D. Karger, M. Kaashoek, F. Dabek, and H. Balakrishnan, “Chord: A Scalable Peer-to-Peer Lookup Protocol for Internet Applications,” IEEE/ACM Trans. Networking, vol. 11, no. 1, pp. 17-32, 2003.
[41] S. Lam and H. Liu, “Failure Recovery for Structured P2P Networks: Protocol Design and Performance under Churn,” Computer Networks, vol. 50, no. 16, pp. 3083-3104, 2006.
[42] V. Ramasubramanian, R. Petersen, and E. Sirer, “Corona: A High Performance Publish-Subscribe System for the World Wide Web,” Proc. Third Symp. Networked Systems Design and Implementation (NSDI '06), May 2006.
[43] A. Kubota, A. Yamada, and Y. Miyake, “L2VPN over Chord: Hosting Millions of Small Zeroconf Networks over DHT Nodes,” Proc. IEEE Global Telecomm. Conf. (GLOBECOM '06), pp.1-5, Nov./Dec. 2006.
[44] H. Balakrishnan, S. Shenker, and M. Walfish, “Peering Peer-to-Peer Providers,” Proc. Fourth Int'l Workshop Peer-to-Peer Systems (IPTPS '05), Feb. 2005.
[45] L. Zhou and R. van Renesse, “P6P: A Peer-to-Peer Approach to Internet Infrastructure,” Proc. Third Int'l Workshop Peer-to-Peer Systems (IPTPS '04), Feb. 2004.
[46] M. Walfish, H. Balakrishnan, and S. Shenker, “Untangling the Web from DNS,” Proc. First Symp. Networked Systems Design and Implementation (NSDI '04), pp. 225-238, Mar. 2004.
[47] I. Stoica, D. Adkins, S. Zhuang, and S. Shenker, “Internet Indirection Infrastructure,” IEEE/ACM Trans. Networking, vol. 12, no. 2, pp. 205-218, 2004.
[48] J. Stribling, E. Sit, F. Kaashoek, J. Li, and R. Morris, “Don't Give Up on Distributed File Systems,” Proc. Sixth Int'l Workshop Peer-to-Peer Systems (IPTPS '07), Feb. 2007.
[49] D. Geels, “Data Replication in Oceanstore,” UC Berkeley Master's Report, Technical Report UCB//CSD-02-1217, 2002.
[50] R. Rodrigues, “Robust Services in Dynamic Systems,” Doctoral dissertation, Massachusetts Inst. Tech nology, 2005.
[51] T. Koponen, M. Chawla, B. Chun, A. Ermolinskiy, K.Y. Kim, S. Shenker, and I. Stoica, “A Data-Oriented (and Beyond) Network Architecture,” Proc. ACM SIGCOMM '07, Aug. 2007.
[52] A. Bharambe, S. Rao, V. Padmanabhan, S. Seshan, and H. Zhang, “The Impact of Heterogeneous Bandwidth Constraints on DHT-Based Multicast Protocols,” Proc. Fourth Int'l Workshop Peer-to-Peer Systems (IPTPS '05), Feb. 2005.
[53] M. Dahlin, B.B.V. Chandra, L. Gao, and A. Nayate, “End-to-End WAN Service Availability,” IEEE/ACM Trans. Networking (TON), vol. 11, no. 2, pp. 300-313, 2003.
[54] J. Risson, K. Robinson, and T. Moors, “Fault Tolerant Active Rings for Structured Peer-to-Peer Overlays,” Proc. 30th Ann. IEEE Conf. Local Computer Networks (LCN '05), pp. 18-25, Nov. 2005.
[55] J. Gray and L. Lamport, “Consensus on Transaction Commit,” ACM Trans. Database Systems, vol. 31, no. 1, pp. 133-160, 2006.
[56] N.D. de Bruijn, “A Combinatorial Problem,” Koninklijke Netherlands: Academe Van Wetenschappen, vol. 49, pp. 758-764, 1946.
[57] M. Imase and M. Itoh, “Design to Minimize Diameter on Building-Block Network,” IEEE Trans. Computers, vol. 30, no. 6, pp. 439-442, June 1981.
[58] D.K. Pradhan and S.M. Reddy, “A Fault-Tolerant Communication Architecture for Distributed Systems,” IEEE Trans. Computers, vol. 31, no. 9, pp. 863-870, Sept. 1982.
[59] D. Loguinov, J. Casas, and X. Wang, “Graph-Theoretic Analysis of Structured Peer-to-Peer Systems: Routing Distances and Fault Resilience,” IEEE/ACM Trans. Networking, vol. 13, no. 5, pp. 1107-1120, 2005.
[60] D. Karger and M. Ruhl, “Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems,” Proc. 16th ACM Symp. Parallel Algorithms and Architectures (SPAA '04), June 2004.
[61] D. Karger, E. Lehman, T. Leighton, R. Panigraphy, M. Levin, and D. Lewin, “Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web,” Proc. 29th ACM Symp. Theory of Computing (STOC '97), pp.654-663, May 1997.
[62] K. Sivarajan and R. Ramaswami, “Lightwave Networks Based on de Bruijn Graphs,” IEEE/ACM Trans. Networking, vol. 2, no. 1, pp.70-79, 1994.
[63] V. Paxson, “Measurements and Analysis of End-to-End Internet Dynamics,” Doctoral dissertation, Univ. of California, Berkeley, 1997.
[64] Y. Zhang, V. Paxson, and S. Shenker, “The Stationarity of Internet Path Properties: Routing, Loss and Throughput,” technical report, AT&T Center for Internet Research, Int'l Computer Science Inst. (ICSI), May 2000.
[65] D. Andersen, “Improving End-to-End Availability Using OverlayNetworks,” Doctoral dissertation, Massachusetts Inst. Tech nology, 2005.
[66] M. Freedman, K. Lakshminarayanan, S. Rhea, and I. Stoica, “Non-Transitive Connectivity and DHTs,” Proc. Second USENIX Workshop Real, Large Distributed Systems (WORLDS '05), Dec. 2005.
[67] V. Pappas, D. Massey, A. Terzis, and L. Zhang, “A Comparative Study of the DNS Design with DHT-Based Alternatives,” Proc. IEEE INFOCOM '06, pp. 23-29, Apr. 2006.
[68] K.P. Gummadi, H. Madhyastha, S. Gribble, H. Levy, and D. Wetherall, “Improving the Reliability of Internet Paths with One-Hop Source Routing,” Proc. Sixth Symp. Operating Systems Design and Implementation (OSDI '04), Dec. 2004.
[69] P. Maymounkov and D. Mazieres, “Kademlia: A Peer-to-Peer Information System Based on the XOR Metric,” Proc. First Int'l Workshop Peer to Peer Systems (IPTPS '02), Mar. 2002.
[70] D. Stutzbach and R. Rajaie, “Improving Lookup Performance overa Widely-Deployed DHT,” Proc. IEEE INFOCOM '06, pp.1-12, Apr. 2006.
[71] D. de Caen, “A Lower Bound on the Probability of a Union,” Discrete Math., vol. 169, nos. 1-3, pp. 217-220, 1997.

Index Terms:
Distributed hash tables, epidemics, gossip, anti-entropy, multicast, reliability, wide-area networks, de Bruijn graph.
Citation:
John Risson, Aaron Harwood, Tim Moors, "Topology Dissemination for Reliable One-Hop Distributed Hash Tables," IEEE Transactions on Parallel and Distributed Systems, vol. 20, no. 5, pp. 680-694, May 2009, doi:10.1109/TPDS.2008.145
Usage of this product signifies your acceptance of the Terms of Use.