Issue No. 11 - November (2009 vol. 58)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TC.2009.111
Vincent Gramoli , EPFL and University of Neuchâtel, Switzerland
Ymir Vigfusson , Cornell University, New York
Ken Birman , Cornell University, New York
Anne-Marie Kermarrec , INRIA, Rennes Bretagne Atlantique
Robbert van Renesse , Cornell University, New York
Peer-to-peer (P2P) architectures are popular for tasks such as collaborative download, VoIP telephony, and backup. To maximize performance in the face of widely variable storage capacities and bandwidths, such systems typically need to shift work from poor nodes to richer ones. Similar requirements are seen in today's large data centers, where machines may have widely variable configurations, loads, and performance. In this paper, we consider the slicing problem, which involves partitioning the participating nodes into k subsets using a one-dimensional attribute, and updating the partition as the set of nodes and their associated attributes change. The mechanism thus facilitates the development of adaptive systems. We begin by motivating this problem statement and reviewing prior work. Existing algorithms are shown to have problems with convergence, manifesting as inaccurate slice assignments, and to adapt slowly as conditions change. Our protocol, Sliver, has provably rapid convergence, is robust under stress and is simple to implement. We present both theoretical and experimental evaluations of the protocol.
Distributed systems, fault tolerance, performance evaluation of algorithms and systems.
Vincent Gramoli, Ymir Vigfusson, Ken Birman, Anne-Marie Kermarrec, Robbert van Renesse, "Slicing Distributed Systems", IEEE Transactions on Computers, vol. 58, no. , pp. 1444-1455, November 2009, doi:10.1109/TC.2009.111