On the Correctness of Gossip-based Membership Protocols
The importance of scalability and fault-tolerance in modern distributed systems has led to considerable research in multi-cast protocols using gossip. In a gossip protocol, each node forwards messages to a small set of ``gossip partners'' chosen at random from the entire group membership. By discarding the strong reliability guarantees of traditional protocols in favor of probabilistic guarantees, gossip protocols can deliver greater scalability and fault tolerance. In early gossip algorithms, partners were chosen uniformly at random from the entire membership, limiting scalability because of the resources required to store and maintain complete membership views at each node. Later protocols avoided this issue by storing much smaller random subsets of the membership at each node, and choosing gossip partners only from these local views. Such protocols are subtle: at least some local views must change in response to group membership changes in order to preserve connectivity and performance guarantees. While these protocols have been the subject of much simulation and analysis, formal proofs of key properties - in particular the probability of partitioning - have remained elusive. In this thesis we give a new scalable gossip-based algorithm for local view maintenance, together with a proof that the expected time until a network partition is at least exponential in the view size and the size of the departing set. We develop probabilistic bounds on the in-degree (hence the load) of individual nodes, and argue that protocols lacking our reinforcement component eventually converge to star-like networks, whose connectivity depends on a small set of overloaded nodes. We argue that the undirected connectivity graph is an expander, for which application-level gossip multi-cast protocols will converge rapidly. An analysis of the membership system under heavy churn yields a lower bound on the amount of communications required per round. Finally, we offer some arguments supporting the experimental fact that the elements of the local views - although not a uniformly random sampling of the set of nodes in the system - have a high degree of randomness and suggesting that the state of the system after O(ln n) iterations is independent of the initial state.
gossip; random graph; distributed system; membership; peer-to-peer; scalability; reliability; epidemic; probabilistic multi-cast; group communication
dissertation or thesis