Approximating Perfect Failure Detectors in Asynchronous Distributed Systems

dc.contributor.authorSabel, Lauraen_US
dc.description.abstractSystems that provide group membership services (e.g., Isis) can be used to solve many canonical problems in distributed systems, such as Consensus and Leader Election. However, solving these problems has been shown to require {\em failure detectors} of various strengths, all of which are impossible to implement in an asynchronous system. It has been shown that it is possible to solve Consensus in an asynchronous distributed system using a very weak failure detector. However, existing group membership systems do not appear to implement this type of failure detector. Furthermore, there are other problems that can be solved with group membership services, such as Leader Election, that are harder than Consensus and cannot be solved without a perfect failure detector. This leads to an apparent contradiction: how can provably impossible problems be solved in existing group membership systems? In this dissertation, we investigate {\em approximations} to perfect failure detectors in order to obtain acceptable and practical solutions to otherwise-impossible problems such as Election. We argue that approximating a perfect failure detector yields approximate solutions to these problems, and define a notion of approximation that yields solutions that are useful in practice. We give a formal specification of an approximately-perfect failure detector, give two protocols that implement our approximation, and derive upper bounds on the number of tolerable failures for any such protocol. The approximately-perfect failure detector presented in this dissertation is similar to that used in Isis; this implies that the failure detector implemented in group membership services such as Isis are actually approximately-perfect failure detectors, and that the resulting solutions to Consensus, Election, and other problems are approximate solutions. Furthermore, the protocols that implement our specification are very simple and can be easily implemented. Such an implementation could be used in existing or future group membership services, or by other systems programmers for whom group membership services are not necessary but for whom failure detection would be useful.en_US
dc.format.extent721314 bytes
dc.format.extent673884 bytes
dc.publisherCornell Universityen_US
dc.subjectcomputer scienceen_US
dc.subjecttechnical reporten_US
dc.titleApproximating Perfect Failure Detectors in Asynchronous Distributed Systemsen_US
dc.typetechnical reporten_US


Original bundle
Now showing 1 - 2 of 2
Thumbnail Image
704.41 KB
Adobe Portable Document Format
No Thumbnail Available
658.09 KB
Postscript Files