Virtually-Synchronous Communication Based on a Weak Failure Suspector
Schiper, Andre; Ricciardi, Aleta M.
Failure detectors (or, more accurately, Failure Suspectors - FS) appear to be a fundamental service upon which to build fault-tolerant, distributed applications. This paper shows that a FS with very weak semantics (i.e. that delivers failure and recovery information in no specific order) suffices to implement virtually-synchronous communication (VSC) in an asynchronous system subject to process crash failures and network partitions. The VSC paradigm is particularly useful in asynchronous systems and greatly simplifies building fault-tolerant applications that mask failures by replicating processes. We suggest a three-component architecture to implement virtually-synchronous communication : 1) at the lowest level, the FS component; on top of it, 2a) a component that defines new views, and 2b) a component that reliably multicasts messages within a view. The issues covered in this paper also lead to a better understanding of the various membership service semantics proposed in recent literature.
computer science; technical report
Previously Published As