JavaScript is disabled for your browser. Some features of this site may not work without it.
Determining the Last Process to Fail

Author
Skeen, Dale
Abstract
A total failure occurs whenever all processes cooperatively executing a distributed task fail before the task's completion. A frequent prerequisite for recovery from a total failure is the identification of the last group (LAST) of processes concurrently failing. Herein, we derive necessary and sufficient conditions for computing LAST from the local failure data of recovered processes. These conditions are easily translated into decision procedures for LAST membership using either complete or incomplete failure data. The choice of failure data itself is dictated by two requirements: (1) it can be cheaply maintained, and (2) maximum fault-tolerance is afforded in the sense that the expected number of recoveries required for identifying LAST is minimized.
Date Issued
1982-02Publisher
Cornell University
Subject
computer science; technical report
Previously Published As
http://techreports.library.cornell.edu:8081/Dienst/UI/1.0/Display/cul.cs/TR82-496
Type
technical report