Detecting Distributed Termination When Processors Can Fail
Lermen, C. W.; Schneider, Fred B.
A collection of protocols to facilitate detection of the termination of a computation on a distributed system are developed. Communication is assumed to be accomplished by use of asynchronous broadcasting. It is argued that this is a reasonable assumption for a distributed system in light of advances in local networking. The protocols presented are all robust with respect to processor failures. They differ in their requirements - some make heavy use of the communications network at the end of a computation, while others spread the communications cost out through the computation. Problems of restarting failed processors are also addressed.
computer science; technical report
Previously Published As