JavaScript is disabled for your browser. Some features of this site may not work without it.
On the Reliability of Fault-Tolerant Distributed Computing Systems

Author
Babaoglu, Ozalp
Abstract
The designer of a fault-tolerant distributed system faces numerous alternatives. Using a stochastic model of processor failure times, we investigate design choices such as replication level, protocol running time, randomized versus deterministic protocols, fault detection and authentication. We use the probability with which a system produces the correct output as our evaluation criterion. This contrasts with previous fault-tolerance results that guarantee correctness only if the percentage of faulty processors in the system can be bounded. Our results reveal some subtle and counterintuitive interactions between the design parameters and system reliability.
Date Issued
1986-02Publisher
Cornell University
Subject
computer science; technical report
Previously Published As
http://techreports.library.cornell.edu:8081/Dienst/UI/1.0/Display/cul.cs/TR86-738
Type
technical report