Engineering Fault-Tolerant Distributed Computing Systems
We view the design of fault-tolerant computing systems as an engineering endeavor. As such, this activity requires understanding the theoretical limitations and the scope of the feasible designs. We survey the impact that various environment characteristics and design choices have on the resultant system properties. We propose a single metric - the system reliability - as an appropriate measure for exploring tradeoffs among a potentially-large design space.
computer science; technical report
Previously Published As