Providing Design Abstractions in Distributed Systems
Neiger, Gilbert A.; Toueg, Sam
The design of protocols for distributed systems is more complex than for centralized systems because coordination and cooperation between processors are difficult to achieve. Among the factors complicating this design are the failure of processors and the lack of processor synchronization. In this paper, we show how to simplify the design of fault-tolerant protocols using methods that automatically translate protocols tolerant of benign failures into ones tolerant of more severe failures. Such methods provide the abstraction of restricted faulty behavior. We also show how to circumvent the lack of processor synchronization by using logical clocks, which provide the abstraction of perfectly synchronized clocks in solutions to a large class of problems, both in asynchronous and partially synchronized systems.
computer science; technical report
Previously Published As