Understanding the Message Logging Paradigm for Masking Process Crashes
No Access Until
Permanent Link(s)
Collections
Other Titles
Author(s)
Abstract
Message logging is a popular technique for building systems that can tolerate process crashes and transient channel failures. The technique, which was first developed in the mid-80s, is popular because message-logging protocols are relatively simple and require process replication only when a process fails. Surprisingly, however, very little attention has been given to the formal specification of the consistency property that these protocols implement in order to be able to recover failed processes to a consistent state. This dissertation presents the first such formal specification. From this specification, the two major classes of message-logging protocols, namely {\em optimistic} and {\em pessimistic}, are characterized. A third and new class of message-logging protocols, called {\em causal}, is introduced. A notion of optimality, based on three important performance metrics, is proposed, and it is shown that optimal implementations of causal message-logging protocols exist. In particular, it is shown that causal message-logging protocols combine the positive aspects of optimistic and pessimistic message logging. A subclass of causal message-logging protocols, called {\em family-based logging}, is developed. Family-based logging protocols are optimal and have the additional attractive characteristic that the smaller the maximum number of concurrent failures, the lower their overhead. Furthermore, several compression techniques can be used to reduce this overhead. Finally, it is shown that family-based logging protocols can be implemented in order to take advantage of the different patterns of communication that systems exhibit.