Log-Based Recovery in Asychronous Distributed Systems
Kane, Kenneth P.
Replication has been shown to be an important tool in the design of high- performance and highly-available distributed systems. When applied to data, however, replication significantly complicates the problem of maintaining consistency within a system. This problem is further complicated when repositories of the data can potentially fail and recover. In this dissertation, we describe a log-based mechanism for restoring consistent states to replicated data objects after failures. A variety of techniques have been proposed for implementing consistency in a system. Most of these techniques focus on preserving a form of consistency based on serialization of updates. Although serializable consistency is useful for building a large number of applications, there are also many applications that do not require the full strength of consistency that serializability provides. For these applications, the cost of implementing serializable consistency can be prohibitive. A number of weaker and less expensive consistency forms have therefore been proposed for building such applications. In this dissertation, we focus on preserving a causal form of consistency based on the notion of virtual time. Causal consistency has been shown to apply to a variety of applications, including distributed simulation, task decomposition, and mail delivery systems. Several mechanisms have been proposed for implementing causally consistent recovery, most notably those of Strom and Yemini, and Johnson and Zwaenepoel. Our mechanism differs from these in two major respects. First, we implement a roll-forward style of recovery. A functioning process is never required to roll-back its state in order to achieve consistency with a recovering process. Second, our mechanism does not require any explicit information about the causal dependencies between updates. Instead, all necessary dependency information is inferred from the orders in which updates are logged by the object servers. Our basic recovery technique appears to be applicable to forms of consistency other than causal consistency. In particular, we show how our recovery technique can be modified to support an atomic form of consistency that we call grouping consistency. By combining grouping consistency with causal consistency, it may even be possible to implement serializable consistency within our mechanism.
computer science; technical report
Previously Published As