Trade-offs Between Replication and Availability in Distributed Databases
Shah, Amitabh; Marzullo, Keith
Distributed databases are generally built on top of standard communication facilities such as leased phone lines. Often several applications, running in different databases, use the same network for their communication. But the applications that run in these databases have grown increasingly more complex and demanding in their availability requirements, often with temporal constraints (such as real-time databases). We argue in this paper that there is a need for dedicated communication networks designed for specific applications. To design such networks, we argue that it is necessary to analyze the data access patterns in the applications that run on the top of the networks. Such analysis would give a network designer an insight into where and how much to replicate the data in the system. Replication can increase availability of data, but too much replication can also hamper it. Thus, for a given application, there exists a right balance between replication and availability. Our goal is to find this balance and show how to design a network of the cheapest cost that achieves it. In this paper, we take the first step towards the design problem by precisely characterizing the trade-offs between replication and availability and suggest a network design strategy to exploit these trade-offs.
computer science; technical report
Previously Published As