Building a Scalable Shared Log
No Access Until
Permanent Link(s)
Collections
Other Titles
Author(s)
Abstract
The shared log paradigm is at the heart of modern distributed applications in the growing cloud computing industry. Often, application logs must be stored durably for analytics, regulations, or failure recovery, and their smooth operation depends closely on how the log is implemented. An ideal implementation of the shared log abstraction should be capable of growing elastically in response to the needs of its client applications, without compromising availability; recover quickly from failures; adopt the data layout that best matches the performance requirement of its clients; scale write throughput without giving up on total order; and offer a low latency that satisfies applications' requirements. Unfortunately, no single shared log today can offer this combination of features. In particular, no shared log provides both total order and seamless reconfiguration, i.e., the capability to reconfigure the service without compromising its global availability. Additionally, no shared log provides both total order and low latency. In this dissertation, we analyze the challenges of achieving both total order and seamless reconfiguration, and of attaining both total order and low latency. Based on this analysis, we have built and evaluated two systems, Scalog and Ziplog, to demonstrate how to address these challenges. Scalog is a new implementation of the shared log abstraction that offers an unprecedented combination of features for continuous smooth delivery of service: Scalog allows applications to customize data placement, supports reconfiguration with no loss in availability, and recovers quickly from failures. At the same time, Scalog provides high throughput and total order. At its core is a novel ordering protocol that (1) efficiently merges the order in which records are stored at each shard to produce a single global order across shards, and (2) collates and processes records in the same batch that may originate from any client and be stored at any shard. This novel design enables Scalog to scale to thousands of shards while providing seamless reconfiguration, flexible data placement, and quick failure recovery. Ziplog is a new implementation of a totally ordered shared log that achieves latency and throughput comparable to what today can only be delivered by systems that optimize only one of these metrics at the expense of the other. Ziplog achieves these results through a new API that, instead of adding new records to the log through a linearizable Append operation, relies on a linearizable InsertAfter operation that specifies the log position past which the new record should be inserted. This new API allows Ziplog to totally order records across shards without needing cross-shard coordination and with an average latency of fewer than three message delays.
Journal / Series
Volume & Issue
Description
Sponsorship
Date Issued
Publisher
Keywords
Location
Effective Date
Expiration Date
Sector
Employer
Union
Union Local
NAICS
Number of Workers
Committee Chair
Van Renesse, Robbert
Committee Co-Chair
Committee Member
Chen, Li