On Consistent and Efficient Graph Data Management
This dissertation describes techniques to store and process large graphs in modern datacenters with high performance and strong consistency guarantees. Graph-structured data is ubiquitous: social networks, content networks, cryptocurrency transaction histories, and business analytics routinely store and manipulate large graphs. For reasons of scale, both in terms of data size as well as workload volume, it is necessary to store such large graphs in a distributed fashion. Moreover, graph workloads have unique characteristics, such as long running read queries interspersed with shorter updates, that naturally lead to a programming interface consisting of a hybrid of transactions and analytics. Providing efficient and consistent access to graph-structured data is a significant challenge. This dissertation makes three contributions. First, it describes a novel technique to order distributed transactions by introducing the concept of an ordering service. An ordering service seeks to simplify the design of modern distributed systems by factoring out the task of ordering from the core system into a separate service. Second, it details techniques that scale up the performance of a centralized ordering service by combining it with a lightweight timestamping mechanism. Third, it describes a full implementation of Weaver, a new distributed, transactional graph store that includes mechanisms for practical and efficient graph data management, such as dynamic resharding of graph partitions and caching of query results. Overall, these techniques lead to a scalable and consistent graph store that is capable of supporting modern distributed applications with high performance.
graphs; Distributed systems; Computer science; databases
Sirer, Emin G.
Kleinberg, Jon M.; Foster, John N.
Ph. D., Computer Science
Doctor of Philosophy
dissertation or thesis