Show simple item record

dc.contributor.authorGeng, Haoyan
dc.date.accessioned2016-04-04T18:05:22Z
dc.date.available2021-02-01T07:00:35Z
dc.date.issued2016-02-01
dc.identifier.otherbibid: 9597057
dc.identifier.urihttps://hdl.handle.net/1813/43618
dc.description.abstractTopic-based publish-subscribe systems have become an increasingly critical part of infrastructure that supports today's cloud-based services. Such systems collect, store, and disseminate log records over many datacenters across the globe, each containing thousands of inexpensive fault-prone machines. Reliability, scalability, and high performance are all desirable. Achieving all these properties at the same time is a significant research challenge. Scaling out data collection and dissemination naively could bring about unconventionally high bandwidth over-subscription and network congestion. Storing data for reliability comes with the cost of storage capacity and potential system slowdown. This thesis seeks to meet the challenge by providing novel building blocks for topic-based publish-subscribe systems. We introduce the Sprinkler reliable broadcast facility that scales out data dissemination over geo-distributed datacenters. We propose a storage framework that supports the concept of rediversification to scale up the storage. We present a structure called funnelling trees to scale out data collection. We show the design and implementation of a novel form of garbage collection, a technique that can be incorporated with all three tasks to reduce stress brought by high workload. Under typical web caching workloads, the benefit of garbage collection is significant. Together with these components, this thesis provides a complete picture including frameworks, protocols, and implementations. We address all three ele- ments in a topic-based publish-subscribe service-data collection, storage, and dissemination. Sprinkler achieves both reliability and scalability across geodistributed datacenters under sustained high workload.
dc.language.isoen_US
dc.titleTowards Efficient And Reliable Publish-Subscribe For Geo-Distributed Datacenters
dc.typedissertation or thesis
thesis.degree.disciplineComputer Science
thesis.degree.grantorCornell University
thesis.degree.levelDoctor of Philosophy
thesis.degree.namePh. D., Computer Science
dc.contributor.chairVan Renesse,Robbert
dc.contributor.committeeMemberOrman,Levent V.
dc.contributor.committeeMemberFoster,John N.


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Statistics