Analysis And Design Of Advanced Caching Solutions For The Modern Web
The rise of modern Web applications has seen a surge in the quantity of digital content - photos, videos, and user interactions - stored, accessed, and transmitted by Internet services. To better handle such content, popular Web services, such as Facebook, have deployed large cache tiers within their serving stacks to lessen the load on backend systems and to decrease the data-request latency for users. Designing such cache infrastructures exposes challenges across three dimensions: (1) Modern Web workloads differ from earlier traditional workloads, due to the large amount of user-created content, as well as the frequency of updates due to user interactions; therefore, more advanced cache-replacement policies are required to provide sustained high hit-ratios, the key metric for caching performance. (2) Flash devices are extensively used due to their cost advantage over DRAM and their significantly higher I/O performance than magnetic disks; however, flash often yields low performance and high wearing costs with the small random writes that advanced caching algorithms tend to generate. (3) Load balance is critical for the scalability of an entire cache-server tier; however, different content within the modern Web may have disparate popularities, and the dependent data-partition mechanism is often used to co-locate relative data in favor of advanced application queries. Both scenarios exacerbate the imbalance. In this dissertation, we use two existing cache systems within the Facebook infrastructure - the photo-cache as a representative of static-content cache, and the Tao cache as an example of in-memory cache solutions - to address the three challenges men- tioned above. First, we examine the workload caused by accessing photos on Facebook and the effectiveness of the many layers of photo caches that have been deployed. By analyzing an event stream covering almost 80 million requests for more than 1 million unique photos, we are able to study cache-access patterns, evaluate current cache efficacy, and explore the potential performance benefits of certain advanced eviction algorithms at multiple cache layers. Second, when building advanced cache on flash devices, in order to address the performance degradation caused by small random writes, we propose the novel RIPQ (Restricted Insertion Priority Queue) framework. RIPQ allows for advanced caching algorithms with large cache sizes, high throughput, and long device lifetime. RIPQ maintains an approximate priority queue efficiently on flash by aggregating small random writes into large writes to a restricted set of insertion points, lazily moving items, and co-locating items with similar priorities. We show that two families of advanced caching algorithms, Segmented-LRU and GDSF (Greedy-Dual-Size-Frequency), can be easily implemented with RIPQ. Our evaluation builds on traces from Facebook's photoserving stack shows that GDSF algorithms with RIPQ can improve cache hit ratios by 20% over the current production system, while incurring low overhead and achieving high throughput. Third, we investigate the principal causes of load imbalance - including data colocation, non-ideal hashing scenario, and hot-spot temporal effects. We analyze Facebook's runtime traffic against the partitioned cache tier in front of Tao, a subsystem that stores objects associated with Facebook's social graph. As part of this investigation, we also employ trace-drive analytics to study the benefits and limitations of current loadbalancing methods - including consistent hashing and hot-partition replication (with front-end caching as a special case) - and we suggest areas for future research.
Birman, Kenneth Paul
Van Renesse, Robbert
Orman, Levent V.; Snavely, Keith Noah
Ph.D. of Computer Science
Doctor of Philosophy
dissertation or thesis