Optimizing Foundational System Building Blocks of Datacenter Applications

Zhou, Zhuangzhuang

Optimizing Foundational System Building Blocks of Datacenter Applications

dc.contributor.author	Zhou, Zhuangzhuang
dc.contributor.chair	Delimitrou, Christina	en_US
dc.contributor.committeeMember	Martinez, Jose	en_US
dc.contributor.committeeMember	Weatherspoon, Hakim	en_US
dc.date.accessioned	2025-01-14T20:01:26Z
dc.date.issued	2024-08
dc.description	158 pages	en_US
dc.description.abstract	Cloud computing has become the prevailing computing infrastructure for the majority of the world's computation. Computing platforms for cloud computing and large internet services are hosted in datacenters, and optimizing the performance of datacenter applications can result in significant cost savings. Given the diversity of datacenter workloads, optimizing a single application may not yield substantial improvements in the total system efficiency, as costs are spread across numerous independent workloads. In contrast, optimizing the foundational system building blocks of datacenter applications, including high-level system infrastructures to underlying system software libraries, can significantly improve the productivity of the datacenter fleet, since entire classes of datacenter applications can benefit from such optimizations. This dissertation proposes a series of optimizations in foundational system building blocks of datacenter applications. Applications running in datacenter are often built as collections of loosely coupled services that are deployed and executed through high-level system building blocks such as serverless workflow engines and microservice frameworks. First, we focus on optimizing such a system building block at the top of the computing stack, the serverless computing framework. Despite the benefits of ease of programming, fast elasticity, and fine-grained billing, serverless computing suffers from resource inefficiency. We designed Aquatope, a QoS-and-uncertainty-aware resource scheduler for end-to-end serverless workflows that takes into account the inherent uncertainty present in FaaS platforms, and improves performance predictability and resource efficiency. Aquatope uses a set of scalable and validated Bayesian models to create prewarmed containers ahead of function invocations, and to allocate appropriate resources at function granularity to meet a complex workflow’s end-to-end QoS, while minimizing resource cost. Aquatope demonstrates that a joint solution to cold start and resource management, taking into account uncertainty, can effectively improve the resource efficiency of serverless applications. However, serverless workflows still suffer from significant control plane and inter-function communication overheads, which make them unsuitable for latency-critical applications. We also designed Meteion, a fast and efficient serverless workflow engine for latency-critical interactive applications. Meteion decouples the control plane from the workflow execution, and leverages lightweight per-function engines to enable decentralized workflow orchestration and direct inter-function communication. Meteion's DAG scheduler utilizes the workflow's latency distribution and graph structure to provision containers promptly, ensuring that functions can execute seamlessly on worker servers without falling back to the control plane. Second, we delve into a foundational system library, the memory allocator. Datacenter applications typically share the usage of certain low-level software libraries, and memory allocation constitutes a substantial component of datacenter computation. Optimizing the memory allocator can improve application performance, leading to significant cost savings. We present the first comprehensive characterization of TCMalloc at warehouse scale. Our characterization reveals a profound diversity in the memory allocation patterns, allocated object sizes and lifetimes, for large-scale datacenter workloads, as well as in their performance on heterogeneous hardware platforms. Based on these insights, we optimize TCMalloc for warehouse-scale environments. Specifically, we propose optimizations for each level of its cache hierarchy that include usage-based dynamic sizing of allocator caches, leveraging hardware topology to mitigate inter-core communication overhead, and improving allocation packing algorithms based on statistical data. Evaluation results show that these optimizations significantly improve the productivity of the datacenter fleet.	en_US
dc.description.embargo	2025-09-03
dc.identifier.doi	https://doi.org/10.7298/rhcv-wh67
dc.identifier.other	Zhou_cornellgrad_0058F_14444
dc.identifier.other	http://dissertations.umi.com/cornellgrad:14444
dc.identifier.uri	https://hdl.handle.net/1813/116641
dc.language.iso	en
dc.subject	Cloud Computing	en_US
dc.subject	Datacenter	en_US
dc.subject	Memory Allocator	en_US
dc.subject	Memory Management	en_US
dc.subject	Resource Management	en_US
dc.subject	Serverless Computing	en_US
dc.title	Optimizing Foundational System Building Blocks of Datacenter Applications	en_US
dc.type	dissertation or thesis	en_US
dcterms.license	https://hdl.handle.net/1813/59810.2
thesis.degree.discipline	Electrical and Computer Engineering
thesis.degree.grantor	Cornell University
thesis.degree.level	Doctor of Philosophy
thesis.degree.name	Ph. D., Electrical and Computer Engineering

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Zhou_cornellgrad_0058F_14444.pdf
Size:: 8.36 MB
Format:: Adobe Portable Document Format

Download

Collections

Cornell Theses and Dissertations