eCommons

 

Optimizing Foundational System Building Blocks of Datacenter Applications

Access Restricted

Access to this document is restricted. Some items have been embargoed at the request of the author, but will be made publicly available after the "No Access Until" date.

During the embargo period, you may request access to the item by clicking the link to the restricted file(s) and completing the request form. If we have contact information for a Cornell author, we will contact the author and request permission to provide access. If we do not have contact information for a Cornell author, or the author denies or does not respond to our inquiry, we will not be able to provide access. For more information, review our policies for restricted content.

No Access Until

2025-09-03
Permanent Link(s)

Other Titles

Abstract

Cloud computing has become the prevailing computing infrastructure for the majority of the world's computation. Computing platforms for cloud computing and large internet services are hosted in datacenters, and optimizing the performance of datacenter applications can result in significant cost savings. Given the diversity of datacenter workloads, optimizing a single application may not yield substantial improvements in the total system efficiency, as costs are spread across numerous independent workloads. In contrast, optimizing the foundational system building blocks of datacenter applications, including high-level system infrastructures to underlying system software libraries, can significantly improve the productivity of the datacenter fleet, since entire classes of datacenter applications can benefit from such optimizations. This dissertation proposes a series of optimizations in foundational system building blocks of datacenter applications. Applications running in datacenter are often built as collections of loosely coupled services that are deployed and executed through high-level system building blocks such as serverless workflow engines and microservice frameworks. First, we focus on optimizing such a system building block at the top of the computing stack, the serverless computing framework. Despite the benefits of ease of programming, fast elasticity, and fine-grained billing, serverless computing suffers from resource inefficiency. We designed Aquatope, a QoS-and-uncertainty-aware resource scheduler for end-to-end serverless workflows that takes into account the inherent uncertainty present in FaaS platforms, and improves performance predictability and resource efficiency. Aquatope uses a set of scalable and validated Bayesian models to create prewarmed containers ahead of function invocations, and to allocate appropriate resources at function granularity to meet a complex workflow’s end-to-end QoS, while minimizing resource cost. Aquatope demonstrates that a joint solution to cold start and resource management, taking into account uncertainty, can effectively improve the resource efficiency of serverless applications. However, serverless workflows still suffer from significant control plane and inter-function communication overheads, which make them unsuitable for latency-critical applications. We also designed Meteion, a fast and efficient serverless workflow engine for latency-critical interactive applications. Meteion decouples the control plane from the workflow execution, and leverages lightweight per-function engines to enable decentralized workflow orchestration and direct inter-function communication. Meteion's DAG scheduler utilizes the workflow's latency distribution and graph structure to provision containers promptly, ensuring that functions can execute seamlessly on worker servers without falling back to the control plane. Second, we delve into a foundational system library, the memory allocator. Datacenter applications typically share the usage of certain low-level software libraries, and memory allocation constitutes a substantial component of datacenter computation. Optimizing the memory allocator can improve application performance, leading to significant cost savings. We present the first comprehensive characterization of TCMalloc at warehouse scale. Our characterization reveals a profound diversity in the memory allocation patterns, allocated object sizes and lifetimes, for large-scale datacenter workloads, as well as in their performance on heterogeneous hardware platforms. Based on these insights, we optimize TCMalloc for warehouse-scale environments. Specifically, we propose optimizations for each level of its cache hierarchy that include usage-based dynamic sizing of allocator caches, leveraging hardware topology to mitigate inter-core communication overhead, and improving allocation packing algorithms based on statistical data. Evaluation results show that these optimizations significantly improve the productivity of the datacenter fleet.

Journal / Series

Volume & Issue

Description

158 pages

Sponsorship

Date Issued

2024-08

Publisher

Keywords

Cloud Computing; Datacenter; Memory Allocator; Memory Management; Resource Management; Serverless Computing

Location

Effective Date

Expiration Date

Sector

Employer

Union

Union Local

NAICS

Number of Workers

Committee Chair

Delimitrou, Christina

Committee Co-Chair

Committee Member

Martinez, Jose
Weatherspoon, Hakim

Degree Discipline

Electrical and Computer Engineering

Degree Name

Ph. D., Electrical and Computer Engineering

Degree Level

Doctor of Philosophy

Related Version

Related DOI

Related To

Related Part

Based on Related Item

Has Other Format(s)

Part of Related Item

Related To

Related Publication(s)

Link(s) to Related Publication(s)

References

Link(s) to Reference(s)

Previously Published As

Government Document

ISBN

ISMN

ISSN

Other Identifiers

Rights

Rights URI

Types

dissertation or thesis

Accessibility Feature

Accessibility Hazard

Accessibility Summary

Link(s) to Catalog Record