Cornell University
Library
Cornell UniversityLibrary

eCommons

Help
Log In(current)
  1. Home
  2. Cornell University Graduate School
  3. Cornell Theses and Dissertations
  4. HOST CONGESTION CONTROL

HOST CONGESTION CONTROL

File(s)
Agarwal_cornellgrad_0058F_14630.pdf (8.99 MB)
Permanent Link(s)
https://doi.org/10.7298/614t-x320
https://hdl.handle.net/1813/116373
Collections
Cornell Theses and Dissertations
Author
Agarwal, Saksham
Abstract

The conventional wisdom in systems and networking communities is that congestion in datacenter networks happens primarily within the network fabric (i.e., at network links and/or switches). This dissertation has three core contributions: (1) it presents a new problem of "host congestion''—congestion within the datapath between peripheral devices and compute/memory—and presents evidence of host congestion both in production datacenters and in experimental lab setups; (2) it builds an in-depth understanding of the root causes of host congestion, and of the impact of host congestion on application-level performance; and (3) it explores the implications of host congestion to the design of network protocols, network stacks and operating systems. We define host congestion in the context of networked applications as follows: the receiver-side host network interface card (NIC) receives data from the network at a rate faster than it can transfer it to compute/memory. This reduces the available NIC-to-memory bandwidth, resulting in queueing and eventual packet drops at hosts. We demonstrate that, even with state-of-the-art network protocols and network stacks, host congestion leads to significant degradation in throughput and orders-of-magnitude inflation in tail latency and a surprisingly large fraction of packet drops at the host, even when the access link bandwidth is far from fully utilized. We present evidence and characterization of the host congestion phenomenon for both large-scale production clusters (running Swift congestion control protocol with a userspace network stack), and in experimental testbeds (running DCTCP congestion congestion protocol with Linux network stack). Several recent studies have built upon our work to show that hardware-offloaded network stacks also suffer from similar or worse host congestion phenomenon. Host congestion, and resulting queueing and packet drops at the host, are new to the community. To that end, this thesis also builds an in-depth understanding of the root causes of the host congestion phenomenon. We demonstrate that host congestion is caused due to nanosecond-scale latency inflation within the NIC-to-CPU/memory datapath. Such latency inflation is rooted in the poor interplay between processor, memory and peripheral interconnects within the host; such an interplay, in turn, leads to contention at host resources (e.g., memory bandwidth, IO memory management units, etc.) and manifests itself in the form of underutilization of peripheral interconnect (PCIe) bandwidth, queueing and packet drops at the NIC, and subsequent drop in application-level performance. We also discuss that unfavorable technology trends for host resources are going to make this problem worse over time: access link bandwidths and PCIe bandwidths are expected to increase by 8-16x over the next few years; however, trends for essentially all other host resources (e.g., memory bandwidth per core, IO memory management unit caches, NIC buffer sizes, etc.) are relatively stagnant. Thus, we expect higher degrees of resource imbalance and contention within the host. Host congestion alters the many assumptions entrenched within the design of modern networked and operating systems. For instance, existing datacenter congestion control (CC) protocols were not designed to efficiently detect and respond to host congestion—they operate at a network round-trip-time (RTT) granularity (typically tens-to-hundreds of microseconds) while host congestion can change dynamically at sub-microsecond granularity. We present hostCC, a new congestion control architecture that handles both host congestion and network fabric congestion. HostCC introduces new "host-local” congestion signals and a sub-RTT granularity "host-local'' congestion response to handle host congestion. We demonstrate that hostCC can be integrated with existing CC protocols to efficiently perform host and network resource allocation among competing entities. As another example of host congestion altering the many assumptions entrenched within operating systems, we explore IO memory protection mechanisms. Such mechanisms are used in real-world production datacenters to prevent malicious and/or faulty NICs from executing errant transfers into host memory. Modern hosts achieve this using an IOMMU—NICs operate on virtual addresses, and IOMMU translates virtual addresses to physical addresses (potentially speeding up translations using a cache called IOTLB) before executing memory transfers. We demonstrate that existing state-of-the-art IO memory protection mechanisms are able to provide one of the two desirable properties: (1) strong safety property, that results in unavoidable IOTLB misses and subsequent host congestion; or (2) high performance. We present “Fast Safe” (FS), a simple modification to existing memory protection mechanisms that enables them to provide the strongest safety property, and yet, near-completely eliminates their overheads. The key insight in FS design is that, rather than focusing on achieving high IOTLB hit rates, we should focus on reducing the cost of each translation upon an IOTLB miss. FS reduces the cost of each translation by exploiting modern IOMMU hardware, along with novel mechanisms for contiguous virtual address allocation and batched unmapping and invalidations of used virtual addresses. We demonstrate that FS design requires no modifications in host hardware, minimal modifications within the operating system, and yet, near-completely alleviates all overheads of the strongest form of memory protection mechanism. Finally, using the example of storage stacks, we demonstrate that host congestion has implications that are more far-reaching than networked applications. Specifically, modern storage stacks enable exploiting high throughput offered by storage devices like SSDs by maintaining multiple in-flight IO requests. Typically, the number of in-flight requests is set to be the "knee point'' of the device's latency-load curve. This minimizes the latency while maximizing the throughput, under the assumption that the bottleneck is at the CPU and/or at the storage device. However, under host congestion, the latency-load curve (and the knee-point) can change dynamically at sub-microsecond granularity. We demonstrate that, under such host congestion, existing storage stacks keep a sub-optimal number of in-flight requests resulting in significant application-level performance degradation. We present storageCC, a new storage stack architecture to handle host congestion. Our key insight is that the end-to-end storage datapath (i.e., SSD-to-CPU/memory datapath) is conceptually a (lossless) computer network. StorageCC builds upon this insight to detect host congestion by collecting congestion signals within the host, and adapting the number of in-flight requests based on host congestion. Evaluation of StorageCC over a wide variety of scenarios demonstrates that StorageCC converges to the optimal parallelism for each individual scenario, maintaining near-hardware latency and throughput performance. Our thesis suggests that host congestion may have wide-reaching implications to design of network protocols, network stacks, operating systems, and even host hardware. To that end, we close by summarizing the many new research questions opened up by our thesis in computer networking, operating systems and computer architecture.

Description
162 pages
Date Issued
2024-08
Committee Chair
Agarwal, Rachit
Committee Member
Foster, John
Shmoys, David
Degree Discipline
Computer Science
Degree Name
Ph. D., Computer Science
Degree Level
Doctor of Philosophy
Type
dissertation or thesis
Link(s) to Catalog Record
https://newcatalog.library.cornell.edu/catalog/16611859

Site Statistics | Help

About eCommons | Policies | Terms of use | Contact Us

copyright © 2002-2026 Cornell University Library | Privacy | Web Accessibility Assistance