Cornell University
Library
Cornell UniversityLibrary

eCommons

Help
Log In(current)
  1. Home
  2. Cornell University Graduate School
  3. Cornell Theses and Dissertations
  4. Hardware-Software Techniques for Improving Resource Efficiency in Datacenters

Hardware-Software Techniques for Improving Resource Efficiency in Datacenters

File(s)
Kulkarni_cornellgrad_0058F_12036.pdf (2.15 MB)
Permanent Link(s)
https://doi.org/10.7298/jevv-h547
https://hdl.handle.net/1813/70334
Collections
Cornell Theses and Dissertations
Author
Kulkarni, Neeraj
Abstract

Cloud multi-tenancy, which is a major contributor to cost efficiency, leads to unpredictable performance due to interference in the shared resources, especially when it comes to interactive services. Multi-tenancy is disallowed altogether, degrading utilization or ---at best--- interactive services are co-scheduled with low priority, best-effort workloads, whose performance can be sacrificed when deemed necessary. In this dissertation, we propose to improve server utilization by co-scheduling latency-critical applications with batch applications and mitigate the interference in shared resources using various hardware and software techniques. We specifically explore leveraging approximation, resource partitioning (viz. core relocation, LLC and memory capacity partitioning), and reconfigurable cores. Approximate computing applications offer the opportunity to enable tighter colocation among multiple applications whose performance is important. We present Pliant, a lightweight cloud runtime that leverages the ability of approximate computing applications to tolerate some loss in their output quality to boost the utilization of shared servers. During periods of high resource contention, Pliant employs incremental and interference-aware approximation to reduce contention in shared resources, and prevent QoS violations for co-scheduled interactive, latency-critical services. Reconfigurable cores allow fine-grained power and performance adjustments in cores and open up more opportunities for colocation, as they can adjust to the dynamic needs of a specific mix of co-scheduled applications. Additionally, reconfigurable cores are an attractive solution to manage power among the applications executing on a server node that share the node-wide power budget. We propose CuttleSys, an online resource manager that combines scalable machine learning and fast design space exploration to determine the performance and power of an application across all possible core and cache reconfigurations, and effectively navigate the large design space to arrive at a high-performing solution while operating under a power budget. CuttleSys combines performance and power inference using Stochastic Gradient Descent, with a highly-parallelized design space exploration algorithm geared towards high-dimensional searches. The combination of these two techniques permits efficiently identifying a per-core configuration and cache partition, in a way that meets QoS for interactive services and maximizes throughput for co-scheduled batch workloads, while operating under a power budget.

Description
163 pages
Date Issued
2020-05
Keywords
approximation
•
datacenter
•
latency-critical applications
•
power management
•
reconfigurable
•
resource efficiency
Committee Chair
Albonesi, David
Delimitrou, Christina
Committee Member
Martinez, Jose
Degree Discipline
Electrical and Computer Engineering
Degree Name
Ph. D., Electrical and Computer Engineering
Degree Level
Doctor of Philosophy
Type
dissertation or thesis
Link(s) to Catalog Record
https://catalog.library.cornell.edu/catalog/13254331

Site Statistics | Help

About eCommons | Policies | Terms of use | Contact Us

copyright © 2002-2026 Cornell University Library | Privacy | Web Accessibility Assistance