Show simple item record

dc.contributor.authorIsmail, Mohamed
dc.identifier.otherbibid: 11050706
dc.description.abstractAs software becomes more complex and the costs of developing and maintaining code increase, dynamic programming languages such as Python are becoming more desirable alternatives to traditional static languages such as C. Programmers can express more functionality with fewer lines of code and can spend less time debugging low-level bugs such as buffer overflows and memory leaks. Unfortunately, programs written in a dynamic language often execute significantly slower than an equivalent program written in a static language, sometimes by orders of magnitude. This dissertation investigates the following question: How can dynamic languages achieve high performance through HW/SW co-optimization? The first part of the dissertation studies inefficiencies in dynamic languages through a detailed quantitative analysis of the overhead in Python. The study identifies a new major source of overhead, C function calls, for the Python interpreter. Additionally, studying the interaction of the runtime with the underlying processor hardware shows that the performance of Python with JIT depends heavily on the cache hierarchy and memory system. Proper nursery sizing is necessary for each application to optimize the trade-off between cache performance and garbage collection overhead. Based on insights from the study, the software and hardware are co-optimized to improve the memory management performance. In the second part of the dissertation, a cache-aware optimization for single-application memory management is presented. The performance and memory bandwidth usage is improved by co-optimizing garbage collection overhead and cache performance for newly-initialized and dead objects. Further study shows that less frequent garbage collection results in a large number of cache misses for initial stores to new objects. The problem is solved by directly placing uninitialized objects into on-chip caches without off-chip memory accesses. Cache performance is further optimized by reducing unnecessary cache pollution and write-backs through a partial tracing algorithm that invalidates dead objects between full garbage collections. The dissertation then focuses on the case of multiple applications running concurrently on a multi-core processor with shared caches. It is shown that the performance of dynamic languages can degrade significantly due to cache contention among multiple concurrent applications that share a cache. To address this problem, program memory access patterns are reshaped by adjusting the nursery size. Both a static and a dynamic scheme are presented that determine good nursery sizes for multiple programs running concurrently.
dc.rightsAttribution 4.0 International
dc.subjectComputer engineering
dc.subjectdynamic languages
dc.subjectmemory management
dc.subjectComputer science
dc.titleHardware-Software Co-optimization for Dynamic Languages
dc.typedissertation or thesis and Computer Engineering University of Philosophy, Electrical and Computer Engineering
dc.contributor.chairSuh, Gookwon Edward
dc.contributor.committeeMemberMartinez, Jose F.
dc.contributor.committeeMemberWeatherspoon, Hakim

Files in this item


This item appears in the following Collection(s)

Show simple item record

Except where otherwise noted, this item's license is described as Attribution 4.0 International