Trace-Based Learning for Agile Hardware Design and Design Automation

Other Titles



Modern computational platforms are becoming increasingly complex to meet the stringent constraints on performance and power. With the larger design spaces and new design trade-offs brought by the complexity of modern hardware platforms, the productivity of designing high-performance hardware is facing significant challenges. The recent advances in machine learning provide us with powerful tools for modeling and design automation, but current machine learning models require a large amount of training data. In the digital design flow, simulation traces are a rich source of information that contains a lot of details about the design such as state transitions and signal values. The analysis of traces is usually manual, but it is difficult for humans to effectively learn from traces that are often millions of cycles long. With state-of-the-art machine learning techniques, we have a great opportunity to collect information from the abundant simulation traces that are generated during evaluation and verification, build accurate estimation models, and assist hardware designers by automating some of the critical design optimization steps. In this dissertation, we propose three trace-based learning techniques for digital design and design automation. These techniques automatically learn from simulation traces and provide assistance to designers at early stages of the design flow. We first introduce PRIMAL, a machine-learning-based power estimation technique that enables fast, accurate, and fine-grained power modeling of IP cores at both register-transfer level and cycle-level. Compared with gate-level power analysis, PRIMAL achieves an average error within 5% while offering an average speedup of over 50x. Secondly, we present Circuit Distillation, a machine-learning-based methodology that automatically derives combinational logic modules from cycle-level simulation for applications with stringent constraints on latency and area. In our case study on network-on-chip packet arbitration, the learned arbitration logic is able to achieve performance close to an oracle policy under the training traffic, improving the average packet latency by 64x over the baselines while only consuming area comparable to three eight-bit adders. Finally, we discuss TraceBanking, a graph-based learning algorithm that leverages functional-level simulation traces to search for efficient memory partitioning solutions for software-programmable FPGAs. TraceBanking is used to partition an image buffer of a face detection accelerator, and the generated banking solution significantly improves the resource utilization and frequency of the accelerator.

Journal / Series

Volume & Issue


141 pages


Date Issued





Effective Date

Expiration Date




Union Local


Number of Workers

Committee Chair

Zhang, Zhiru

Committee Co-Chair

Committee Member

Albonesi, David H.
Sampson, Adrian
Ren, Haoxing

Degree Discipline

Electrical and Computer Engineering

Degree Name

Ph. D., Electrical and Computer Engineering

Degree Level

Doctor of Philosophy

Related Version

Related DOI

Related To

Related Part

Based on Related Item

Has Other Format(s)

Part of Related Item

Related To

Related Publication(s)

Link(s) to Related Publication(s)


Link(s) to Reference(s)

Previously Published As

Government Document




Other Identifiers


Rights URI


dissertation or thesis

Accessibility Feature

Accessibility Hazard

Accessibility Summary

Link(s) to Catalog Record