Architectural Support for Accelerating Machine Learning Inference
No Access Until
Permanent Link(s)
Collections
Other Titles
Author(s)
Abstract
Machine learning has become ubiquitous over recent years, prompting many studies of architectures for accelerating these algorithms. At the same time, algorithms themselves are rapidly evolving, which means that any accelerators designed have to be efficient not just at the algorithms used today, but also at any future algorithms that may be developed. This thesis explores the question, ‘Surely specialised accelerators have their space in this landscape, but is there a case to be made for an architecture that can provide competitive performance at low power, while remaining highly programmable and future-proof?’ This work tries to answer that question with a ‘Yes’, presenting VIP (Versatile Inference Processor), a system employing well-understood concepts of vector-processing and near-data processing with some modest, but key modifications that are critical to its performance. Through detailed microarchitecture simulations, it shows that VIP achieves competitive performance on a number of machine learning algorithms such as belief propagation (BP) on Markov random fields (MRFs), and deep neural networks (DNNs) including convolutional neural networks (CNNs), multi-layer perceptrons (MLPs) and recurrent neural networks (RNNs). Through synthesis of RTL code for a VIP processing engine (PE), it shows that the requirements for VIP are modest – VIP’s 128 PEs require 18 mm2 in area and consume 3.4 W to 4.8 W of power.
Journal / Series
Volume & Issue
Description
Sponsorship
Date Issued
Publisher
Keywords
Location
Effective Date
Expiration Date
Sector
Employer
Union
Union Local
NAICS
Number of Workers
Committee Chair
Committee Co-Chair
Committee Member
Zhang, Zhiru