Architectural Support for Accelerating Machine Learning Inference
Machine learning has become ubiquitous over recent years, prompting many studies of architectures for accelerating these algorithms. At the same time, algorithms themselves are rapidly evolving, which means that any accelerators designed have to be efficient not just at the algorithms used today, but also at any future algorithms that may be developed. This thesis explores the question, ‘Surely specialised accelerators have their space in this landscape, but is there a case to be made for an architecture that can provide competitive performance at low power, while remaining highly programmable and future-proof?’ This work tries to answer that question with a ‘Yes’, presenting VIP (Versatile Inference Processor), a system employing well-understood concepts of vector-processing and near-data processing with some modest, but key modifications that are critical to its performance. Through detailed microarchitecture simulations, it shows that VIP achieves competitive performance on a number of machine learning algorithms such as belief propagation (BP) on Markov random fields (MRFs), and deep neural networks (DNNs) including convolutional neural networks (CNNs), multi-layer perceptrons (MLPs) and recurrent neural networks (RNNs). Through synthesis of RTL code for a VIP processing engine (PE), it shows that the requirements for VIP are modest – VIP’s 128 PEs require 18 mm2 in area and consume 3.4 W to 4.8 W of power.
Martinez, Jose F.
Batten, Christopher; Zhang, Zhiru
Electrical and Computer Engineering
Ph. D., Electrical and Computer Engineering
Doctor of Philosophy
dissertation or thesis