Training Paradigms For Deep Residual Networks
Convolutional networks are the current state of the art for image tasks. It has long been known that depth is key for increasing their expressive power, but many challenges rendered training difficult. With the advent of deep residual networks , the feasible depth of networks has increased from a few dozen to several hundred. Nevertheless several traditional machine learning problems persist such as overfitting, vanishing gradients, and diminishing feature reuse. Additionally, the training time for large networks is still measured in weeks. This thesis will detail two novel approaches for training deep residual networks that address the aforementioned persistent difficulties and present experimental evidence of their efficacy.
deep learning; machine learning; deep residual networks
Van Loan,Charles Francis
M.S. of Computer Science
Master of Science
dissertation or thesis