JavaScript is disabled for your browser. Some features of this site may not work without it.
Learning Deep Models with Linguistically-Inspired Structure

Author
Niculae, Vlad
Abstract
Many applied machine learning tasks involve structured representations. This is particularly the case in natural language processing (NLP), where the discrete, compositional nature of words and sentences leads to natural combinatorial representations such as trees, sequences, segments, or alignments, among others. It is no surprise that structured output models have been successful and popular in NLP applications since their inception. At the same time, deep, hierarchical neural networks with latent representations are increasingly widely and successfully applied to language tasks. As compositions of differentiable building blocks, deep models conventionally perform smooth, soft computations, resulting in dense hidden representations. In this work, we focus on models with structure and sparsity in both their outputs as well as their latent representations, without sacrificing differentiability for end-to-end gradient-based training. We develop methods for sparse and structured attention mechanisms, for differentiable sparse structure inference, for latent neural network structure, and for sparse structured output prediction. We find our methods to be empirically useful on a wide range of applications including sentiment analysis, natural language inference, neural machine translation, sentence compression, and argument mining.
Date Issued
2018-08-30Subject
Computer science; ML; NLP; SparseMAP; sparsity; structure
Committee Chair
Cardie, Claire T.
Committee Member
Haenni, Sabine; Sridharan, Karthik
Degree Discipline
Computer Science
Degree Name
Ph. D., Computer Science
Degree Level
Doctor of Philosophy
Rights
Attribution 4.0 International
Rights URI
Type
dissertation or thesis
Except where otherwise noted, this item's license is described as Attribution 4.0 International