eCommons

 

Learning Deep Models with Linguistically-Inspired Structure

dc.contributor.authorNiculae, Vlad
dc.contributor.chairCardie, Claire T.
dc.contributor.committeeMemberHaenni, Sabine
dc.contributor.committeeMemberSridharan, Karthik
dc.date.accessioned2018-10-23T13:33:13Z
dc.date.available2018-10-23T13:33:13Z
dc.date.issued2018-08-30
dc.description.abstractMany applied machine learning tasks involve structured representations. This is particularly the case in natural language processing (NLP), where the discrete, compositional nature of words and sentences leads to natural combinatorial representations such as trees, sequences, segments, or alignments, among others. It is no surprise that structured output models have been successful and popular in NLP applications since their inception. At the same time, deep, hierarchical neural networks with latent representations are increasingly widely and successfully applied to language tasks. As compositions of differentiable building blocks, deep models conventionally perform smooth, soft computations, resulting in dense hidden representations. In this work, we focus on models with structure and sparsity in both their outputs as well as their latent representations, without sacrificing differentiability for end-to-end gradient-based training. We develop methods for sparse and structured attention mechanisms, for differentiable sparse structure inference, for latent neural network structure, and for sparse structured output prediction. We find our methods to be empirically useful on a wide range of applications including sentiment analysis, natural language inference, neural machine translation, sentence compression, and argument mining.
dc.identifier.doihttps://doi.org/10.7298/X4SJ1HVQ
dc.identifier.otherNiculae_cornellgrad_0058F_11047
dc.identifier.otherhttp://dissertations.umi.com/cornellgrad:11047
dc.identifier.otherbibid: 10489636
dc.identifier.urihttps://hdl.handle.net/1813/59540
dc.language.isoen_US
dc.rightsAttribution 4.0 International*
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/*
dc.subjectComputer science
dc.subjectML
dc.subjectNLP
dc.subjectSparseMAP
dc.subjectsparsity
dc.subjectstructure
dc.titleLearning Deep Models with Linguistically-Inspired Structure
dc.typedissertation or thesis
dcterms.licensehttps://hdl.handle.net/1813/59810
thesis.degree.disciplineComputer Science
thesis.degree.grantorCornell University
thesis.degree.levelDoctor of Philosophy
thesis.degree.namePh. D., Computer Science

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Niculae_cornellgrad_0058F_11047.pdf
Size:
1.8 MB
Format:
Adobe Portable Document Format