Cornell University
Library
Cornell UniversityLibrary

eCommons

Help
Log In(current)
  1. Home
  2. Cornell University Graduate School
  3. Cornell Theses and Dissertations
  4. Learning Deep Representations for Low-Resource Cross-Lingual Natural Language Processing

Learning Deep Representations for Low-Resource Cross-Lingual Natural Language Processing

File(s)
Chen_cornellgrad_0058F_11362.pdf (2.12 MB)
Permanent Link(s)
https://doi.org/10.7298/agpn-g679
https://hdl.handle.net/1813/67326
Collections
Cornell Theses and Dissertations
Author
Chen, Xilun
Abstract

Large-scale annotated datasets are an indispensable ingredient of modern Natural Language Processing (NLP) systems. Unfortunately, most labeled data is only available in a handful of languages; for the vast majority of human languages, few or no annotations exist to empower automated NLP technology. Cross-lingual transfer learning enables the training of NLP models using labeled data from other languages, which has become a viable technique for building NLP systems for a wider spectrum of world languages without the prohibitive need for data annotation. Existing methods for cross-lingual transfer learning, however, require cross-lingual resources (e.g. machine translation systems) to transfer models across languages. These methods are hence futile for many low-resource languages without such resources. This dissertation proposes a deep representation learning approach for low-resource cross-lingual transfer learning, and presents several models that (i) progressively remove the need for cross-lingual supervision, and (ii) go beyond the standard bilingual transfer case into the more realistic multilingual setting. By addressing key challenges in two important sub-problems, namely multilingual lexical representation and model transfer, the proposed models in this dissertation are able to transfer NLP models across multiple languages with no cross-lingual resources.

Date Issued
2019-05-30
Keywords
natural language processing
•
Deep Learning
•
Artificial intelligence
•
Computer science
•
Adversarial Neural Networks
•
Cross-Lingual
•
Multilingual
•
Transfer Learning
Committee Chair
Cardie, Claire T.
Committee Member
Kleinberg, Jon M.
Hopcroft, John E.
Degree Discipline
Computer Science
Degree Name
Ph.D., Computer Science
Degree Level
Doctor of Philosophy
Rights
Attribution 4.0 International
Rights URI
https://creativecommons.org/licenses/by/4.0/
Type
dissertation or thesis

Site Statistics | Help

About eCommons | Policies | Terms of use | Contact Us

copyright © 2002-2026 Cornell University Library | Privacy | Web Accessibility Assistance