Extracting Opinions And Events From Text: Joint Inference Approaches

Other Titles


With the rapid growth of text data on the Web and on personal devices, there is an increasing need to automatically process text and unlock different types of information from it. Opinions and events are two important types of information that appear ubiquitously in text. One represents subjective information, concerning a person's attitudes, beliefs, sentiment, judgements and evaluations, and the other represents factual information concerning what happens in the real world. The ability to extract and interpret opinions and events is essential for many natural language processing (NLP) applications such as news summarization, open-domain question answering, social media analysis, and government document management. While NLP has made great progress on information extraction tasks such as named entity recognition (entities like persons, organizations and locations) and named entity resolution (determining references of entities), much less progress has been made on the extraction of complex information such as opinions and events. Existing methods mostly extract individual components and attributes of opinions and events without accounting for their dependencies. Moreover, they often make phrase- or sentence-level predictions without considering the larger discourse context, such as a document or a conversation. This dissertation presents models that address these two shortcomings. To capture the interdependencies among different information elements, we pro- pose models that can perform joint inference across different but related extraction subtasks, including joint opinion entity extraction and relation extraction, and joint opinion segmentation and attribute classification. Extensive experiments show that joint inference yields significant improvements when compared to standard approaches that combine the subtasks in a pipeline, and achieves state-of-the-art performance on the extraction subtasks. To facilitate global discourse understanding, we explore machine learning techniques that allow the integration of linguistic evidence at multiple levels of context - at the word, sentence, and document level - into coherent probabilistic models. Specifically, we develop a structured learning approach that can leverage intra- and inter-sentential cues in fine-grained sentiment analysis, and a Bayesian clustering model for event coreference resolution within a document and across documents. In both applications, we demonstrate the advantages of learning from multiple levels of contextual evidence.

Journal / Series

Volume & Issue



Date Issued




Opinion extraction; Event extraction; Joint inference


Effective Date

Expiration Date




Union Local


Number of Workers

Committee Chair

Cardie,Claire T

Committee Co-Chair

Committee Member

Gehrke,Johannes E.

Degree Discipline

Computer Science

Degree Name

Ph. D., Computer Science

Degree Level

Doctor of Philosophy

Related Version

Related DOI

Related To

Related Part

Based on Related Item

Has Other Format(s)

Part of Related Item

Related To

Related Publication(s)

Link(s) to Related Publication(s)


Link(s) to Reference(s)

Previously Published As

Government Document




Other Identifiers


Rights URI


dissertation or thesis

Accessibility Feature

Accessibility Hazard

Accessibility Summary

Link(s) to Catalog Record