Fine-Grained Opinion Analysis: Structure-Aware Approaches
Natural language reflects the affective nature of the human mind. Accordingly, expressions of affect and opinion appear profusely in natural language utterances - either explicitly or implicitly. Recognizing and interpreting the subjective information, beyond factual information such as topics and events thereby constitute an important aspect of natural language understanding. Indeed in recent years, there has been a great surge of research interest to help computers understand the subjective side of natural language. In this dissertation, we explore computational methods that can push the envelope for sentiment analysis in text. There are two distinctive themes in our contributions: First, our focus will be on fine-grained opinion analysis, which has been relatively less explored than coarse-grained analysis (e.g., document-level classification). Second, the approaches developed in our work are structure-aware in that we design the inference and/or learning algorithms reflecting the task-specific linguistic structure. We tackle five different sets of problems under these themes, and the key results are summarized in the paragraphs below: Joint Extraction of Opinion Elements and Relations: In this work, we present a system for extracting fine-grained opinion elements such as opinion expressions and the sources of opinions, and the relations among those elements, using machine learning techniques and integer linear programming. The extracted opin- ion elements can then be used as building blocks for various opinion applications, such as opinion summarization or opinion-oriented question answering. Joint Extraction of Opinions and their Attributes: We recognize that the task of determining polarity is related to the task of determining intensity. Based on this observation, we develop a hierarchical sequential learning technique to extract opinion expressions and their attributes - polarity and intensity - simultaneously. Polarity Inference in light of Compositional Semantics: In this work, we in- vestigate methods for fine-grained polarity classification by drawing a connection to compositional semantics, one of the classic branches of research across linguistics and logic. This work attempts to bridge the gap between theories in compositional semantics and practical approaches based on machine learning techniques, by incorporating simple compositional rules based on syntactic patterns as structural inference for the learning algorithm. Lexicon Adaptation as Constraint Optimization: Although there has been plentiful research in the creation of lexical resources for sentiment analysis, most is conducted in isolation from actual applications. As a result, a purportedly better lexical resource might not lead to better performance when utilized for a specific natural language application. To address this problem, we develop a method that adapts a general-purpose polarity lexicon into a domain-specific one in the context of a specific NLP task, by casting the problem as a constraint optimization problem using integer linear programming. Structured local training for coreference resolution: Once we have identified fine-grained opinion elements in text, we need to determine whether some of the extracted phrases are referring to an identical entity - namely, coreference resolution. In this work, we develop "structured local training", a machine learning technique based on Conditional Random Fields (CRFs) that directly incorporates the interaction between local decisions and global decisions into the learning procedure. We also propose "biased potential functions" that can empirically drive CRFs towards performance improvements with respect to the preferred evaluation measure.
dissertation or thesis