Exploiting Structure For Sentiment Classification
This thesis studies the problem of sentiment classification at both the document and sentence level using statistical learning methods. In particular, we develop computational models that capture useful structure-based intuitions for solving each task, treating the intuitions as latent representations to be discovered and exploited during learning. For document-level sentiment classification, we exploit structure in the form of informative sentences - those sentences that exhibit the same sentiment as the document, thus explain or support the document's sentiment label. We first show that incorporating automatically discovered informative sentences in the form of additional constraints for the learner improves performance on the document-level sentiment classification task. Next, we explore joint structured models for this task: our final proposed model does not need sentence-level sentiment labels, and directly optimizes document classification accuracy using inferred sentence-level information. Our empirical evaluation on two publicly available datasets shows improved performance over strong baselines. For phrase-level sentiment classification, we investigate the compositional linguistic structure of phrases. We investigate compositional matrix-space models, learning matrix-space word representations and modeling composition as matrix multiplication. Using a publicly available dataset, we show that the matrix-space model outperforms the standard bag-of-words model for the phrase-level sentiment classification task.
sentiment classification; sentiment analysis; natural language processing
Cardie, Claire T
Hopcroft, John E; Hale, John T.
Ph. D., Computer Science
Doctor of Philosophy
dissertation or thesis