Summarization And Sentiment Analysis For Understanding Socially-Generated Content
During the past decades, we have witnessed the emergence of significant amounts of socially-generated content enabled by the widespread use of Internet, especially the social media websites. How to efficiently and effectively extract useful information and learn knowledge from the socially-generated content becomes a challenging task. Progress has been made in the area of natural language processing to help users understand and absorb knowledge from large volumes of text documents. This dissertation proposes broadly applicable natural language processing techniques to extract key information from massive amounts of heterogeneous textual data in response to users information queries and present it in a comprehensible way. Concretely, novel automatic summarization approaches are proposed to generate concise and informative responses from large amounts of texts to address users requests. We study textual data ranging from eloquent news articles written by professionals in traditional media, to massive user-generated content in popular social media, and to spontaneous conversations containing disfluency and interruptions. Furthermore, sentiment analysis methods are presented for studying the social interactions in online discussions. We target at discovering useful knowledge from informal text and thus obtaining a deeper understanding of socially-generated content.
Natural Language Processing; Summarization; Sentiment Analysis
Turnbull,Bruce William; Gehrke,Johannes E.
Ph.D. of Computer Science
Doctor of Philosophy
dissertation or thesis