Show simple item record

dc.contributor.authorJo, Yookyungen_US
dc.date.accessioned2012-12-17T13:51:07Z
dc.date.available2016-12-30T06:46:54Z
dc.date.issued2011-08-31en_US
dc.identifier.otherbibid: 7955567
dc.identifier.urihttps://hdl.handle.net/1813/30740
dc.description.abstractAs large-scale digital text collections become abundant, the necessity of automatically summarizing text data by discovering topics and the evolution of topics in them is well-justified and there is surge of research interest in the task. We use graphs for topic discovery and topic evolution discovery by mining the statistical properties of graphs associated with the text data. Considering that an increasing number of text collections have some kind of networks associated with the data (text data in social network service, research paper collections, digital text with user browsing history), there is a great potential in using graphs for the task of text mining. Our work on topic and topic evolution discovery shows qualitatively different results from the existing approaches in that the discovered topics exhibit concreteness with a variety of size and time dynamics and in that the rich topology of topic evolution is captured in the result. We discover topics by mining the correlation between topic terms and the citation graph. This is done by developing a statistical measure, associated with terms, for the connectivity of a document graph. In topic evolution discovery, we capture the inherent topology of topic evolution in a corpus by discovering quantized units of evolutionary change in content and connecting them by summarizing the underlying document network. We note that topic words and nontopic words differ in their distributional properties and use this observation to discover topics via making a document network. We use the same observation to enhance the quality of topics obtained by Latent Dirichlet Allocation.en_US
dc.language.isoen_USen_US
dc.subjecttopicen_US
dc.subjectevolutionen_US
dc.subjectnetworken_US
dc.titleUsing Graphs For Topic Discoveryen_US
dc.typedissertation or thesisen_US
thesis.degree.disciplineComputer Science
thesis.degree.grantorCornell Universityen_US
thesis.degree.levelDoctor of Philosophy
thesis.degree.namePh. D., Computer Science
dc.contributor.chairHopcroft, John Een_US
dc.contributor.committeeMemberJoachims, Thorstenen_US
dc.contributor.committeeMemberShmoys, David Ben_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Statistics