Show simple item record

dc.contributor.authorRuch, Alexander Martin
dc.date.accessioned2021-09-09T17:41:00Z
dc.date.available2021-09-09T17:41:00Z
dc.date.issued2021-05
dc.identifier.otherRuch_cornellgrad_0058F_12453
dc.identifier.otherhttp://dissertations.umi.com/cornellgrad:12453
dc.identifier.urihttps://hdl.handle.net/1813/109793
dc.description173 pages
dc.description.abstractThis dissertation presents three papers demonstrating how integrating graph (network) and language (text) data in machine learning models can enhance computational social science models. These two types of data are ubiquitous across many contexts in which computational social scientists work (e.g., social media platforms, online markets, and the Web as a whole). Relatively little research has analyzed how to model network and text data together at scale, partly since models for these data are often computationally expensive but also because statistical models for them require expert-driven decisions on feature engineering and how they are related within models. The first paper in this dissertation combines node and text embeddings in a downstream classification model to study mental health dynamics on Reddit. The second paper cascades knowledge graph classifications to a text clustering model to study how demographic confounding causes extreme instances of lifestyle politics using aggregated Facebook interest data. Finally, the third paper uses graph and language data from Amazon to study the spread of political and lifestyle polarization in the large online market and tests how network and morality features explain the presence of lifestyle polarization. Together, the three studies show how integrating graph and language data in machine learning models can facilitate computational social science not only by improving such models’ power, efficiency, and ease of use but also by allowing us to test new hypotheses and explain black box models. The conclusion contextualizes findings for academia and industry.
dc.language.isoen
dc.rightsAttribution 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectComputational Social Science
dc.subjectGraphs
dc.subjectMachine Learning
dc.subjectNatural Language Processing
dc.subjectNetworks
dc.subjectText Analysis
dc.titleINTEGRATING GRAPH AND LANGUAGE DATA IN MACHINE LEARNING MODELS: APPLICATIONS FOR COMPUTATIONAL SOCIAL SCIENCE
dc.typedissertation or thesis
thesis.degree.disciplineSociology
thesis.degree.grantorCornell University
thesis.degree.levelDoctor of Philosophy
thesis.degree.namePh. D., Sociology
dc.contributor.chairMacy, Michael W.
dc.contributor.committeeMemberMimno, David
dc.contributor.committeeMemberGilovich, Tom
dcterms.licensehttps://hdl.handle.net/1813/59810
dc.identifier.doihttp://doi.org/10.7298/pyqe-xp82


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Except where otherwise noted, this item's license is described as Attribution 4.0 International

Statistics