eCommons

 

INTEGRATING GRAPH AND LANGUAGE DATA IN MACHINE LEARNING MODELS: APPLICATIONS FOR COMPUTATIONAL SOCIAL SCIENCE

Other Titles

Abstract

This dissertation presents three papers demonstrating how integrating graph (network) and language (text) data in machine learning models can enhance computational social science models. These two types of data are ubiquitous across many contexts in which computational social scientists work (e.g., social media platforms, online markets, and the Web as a whole). Relatively little research has analyzed how to model network and text data together at scale, partly since models for these data are often computationally expensive but also because statistical models for them require expert-driven decisions on feature engineering and how they are related within models. The first paper in this dissertation combines node and text embeddings in a downstream classification model to study mental health dynamics on Reddit. The second paper cascades knowledge graph classifications to a text clustering model to study how demographic confounding causes extreme instances of lifestyle politics using aggregated Facebook interest data. Finally, the third paper uses graph and language data from Amazon to study the spread of political and lifestyle polarization in the large online market and tests how network and morality features explain the presence of lifestyle polarization. Together, the three studies show how integrating graph and language data in machine learning models can facilitate computational social science not only by improving such models’ power, efficiency, and ease of use but also by allowing us to test new hypotheses and explain black box models. The conclusion contextualizes findings for academia and industry.

Journal / Series

Volume & Issue

Description

173 pages

Sponsorship

Date Issued

2021-05

Publisher

Keywords

Computational Social Science; Graphs; Machine Learning; Natural Language Processing; Networks; Text Analysis

Location

Effective Date

Expiration Date

Sector

Employer

Union

Union Local

NAICS

Number of Workers

Committee Chair

Macy, Michael W.

Committee Co-Chair

Committee Member

Mimno, David
Gilovich, Tom

Degree Discipline

Sociology

Degree Name

Ph. D., Sociology

Degree Level

Doctor of Philosophy

Related Version

Related DOI

Related To

Related Part

Based on Related Item

Has Other Format(s)

Part of Related Item

Related To

Related Publication(s)

Link(s) to Related Publication(s)

References

Link(s) to Reference(s)

Previously Published As

Government Document

ISBN

ISMN

ISSN

Other Identifiers

Rights

Attribution 4.0 International

Types

dissertation or thesis

Accessibility Feature

Accessibility Hazard

Accessibility Summary

Link(s) to Catalog Record