Complex Networks as an Analytical Framework: Scientific Collaboration Networks and Persistence of Endemic Disease in Heterogeneous Populations

Other Titles


Complex networks have proven to be useful as a versatile framework for understanding different systems across many disciplines. This dissertation will use networks in two different contexts for the purposes of answering a variety of questions.
The first chapter will focus on data-driven studies of scientific publishing practices. The recent availability of large electronic publication data sets has made it possible to perform large-scale empirical studies of science. The first section of this chapter will discuss patterns of text re-use among articles in the arXiv, a large scientific corpus. We show how habitual text re-use is restricted to a minority of authors, and that articles containing large quantities of re-used text tend to be cited less frequently. The second section of the first chapter will study the assembly of scientific co-authorship networks. Previous studies of co-authorship networks have found topological transitions in which co-authorship networks coalesce to form a densely connected community. Such studies have relied on manual annotation of publishing data sets, which has restricted their size and scope to covering only a handful of disciplines. We overcome these limitations using techniques from natural language processing and machine learning to generate a large population of co-authorship networks representing many different disciplines. Consistent with earlier findings, we observe a similar global topological transition across many different scientific disciplines, suggesting that this is a general property of the development of scientific communities. The second chapter will use mathematical models to study the persistence of endemic disease in a heterogeneous population. Endemic disease occurs when infection continues to affect a population over an extended period of time instead of dying out following the initial outbreak. Infectious disease modeling can provide important insights into understanding what factors contribute to the persistence of endemic disease. In particular, what role does population heterogeneity play in the persistence of endemic disease? Since the propagation of infectious disease relies on transmission of a pathogen through direct or indirect contact, networks provide an intuitive mathematical framework for modeling the connections between different hosts in a population. Here, we use the stochastic SIRS model to explore the properties of the endemic disease state, and to understand how a population's underlying contact network affects the persistence of endemic disease. Using a combination of computer simulations and analytical techniques, we find how different model parameters affect the properties of the endemic state. We also uncover a simple phenomenological relationship between the statistical properties of the endemic state and the persistence lifetime that appears to remain robust for a wide range of model parameters and contact networks.

Journal / Series

Volume & Issue



Date Issued




Physics; Complex networks; Computational social science; Infectious disease dynamics; Scientometrics; Stochastic modeling


Effective Date

Expiration Date




Union Local


Number of Workers

Committee Chair

Myers, Christopher R.

Committee Co-Chair

Committee Member

Ginsparg, Paul Henry
McEuen, Paul L.

Degree Discipline


Degree Name

Ph. D., Physics

Degree Level

Doctor of Philosophy

Related Version

Related DOI

Related To

Related Part

Based on Related Item

Has Other Format(s)

Part of Related Item

Related To

Related Publication(s)

Link(s) to Related Publication(s)


Link(s) to Reference(s)

Previously Published As

Government Document




Other Identifiers


Attribution-ShareAlike 2.0 Generic


dissertation or thesis

Accessibility Feature

Accessibility Hazard

Accessibility Summary

Link(s) to Catalog Record