ESSAYS IN POLITICAL ECONOMICS AND
NETWORKS

A Dissertation

Presented to the Faculty of the Graduate School

of Cornell University

in Partial Fulfillment of the Requirements for the Degree of

Doctor of Philosophy

by

Julien Manuel Neves

May 2023


© 2023 Julien Manuel Neves

ALL RIGHTS RESERVED


ESSAYS IN POLITICAL ECONOMICS AND NETWORKS

Julien Manuel Neves, Ph.D.

Cornell University 2023

This dissertation consists of three essays on political economics and the role

of social connections. In Chapter 1, I examine how competitive elections, as

measured by margin of victory and fundraising outcomes, affect legislator be-

havior during the following legislative session. Focusing on the U.S. State leg-

islatures, I use two measures of productivity: the number of new bills a legis-

lator sponsored, and the number of new bills excluding bills copied from other

legislatures or model legislation. This allows me to study the extent to which

legislators introduce these bills to appear productive while putting in minimal

effort. I use an instrumental variable approach and find that legislators who win

elections more comfortably introduce fewer copycat bills and more substantive

bills.

In Chapter 2, my coauthors and I propose a new theory of network forma-

tion where agents are not myopic and where their decisions over their social

links and behavior are endogenous. Using a new equilibrium concept, we show

that our model can be estimated using a modified Approximate Bayesian com-

putation method. We showcase our approach with three distinct empirical ex-

amples. The first example focuses on the legislative effectiveness of politicians

in the 111th and 112th U.S. Congress. The second example looks at R&D expen-

ditures in the Chemicals And Allied Products industry. The third example, using

the National Longitudinal Study of Adolescent to Adult Health (Add Health)

dataset, looks at peer effects on the educational achievement of adolescents.


In Chapter 3, I study the extent to which judicial influence depends on the

judges’ social connections. Guided by a theoretical model that formalizes the

role of social connection, I document that social connections are a significant

determinant of judge influence. I use the flow of law clerks between judges

from 1995-2004 as a measure of social connections, total citations as a proxy

for influence, and I address network endogeneity by using novel data on the

judges’ alumni connections. The results also provide new insights into how

social connectedness interacts with judges’ demographic characteristics.


BIOGRAPHICAL SKETCH

Julien Neves grew up in Montreal, Canada. He graduated from McGill Univer-

sity with a B.A. Joint Honours in Economics and Mathematics in 2015, and an

M.A. in Economics in 2016. From 2017-2022, he attended Cornell University to

pursue a Ph.D. in Economics. After his doctoral studies, he will start working

as an economist for Amazon.

iii


This document is dedicated to my better half, Émilie Chiasson, and my

parents, Stéphane Raymond and Marlyn Neves.

iv


ACKNOWLEDGEMENTS

I am grateful to my committee: Marco Battaglini, Eleonora Patacchini, and

Giulia Brancaccio. Their guidance, feedback, and support were instrumental in

shaping this dissertation.

I would also like to thank Aviv Caspi, Matthew Comey, Christa Deneault,

Giulia Olivero, Camille Portier, David Wasser, and many others for their help

and comments. I’m especially indebted to Aviv for his constant support through

the last months of this endeavor.

I thank Joe Walsh for providing data on text reuse in the U.S. State legis-

latures, Denise Roth Barber for providing an expanded access to the National

Institute on Money In State Politics dataset, and Derek Stafford for providing

data on law clerks’ movements.

I’m grateful to my family and friends for their moral support, kindness,

and encouragement. In particular, I thank my parents, Stéphane Raymond and

Marlyn Neves, for their unwavering support. Lastly, I thank my fiancée, Émilie

Chiasson, who has agreed to embark with me on this long journey.

v


TABLE OF CONTENTS

Biographical Sketch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

1 Electoral Experiences and Legislator Behavior 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.3 Research Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3.1 Instrumenting for Competitiveness Measures . . . . . . . 16
1.3.2 Alternative Instrument . . . . . . . . . . . . . . . . . . . . . 18

1.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2 Dynamic network formation with forward looking agents 23
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.2.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.2.2 Equilibrium analysis . . . . . . . . . . . . . . . . . . . . . . 35
2.2.3 Network competitive equilibrium . . . . . . . . . . . . . . 35

2.3 Estimation Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.3.1 Model Specification . . . . . . . . . . . . . . . . . . . . . . 42
2.3.2 Approximate Bayesian Computation (ABC) . . . . . . . . 43

2.4 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.5 Empirical Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

2.5.1 Legislative Effectiveness in the U.S. Congress . . . . . . . . 49
2.5.2 R&D expenditures in the Chemical Industry . . . . . . . . 61
2.5.3 Adolescent behavior . . . . . . . . . . . . . . . . . . . . . . 67

2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

3 Judge Influence and Judicial Networks 76
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

3.2.1 Social Network . . . . . . . . . . . . . . . . . . . . . . . . . 81
3.2.2 Citations data . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.2.3 Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.2.4 Alumni Network . . . . . . . . . . . . . . . . . . . . . . . . 88

3.3 Theory and Empirical Strategy . . . . . . . . . . . . . . . . . . . . 88

vi


3.3.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
3.3.2 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

3.4 Estimation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
3.4.1 Alternative First Step . . . . . . . . . . . . . . . . . . . . . . 98

3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

A Appendices to Chapter 1 101
A.1 Additional Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
A.2 Additional Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

A.2.1 First Stage Results . . . . . . . . . . . . . . . . . . . . . . . 103
A.2.2 Summary Statistics . . . . . . . . . . . . . . . . . . . . . . . 104
A.2.3 Alternative Instrument Results . . . . . . . . . . . . . . . . 105

B Appendices to Chapter 2 108
B.1 Proof of Proposition 1 . . . . . . . . . . . . . . . . . . . . . . . . . . 108
B.2 Setup of the Simulations in Section 2.4 . . . . . . . . . . . . . . . . 108
B.3 Additional Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
B.4 Additional Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

C Appendices to Chapter 3 120
C.1 Additional Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
C.2 Additional Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

vii


LIST OF TABLES

1.1 Effect of Being Opposed on Number of Bills Introduced . . . . . 19
1.2 Effect of Vote Margin on Number of Bills Introduced . . . . . . . 20
1.3 Effect of Campaign Contributions on Number of Bills Introduced 21
1.4 Effect of Campaign Contributions Split by Source on Number of

Bills Introduced . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.1 Network Formation for the Legislature Example . . . . . . . . . . 52
2.2 Estimation Results for the Legislature Example . . . . . . . . . . 53
2.3 Estimation Results for the Legislature Example - Comparison of

Nested Models- . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.4 Counterfactual Analysis - Alumni Connections - . . . . . . . . . . 58
2.5 Counterfactual Analysis - Ideological Extremism - . . . . . . . . . 60
2.6 Network Formation for the R&D Example . . . . . . . . . . . . . 64
2.7 Estimation Results for the R&D Example . . . . . . . . . . . . . . 65
2.8 Estimation Results for the R&D Example - Comparison of

Nested Models - . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
2.9 Network Formation for the Adolescent Behavior Example . . . . 70
2.10 Estimation Results for the Adolescent Behavior Example . . . . . 71
2.11 Estimation Results for the Adolescent Behavior Example - Com-

parison of Nested Models - . . . . . . . . . . . . . . . . . . . . . . 73

3.1 Network Formation . . . . . . . . . . . . . . . . . . . . . . . . . . 94
3.2 Estimation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

A.1 First Stage Relationship for Electoral Competitiveness Measures 103
A.2 Summary Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
A.3 Effect of Being Opposed on Number of Bills Introduced: Alter-

native Instrument . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
A.4 Effect of Vote Margin on Number of Bills Introduced: Alternative

Instrument . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
A.5 Effect of Campaign Contributions on Number of Bills Intro-

duced: Alternative Instrument . . . . . . . . . . . . . . . . . . . . 106
A.6 Effect of Campaign Contributions Split by Source on Number of

Bills Introduced: Alternative Instrument . . . . . . . . . . . . . . 107

B.1 Summary Statistics for the Legislature Example . . . . . . . . . . 118
B.2 Summary Statistics for the R&D Example . . . . . . . . . . . . . . 119
B.3 Summary Statistics for the Adolescent Behavior Example . . . . 119

C.1 Estimation Results Decomposed by Categories . . . . . . . . . . . 123
C.2 Estimation Results for the Mean-Log Citations . . . . . . . . . . . 124
C.3 Estimation Results for Different Alumni Networks . . . . . . . . 125
C.4 Horse Race of Centrality Measures . . . . . . . . . . . . . . . . . . 126
C.5 Summary Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

viii


LIST OF FIGURES

1.1 Example of Model Legislation from ALEC . . . . . . . . . . . . . 8
1.2 Legislative Influence Detector Example: Wisconsin Senate Bill

179 (2015) v. Louisiana Senate Bill 593 (2012) (Burgess et al., 2016) 9
1.3 Number of Copycat Bills by State . . . . . . . . . . . . . . . . . . 11
1.4 Share of Copycat Bills by State . . . . . . . . . . . . . . . . . . . . 12
1.5 Distribution of Vote Margin Conditional on Being Opposed . . . 15

2.1 Estimated Posterior Distributions for the Simulated Example -
Key Variables - . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

2.2 Estimated Bias by Number of Agents n . . . . . . . . . . . . . . . 48

3.1 Law Clerks Movements Network Between 1995 and 2004 . . . . 83
3.2 Total Number of Citations by Judge . . . . . . . . . . . . . . . . . 87
3.3 Alumni Network - One-Year Window - . . . . . . . . . . . . . . . 89

A.1 Distribution of Total Campaign Contributions Received by Leg-
islators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

A.2 Distribution of Vote Margin . . . . . . . . . . . . . . . . . . . . . . 102

B.1 Estimated Posterior Distributions for the Legislative Example -
Key Variables - . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

B.2 Estimated Posterior Distributions for the Legislative Example -
Control Variables - . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

B.3 Estimated Posterior Distributions for Legislative Example - Con-
trol Variables (Continued) - . . . . . . . . . . . . . . . . . . . . . . 112

B.4 Estimated Posterior Distributions for the R&D Example - Key
Variables - . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

B.5 Estimated Posterior Distributions for the R&D Example - Control
Variables - . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

B.6 Estimated Posterior Distributions for the R&D Example - Control
Variables (Continued) - . . . . . . . . . . . . . . . . . . . . . . . . 114

B.7 Estimated Posterior Distributions for the Adolescent Behavior
Example - Key Variables - . . . . . . . . . . . . . . . . . . . . . . . 115

B.8 Estimated Posterior Distributions for the Adolescent Behavior
Example - Control Variables - . . . . . . . . . . . . . . . . . . . . . 116

B.9 Estimated Posterior Distributions for the Adolescent Behavior
Example - Control Variables (Continued) - . . . . . . . . . . . . . 117

C.1 Alumni Network - Same Graduating Class - . . . . . . . . . . . . 120
C.2 Alumni Network - Four-Year Window - . . . . . . . . . . . . . . . 121
C.3 Estimated Network Effects Using EGRM . . . . . . . . . . . . . . 122

ix


CHAPTER 1

ELECTORAL EXPERIENCES AND LEGISLATOR BEHAVIOR: CAMPAIGN

DONATIONS, ELECTION CLOSENESS, AND COPYCAT LEGISLATION

1.1 Introduction

Elected officials balance time between their substantive work and preparing for

reelection. Ideally, representatives would “run on their record,” incentivizing

them to focus on the needs of their constituents. However, in reality, reelection

tasks including fundraising, political advertising, and campaigning distort the

time elected officials spend on their official duties. A key question is thus what

determines how elected officials balance their time between substantively leg-

islating and preparing for reelection. Answering this question is complicated

by the difficulty of measuring legislator productivity: some measures are con-

taminated by broader forces outside the individual’s control, and others can be

manipulated by legislators seeking to appear productive.

In this paper, I study the extent to which election margin of victory and

fundraising outcomes affect legislator behavior during the following legisla-

tive session. I use the number of new bills a legislator sponsored or introduced

as a measure of productive behavior. Crucially, I separately identify bills that

were copied from other state legislatures or model legislation (commonly called

“copycat bills”) and study the extent to which legislators introduce these bills to

appear productive with minimal effort. I use an instrumental variable approach

and find that legislators who win elections more comfortably – measured by

running unopposed, having larger vote margins, and raising donations success-

fully – introduce more substantive bills and fewer copycat bills. Legislators who

1


win on closer margins or struggle to raise funds from donors introduce more

copycat bills and fewer substantive ones. The natural explanation is that these

legislators rely on copycat bills to signal productively while actually focusing

their efforts on solidifying their reelection prospects. I find evidence consistent

with this hypothesis: these vulnerable legislators, who struggled with fundrais-

ing in the previous election, raise more money in their first year of office than

their peers who won more comfortably.

To study this question, I construct a new dataset linking measures of elec-

toral closeness to measures of individual legislator productivity, which I de-

scribe in Section 2. I link data on over 500,000 bills submitted to US state con-

gresses between 2009 and 2016 compiled by Burgess et al. (2016), which iden-

tifies 45,405 instances of copycat legislation, with metadata on the bills them-

selves from LegiScan. I match these data with information about sponsoring

legislators from VoteSmart, campaign contributions from the Center for Respon-

sive Politics, and electoral outcomes from the State Legislative Election Returns

dataset compiled by Klarner (2018). This set of linkages allows me to separately

analyze details that prove key, including impacts on sponsoring different types

of bills and by donations from different sources.

The second advantage of this paper is in the empirical strategy enabling

causal estimates on the impacts of election closeness. The ideal experiment

would be to randomly vary how comfortably a given legislator won their elec-

tion, then compare the behavior in the upcoming legislative session of legisla-

tors who won comfortably to those who won narrowly. My instrumental vari-

able (IV) strategy, which I detail in Section 3, approximates this experiment by

exploiting variation in electoral closeness coming from partisan waves within a

2


given state. I measure election closeness in three ways. First, whether a candi-

date ran unopposed. Second, the margin of votes that the candidate won with.

Third, the amount of money the candidate raised. I ask the extent to which each

of these instrumented measures of closeness causes legislators to propose more

substantive bills or copycat bills.

I report the results in Section 4. I find that legislators who face closer elec-

tions introduce fewer substantive bills and more copycat bills in the following

legislative session. These results are consistent regardless of whether I measure

election closeness by the likelihood of running opposed or the margin of votes.

Legislators who won closer elections sponsor fewer substantive bills and sub-

stitute towards copycat bills more than their peers who won comfortably.

Candidates might interpret electoral vulnerability not only from their most

recent vote margin or ability to clear the field of opponents, but also in their

success raising donations. I find that raising an additional $100K in an election

leads to a legislator introducing 5.7 more substantive bills and 0.2 fewer copycat

bills in the following legislative session. Observing the identity of donors allows

me to disaggregate these effects by the type of dollar raised. As one might ex-

pect, higher donations only lead to the effects associated with electoral security

when those dollars flow from external donors. Higher contributions from the

candidate themselves lead to fewer bills being introduced. Insofar as having to

contribute one’s own resources to an election campaign is a sign of weakness,

this is unsurprising. These results suggest that legislators react to close elections

by focusing less on substantively legislating in the following session. It appears

that they mask this substitution by sponsoring bills copied from other sources.

Overall, this is a troubling picture of electoral politics. Many decry the rates

3


at which incumbents run unopposed and win reelection as a failure of civic en-

gagement. How can representatives be held accountable to their constituents if

they do not face vigorous challenges? However, these results show that close

electoral contests lead to less productive legislators. One silver lining of this

work is the extent to which new machine learning methods allow for better

identification of shirking legislators, specifically those that rely on copied legis-

lation to signal productivity.

Related Literature

This paper is closest in spirit to a strand of literature that studies the incen-

tives behind the legislative production process. Starting from the electoral con-

nection theory from Mayhew (1974), it can be argued that politicians are mainly

motivated by re-election and that the producing of legislation is a signalling

tool of their competency. In that vein, Giommoni et al. (2022) propose a model

of legislative production and show how restricting the possibility of cosponsor-

ing can induce overproduction of low-quality legislation. Using a reform in the

96th Congress that removed the hard cap on the number of cosponsors for a bill

in the House of Representatives, they find that limited cosponsorship opportu-

nities caused an overproduction of poor-quality legislation. In comparison, in

this paper, I also find that legislators that faced a competitive race will tend to

produce more low-quality legislation. Furthermore, Gratton et al. (2021) pro-

pose a model that shows that legislators experiencing political instability will

tend to introduce a larger amount of low-quality laws, which can lead to a de-

crease in bureaucratic efficiency. This is also consistent with the results of this

paper if we take electoral competitiveness as almost an individual measure of

electoral instability.

4


Another closely related literature is the one that uses term limits to study the

impacts of electoral incentives on legislative behavior (see, for example, Fouir-

naies and Hall (2022), Dal Bó and Rossi (2011)). Dal Bó and Rossi (2011) find

that longer terms promote more effort, but are skeptical that elections are the

primary mechanism. They conclude that job stability promotes effort. I arrive

at a similar conclusion through a different set of findings. As mentioned pre-

viously, I find that elections do, in fact, impact behavior – specifically that elec-

tions which signal increased job security promote legislator effort. Fouirnaies

and Hall (2022) find evidence that legislators in U.S. State legislatures who can

no longer seek reelection are overall less productive, e.g., sponsor fewer bills.

More broadly, this paper relates to the literature on legislative effectiveness

and cosponsorship activity, e.g., Volden and Wiseman (2014), Battaglini, Sciabo-

lazza and Patacchini (2020), Battaglini, Patacchini and Rainone (2022), Miquel

and Snyder (2006), Anderson, Box-Steffensmeier and Sinclair-Chapman (2003),

Frantzich (1979), Bratton (2005), Campbell (1982). While multiple measures of

effectiveness have been proposed, I differ from the rest of this literature by look-

ing at the number of bills that had sections copied from elsewhere.

For electoral competitiveness, one measure I use in this paper is the vote

margin. There is growing literature trying to relate the share of votes a legisla-

tor can garner to legislative productivity (see among other productivity Barber

and Schmidt (2019), Schmidt and Young (2017)), but there isn’t a clear consen-

sus. Schmidt and Young (2017) find that an absence of electoral competition re-

sults in a 13 percent drop in overall legislative productivity. On the other hand,

Barber and Schmidt (2019) find significant evidence of a positive relationship

between primary vote share and legislative effectiveness from U.S. Congress

5


House representatives. This paper contributes to the debate by finding a rela-

tionship similar to that of Barber and Schmidt (2019).

My paper is also related to the vast literature on the role of money in politics

(see, for example, Stratmann (2005) and Ansolabehere, de Figueiredo and Sny-

der (2003) for a meta-analysis). While a large portion of this literature tries to

evaluate how politicians can be influenced1, I instead focus on using campaign

contributions to evaluate how legislators might feel about the precarity of their

position and the effect on their productivity.

This paper is related to the text reuse literature, particularly relating to

copycat legislation. As mentioned previously, thanks to machine learning, re-

searchers have been able to identify examples of text reuse in state legisla-

tures, see Garrett and Jansa (2015), Hertel-fernandez and Kashin (2015), Hertel-

Fernandez (2014), and Burgess et al. (2016). Another example is by Pagliari and

Young (2020) who looked at instances of text reuse among the comment letters

submitted by special interest groups to policy proposals in the EU. Closer to this

paper, Linder et al. (2018) using the bill-to-bill text reuse data from Burgess et al.

(2016) and found that ideologically close legislators will tend to exhibit a high

degree of text reuse.

1For example, Battaglini and Patacchini (2018) that shows that campaign donations are cor-
related with how central some politicians and Bertrand et al. (2020) that find evidence that char-
itable giving by corporations is correlated with politician getting seats on committee.

6


1.2 Data

1.2.1 Background

While legislators wear many hats, the primary role they serve is to introduce

and pass laws in their respective legislatures. For a piece of legislative text to

be introduced to a house, senate, or assembly, it usually needs one or many

sponsors. Depending on the type and nature of the bill introduced, the bill will

usually be sent to a committee for evaluation and possible editing. Afterward,

the bill may be advanced to a vote.

While legislators are the ones introducing bills by sponsoring them, the true

authorship of a bill is generally unclear. For instance, a bill can be written di-

rectly by the legislator, by his staff, by some other member of the party, by the

staff of a bipartisan legislative drafting service, by a lobbyist or special interest

groups (SIG), or by a legislator in another state. In this paper, I pay attention to

two particular sources of legislation: model legislation and copycat legislation.

Model legislation is usually a piece of legislation that was written by some

third party with the intent to be included in bills proposed by legislators. While

the intent behind some of the model legislation available is to ensure some sort

of uniformity in the code of law between states, its primary usage is to provide

a ready-to-use piece of legislation that is tailored to help certain businesses and

special interest groups (SIG). Therefore, model legislation is usually written by

some lobbyist or SIGs, and it is not usually disseminated to the public. How-

ever, groups like the American Legislative Exchange Council (ALEC) or the

American Legislative and Issue Campaign Exchange (ALICE) have decided to

7


make some of their model legislation publicly available on their websites. Fig-

ure 1.1 shows an example of such model legislation, directly taken from ALEC’s

website. This particular excerpt concerns a tax credit for long-term care, but as

I mentioned previously, the range of topics that model legislation cover is vast.

Figure 1.1: Example of Model Legislation from ALEC

Copycat legislation is legislation that was copied directly from model legis-

lation or some other bill introduced in another state. Using machine learning,

Burgess et al. (2016) have developed the Legislative Influence Detector; a tool to

identify pieces of legislative text that closely match. Matches are based on the

Smith-Waterman local alignment algorithm (see, for more details, Smith and

Waterman (1981)). This allows matching segments of a bill with model legisla-

tion instead of having to deal with the entire document. Figure 1.2 shows how

the tool from Burgess et al. (2016) can match text from two different bills. Note

that while the wording looks similar, it is not identical. For example, the word

8


“hydranencephaly” is misspelled in one of the bills in Figure 1.2.

Figure 1.2: Legislative Influence Detector Example: Wisconsin Senate Bill 179
(2015) v. Louisiana Senate Bill 593 (2012) (Burgess et al., 2016)

There are multiple reasons why a state legislator would resort to copying

legislation word for word, but one key aspect of this practice is that it is easy

to do. For instance, a politician who wants to pass a bill on reproductive rights

could spend time and resources writing the bill tailored to her/his liking. This is

an expansive process for politicians, especially for those that are resources con-

strained and without any background in law. According to the National Con-

ference of State Legislatures (NCSL), only four states can be classified as hav-

ing a legislature with full-time politicians who are well-paid and have a large

staff: California, Michigan, New York, and Pennsylvania (National Conference

of State Legislatures, 2017). This is a stark contrast to the Federal level where,

for instance, each legislator has access to the Office of the Legislative Counsel,

which offers legislative drafting services. It is, therefore, reasonable to expect

state legislators to rely on using model legislation or copycat legislation.

9


1.2.2 Data

Text reuse

I use two main measures of productivity in the legislative process during a ses-

sion: the number of bills sponsored, and the share of those bills that are copied

in parts from either model legislation or another state. For the latter, I use the

data compiled by Burgess et al. (2016). In their paper, the authors have collected

over the period of 2009 to 2016 more than 2,400 pieces of model legislation writ-

ten by lobbyists and analyzed 500,000 state bills for any matches. Out of all

the bills, Burgess et al. (2016) found 45,405 instances of text reuse between state

bills, and 14,137 bills that were directly copied from model legislation. The au-

thors have been gracious enough to provide this data, which provides pairing

across different bills and model legislation and the respective Smith-Waterman

local alignment scores of those pairs. The higher the score, the more similar the

part of the text is2.

There is a lot of variation in the use of copycat legislation across states. For

example, Figure 1.3 shows how many times a bill introduced in one state was

originally introduced before in another state over the sample period. Legislators

from Mississippi are mostly likely to rely on copycat legislation.

2I use a threshold of 1000 for the alignment score to classify if a bill contains copied parts.

10


Figure 1.3: Number of Copycat Bills by State

One issue with Figure 1.3 is that this figure is not accounting for how pro-

ductive a state is at introducing legislation. For instance, in the 2015-2016 reg-

ular session, New York introduced 18534 bills, while Kansas only introduced

1459 bills. Figure 1.4 plots the number of instances where a bill was flagged as

having copied from another state as a share of the total amount of legislation

introduced.

11


Figure 1.4: Share of Copycat Bills by State

Figure 1.4 shows that, for example, while Mississippi is still high up in

the ranking in terms of relying on copycat legislation, with roughly 13% of its

bills having sections copied from another bill, it is surpassed by another state:

Kansas.

Bills and Legislators

I supplement the data provided by Burgess et al. (2016) by matching the bill

numbers from that dataset to the metadata on state legislative bills from LegiS-

can. LegiScan is a legislative tracking and reporting service with information on

most legislation passed or presented in each of the 50 states from 2007 to today.

I collect all the information available on more than 1.3 million bills introduced

in the different states’ legislatures. The data includes information such as the

12


status of a bill, the date it was introduced, its general content, etc. More im-

portantly, the data also contains information on the cosponsorship of bills and

roll-call details.

I am able to match roughly 80% of the Burgess et al. (2016) dataset with the

LegiScan data. The main source of discrepancy is that some bills introduced

before 2010 are not present on the LegiScan.

Note that while LegiScan has a lot of information on legislators, they are

missing some important variables (age, race, incumbency, etc.). To remedy this

situation, I use the state legislators’ individual characteristics via data provided

by Vote Smart, an organization looking to provide unbiased information on can-

didates to all Americans. Using their API, I collected information on more than

15,000 legislators in the United States and merged it with the LegiScan data pre-

viously described.

The Vote Smart data has some gaps, in particular, there are numerous miss-

ing values for the birth year of legislators. To patch some of the missing values,

I scrape data from Wikipedia for every legislator. This is particularly impor-

tant to keep the bulk of legislators in my regressions. Table A.2 provides some

summary statistics on the dataset.

Campaign Contribution

For campaign contributions, I use data from the Center for Responsive Politics

and the National Institute on Money in Politics on every individual contribu-

tion to state legislators from 2008 to 2020. This amounts to roughly 1.8 million

observations for about 35,000 political races. I aggregated over each candidate

13


by taking the total amount of donations over a given election cycle. However,

I distinguish donations from the party and from the candidate themselves. Fig-

ure A.1 in the appendix shows the distribution of donations across candidates.

This donation data is matched to the legislative data discussed previously.

Election results

To evaluate the closeness of a race in terms of votes, I use the data collected by

Klarner (2018). This dataset contains general election results from 1976 to 2016

in all 50 states. It also contains results from some party primaries. From this

data, I create margins by comparing the share of the vote of every candidate that

won an election to the vote share of the closest loser. For example, this means

that if a candidate wins an election with a share of 50% against candidates with

30% of the vote and 20% of the vote, respectively, the margin would be 20%. In

other cases, for instance, in West Virginia House of Delegates District 19, where

the top four candidates get a seat, it becomes a little more challenging. In 2010,

in that election, six candidates ran for those four positions. The top four were

democrats with vote shares ranging from 21.6% to 18.1%. The candidates that

lost had 10.7% and 10.1% of the vote shares, respectively. In this case, the margin

I would measure for the winners would range from 10.9% to 7.4%. If a candidate

runs unopposed in the general election, I set the winning margin equal to 100%.

For safe districts for either republicans or democrats, the results of the gen-

eral election don’t relay how competitive the races really are. The real challenge

usually happens in the primaries. Therefore, I use the minimum winning mar-

gin in an election cycle for each candidate, comparing primaries to general elec-

tions. This means that a candidate running unopposed in the general election

14


but that won her/his primary by 4% would have a vote margin of 4%. I create

a dummy variable to measure if a candidate faces any opposition if its winning

vote margin is below 90%.

Figure 1.5 shows the distribution of winning margins across legislators in

my sample conditional on being opposed.

Figure 1.5: Distribution of Vote Margin Conditional on Being Opposed

Figure 1.5 in the appendix also plots the distribution of vote margin across

legislators but also includes unopposed races. While it’s clear from Figure 1.5

that the frequency goes down as the vote margin increases, Figure 1.5 shows

that around a 90% vote margin, the frequency goes back up.

15


1.3 Research Design

My empirical approach is to regress legislative behavior (e.g., the number of

bills cosponsored or copied) over different measures of election closeness (e.g.,

total contributions, winning vote margin, or excess donations). For each legis-

lator i in party p in state s at a time t, I consider the following specification for

the outcome of interest Yitps:

Yitps = α + ϕ · Competitiveness Measureitps + Xitpsβ + λt + δp + ξs + ϵitps (1.1)

where Xitps is a vector of controls including incumbency status, gender, role,

tenure, age, and if the legislator has a law degree, and ϵitps is the error term. The

period t is defined as the election cycle. Most variables are measured during

that particular election y, except the outcome, Yitps, where I measure it over the

next two years. For example, if an election happened in 2012, I measure the

legislative activity over 2013 and 2014. I do this to keep Yitps consistent across

different legislatures with varying term lengths3.

1.3.1 Instrumenting for Competitiveness Measures

The main source of concern about using the aforementioned specification is that

bill sponsorship is correlated with donations or margin of victory through other

channels unrelated to competitiveness. While I control for some important

drivers of legislative productivity and electoral success, such as incumbency,

I’m unable to assert that both variables are not affected by some unobserved

cofounders. This means that the OLS estimate of Equation (1.1) could be biased.
3The most common length term is 2 years for house members (44 states), but can go up to 4

years in some states (6 states). The reverse holds true for state senators, i.e., 31 states with term
lengths of 4 years, 12 states with 2 years, and 7 states with varying lengths.

16


I seek an instrument that predicts margin of victory and total donations, but

that is unrelated to other determinates of bill sponsorship. I propose using the

average margin of victory, average donations, and average excess donations for

other candidates of the same party within the same state, excluding legislator i

as an instrument for these same variables for i, i.e.,

˜Competitiveness Measureitps =
1

n − 1

∑
j,i

Competitiveness Measureitps (1.2)

where n is the number of legislators in state s at time t.

By taking the average margin of victory and donations in the same state, I

break the relationship between an individual lawmaker’s own experience and

instead capture variations in donations/margin of victory that are due to com-

mon party factors, such as influence by common interest groups and state cau-

cus support structures.

Using this instrument, I estimate Equation (1.1) using a Two-stage Least

Squares (2SLS) approach, where the first stage uses the following specification:

CMitps = α̃ + φ · C̃Mitps + Xitpsβ + λt + δp + ξs + ϵitps (1.3)

where CMitps is the competitiveness measure and C̃Mitps is the constructed in-

strument. Table A.1 reports the results of estimating Equation (1.3) for the dif-

ferent competitiveness measures and instruments. All instruments created can

reject the null hypothesis of weak-instrument and are highly predictive of their

respective competitiveness measure.

17


1.3.2 Alternative Instrument

Since the instrument I propose is defined at the party, state and time t level, it

is reasonable to expect that there might not be enough variation to exploit. To

remedy this situation, I also propose an alternative instrument based on exploit-

ing neighboring legislative districts. Let i be the legislator that wins the election

at time t in district d. Using state legislative districts shapefiles from the Census

bureau, I create a set Nitps of candidates in districts with a common boundary

to d. I then take the average value of the competitiveness measures of those

candidates, i.e.,

˜Competitiveness Measureitps =
1
n

∑
j∈Nitps

Competitiveness Measureitps (1.4)

where n is the size of the set Nitps. The advantage of this instrument compared

to Equation (1.2) is that it provides more variation. The main drawback is that

it is not robust to regional unobserved factors that could drive both legislative

outcomes and the electoral process. I provide the results of the analysis with

this particular instrument in the appendix.

1.4 Results

In Table 1.1, I show that running opposed leads to a legislator introducing nearly

21 fewer substantive bills and 1.4 more copycat bills. It is also interesting to note

that incumbents compared to challengers, are more likely to rely on copycat

legislation and introduce more substantive bills.

18


Table 1.1: Effect of Being Opposed on Number of Bills Introduced

OLS Instrumental OLS Instrumental
Variable Variable

Number of copycat bills Number of copycat bills Number of bills Number of bills

(1) (2) (3) (4)

Opposed (Yes = 1) 0.024 1.418∗∗∗ −2.696∗∗ −21.281∗∗

(0.025) (0.185) (1.354) (9.478)

Incumbency Status (Incumbent) 0.111∗∗ 0.413∗∗∗ 9.421∗∗∗ 5.402∗

(0.047) (0.064) (2.546) (3.262)

Incumbency Status (Open) 0.0003 0.108∗∗ −0.071 −1.508
(0.050) (0.055) (2.711) (2.815)

Party (Independent) −0.803∗∗∗ −0.810∗∗∗ −32.513∗∗∗ −32.414∗∗∗

(0.196) (0.206) (10.551) (10.586)

Party (Republican) −0.077∗∗∗ −0.132∗∗∗ −6.935∗∗∗ −6.201∗∗∗

(0.023) (0.025) (1.220) (1.279)

Party (Third-party) −0.126 −0.229 −10.726 −9.356
(0.181) (0.191) (9.755) (9.811)

Role (Senator = 1) 0.183∗∗∗ 0.178∗∗∗ 1.770 1.835
(0.024) (0.025) (1.301) (1.306)

JD 0.078∗∗∗ 0.069∗∗ 0.989 1.104
(0.028) (0.030) (1.515) (1.521)

Gender (Male = 1) −0.074∗∗∗ −0.041 −8.971∗∗∗ −9.415∗∗∗

(0.025) (0.026) (1.337) (1.360)

Tenure −0.005∗∗∗ 0.001 0.329∗∗∗ 0.248∗∗

(0.002) (0.002) (0.095) (0.103)

(Intercept) 1.455 0.390 48.727 62.916
(1.033) (1.096) (55.591) (56.231)

State Fixed Effects Yes Yes Yes Yes
Year Fixed Effects Yes Yes Yes Yes

Observations 28,639 28,639 28,639 28,639
R2 0.283 0.206 0.530 0.527
Adjusted R2 0.281 0.204 0.529 0.526
Residual Std. Error (df = 28571) 1.751 1.843 94.254 94.565

Note: Column (1) and (3) reports the OLS estimate using the electoral competitiveness measure directly. Column (2) and (4) reports the
2SLS results using instrument described in Section 1.3.1. Standard errors for the coefficients are reported in the parenthesis. *, **, and ***
indicate statistical significance at the 10, 5 and 1 percent levels, based on the p-value.

In Table 1.2, I show that the effect persists even among legislators who ran

opposed. Winning by an additional 1% of the vote share means that a legislator

introduces 1.9 more substantive bills in the following session. The effect of vote

margin on copycat bills is imprecisely estimated, but the coefficient is negative,

as would be expected.

19


Table 1.2: Effect of Vote Margin on Number of Bills Introduced

OLS Instrumental OLS Instrumental
Variable Variable

Number of copycat bills Number of copycat bills Number of bills Number of bills

(1) (2) (3) (4)

% Margin 0.001 −0.003 0.158∗∗∗ 1.924∗∗∗

(0.001) (0.003) (0.036) (0.191)

Controls Yes Yes Yes Yes

State Fixed Effects Yes Yes Yes Yes
Year Fixed Effects Yes Yes Yes Yes

Observations 19,099 19,099 19,099 19,099
R2 0.294 0.293 0.540 0.483
Adjusted R2 0.291 0.290 0.538 0.481
Residual Std. Error (df = 19031) 1.731 1.732 90.374 95.777

Note: We drop legislators where there is no competitor. Column (1) and (3) reports the OLS estimate using the electoral competitiveness
measure directly. Column (2) and (4) reports the 2SLS results using instrument described in Section 1.3.1. Standard errors for the coeffi-
cients are reported in the parenthesis. *, **, and *** indicate statistical significance at the 10, 5 and 1 percent levels, based on the p-value.

In Table 1.3, I show that raising an additional $100K in an election leads to

a legislator introducing 5.8 more substantive bills and 0.2 fewer copycat bills

in the following legislative session. Donations can come from many sources,

though. In Table 1.4, I disaggregate the sources of donations between dollars

coming from outside donors, state parties, and candidates themselves. I find

that raising additional dollars from oneself does not carry the same effect as

raising from donors. Specifically, an additional $100K from outside donors in-

creases the number of substantive bills introduced by 13.2, but $100K from one-

self decreases the number of substantive bills by 19.0. This is what we would

expect: raising more money from oneself is no signal of electoral security. In

fact, it is likely the opposite since the candidate was unable to raise money ex-

ternally. Thus, it is no surprise that success raising money from donors leads to

candidates dedicating more time to substantive legislating while having to con-

tribute from their own wealth leads candidates to legislate less, allowing them

to focus on reelection.

20


Table 1.3: Effect of Campaign Contributions on Number of Bills Introduced

OLS Instrumental OLS Instrumental
Variable Variable

Number of copycat bills Number of copycat bills Number of bills Number of bills

(1) (2) (3) (4)

Total Contributions ($100,000) −0.013∗∗∗ −0.189∗∗∗ −0.167 5.765∗∗∗

(0.004) (0.037) (0.202) (1.930)

Controls Yes Yes Yes Yes

State Fixed Effects Yes Yes Yes Yes
Year Fixed Effects Yes Yes Yes Yes

Observations 28,639 28,639 28,639 28,639
R2 0.283 0.228 0.530 0.516
Adjusted R2 0.281 0.226 0.529 0.515
Residual Std. Error (df = 28571) 1.751 1.817 94.260 95.672

Note: Column (1) and (3) reports the OLS estimate using the electoral competitiveness measure directly. Column (2) and (4) reports the
2SLS results using instrument described in Section 1.3.1. Standard errors for the coefficients are reported in the parenthesis. *, **, and ***
indicate statistical significance at the 10, 5 and 1 percent levels, based on the p-value.

Table 1.4: Effect of Campaign Contributions Split by Source on Number of Bills
Introduced

OLS Instrumental OLS Instrumental
Variable Variable

Number of copycat bills Number of copycat bills Number of bills Number of bills

(1) (2) (3) (4)

Total contributions from donors ($100,000) −0.013∗∗ −0.057 −0.070 13.242∗∗∗

(0.005) (0.046) (0.278) (2.438)

Total contributions from party ($100,000) −0.032∗∗∗ −0.491∗∗∗ −2.627∗∗∗ −6.580
(0.010) (0.102) (0.564) (5.459)

Total contributions from candidate ($100,000) 0.006 −0.273 2.017∗∗∗ −18.972∗

(0.013) (0.199) (0.704) (10.645)

Controls Yes Yes Yes Yes

State Fixed Effects Yes Yes Yes Yes
Year Fixed Effects Yes Yes Yes Yes

Observations 28,639 28,639 28,639 28,639
R2 0.283 0.210 0.531 0.488
Adjusted R2 0.281 0.209 0.529 0.486
Residual Std. Error (df = 28569) 1.750 1.837 94.218 98.436

Note: Column (1) and (3) reports the OLS estimate using the electoral competitiveness measure directly. Column (2) and (4) reports the 2SLS results using
instrument described in Section 1.3.1. Standard errors for the coefficients are reported in the parenthesis. *, **, and *** indicate statistical significance at
the 10, 5 and 1 percent levels, based on the p-value.

Table A.3 and Table A.4 in the Appendix report the results of running the

analysis on being opposed and vote margin using the alternative instrument

construction. The effects I find are consistent with Table 1.1 and Table 1.2. Like-

wise, Table A.5 and Table A.6 in the Appendix show the results for analysis us-

21


ing the alternative instrument. The results are in line with Table A.5 and Table

A.6.

1.5 Conclusion

In this paper, I find evidence that legislators that faced a competitive election

tend to be less productive in a meaningful way in the subsequent session. Over-

all, I find that legislators will rely more on copycat legislation and introduce

fewer substantive bills overall. This is true if we measure the competitiveness

of an election through vote margin, whether the candidate ran unopposed or

the amount of money raised for that particular election cycle.

These findings paint a somewhat bleak picture of American politics; one

where a desirable outcome, such as having competitive races, leads to less ef-

fective legislators. To make policy recommendations to remedy the situation,

we need to understand how legislators spend their time while in the legislative

session. For instance, does the threat of losing the next election drive legislators

to spend more time fundraising instead of writing substantive bills, doing con-

stituent service work, or establishing themselves as key figure of their party?

Answering this question is crucial to better understand the bottlenecks of the

legislative process and how we can improve it.

22


CHAPTER 2

DYNAMIC NETWORK FORMATION WITH FORWARD LOOKING

AGENTS

2.1 Introduction

Social connections play a crucial role in shaping relevant observed outcomes.

For example, previous work finds that social connections shape how adoles-

cents partake in risky behavior, how legislators can be effective lawmakers in

congress, and drive labor outcomes. However, social networks are rarely fully

observable in the data: accurate record keeping of friends, colleagues, foes, and

the intensity of those connections is usually unavailable. In some cases, infor-

mation on social connections is completely nonexistent. Furthermore, estab-

lishing social connections is inherently a dynamic process, e.g., being friends

today makes it easier to maintain that relationship tomorrow. The traditional

approach of the literature has been to focus on static models of social networks

and to assume that the true social network can be approximated by some prox-

ies, such as having attended the same school or sharing some observable char-

acteristics such as gender or race. In this paper, we build and estimate a model

that addresses both of these limitations.

To do so, we propose a new theory of network formation where agents are

not myopic and where their decisions over their social links and behavior are en-

dogenous. Our model, under some conditions, has a unique equilibrium predic-

tion that depends solely on a finite set of structural parameters which sidesteps

the curse of dimensionality of recovering every link in the network separately.

Building on the method proposed by Battaglini, Patacchini and Rainone (2022),

23


we show that our model can be estimated and that we can recover the true value

of the structural parameters under simulations. To illustrate the applicability of

our approach, we estimate three distinct empirical examples. The first example

focuses on the 111th and 112th U.S. Congress and the legislative effectiveness

of lawmakers. The second application looks at R&D expenditures from 2006 to

2011 in the Chemicals And Allied Products industry. Lastly, we look at the edu-

cational achievement of adolescents using the National Longitudinal Study of

Adolescent to Adult Health (Add Health) dataset.

Our model is divided into T periods. Each of these periods has two stages.

In the first stage, the player decides how much effort to spend on establishing

social connections, which is costly. One contribution of this paper is to assume

that this cost also depends on the links formed in period T − 1. In the second

stage of the game, players choose how much effort to exert to achieve some

outcome, for example, passing bills for legislators or achieving good grades for

adolescents, taking the social links established in the previous stage as given.

To solve and evaluate this model, we rely on two methodological concepts.

First, we introduce a new equilibrium concept, i.e., the Dynamic Network Com-

petitive Equilibrium (DNCE). In our model, when agents change their level of

effort, there is not only a direct effect for that particular change but also indirect

spillover effects through their connections. For instance, in the case of the U.S.

Congress, a legislator that decides to spend more time writing bills will de facto

increase their effectiveness in the legislative production process. This change

in effectiveness will trickle down to increasing the effectiveness of their social

connections in Congress, which will in turn, add to the original increase in ef-

fectiveness for the legislator, and so on and so forth. This cascading renders the

24


game almost impossible to solve. This is further complicated in our setting since

efforts in one period will affect other periods indirectly. To address these issues,

we extend the concept of Network Competitive Equilibrium (NCE) introduced by

Battaglini, Patacchini and Rainone (2022). The idea of the NCE is that agents act

as “price-takers”, i.e., that players do not internalize the indirect spillover effects

on other players of their choice. We show how this allows us to characterize the

equilibrium of our game with a system of nonlinear equations that depends on

the structural parameters of the model and outcome levels.

Second, using the system of nonlinear equations we derived, we use

Bayesian methods to estimate our model. Because we cannot derive a closed-

form likelihood function, standard Bayesian methods are not available to us.

We instead use a modified version of Approximate Bayesian Computation (ABC)

method as described by Battaglini, Patacchini and Rainone (2022). To test the

validity of our approach, we simulate our model with a simple data-generating

process. We show that we can recover the key structural parameters. We also

show that as the number of agents, n, increases, our posterior distributions con-

verge around the true values of our parameters.

As mentioned previously, we apply our method to three distinct empirical

settings. To keep our model tractable, we focus only on two periods in each

example. The first example estimates peer effects on lawmakers’ legislative ef-

fectiveness in the 111th and 112th U.S. Congress. Controlling for a set of legisla-

tors’ individual characteristics, we find evidence that social connections matter

in driving effectiveness, as measured by the Legislative Effectiveness Scores de-

veloped by Volden and Wiseman (2014). Additionally, we find that the dynamic

component of our model matters and that it improves the fit significantly com-

25


pared to the static approach of Battaglini, Patacchini and Rainone (2022). We

also provide two counterfactual exercises. First, we evaluate the model when

shutting down the enduring “old boy” networks as measured by the alumni

connections in the network formation process. We show that there is no real

change from eliminating alumni connections over the network centralities of

the legislators across different groups (party, race, gender, etc.). We also pro-

pose an exercise where we cull ideologically extreme legislators in the 112th

U.S. Congress. To accomplish this, we change the political ideology as measured

by the D-W nominate score of the most extreme legislators to the median D-W

nominate score and measure the effect of that change on legislative effective-

ness. Perhaps surprisingly, this exercise does not imply a significant difference

in legislative effectiveness.

Our second application exploits R&D investment from 2006 to 2011 in the

Chemicals And Allied Products industry. Following the Hsieh, König and Liu

(2022) approach, we estimate our model using R&D expenditures as our out-

come variable and the productivity measured by the lagged stock of R&D ex-

penditures as one of the characteristics we control for. We show that our model

estimates null results for the network effects in this particular example.

And lastly, we look at the behaviors of adolescents using the National Lon-

gitudinal Study of Adolescent to Adult Health (Add Health) dataset. In this

paper, we focus on educational achievement. We use the average GPA in a

given year as the relevant outcome in our setting. First, we reaffirm the strong

evidence that social network effects matter in the context of adolescent educa-

tional achievement. Second, we show that having a dynamic model is crucial in

explaining GPA differences among adolescents.

26


The remainder of this paper is organized as follows. Section 2 introduces

our model of behavior and formation of social connections. Section 3 presents

the econometric specification and the estimation method used to estimate our

model. Section 4 presents a simulation to showcase the performance of the pro-

posed approach. Section 5 applies the model to three distinct empirical appli-

cations. Section 6 concludes. In the rest of this section, we review the related

literature.

Related literature

The literature on estimating network effects in economics is extensive, but

the vast majority of the research has two limitations: models are usually static

and social networks are taken as exogenous. For the former limitation, a few re-

cent papers have tried to tackle the problem of myopic agents (see, Ozgur, Bisin

and Bramoullé (2018), Arduini et al. (2019)). Ozgur, Bisin and Bramoullé (2018)

demonstrate various theoretical results for social interactions models with lin-

ear dynamic economies. With a more general network topology setting, Arduini

et al. (2019) proposes a model with forward-looking agents to study jointly so-

cial network effects and smoking addiction. We contribute by incorporating

both the agent’s behavior and choice of social links endogenously. Our ap-

proach entirely avoids having to rely on observing an exogenous network to re-

cover social interactions effects in a dynamic setting. Essentially, the only thing

we need to make inference is a vector of observable outcomes.

Assuming social networks are exogenous has two core issues. First, as de-

scribed before, there is an issue of endogeneity between the behavioral decisions

that determine the outcome we are interested in and the social network. There

has been growing literature in recent years trying to answer this issue of endo-

27


geneity when estimating network effects (see e.g., Auerbach (2022), Battaglini,

Patacchini and Rainone (2022), Battaglini, Sciabolazza and Patacchini (2020),

Canen, Jackson and Trebbi (2022), de Paula, Rasul and Souza (2018), Goldsmith-

Pinkham and Imbens (2013), Hsieh, König and Liu (2022), Rose (2019), Johnsson

and Moon (2021)). Second, often times social links are unobservable. The usual

approach has been to rely on proxies for the social links, e.g., cosponsorship

network for legislators, friendship nominations for adolescents, and law clerk

movements for federal judges. Battaglini et al. (2020), Battaglini, Patacchini and

Rainone (2022), De Paula, Richards-Shubik and Tamer (2018), and Rose (2019),

among others, have proposed alternatives. For instance, Battaglini et al. (2020),

De Paula, Richards-Shubik and Tamer (2018), and Rose (2019) propose high-

dimensional estimation techniques to estimate social networks. They first rely

on assuming an exogenous linear model of behavior, which assumes away the

endogeneity of the network. They also require that the network is sufficiently

sparse and fixed over many repeated observations, which in our context is too

limiting since we estimate our dynamic model over two periods. Therefore, to

avoid those limitations, this paper extends the work of Battaglini, Patacchini

and Rainone (2022) to now allow for non-myopic agents.

In this paper, we apply our method to three different empirical settings.

Our first example relates to the recent literature on studying how social net-

works influence legislative behavior (see, e.g., Fowler (2006), Kirkland (2011),

Battaglini and Patacchini (2018), Battaglini, Sciabolazza and Patacchini (2020),

Battaglini, Patacchini and Rainone (2022), Canen, Jackson and Trebbi (2022)).

Fowler (2006) and Kirkland (2011), using cosponsorship as the proxy for the

social ties between legislators, showed that centrality correlates with effective-

ness. Battaglini, Sciabolazza and Patacchini (2020) follows up on that work by

28


providing a two-step method to control for endogeneity between cosponsorship

and legislative effectiveness by using a Heckman-type correction where the in-

strument for the cosponsorship network is the alumni network. This approach

has the shortfall of ignoring key structural network properties of the underly-

ing social network. Battaglini, Patacchini and Rainone (2022) improve on this

aspect with the proposed model of endogenous network formation1. We further

expand on this by introducing dynamic effects in this paper and show that they

are relevant and improve the fit of the model.

Our second empirical example relates to the literature exploring network

effects from R&D collaborations (see, e.g., König, Liu and Zenou (2019), Konig,

Liu and Hsieh (2021), Hsieh, König and Liu (2022), Wang and Yang (2022), Cam-

inati (2021), Zacchia (2020), Dawid and Hellmann (2020), Arqué-Castells and

Spulber (2022)). Closest to our approach is Hsieh, König and Liu (2022) who de-

rive a structural model for the coevolution of networks and behavior and apply

it in the context of R&D spending and joint ventures decisions in the chemi-

cal and pharmaceutical industry. Our paper differs in mainly two ways. First,

in our setting, the true network G is unobservable. Second, firms are forward-

looking players making decisions over two periods.

Our final empirical setting is based on the vast literature on estimating peer

effects on adolescent behaviors. Among others, Hanushek et al. (2003), An-

grist (2004), Kang (2007) Boucher et al. (2014), Calvó-Armengol, Patacchini and

Zenou (2009), and Patacchini, Liu and Rainone (2013) use networks based on

school membership, classroom membership or self-reported friendship nomi-

1Canen, Jackson and Trebbi (2022) also propose and estimate a model of endogenous net-
work formation based on the model proposed by Cabrales, Calvó-Armengol and Zenou (2011).
The main difference is that in their approach where social efforts are not targeted, i.e., a legisla-
tor chooses one particular level effort to connect with any legislators regardless of party identity
for instance.

29


nations to evaluate the peer effects on behaviors ranging from smoking to school

performance. In our context, we are closest to the setting of Calvó-Armengol,

Patacchini and Zenou (2009) who look at school performance using friends

nominations as the social connections network. Using proxies for the underly-

ing social network has issues. First, these networks may be omitting important

links. This is particularly salient when defining the network using classroom or

school membership because it restricts friendship to only one setting. But, even

with the friendship nominations, we might be missing key links if the self-report

does not disclose every friendship. Moreover, it is almost impossible to report

the strength of a connection. The second issue is that using only one network

as our proxy for social connections is limiting. Researchers might want to com-

bine different sources of information to infer social connections, e.g., classroom

membership, friendship nominations, gender, and neighbors, but it is hard to

know the optimal combination. Our paper, apart from having a model that

allows for endogeneity and forward-looking agents, also provides a way to in-

clude information from multiple adjacency matrices to endogenously choose

the importance of particular data characteristics in the formation of the social

networks.

Finally, for our estimation method, it is important to mention the work of

König (2016) and Boucher (2020) who also rely on ABC methods to estimate

models of network formation. Unlike this paper, König (2016) and Boucher

(2020) estimate probability distributions over networks with the underlying as-

sumption that a particular network realization is observed. To achieve this,

König (2016) uses the ABC method with a particular set of summary statistics

and estimate spillover effects using patent and coauthorships data in physics

and economics, while Boucher (2020) takes a similar approach but applied to

30


network effects to explain homophily among high school friends.

2.2 Model

2.2.1 Setup

Consider a set of n agents, where N = {1, ..., n} is the set of agents. Agents live

for T periods, and in each period each agent cares about a particular outcome

denoted as Yi,t for t = 1, ...,T . The goal of each agent is to maximize the utility

generated from that outcome. The type of outcome the agents care about can

vary depending on the setting. For example, in the U.S. Congress, legislators

care about the number of bills they pass. Another setting would be a researcher

trying to maximize the number of papers they produce every year.

To maximize its utility, the agent chooses an amount of effort to exert to

produce Yi,t. We assume that Yi,t is an increasing function of this effort and the Yi,t

of all the agents with whom i is socially connected at t. Specifically, we assume

the following “production function” for Yi,t:

Yi,t = ρ ·
(
si,t
)α (li,t

)1−α
+ εi,t (2.1)

The Cobb-Douglass in (2.1) captures the effects of agent i’s effort li,t, and level of

“social connectedness” si,t. We assume that i’s social connectedness is

si,t =
∑

j∈N
gi, j,tY j,t, (2.2)

where gi, j,t is a measurement of the social link between i and j. The implication

of (2.2) is that the level of effort exerted by j affects i through the degree of social

31


connection of i to j at t. The second term, εi,t, is some individual idiosyncratic

factor that contributes to Yi,t efficacy independently from any social connections

or effort choice. Players observe εi,t, but we do not. In the analysis below, we

assume gi,i,t = 0, gi, j,t ∈
[
0, g
]

with g > 0, εi,t ∈
[
ε, ε
]

with ε > 0, ε ∈ (0, 1), and

li,t ∈
[
0, l
]

with l > 0. Additionally, we will maintain the following assumption

that bounds Yi,t between 0 and 1:

Assumption 1. ρ · gα · l
1−α
+ ε < 1.

These assumptions on the parameters and functional form are only made for

convenience. Other functional forms can be considered.

In this model, the agents’ effort levels l =
{
l1,t, ..., ln,t

}
, outcomes Y ={

Y1,t, ...,Yn,t
}

and the social adjacency matrices Gt =
(
gi, j,t

)
i, j∈N

are endogenous

variables. In each period t, an agent i is forward looking and selects li,t and

gi
τ = (gi,1,t, ..., gi,n,t) to maximize the expected discounted outcomes in the T peri-

ods. At t = T , agent i’s selects li,T ,gi
T to maximize Yi,T

(
li,T ,gi

T

)
, i.e., the outcome

in period T only. At t < T , the agent selects li,t,gi
τ to maximize:

ui,t

(
li,t,gi

t

)
= Yi,t

(
lt
i,t,g

i
t

)
+ E
[∑T

τ=t+1
βτ−1

d Yi,τ

(
li,τ (Gτ−1) ,gi

τ(Gτ−1)
)]

(2.3)

where li,τ (Gτ−1) ,gi
τ(Gτ−1) are the equilibrium values of effort and network con-

nections at τ given the network formed in the previous period Gτ−1. The key

feature of our model is that players are not myopic, i.e., players recognize that

decisions at time t affect future periods.

In every period, li,t,gi
t and Yi,τ for all is are determined in two-stages. At t.2,

the agents choose their efforts li,t, taking the social connections Gt as given. The

cost of effort is assumed to be represented by a linear function Li(li,t) = c · li,t,

where c is some cost parameter. At t.1, agents link with other agents to increase

32


the social component of their production function for the outcome of interest.

At this stage, the agents simultaneously choose the social links gi, j,t. We assume

that at t.1, agent i decides with which other agent j ∈ N\i he or she wishes to

establish a link gi, j,t. A link with j at time t depends exclusively on i’s effort. The

cost of establishing this link with intensity gi, j,t is given by the following:

C(gi, j,t, gi, j,t−1, θi, j,t) =
λ

(1 + λ)

 gi, j,t

ζ
(
gi, j,t−1, θi, j,t−1

)
+ θi, j,t


1+ 1

λ

, (2.4)

where ζ
(
gi, j,t−1, θi, j,t−1

)
is a function increasing in gi, j,t−1and θi, j,t−1, links with higher

values at t−1 decrease the cost at t; and θi, j,t is a variable that captures the degree

to which the types of i and j are socially “compatible” (the more i and j are

socially compatible, the lower the cost for i to establish a link with intensity gi, j,t

with j). This cost may be interpreted as, for example, the cost of the time spent

socializing with j, the number of meetings between companies to establish a

joint venture or the time that legislator i’s staff needs to spend with legislator j’s

staff to coordinate actions. We assume here (but we don’t need to) that ζ
(
gi, j,0

)
=

ζ0, i.e., the cost in the first stage when we have no preceding network is constant;

and ζ
(
gi, j,t

)
= ζ1 ·

(
gi, j,t + θi, j,t

)
for t ≥ 1. If ζ

(
gi, j,t

)
= 0, then we are back to the static

version of this model.

A key feature of the evolution of social connectedness and the observed out-

come over periods in this model is that they are interconnected, thus making the

network formation model dynamic. Specifically, we assume that it is cheaper

to maintain a social connection than to form a new one. The cost, moreover,

may be heterogeneous. The variable θi, j is taken as exogenous in the theoretical

analysis, and it may comprise several factors affecting the likelihood to observe

a link—for example, similarity along various characteristics (gender, social, or

educational background). We assume that the matrix Θt =
(
θi, j,t

)
i, j

is symmetric

33


and that for each agent i there is a set Mi,t of other agents such that θi, j,t > 0

for j ∈ Mi,t and zero otherwise. This implies that agent i is compatible with

at most a subset Mi,t with cardinality mi,t =
∣∣∣Mi,t

∣∣∣ of other agents. We denote

m = maxi,t mi,t as the maximal cardinality of the subsets of connections over time.

The following assumption guarantees that we will not have a corner solution

in which an agent chooses li,t = l for some i ∈ N and t.2

Assumption 2. l > ((1 − α) ρ/c)1/α

Note that if the social spillovers ρ is sufficiently small, Assumption 1 and

Assumption 2 are automatically satisfied.

The type ωi,t at time t of a agent i is defined by all the variables describing

his/her preferences and social connections, so ωi,t = (εi,t,
(
θi,k,t
)

k∈N ,Mi,t). We de-

note with Ω the space of types with typical element ωt ∈ Ω. A pure strategy

for an agent is described by a socialization strategy g : Ω →
[
0, g
]n−1, mapping

the agent’s type to a vector of intensities gi
t = {gi, j,t} j,i for each of the n − 1 other

agents, and an effort strategy l : Ω×G→
[
0, l
]
, mapping the social network and

i’s type to an effort level.

2A formal proof of this fact is provided in the proof of Proposition 1.

34


2.2.2 Equilibrium analysis

2.2.3 Network competitive equilibrium

In every period t, the previous section describes a relatively simple structure

two-stage model that can be solved with backward induction. At t.2, the agents

choose effort levels taking the social network Gt as given; at t.1, agents choose

their social links. Solving for those levels of efforts and links becomes compli-

cated because any action taken by i has not only a direct effect but also an indi-

rect effect on their state. For example, consider the choice at t.1, when agent i

chooses the link to j, gi, j,t: here a change in gi, j,t has a direct effect on Yi,t described

by (2.1), but it may also have a complex set of indirect effects: the change in Yi,t

given Gt changes all other Y j,ts of js who are connected to i at t. The change in

gi, j,t, moreover, has dynamic effects: the choice of connections by i at t affect the

distribution of the cost of connecting at t + 1, and so on. This in turn affects the

networks and their effectiveness in the following periods. With a large set of

agents, these indirect effects add a lot of complexity to the analysis.

To address these complications, we apply and extend the concept of Net-

work Competitive Equilibrium introduced by Battaglini, Patacchini and Rain-

one (2022) to our dynamic environment. The key idea is to assume that, as in

a competitive equilibrium for prices, players are “price takers ” with respect to

the levels of the outcome Y of the other players. We can therefore introduce a

Dynamic Network Competitive Equilibrium (henceforth, DNCE) as follows:

Definition 1. Agents’ effort levels l =
{
l1,t, ..., ln,t

}T
t=1, outcomeY =

{
Y1,t, ...,Yn,t

}T
t=1

and the social matrices Gt =
(
gi, j,t

)
i, j∈N

for t = 1, ...,T constitute a Dynamic Network

35


Competitive Equilibrium (DNCE) if:

• network connections gi
t = (gi,1,t, .., gi,n,t) are optimal for i at t given Yt and the

expected Yt+τ for τ = 1, ...,T − t in equilibrium;

• effort levels li,t are optimal for agent i at time t given Yt and Gt =
(
gi

t

)
i∈N

for

t = 1, ...,T

• the vector of outcome levels Y satisfies the production function (2.1) given l and

Gt.

The first two conditions are simply saying that agents are optimizing given

others’ level of outcome, Yt without internalizing how a change in Yi,t could

ripple down indirectly. Essentially, players act as “price-takers” where prices

in this scenario are denoted by Yt. The last condition is akin to market clearing

conditions; the levels of efforts, and social connections, need to produce the

level of outcome Yt we observe.

In the following subsections, we apply this DNCE concept to find a unique

equilibrium prediction to our setting. For simplicity and tractability, we assume

T = 2. Conceptually, we can do any T , but practically only low Ts. In the

same vein, we assume that θi, j,t can either be 0 or 1. This boils down to having a

dummy indicator of the underlying exogenous compatibility of i and j at time

t.

The choice of effort at t = 2

At stage 2.2, i.e., the second stage of the second period, the agents select effort

given the network G2. Substituting in the solution the optimal level efforts into

36


(2.1), we obtain that the equilibrium levels of Y for a type i ∈ N is given by:

Yi,2 = δ ·
∑n

j=1
gi, j,2Y j,2 + εi,2. (2.5)

where δ = ρ ((1 − α) ρ/c)
1−α
α . These equations can be expressed in matrix form as:

[I − δ ·G2] · Y2 = ε2 (2.6)

where ε2 is the vector
(
εi,2
)

i∈N . If [I − δ ·G2] is invertible, then we can solve for

Y2 as a function of the idiosyncratic ε2.

The formation of the network at T = 2

At stage 2.1, the agents choose their social links to maximize the expected utility

net of the cost of establishing the links. The expected continuation utility at 2.1

of an agent i is easily determined by substituting the optimal effort levels and

Yi,2(G,ε) in (2.3):

U i(G2, ε) = αδ
∑n

j=1
gi, j,2Y j,2(G2, ε2) + εi,2 (2.7)

Agent i will choose the links gi
2 = (gi,1,2, ..., gi,n,2) that maximize (2.7) with the

additional constraint that gi, j,2 ∈
[
0, g
]
, i.e., agent i chooses his/her links solving:

max
gi

2∈[0,g]n

∑n

j=1

αδ · gi, j,2Y j,2(G2, ε2) −
λ

(1 + λ)

 gi, j,2

ζ
(
gi, j,1, θi, j,1

)
+ θi, j,2


1+ 1

λ


 (2.8)

taking G1 = {gi, j,1}i, j∈N and Y j,2(G2, ε2) as given. Because of “price taking” behav-

ior set by our equilibrium concept, player i takes Y j,2(G2, ε2) as constant, not as a

function of G2.

Combining the solution of (2.8) with (2.6), we have that for a DNCE, Y2 (θ2)

37


and a matrix G2 (θ2) that need to solve the system:

Yi,2 = δ ·
∑

l∈N

(
gi,l,2 · Yl,2

)
+ εi,2 (2.9)

and gi, j,2 ≤
(
ζ
(
gi, j,1, θi, j,1

)
+ θi, j,2

)1+λ (
αδY j,2

)λ
( = for gi, j,2 ≤ g) (2.10)

for any i, j ∈ N.

The choice of effort at t = 1

Again, in the second stage of a given period, the agents select their optimal level

of efforts given G1, and substituting those into (2.1), we get that Yi,1 follows:

Yi,1 = δ ·
∑n

j=1
gi, j,1Y j,1 + εi,1. (2.11)

where δ = ρ ((1 − α) ρ/c)
1−α
α . These equations can be expressed in matrix form as:

[I − δ ·G1] · Y1 = ε1 (2.12)

where ε1 is the vector of idiosyncratic base levels for Yi,1.

The network formation at t = 1

Assuming that the constraints for G2 are not binding and that we have an inte-

rior solution for (2.10), then given Yi,2 and Yi,1 as functions of G1 and the idiosyn-

cratic ε, we can solve for the links G1 The expected continuation utility at 1.1 of

a type i follows:

U i(Gt, ε) =



αδ
∑n

j=1 gi, j,1Y j,1

+βd · α
λ (δ)1+λ

·
∑

l∈N


(
ζ
(
gi,l,1, θi, j,1

)
· Yl,2

)1+λ
·
(
1 − pi,l,2

)
+
((
ζ
(
gi, j,1, θi, j,1

)
+ 1
)
· Yl,2

)1+λ
· pi,l,2


+εi,1 + βdεi,2


(2.13)

38


where βd < 1 is the “discount factor” of future Y levels, and pi,l,2 = P(θi,l,2 = 1 | ϵ).

Again, the agent i chooses his/her links maximizing utility:

max
gi

1


∑n

j=1



αδ
∑n

j=1 gi, j,1Y j,1

+αλ (δ)1+λ
· βd ·

∑
l∈N


(
ζ
(
gi,l,1, θi, j,1

)
· Yl,2

)1+λ
·
(
1 − pi,l,2

)
+((

ζ
(
gi, j,1, θi, j,1

)
+ 1
)
· Yl,2

)1+λ
· pi,l,2


− λ

(1+λ)

( gi, j,1

ζ0+θi, j,1

)1+ 1
λ




(2.14)

Assuming that ζ
(
gi, j,0

)
= ζ0 and ζ

(
gi, j,1, θi, j,1

)
= ζ1 ·

(
gi, j,1 + θi, j,1

)
, from the FOCs

we need to have:

gi, j,1(
ζ0 + θi, j,1

)1+λ =

αδY j,1

+αλ (δ)1+λ (1 + λ)βd


(
ζ1

(
gi, j,1 + θi, j,1

)
Y j,2

)λ (
1 − pi, j,2

)
+
((
ζ1

(
gi, j,1 + θi, j,1

)
+ 1
)

Y j,2

)λ
pi, j,2

 ζ1Y j,2



λ

(2.15)

where pi, j,2 = P(θi, j,2 = 1 | ϵ).

To ensure that gi, j,t is an interior solution, we make the following assumption.

Assumption 3. g > (ζ0 + 1)1+λ (αδ)λ

Under Assumption 3, gi, j,t < ḡ and we have the following proposition.

Proposition 1. A Dynamic Network Competitive Equilibrium (DNCE) exists, and it

39


is characterized by a vector Y , and matrix G1 that solves

Yi,1 = δ
∑

l∈N

(
gi,l,1 · Yl,1

)
− εi,1 (2.16)

Yi,2 = α
λ (δ)1+λ

∑
l∈N

((
ζ1

(
gi, j,1 + θi, j,1

)
+ θi, j,2

)
· Yl,2

)1+λ
− εi,2 (2.17)

gi, j,1(
ζ0 + θi, j,1

)1+λ =

αδY j,1

+αλ (δ)1+λ (1 + λ)βd


(
ζ1

(
gi, j,1 + θi, j,1

)
Y j,2

)λ (
1 − pi, j,2

)
+
((
ζ1

(
gi, j,1 + θi, j,1

)
+ 1
)

Y j,2

)λ
pi, j,2

 ζ1Y j,2



λ

(2.18)

gi, j,2 =
(
ζ1

(
gi, j,1 + θi, j,1

)
+ θi, j,2

)1+λ (
αδY j,2

)λ
(2.19)

for any i, j ∈ N

Therefore, the competitive equilibrium collapses to a system of 2N ×N equa-

tions and 2N × N variables given by (2.19), (2.18), (2.16), and (2.17).

Under Assumption 1-3, for the static model, i.e., ζ (·, ·) = 0, a sufficient con-

dition (but not necessary) for the existence of a unique equilibrium is that δ is

sufficiently small, i.e., δ <
[

1
(1+λ)αλm̄

] 1
1+λ (see Battaglini, Patacchini and Rainone

(2022) for a proof). We maintain that condition in our setting.

Special cases

For estimating our model, having a system of 2N × N nonlinear equation is an

impractical challenge. We, therefore, look at two special cases: λ→ 0, and λ = 1.

When λ → 0, the system of equations given by (2.18), boils down to having

gi, j,1 → (ζ0 + θi, j,1). This means that in turns that gi, j,2 → (ζ0 · ζ1 + ζ1θi, j,1 + θi, j,2). In

essence, the social connections matrices depend entirely on the exogenous θi, j,t.

40


This means that social connections are not endogenously determined anymore.

Furthermore, if we let ζ0 = ζ1 = 0, then we are back to the simple static model.

In this case, if θi, j,t is such that [I − δ · Θt] is invertible, then Yt = [I − δ · Θt] εt,

i.e., the outcome Y is determined by the standard weighted Bonacich centrality

where the weights are given by εi,t.

Now, let λ = 1, meaning that we are back to having G be endogenously deter-

mined. In this scenario, (2.18) reduces to the following simpler set of equations:

gi, j,1 −
(
ζ0 + θi, j,1

)2 [
αδY j,1 + αδ

22βd

(
ζ1

(
gi, j,1 + θi, j,1

)
+ pi, j,2

)
ζ1Y2

j,2

]
= 0

where again we let pi, j,2 = P(θi, j,2 = 1 | ϵ).

This means that gi, j,1 is given by:

gi, j,1 =
(
ζ0 + θi, j,1

)2 [
αδY j,1 + αδ

22βd

(
ζ1

(
gi, j,1 + θi, j,1

)
+ pi, j,2

)
ζ1Y2

j,2

]
(2.20)

⇒ gi, j,1 =

(
ζ0 + θi, j,1

)2 [
αδY j,1 + αδ

22βdθi, j,1ζ
2
1Y2

j,2 + αδ
22βd pi, j,2ζ1Y2

j,2

]
1 − αδ22βd

(
ζ0 + θi, j,1

)2
ζ2

1Y2
j,2

(2.21)

This closed-form solution for gi, j,1 avoids the computational problem of solv-

ing for gi, j,1 through a nonlinear system of equations. Again, in the DNCE, the

level of Yi,t solves the following:

Yi,1 − δ
∑

l∈N

(
gi,l,1 · Yl,1

)
− εi,1 = 0 (2.22)

Yi,2 − αδ
2
∑

l∈N

(
ζ1

(
gi,l,1 + θi, j,2

)
· Yl,2

)2
− εi,2 = 0 (2.23)

Combining (2.21) with (2.23) and (2.22) yields a system of 2N equations and

2N variables. This is one of the main advantages of setting λ = 1, i.e., we avoid

the curse of dimensionality of dealing with the full system of 2N × N equations.

Therefore, for the rest of this paper, we will assume that λ = 1 to avoid the added

computational strains of recovering for λ empirically.

41


2.3 Estimation Method

2.3.1 Model Specification

We assume that we observe two periods of data (t = {1, 2}), with n agents i, and

their level of outcome Yi,t. Additionally, we assume that we can observe some

set of characteristics for each agent
{
Xi,1,t, . . . , Xi,K,t

}
. While we don’t observe the

underlying endogenous network Gt nor the “social compatibility” measure θi, j,t,

we make the assumption that θi, j,t follows some given functional form based

on the agents’ characteristics, and in some cases, an observable exogenous net-

work, Hi, j,t, that might be relevant to θi, j,t.

To bring our model to data, we let εi,t = Xi,tβ+ ϵi,t where ϵi,t is some unobserv-

able error term, β a K × 1 dimensional vector. Then, the solution to our model

assuming λ = 1 is given by:

Yi,1 = δ
∑

l∈N

(
gi,l,1 · Yl,1

)
+ Xi,1β + ϵi,1 (2.24)

Yi,2 = αδ2
∑

l∈N

((
ζ1
(
gi,l,1 + θi,l,1

)
+ θi,l,2

)
Yl,2
)2
+ Xi,2β + ϵi,2 (2.25)

gi, j,1 =
(
ζ0 + θi, j,1

)2 [
αδY j,1 + αδ

22βd

(
ζ1

(
gi, j,1 + θi, j,1

)
+ pi, j,2

)
ζ1Y2

j,2

]
(2.26)

where δ = ρ ((1 − α) ρ/c)
1−α
α . Note again that setting ζ1 = 0, is equivalent to having

the non-dynamic version of the model while setting ρ = 0 is equivalent to the

simple linear model with no network effect. Therefore, in our context, we are

essentially interested in seeing if ζ1 and ρ are different from 0.

Finally, θi, j,t, we model it as a random realization of the following logistic

function:

Pr(θi, j,t = 1 | Z,H) =
exp
(
ι+γHi, j,t+

∑
l m(zl

i,t ,z
l
j,t)ψl
)

1+exp
(
ι+γHi, j,t+

∑
l m(zl

i,t ,z
l
j,t)ψl
) (2.27)

42


where zl
i,t is some relevant observable characteristics of i and m(·, ·) is some dis-

tance function. Throughout this paper, we assume that m(zl
i, z

l
j) = |z

l
i − zl

j|.

2.3.2 Approximate Bayesian Computation (ABC)

To estimate our model, we apply a variation of the Approximate Bayesian Com-

putation (ABC) method. The goal, as with any Bayesian approach, is to recover

the distribution of some parameters,ω, based on the observed data, i.e., P(ω | Y),

also commonly referred to as the posterior. To achieve this, we rely on two core

components; the likelihood function of the data given ω, P(Y | ω), and our prior

belief on the distribution of ω, π(ω). In our context, a closed form for the likeli-

hood function P(Y | ω) is unavailable. The ABC approach avoids this limitation

by relying on simulating the data instead. The original ABC method proposed

by Marjoram et al. (2003) offered a variation on the original Metropolis-Hasting

algorithm to recover the posterior distribution of P(ω | Y):

A1. Draw ω′ based on some transition kernel q(ω→ ω′)

A2. Draw Y′ by simulating the model with parameters ω′

A3. Compute some distance, d(Y′,Y), between Y′ and Y. If d(Y′,Y) < ν where

ν is some tolerance, move to the next step, else repeat.

A4. Compute h = min
(
π(ω′)q(ω′→ω)
π(ω)q(ω→ω′)

)
A5. Move to ω′ with probability h, and stay at ω with probability 1−h; go back

to the first step.

The result of this algorithm is a Markov Chain with a stationary distribution

equal to P(ω | d(Y′,Y) < ν). In the limit, as ν goes to 0, P(ω | d(Y′,Y) < ν) should

43


converge to P(ω | Y) under some regularity conditions. The choice for distance

function d(Y′,Y) depends on the context of the problem. If a set of sufficient

statistics is available to the researcher, then reducing the distance between those

statistics is obviously the right choice. Usually, a low-dimensional sufficient

statistic is not available. Instead, researchers have to pick a set of summary

statistics that would minimize the loss of information. Blum et al. (2013) provide

a review of the literature on how the methods available to choose the set of

summary statistics.

While this method is technically available to estimate our model, in our con-

text, simulating Y′ is particularly challenging. In fact, it would involve solving

our set of nonlinear equations coming from (2.21), (2.23), and (2.22) at every it-

eration (equations (2.18), (2.16), and (2.17) when λ , 1). Instead, we rely on the

following set of equations:

Yi,1 − δ ·
∑

l∈N

(
gi,l,1 · Yl,1

)
− εi,1 = Hi(ω; Y)) (2.28)

Yi,2 − αδ
2 ·
∑

l∈N

((
ζ1gi,l,1 + θi, j,2

)
· Yl,2

)2
− εi,2 = Li(ω; Y)) (2.29)

where gi,l,1 is given by (2.20). If ω is the true set of parameters, and Yi,1 and Yi,2

the empirical values, then Hi(ω; Y)) = 0 and Li(ω; Y)) = 0 for all i. If another set

of parameters ω′ would generate a vector Y such that Y = Y′, then Hi(ω′; Y)) = 0

and Li(ω′; Y)) = 0 for all i. This means that we can circumvent simulating a new

vector Y′ at every iteration by simply evaluating how far Hi(ω′; Y)) and Li(ω′; Y))

are from 0.

Let λ(w,Y) be the vector resulting from stacking Li(ω; Y) and Hi(ω; Y) for all

i. Then, we can modify the previous algorithm in the following way:

B1. Draw ω′ based on some transition kernel q(ω→ ω′)

44


B2. Compute the norm of λ(w′,Y), ∥λ(w′,Y)∥. If ∥λ(w′,Y)∥ < ν where ν is some

tolerance, move to the next step, else repeat.

B3. Compute h = min
(
π(ω′)q(ω′→ω)
π(ω)q(ω→ω′)

)
B4. Move to ω′ with probability h, and stay at ω with probability 1−h; go back

to the first step.

The result of this new algorithm is a Markov Chain with a stationary distribu-

tion equal to P(ω | ∥λ(w′,Y)∥ < ν) under some regularity condition (see, for a

detailed proof, Battaglini, Patacchini and Rainone (2022). In the limit, as ν goes

to 0, P(ω | ∥λ(w′,Y)∥ < ν) converges to P(ω | Y). The logic behind this result is

that ∥λ(w′,Y)∥ = 0 implies that Li(ω′; Y) = 0 and Hi(ω′; Y) = 0 for all i. In turn, by

definition, for ω′, the equilibrium outcome Y′ is characterized by Li(ω′; Y′) = 0

and Hi(ω′; Y′) = 0 for all i. Which means that requiring Y′ = Y is equivalent to

having ∥λ(w′,Y)∥ = 0 implying that P(ω | Y) = P(ω | ∥λ(w′,Y)∥ = 0).

2.4 Simulations

We use Monte Carlo simulations to assess how well our model performs. The

goal is to show that the estimation method described in the previous section can

recover the structural parameters of the model.

For simplicity, in the remainder of this paper, we set ζ
(
gi, j,0

)
= 0 (this is

the cost in the first stage when we have no preceding network, i.e., gi, j,0); and

ζ
(
gi, j,1, θi, j,1

)
= ζ1 ·

(
gi, j,1 + θi, j,1

)
for t ≥ 1. We also set c = 1, α = 1

2 , βd = .9, and λ = 1.

This allows us to focus on the two core parameters of interest of our model: the

network effect, ρ, and the dynamic effect, ζ1.

45


Now, to evaluate our method, we need to generate data to test it on. To

do so, we assume that we have two periods (t = {1, 2}), with n = 400 agents

i. Additionally, we assume that we can observe some set of characteristics for

each agent
{
Xi,1,t, . . . , Xi,K,t

}
and an observable exogenous network, Hi, j,t, that is

relevant to θi, j,t. In this section, we generate Hi, j,t by assuming it follows the

Erdős-Rényi model, i.e., Hi, j,t = 1 with some probability p, and otherwise Hi, j,t =

0.

Finally, we can set our structural parameters ω = (ρ, ζ1, σϵ , ι, γ, ψ, β) to some

given value, generate a level of outcome Y by solving for the equilibrium out-

come given ω, X and H, and then estimate the parameters given that data. Ap-

pendix B.2 provides the full detail on how the data is generated.

The only thing missing to run our estimation is a prior for ω =

(ρ, ζ1, σϵ , ι, γ, ψ, β). While it is the case that for all parameters we can use un-

informative priors, i.e., uniforms over all possible values, it is a good idea to

restrict some of them to something more sensible. For instance, we let the prior

for β be normal distributions with mean and variance equal to the estimated

mean and variance from running an OLS on our outcome variable without any

network effects.

Figure 2.1 shows the results of this simulation. The line represents the pos-

terior distribution of our parameters, the dashed line represents the priors, the

dotted line gives the median of the posterior distributions, and the dark line

shows the true values of ω.

46


Figure 2.1: Estimated Posterior Distributions for the Simulated Example - Key
Variables -

The two parameters of interest are ρ (network effect) and ζ1 (dynamic effect).

Clearly, from Figure 2.1, we are updating our priors for most of our parame-

ters, especially ρ. Not only that, but our posterior distribution for both ρ and ζ

converge to their respective true value. This evidence supports that our method

can recover the structural parameters of our model accurately.

Sensitivity Analysis

To get a better understanding of our method, we explore its performance by

varying the number of agents n. To do so, we simulate our model, using the

same way we describe in Appendix B.2, but where we set n to different val-

ues, i.e., n = 100, 200, 300, 400. Figure 2.2 shows the difference represents the

difference between the real and the estimated parameter.

47


Figure 2.2: Estimated Bias by Number of Agents n

It is interesting to note three things from Figure 2.2. First, as n increases,

the dispersion around the true value decreases, which suggests that our model

works better in a setting with a larger set of players or at least that we can re-

cover the true parameters with more precision. Second, we can recover ρ pretty

well at any level of n. This is in line with Battaglini, Patacchini and Rainone

(2022) that find that having n = 150 in their particular setting is sufficient to

recover ρ with precision. Finally, for ζ, Figure 2.2 shows that having a large n

is crucial. At n = 100, while the box plot contains 0, the spread is considerable.

This points to the fact that our methodology might be limited in cases where n

is too small.

48


2.5 Empirical Evidence

In this section, we present three empirical settings that showcase the value of

our approach.

2.5.1 Legislative Effectiveness in the U.S. Congress

The first application studies the importance of social connections on U.S. legis-

lators’ productivity, allowing for a dynamic network formation.

Data

Following Battaglini, Patacchini and Rainone (2022) approach, we measure

members of Congress productivity by using the Legislative Effectiveness Scores

(LESs) developed by Volden and Wiseman (2014). The score is based on how

many bills a legislator introduces and how far these bills get on the floor (re-

ferred to committee, receive action on the floor, passed, and so on). Legislative

Effectiveness Project (http://www.thelawmakers.org) provides this data for the

93rd-110th Congress directly on their website. In this paper, we focus on two

election cycles: the 111th Congress (election cycle 2008) to the 112th Congress

(election cycle 2010). One advantage of choosing the 111th Congress and the

112th Congress for our context is that they are only separated by a midterm

election during Obama’s first term as President of the United States of Amer-

ica. This results in a large overlap between the House representatives present in

both congresses, i.e., those reelected in 2010.

49


As for the set of controls Xi,r, we include the party, gender, race, the num-

ber of years spent in Congress and its squared term, D-W ideology, margin of

victory and its squared term, age, state, majority and minority party leadership,

and previous legislative experience. Additionally, we include the main area of

policy interest following Battaglini, Patacchini and Rainone (2022) approach3.

To model the underlying cost θi, j,t for the network formation, we include a

subset of our controls Xi,t: the number of years spent in Congress, age, state,

most recurrent policy subtopic, majority or minority party leadership, gender,

race, party, and age. In addition, we include the network of alumni connections

between legislators, Ht. We construct this network using information on the

educational background of legislators using the Biographical Directory of the

United States Congress4. We set Hi, j = 1 if i, and j graduated from the same in-

stitution within four years of each other, 0 otherwise. Battaglini and Patacchini

(2018) provide more details on how to construct this network.

To calibrate our model, we also make use of the co-sponsorship network 5.

Note that the co-sponsorship network cannot be used directly as a component

of θi, j,t unlike, Hi, j since sponsorship decisions are endogenous to the legislative

activity. In fact, the LES is even built using co-sponsorship data.

Table B.1 provides summary statistics for the different characteristics over

the 111th Congress and the 112th Congress.

3For each Congress member, Battaglini, Patacchini and Rainone (2022) identify the main pol-
icy interest in the following way. Using the data provided by the Congressional Bills Project
(http://congressionalbills.org), which catego- rizes the bills using the policy topic coding sys-
tem provided by the Policy Agendas Project (PAP) (www.comparativeagendas.net/us), for each
Congress member i, they count the bills where the Congress member i was an original sponsor
or cosponsor in each policy subtopic, and identify her/his most recurrent policy subtopic.

4(http://bioguide.Congress.gov/biosearch/biosearch.asp)
5The co-sponsorship network, Ci, j,t, is built by setting Ci, j,t = 1 if legislator i more than 2% of

bills sponsored by i were also co-sponsored by and j, otherwise Ci, j,t = 0

50


Empirical results

To estimate the importance of the dynamic network, we use the methodology

described in Section 2.3. One caveat is that for the Bayesian estimation, we need

to specify the priors of our parameters. While for most parameters, we use

uninformative priors; we calibrate the priors of the parameters of the controls

in the data generating process of the LES, β, and in the network formation of

θi, j,t, with ι, γ, and ψ. For the former, we simply let the priors of β follow normal

distributions using, where we calibrate the mean and variance by first running

an OLS of Yi,t on the set of controls Xi,t. For the latter, we essentially do the same

thing but where instead run a logit regression with the co-sponsorship network

as our dependent variable 6.

Table 2.1 and Table 2.2 reports the results of median value of posterior dis-

tributions estimated. In brackets, we report the p-value of rejecting 0. Table 2.1

concerns only the network formation parameters, ι, γ, and ψ. Table 2.2 reports

the main parameters of interest ρ, and ζ, and the controls parameters, β.

6The priors are set to the following values:

ρ ∼ U(0, 1)

βd ∼ 0.9

ζ1 ∼ U(0, 1)

σϵ ∼ U(0, 1)

β ∼ N(β̂ols,Σols)

ι, γ, ψ ∼ N((ι̂logit, γ̂logit, ψ̂logit),Σlogit)

where β̂ols,Σols are obtained by running an OLS regression of Y on the legislators’ characteristics,
and (ι̂logit, γ̂logit, ψ̂logit),Σlogit are obtained by running a logit regression with the co-sponsorship
network as the dependent variable.

51


Table 2.1: Network Formation for the Legislature Example

Probability that θi, j,t = 1

Link in alumni network 0.063
(0.236)

Seniority [1 = same quartile] 0.130∗∗∗

(0.000)

Seniority i 0.046∗∗∗

(0.000)

Seniority j 0.232∗∗∗

(0.000)

Same state [1 = yes] 0.041∗∗∗

(0.000)

Same topic [1 = yes] 0.122
(0.904)

Leader [1 = both leaders] −0.019∗∗∗

(0.000)

Same gender [1 = yes] 0.088∗∗∗

(0.000)

Same race [1 = both white or both non white] 0.088∗∗∗

(0.000)

Same party [1 = yes] 0.119
(0.476)

Age [1 = same quartile] −0.013∗∗∗

(0.000)

(Intercept) 0.077∗∗∗

(0.000)

Observations 777,924

Note: Estimates of the parameters in equation (2.27). The medians of the pos-
terior distributions estimated with the ABC algorithm is reported. The empir-
ical p- values for zero the null hypothesis is reported in brackets. *, **, and ***
indicate statistical significance at the 10, 5 and 1 percent levels, based on the
empirical p-value.

52


Table 2.2: Estimation Results for the Legislature Example

Dependent variable:

LES

ρ 0.242∗∗∗

(0.000)

ζ 0.398∗∗∗

(0.000)

Party −0.103∗∗∗

(0.000)

Gender 0.051∗∗∗

(0.000)

Non white −0.090∗∗∗

(0.000)

Seniority −0.042∗∗∗

(0.000)

Seniority2 0.004∗∗∗

(0.000)

DW ideology −0.080∗∗∗

(0.000)

Margin 0.004∗∗∗

(0.000)

Margin2 0.00002∗∗

(0.012)

Committee chair 3.787∗∗∗

(0.000)

Delegation size −0.119∗∗∗

(0.000)

Leader 0.222∗∗∗

(0.000)

State legislative experience −0.070
(0.352)

State legislative experience * State legislative professionalism 0.474∗∗∗

(0.000)

Age −0.002∗∗∗

(0.008)

(Intercept) 1.705∗∗∗

(0.000)

State fixed effects Yes
Congress fixed effects No
Major topic fixed effects Yes

Observations 882

Note: Estimates of the parameters in equations (2.24), (2.25), and (2.26). The medians of
the posterior distributions estimated with the ABC algorithm is reported. The empirical p-
values for zero the null hypothesis is reported in brackets. *, **, and *** indicate statistical
significance at the 10, 5 and 1 percent levels, based on the empirical p-value.

53


In the appendix, Figure B.2 plots the posterior distributions of the model for

the two parameters of interest ρ (network effect), and ζ (dynamic effect). Figure

B.2 the posterior distributions for β, and Figure B.3 for the network formation

parameters ι, γ, and ψ.

In Table 2.3, we compared the model proposes by this paper, to simple OLS

results (ρ = 0, and ζ = 0), and the model without dynamic effects (ζ = 0).

54


Table 2.3: Estimation Results for the Legislature Example - Comparison of
Nested Models-

Dependent variable:

LES

(1) (2) (3)

ρ 0.202∗∗∗ 0.242∗∗∗

(0.000) (0.000)

ζ 0.398∗∗∗

(0.000)

Party −0.113 −0.122∗∗∗ −0.103∗∗∗

(0.716) (0.000) (0.000)

Gender 0.047 0.034∗∗∗ 0.051∗∗∗

(0.704) (0.000) (0.000)

Non white −0.075 −0.083∗∗∗ −0.090∗∗∗

(0.656) (0.000) (0.000)

Seniority −0.039 −0.038∗∗∗ −0.042∗∗∗

(0.174) (0.000) (0.000)

Seniority2 0.004∗∗∗ 0.004∗∗∗ 0.004∗∗∗

(0.006) (0.000) (0.000)

DW ideology −0.070 −0.099∗∗∗ −0.080∗∗∗

(0.822) (0.000) (0.000)

Margin 0.004 0.008∗∗∗ 0.004∗∗∗

(0.908) (0.000) (0.000)

Margin2 0.00001 −0.00003∗∗ 0.00002∗∗

(0.960) (0.012) (0.012)

Committee chair 3.813∗∗∗ 3.820∗∗∗ 3.787∗∗∗

(0.000) (0.000) (0.000)

Delegation size −0.120 −0.121∗∗∗ −0.119∗∗∗

(0.313) (0.000) (0.000)

Leader 0.230 0.241∗∗∗ 0.222∗∗∗

(0.233) (0.000) (0.000)

State legislative experience −0.037 −0.023 −0.070
(0.832) (0.352) (0.352)

State legislative experience * State legislative professionalism 0.402 0.294∗∗∗ 0.474∗∗∗

(0.414) (0.000) (0.000)

Age −0.001 −0.001∗∗∗ −0.002∗∗∗

(0.793) (0.008) (0.008)

(Intercept) 1.785 1.510∗∗∗ 1.705∗∗∗

(0.219) (0.000) (0.000)

State fixed effects Yes Yes Yes
Congress fixed effects No No No
Major topic fixed effects Yes Yes Yes

Partial F test 671.72 4.99
p-value 0.000 0.026
MSE 2.201 1.188 1.18