ESSAYS IN POLITICAL ECONOMICS AND NETWORKS A Dissertation Presented to the Faculty of the Graduate School of Cornell University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy by Julien Manuel Neves May 2023 © 2023 Julien Manuel Neves ALL RIGHTS RESERVED ESSAYS IN POLITICAL ECONOMICS AND NETWORKS Julien Manuel Neves, Ph.D. Cornell University 2023 This dissertation consists of three essays on political economics and the role of social connections. In Chapter 1, I examine how competitive elections, as measured by margin of victory and fundraising outcomes, affect legislator be- havior during the following legislative session. Focusing on the U.S. State leg- islatures, I use two measures of productivity: the number of new bills a legis- lator sponsored, and the number of new bills excluding bills copied from other legislatures or model legislation. This allows me to study the extent to which legislators introduce these bills to appear productive while putting in minimal effort. I use an instrumental variable approach and find that legislators who win elections more comfortably introduce fewer copycat bills and more substantive bills. In Chapter 2, my coauthors and I propose a new theory of network forma- tion where agents are not myopic and where their decisions over their social links and behavior are endogenous. Using a new equilibrium concept, we show that our model can be estimated using a modified Approximate Bayesian com- putation method. We showcase our approach with three distinct empirical ex- amples. The first example focuses on the legislative effectiveness of politicians in the 111th and 112th U.S. Congress. The second example looks at R&D expen- ditures in the Chemicals And Allied Products industry. The third example, using the National Longitudinal Study of Adolescent to Adult Health (Add Health) dataset, looks at peer effects on the educational achievement of adolescents. In Chapter 3, I study the extent to which judicial influence depends on the judges’ social connections. Guided by a theoretical model that formalizes the role of social connection, I document that social connections are a significant determinant of judge influence. I use the flow of law clerks between judges from 1995-2004 as a measure of social connections, total citations as a proxy for influence, and I address network endogeneity by using novel data on the judges’ alumni connections. The results also provide new insights into how social connectedness interacts with judges’ demographic characteristics. BIOGRAPHICAL SKETCH Julien Neves grew up in Montreal, Canada. He graduated from McGill Univer- sity with a B.A. Joint Honours in Economics and Mathematics in 2015, and an M.A. in Economics in 2016. From 2017-2022, he attended Cornell University to pursue a Ph.D. in Economics. After his doctoral studies, he will start working as an economist for Amazon. iii This document is dedicated to my better half, Émilie Chiasson, and my parents, Stéphane Raymond and Marlyn Neves. iv ACKNOWLEDGEMENTS I am grateful to my committee: Marco Battaglini, Eleonora Patacchini, and Giulia Brancaccio. Their guidance, feedback, and support were instrumental in shaping this dissertation. I would also like to thank Aviv Caspi, Matthew Comey, Christa Deneault, Giulia Olivero, Camille Portier, David Wasser, and many others for their help and comments. I’m especially indebted to Aviv for his constant support through the last months of this endeavor. I thank Joe Walsh for providing data on text reuse in the U.S. State legis- latures, Denise Roth Barber for providing an expanded access to the National Institute on Money In State Politics dataset, and Derek Stafford for providing data on law clerks’ movements. I’m grateful to my family and friends for their moral support, kindness, and encouragement. In particular, I thank my parents, Stéphane Raymond and Marlyn Neves, for their unwavering support. Lastly, I thank my fiancée, Émilie Chiasson, who has agreed to embark with me on this long journey. v TABLE OF CONTENTS Biographical Sketch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix 1 Electoral Experiences and Legislator Behavior 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.2.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.3 Research Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.3.1 Instrumenting for Competitiveness Measures . . . . . . . 16 1.3.2 Alternative Instrument . . . . . . . . . . . . . . . . . . . . . 18 1.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 1.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2 Dynamic network formation with forward looking agents 23 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.2.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.2.2 Equilibrium analysis . . . . . . . . . . . . . . . . . . . . . . 35 2.2.3 Network competitive equilibrium . . . . . . . . . . . . . . 35 2.3 Estimation Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 2.3.1 Model Specification . . . . . . . . . . . . . . . . . . . . . . 42 2.3.2 Approximate Bayesian Computation (ABC) . . . . . . . . 43 2.4 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 2.5 Empirical Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 2.5.1 Legislative Effectiveness in the U.S. Congress . . . . . . . . 49 2.5.2 R&D expenditures in the Chemical Industry . . . . . . . . 61 2.5.3 Adolescent behavior . . . . . . . . . . . . . . . . . . . . . . 67 2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 3 Judge Influence and Judicial Networks 76 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 3.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.2.1 Social Network . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.2.2 Citations data . . . . . . . . . . . . . . . . . . . . . . . . . . 84 3.2.3 Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . 87 3.2.4 Alumni Network . . . . . . . . . . . . . . . . . . . . . . . . 88 3.3 Theory and Empirical Strategy . . . . . . . . . . . . . . . . . . . . 88 vi 3.3.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 3.3.2 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 3.4 Estimation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 3.4.1 Alternative First Step . . . . . . . . . . . . . . . . . . . . . . 98 3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 A Appendices to Chapter 1 101 A.1 Additional Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 A.2 Additional Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 A.2.1 First Stage Results . . . . . . . . . . . . . . . . . . . . . . . 103 A.2.2 Summary Statistics . . . . . . . . . . . . . . . . . . . . . . . 104 A.2.3 Alternative Instrument Results . . . . . . . . . . . . . . . . 105 B Appendices to Chapter 2 108 B.1 Proof of Proposition 1 . . . . . . . . . . . . . . . . . . . . . . . . . . 108 B.2 Setup of the Simulations in Section 2.4 . . . . . . . . . . . . . . . . 108 B.3 Additional Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 B.4 Additional Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 C Appendices to Chapter 3 120 C.1 Additional Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 C.2 Additional Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 vii LIST OF TABLES 1.1 Effect of Being Opposed on Number of Bills Introduced . . . . . 19 1.2 Effect of Vote Margin on Number of Bills Introduced . . . . . . . 20 1.3 Effect of Campaign Contributions on Number of Bills Introduced 21 1.4 Effect of Campaign Contributions Split by Source on Number of Bills Introduced . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.1 Network Formation for the Legislature Example . . . . . . . . . . 52 2.2 Estimation Results for the Legislature Example . . . . . . . . . . 53 2.3 Estimation Results for the Legislature Example - Comparison of Nested Models- . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 2.4 Counterfactual Analysis - Alumni Connections - . . . . . . . . . . 58 2.5 Counterfactual Analysis - Ideological Extremism - . . . . . . . . . 60 2.6 Network Formation for the R&D Example . . . . . . . . . . . . . 64 2.7 Estimation Results for the R&D Example . . . . . . . . . . . . . . 65 2.8 Estimation Results for the R&D Example - Comparison of Nested Models - . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 2.9 Network Formation for the Adolescent Behavior Example . . . . 70 2.10 Estimation Results for the Adolescent Behavior Example . . . . . 71 2.11 Estimation Results for the Adolescent Behavior Example - Com- parison of Nested Models - . . . . . . . . . . . . . . . . . . . . . . 73 3.1 Network Formation . . . . . . . . . . . . . . . . . . . . . . . . . . 94 3.2 Estimation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 A.1 First Stage Relationship for Electoral Competitiveness Measures 103 A.2 Summary Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 A.3 Effect of Being Opposed on Number of Bills Introduced: Alter- native Instrument . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 A.4 Effect of Vote Margin on Number of Bills Introduced: Alternative Instrument . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 A.5 Effect of Campaign Contributions on Number of Bills Intro- duced: Alternative Instrument . . . . . . . . . . . . . . . . . . . . 106 A.6 Effect of Campaign Contributions Split by Source on Number of Bills Introduced: Alternative Instrument . . . . . . . . . . . . . . 107 B.1 Summary Statistics for the Legislature Example . . . . . . . . . . 118 B.2 Summary Statistics for the R&D Example . . . . . . . . . . . . . . 119 B.3 Summary Statistics for the Adolescent Behavior Example . . . . 119 C.1 Estimation Results Decomposed by Categories . . . . . . . . . . . 123 C.2 Estimation Results for the Mean-Log Citations . . . . . . . . . . . 124 C.3 Estimation Results for Different Alumni Networks . . . . . . . . 125 C.4 Horse Race of Centrality Measures . . . . . . . . . . . . . . . . . . 126 C.5 Summary Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 viii LIST OF FIGURES 1.1 Example of Model Legislation from ALEC . . . . . . . . . . . . . 8 1.2 Legislative Influence Detector Example: Wisconsin Senate Bill 179 (2015) v. Louisiana Senate Bill 593 (2012) (Burgess et al., 2016) 9 1.3 Number of Copycat Bills by State . . . . . . . . . . . . . . . . . . 11 1.4 Share of Copycat Bills by State . . . . . . . . . . . . . . . . . . . . 12 1.5 Distribution of Vote Margin Conditional on Being Opposed . . . 15 2.1 Estimated Posterior Distributions for the Simulated Example - Key Variables - . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 2.2 Estimated Bias by Number of Agents n . . . . . . . . . . . . . . . 48 3.1 Law Clerks Movements Network Between 1995 and 2004 . . . . 83 3.2 Total Number of Citations by Judge . . . . . . . . . . . . . . . . . 87 3.3 Alumni Network - One-Year Window - . . . . . . . . . . . . . . . 89 A.1 Distribution of Total Campaign Contributions Received by Leg- islators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 A.2 Distribution of Vote Margin . . . . . . . . . . . . . . . . . . . . . . 102 B.1 Estimated Posterior Distributions for the Legislative Example - Key Variables - . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 B.2 Estimated Posterior Distributions for the Legislative Example - Control Variables - . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 B.3 Estimated Posterior Distributions for Legislative Example - Con- trol Variables (Continued) - . . . . . . . . . . . . . . . . . . . . . . 112 B.4 Estimated Posterior Distributions for the R&D Example - Key Variables - . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 B.5 Estimated Posterior Distributions for the R&D Example - Control Variables - . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 B.6 Estimated Posterior Distributions for the R&D Example - Control Variables (Continued) - . . . . . . . . . . . . . . . . . . . . . . . . 114 B.7 Estimated Posterior Distributions for the Adolescent Behavior Example - Key Variables - . . . . . . . . . . . . . . . . . . . . . . . 115 B.8 Estimated Posterior Distributions for the Adolescent Behavior Example - Control Variables - . . . . . . . . . . . . . . . . . . . . . 116 B.9 Estimated Posterior Distributions for the Adolescent Behavior Example - Control Variables (Continued) - . . . . . . . . . . . . . 117 C.1 Alumni Network - Same Graduating Class - . . . . . . . . . . . . 120 C.2 Alumni Network - Four-Year Window - . . . . . . . . . . . . . . . 121 C.3 Estimated Network Effects Using EGRM . . . . . . . . . . . . . . 122 ix CHAPTER 1 ELECTORAL EXPERIENCES AND LEGISLATOR BEHAVIOR: CAMPAIGN DONATIONS, ELECTION CLOSENESS, AND COPYCAT LEGISLATION 1.1 Introduction Elected officials balance time between their substantive work and preparing for reelection. Ideally, representatives would “run on their record,” incentivizing them to focus on the needs of their constituents. However, in reality, reelection tasks including fundraising, political advertising, and campaigning distort the time elected officials spend on their official duties. A key question is thus what determines how elected officials balance their time between substantively leg- islating and preparing for reelection. Answering this question is complicated by the difficulty of measuring legislator productivity: some measures are con- taminated by broader forces outside the individual’s control, and others can be manipulated by legislators seeking to appear productive. In this paper, I study the extent to which election margin of victory and fundraising outcomes affect legislator behavior during the following legisla- tive session. I use the number of new bills a legislator sponsored or introduced as a measure of productive behavior. Crucially, I separately identify bills that were copied from other state legislatures or model legislation (commonly called “copycat bills”) and study the extent to which legislators introduce these bills to appear productive with minimal effort. I use an instrumental variable approach and find that legislators who win elections more comfortably – measured by running unopposed, having larger vote margins, and raising donations success- fully – introduce more substantive bills and fewer copycat bills. Legislators who 1 win on closer margins or struggle to raise funds from donors introduce more copycat bills and fewer substantive ones. The natural explanation is that these legislators rely on copycat bills to signal productively while actually focusing their efforts on solidifying their reelection prospects. I find evidence consistent with this hypothesis: these vulnerable legislators, who struggled with fundrais- ing in the previous election, raise more money in their first year of office than their peers who won more comfortably. To study this question, I construct a new dataset linking measures of elec- toral closeness to measures of individual legislator productivity, which I de- scribe in Section 2. I link data on over 500,000 bills submitted to US state con- gresses between 2009 and 2016 compiled by Burgess et al. (2016), which iden- tifies 45,405 instances of copycat legislation, with metadata on the bills them- selves from LegiScan. I match these data with information about sponsoring legislators from VoteSmart, campaign contributions from the Center for Respon- sive Politics, and electoral outcomes from the State Legislative Election Returns dataset compiled by Klarner (2018). This set of linkages allows me to separately analyze details that prove key, including impacts on sponsoring different types of bills and by donations from different sources. The second advantage of this paper is in the empirical strategy enabling causal estimates on the impacts of election closeness. The ideal experiment would be to randomly vary how comfortably a given legislator won their elec- tion, then compare the behavior in the upcoming legislative session of legisla- tors who won comfortably to those who won narrowly. My instrumental vari- able (IV) strategy, which I detail in Section 3, approximates this experiment by exploiting variation in electoral closeness coming from partisan waves within a 2 given state. I measure election closeness in three ways. First, whether a candi- date ran unopposed. Second, the margin of votes that the candidate won with. Third, the amount of money the candidate raised. I ask the extent to which each of these instrumented measures of closeness causes legislators to propose more substantive bills or copycat bills. I report the results in Section 4. I find that legislators who face closer elec- tions introduce fewer substantive bills and more copycat bills in the following legislative session. These results are consistent regardless of whether I measure election closeness by the likelihood of running opposed or the margin of votes. Legislators who won closer elections sponsor fewer substantive bills and sub- stitute towards copycat bills more than their peers who won comfortably. Candidates might interpret electoral vulnerability not only from their most recent vote margin or ability to clear the field of opponents, but also in their success raising donations. I find that raising an additional $100K in an election leads to a legislator introducing 5.7 more substantive bills and 0.2 fewer copycat bills in the following legislative session. Observing the identity of donors allows me to disaggregate these effects by the type of dollar raised. As one might ex- pect, higher donations only lead to the effects associated with electoral security when those dollars flow from external donors. Higher contributions from the candidate themselves lead to fewer bills being introduced. Insofar as having to contribute one’s own resources to an election campaign is a sign of weakness, this is unsurprising. These results suggest that legislators react to close elections by focusing less on substantively legislating in the following session. It appears that they mask this substitution by sponsoring bills copied from other sources. Overall, this is a troubling picture of electoral politics. Many decry the rates 3 at which incumbents run unopposed and win reelection as a failure of civic en- gagement. How can representatives be held accountable to their constituents if they do not face vigorous challenges? However, these results show that close electoral contests lead to less productive legislators. One silver lining of this work is the extent to which new machine learning methods allow for better identification of shirking legislators, specifically those that rely on copied legis- lation to signal productivity. Related Literature This paper is closest in spirit to a strand of literature that studies the incen- tives behind the legislative production process. Starting from the electoral con- nection theory from Mayhew (1974), it can be argued that politicians are mainly motivated by re-election and that the producing of legislation is a signalling tool of their competency. In that vein, Giommoni et al. (2022) propose a model of legislative production and show how restricting the possibility of cosponsor- ing can induce overproduction of low-quality legislation. Using a reform in the 96th Congress that removed the hard cap on the number of cosponsors for a bill in the House of Representatives, they find that limited cosponsorship opportu- nities caused an overproduction of poor-quality legislation. In comparison, in this paper, I also find that legislators that faced a competitive race will tend to produce more low-quality legislation. Furthermore, Gratton et al. (2021) pro- pose a model that shows that legislators experiencing political instability will tend to introduce a larger amount of low-quality laws, which can lead to a de- crease in bureaucratic efficiency. This is also consistent with the results of this paper if we take electoral competitiveness as almost an individual measure of electoral instability. 4 Another closely related literature is the one that uses term limits to study the impacts of electoral incentives on legislative behavior (see, for example, Fouir- naies and Hall (2022), Dal Bó and Rossi (2011)). Dal Bó and Rossi (2011) find that longer terms promote more effort, but are skeptical that elections are the primary mechanism. They conclude that job stability promotes effort. I arrive at a similar conclusion through a different set of findings. As mentioned pre- viously, I find that elections do, in fact, impact behavior – specifically that elec- tions which signal increased job security promote legislator effort. Fouirnaies and Hall (2022) find evidence that legislators in U.S. State legislatures who can no longer seek reelection are overall less productive, e.g., sponsor fewer bills. More broadly, this paper relates to the literature on legislative effectiveness and cosponsorship activity, e.g., Volden and Wiseman (2014), Battaglini, Sciabo- lazza and Patacchini (2020), Battaglini, Patacchini and Rainone (2022), Miquel and Snyder (2006), Anderson, Box-Steffensmeier and Sinclair-Chapman (2003), Frantzich (1979), Bratton (2005), Campbell (1982). While multiple measures of effectiveness have been proposed, I differ from the rest of this literature by look- ing at the number of bills that had sections copied from elsewhere. For electoral competitiveness, one measure I use in this paper is the vote margin. There is growing literature trying to relate the share of votes a legisla- tor can garner to legislative productivity (see among other productivity Barber and Schmidt (2019), Schmidt and Young (2017)), but there isn’t a clear consen- sus. Schmidt and Young (2017) find that an absence of electoral competition re- sults in a 13 percent drop in overall legislative productivity. On the other hand, Barber and Schmidt (2019) find significant evidence of a positive relationship between primary vote share and legislative effectiveness from U.S. Congress 5 House representatives. This paper contributes to the debate by finding a rela- tionship similar to that of Barber and Schmidt (2019). My paper is also related to the vast literature on the role of money in politics (see, for example, Stratmann (2005) and Ansolabehere, de Figueiredo and Sny- der (2003) for a meta-analysis). While a large portion of this literature tries to evaluate how politicians can be influenced1, I instead focus on using campaign contributions to evaluate how legislators might feel about the precarity of their position and the effect on their productivity. This paper is related to the text reuse literature, particularly relating to copycat legislation. As mentioned previously, thanks to machine learning, re- searchers have been able to identify examples of text reuse in state legisla- tures, see Garrett and Jansa (2015), Hertel-fernandez and Kashin (2015), Hertel- Fernandez (2014), and Burgess et al. (2016). Another example is by Pagliari and Young (2020) who looked at instances of text reuse among the comment letters submitted by special interest groups to policy proposals in the EU. Closer to this paper, Linder et al. (2018) using the bill-to-bill text reuse data from Burgess et al. (2016) and found that ideologically close legislators will tend to exhibit a high degree of text reuse. 1For example, Battaglini and Patacchini (2018) that shows that campaign donations are cor- related with how central some politicians and Bertrand et al. (2020) that find evidence that char- itable giving by corporations is correlated with politician getting seats on committee. 6 1.2 Data 1.2.1 Background While legislators wear many hats, the primary role they serve is to introduce and pass laws in their respective legislatures. For a piece of legislative text to be introduced to a house, senate, or assembly, it usually needs one or many sponsors. Depending on the type and nature of the bill introduced, the bill will usually be sent to a committee for evaluation and possible editing. Afterward, the bill may be advanced to a vote. While legislators are the ones introducing bills by sponsoring them, the true authorship of a bill is generally unclear. For instance, a bill can be written di- rectly by the legislator, by his staff, by some other member of the party, by the staff of a bipartisan legislative drafting service, by a lobbyist or special interest groups (SIG), or by a legislator in another state. In this paper, I pay attention to two particular sources of legislation: model legislation and copycat legislation. Model legislation is usually a piece of legislation that was written by some third party with the intent to be included in bills proposed by legislators. While the intent behind some of the model legislation available is to ensure some sort of uniformity in the code of law between states, its primary usage is to provide a ready-to-use piece of legislation that is tailored to help certain businesses and special interest groups (SIG). Therefore, model legislation is usually written by some lobbyist or SIGs, and it is not usually disseminated to the public. How- ever, groups like the American Legislative Exchange Council (ALEC) or the American Legislative and Issue Campaign Exchange (ALICE) have decided to 7 make some of their model legislation publicly available on their websites. Fig- ure 1.1 shows an example of such model legislation, directly taken from ALEC’s website. This particular excerpt concerns a tax credit for long-term care, but as I mentioned previously, the range of topics that model legislation cover is vast. Figure 1.1: Example of Model Legislation from ALEC Copycat legislation is legislation that was copied directly from model legis- lation or some other bill introduced in another state. Using machine learning, Burgess et al. (2016) have developed the Legislative Influence Detector; a tool to identify pieces of legislative text that closely match. Matches are based on the Smith-Waterman local alignment algorithm (see, for more details, Smith and Waterman (1981)). This allows matching segments of a bill with model legisla- tion instead of having to deal with the entire document. Figure 1.2 shows how the tool from Burgess et al. (2016) can match text from two different bills. Note that while the wording looks similar, it is not identical. For example, the word 8 “hydranencephaly” is misspelled in one of the bills in Figure 1.2. Figure 1.2: Legislative Influence Detector Example: Wisconsin Senate Bill 179 (2015) v. Louisiana Senate Bill 593 (2012) (Burgess et al., 2016) There are multiple reasons why a state legislator would resort to copying legislation word for word, but one key aspect of this practice is that it is easy to do. For instance, a politician who wants to pass a bill on reproductive rights could spend time and resources writing the bill tailored to her/his liking. This is an expansive process for politicians, especially for those that are resources con- strained and without any background in law. According to the National Con- ference of State Legislatures (NCSL), only four states can be classified as hav- ing a legislature with full-time politicians who are well-paid and have a large staff: California, Michigan, New York, and Pennsylvania (National Conference of State Legislatures, 2017). This is a stark contrast to the Federal level where, for instance, each legislator has access to the Office of the Legislative Counsel, which offers legislative drafting services. It is, therefore, reasonable to expect state legislators to rely on using model legislation or copycat legislation. 9 1.2.2 Data Text reuse I use two main measures of productivity in the legislative process during a ses- sion: the number of bills sponsored, and the share of those bills that are copied in parts from either model legislation or another state. For the latter, I use the data compiled by Burgess et al. (2016). In their paper, the authors have collected over the period of 2009 to 2016 more than 2,400 pieces of model legislation writ- ten by lobbyists and analyzed 500,000 state bills for any matches. Out of all the bills, Burgess et al. (2016) found 45,405 instances of text reuse between state bills, and 14,137 bills that were directly copied from model legislation. The au- thors have been gracious enough to provide this data, which provides pairing across different bills and model legislation and the respective Smith-Waterman local alignment scores of those pairs. The higher the score, the more similar the part of the text is2. There is a lot of variation in the use of copycat legislation across states. For example, Figure 1.3 shows how many times a bill introduced in one state was originally introduced before in another state over the sample period. Legislators from Mississippi are mostly likely to rely on copycat legislation. 2I use a threshold of 1000 for the alignment score to classify if a bill contains copied parts. 10 Figure 1.3: Number of Copycat Bills by State One issue with Figure 1.3 is that this figure is not accounting for how pro- ductive a state is at introducing legislation. For instance, in the 2015-2016 reg- ular session, New York introduced 18534 bills, while Kansas only introduced 1459 bills. Figure 1.4 plots the number of instances where a bill was flagged as having copied from another state as a share of the total amount of legislation introduced. 11 Figure 1.4: Share of Copycat Bills by State Figure 1.4 shows that, for example, while Mississippi is still high up in the ranking in terms of relying on copycat legislation, with roughly 13% of its bills having sections copied from another bill, it is surpassed by another state: Kansas. Bills and Legislators I supplement the data provided by Burgess et al. (2016) by matching the bill numbers from that dataset to the metadata on state legislative bills from LegiS- can. LegiScan is a legislative tracking and reporting service with information on most legislation passed or presented in each of the 50 states from 2007 to today. I collect all the information available on more than 1.3 million bills introduced in the different states’ legislatures. The data includes information such as the 12 status of a bill, the date it was introduced, its general content, etc. More im- portantly, the data also contains information on the cosponsorship of bills and roll-call details. I am able to match roughly 80% of the Burgess et al. (2016) dataset with the LegiScan data. The main source of discrepancy is that some bills introduced before 2010 are not present on the LegiScan. Note that while LegiScan has a lot of information on legislators, they are missing some important variables (age, race, incumbency, etc.). To remedy this situation, I use the state legislators’ individual characteristics via data provided by Vote Smart, an organization looking to provide unbiased information on can- didates to all Americans. Using their API, I collected information on more than 15,000 legislators in the United States and merged it with the LegiScan data pre- viously described. The Vote Smart data has some gaps, in particular, there are numerous miss- ing values for the birth year of legislators. To patch some of the missing values, I scrape data from Wikipedia for every legislator. This is particularly impor- tant to keep the bulk of legislators in my regressions. Table A.2 provides some summary statistics on the dataset. Campaign Contribution For campaign contributions, I use data from the Center for Responsive Politics and the National Institute on Money in Politics on every individual contribu- tion to state legislators from 2008 to 2020. This amounts to roughly 1.8 million observations for about 35,000 political races. I aggregated over each candidate 13 by taking the total amount of donations over a given election cycle. However, I distinguish donations from the party and from the candidate themselves. Fig- ure A.1 in the appendix shows the distribution of donations across candidates. This donation data is matched to the legislative data discussed previously. Election results To evaluate the closeness of a race in terms of votes, I use the data collected by Klarner (2018). This dataset contains general election results from 1976 to 2016 in all 50 states. It also contains results from some party primaries. From this data, I create margins by comparing the share of the vote of every candidate that won an election to the vote share of the closest loser. For example, this means that if a candidate wins an election with a share of 50% against candidates with 30% of the vote and 20% of the vote, respectively, the margin would be 20%. In other cases, for instance, in West Virginia House of Delegates District 19, where the top four candidates get a seat, it becomes a little more challenging. In 2010, in that election, six candidates ran for those four positions. The top four were democrats with vote shares ranging from 21.6% to 18.1%. The candidates that lost had 10.7% and 10.1% of the vote shares, respectively. In this case, the margin I would measure for the winners would range from 10.9% to 7.4%. If a candidate runs unopposed in the general election, I set the winning margin equal to 100%. For safe districts for either republicans or democrats, the results of the gen- eral election don’t relay how competitive the races really are. The real challenge usually happens in the primaries. Therefore, I use the minimum winning mar- gin in an election cycle for each candidate, comparing primaries to general elec- tions. This means that a candidate running unopposed in the general election 14 but that won her/his primary by 4% would have a vote margin of 4%. I create a dummy variable to measure if a candidate faces any opposition if its winning vote margin is below 90%. Figure 1.5 shows the distribution of winning margins across legislators in my sample conditional on being opposed. Figure 1.5: Distribution of Vote Margin Conditional on Being Opposed Figure 1.5 in the appendix also plots the distribution of vote margin across legislators but also includes unopposed races. While it’s clear from Figure 1.5 that the frequency goes down as the vote margin increases, Figure 1.5 shows that around a 90% vote margin, the frequency goes back up. 15 1.3 Research Design My empirical approach is to regress legislative behavior (e.g., the number of bills cosponsored or copied) over different measures of election closeness (e.g., total contributions, winning vote margin, or excess donations). For each legis- lator i in party p in state s at a time t, I consider the following specification for the outcome of interest Yitps: Yitps = α + ϕ · Competitiveness Measureitps + Xitpsβ + λt + δp + ξs + ϵitps (1.1) where Xitps is a vector of controls including incumbency status, gender, role, tenure, age, and if the legislator has a law degree, and ϵitps is the error term. The period t is defined as the election cycle. Most variables are measured during that particular election y, except the outcome, Yitps, where I measure it over the next two years. For example, if an election happened in 2012, I measure the legislative activity over 2013 and 2014. I do this to keep Yitps consistent across different legislatures with varying term lengths3. 1.3.1 Instrumenting for Competitiveness Measures The main source of concern about using the aforementioned specification is that bill sponsorship is correlated with donations or margin of victory through other channels unrelated to competitiveness. While I control for some important drivers of legislative productivity and electoral success, such as incumbency, I’m unable to assert that both variables are not affected by some unobserved cofounders. This means that the OLS estimate of Equation (1.1) could be biased. 3The most common length term is 2 years for house members (44 states), but can go up to 4 years in some states (6 states). The reverse holds true for state senators, i.e., 31 states with term lengths of 4 years, 12 states with 2 years, and 7 states with varying lengths. 16 I seek an instrument that predicts margin of victory and total donations, but that is unrelated to other determinates of bill sponsorship. I propose using the average margin of victory, average donations, and average excess donations for other candidates of the same party within the same state, excluding legislator i as an instrument for these same variables for i, i.e., ˜Competitiveness Measureitps = 1 n − 1 ∑ j,i Competitiveness Measureitps (1.2) where n is the number of legislators in state s at time t. By taking the average margin of victory and donations in the same state, I break the relationship between an individual lawmaker’s own experience and instead capture variations in donations/margin of victory that are due to com- mon party factors, such as influence by common interest groups and state cau- cus support structures. Using this instrument, I estimate Equation (1.1) using a Two-stage Least Squares (2SLS) approach, where the first stage uses the following specification: CMitps = α̃ + φ · C̃Mitps + Xitpsβ + λt + δp + ξs + ϵitps (1.3) where CMitps is the competitiveness measure and C̃Mitps is the constructed in- strument. Table A.1 reports the results of estimating Equation (1.3) for the dif- ferent competitiveness measures and instruments. All instruments created can reject the null hypothesis of weak-instrument and are highly predictive of their respective competitiveness measure. 17 1.3.2 Alternative Instrument Since the instrument I propose is defined at the party, state and time t level, it is reasonable to expect that there might not be enough variation to exploit. To remedy this situation, I also propose an alternative instrument based on exploit- ing neighboring legislative districts. Let i be the legislator that wins the election at time t in district d. Using state legislative districts shapefiles from the Census bureau, I create a set Nitps of candidates in districts with a common boundary to d. I then take the average value of the competitiveness measures of those candidates, i.e., ˜Competitiveness Measureitps = 1 n ∑ j∈Nitps Competitiveness Measureitps (1.4) where n is the size of the set Nitps. The advantage of this instrument compared to Equation (1.2) is that it provides more variation. The main drawback is that it is not robust to regional unobserved factors that could drive both legislative outcomes and the electoral process. I provide the results of the analysis with this particular instrument in the appendix. 1.4 Results In Table 1.1, I show that running opposed leads to a legislator introducing nearly 21 fewer substantive bills and 1.4 more copycat bills. It is also interesting to note that incumbents compared to challengers, are more likely to rely on copycat legislation and introduce more substantive bills. 18 Table 1.1: Effect of Being Opposed on Number of Bills Introduced OLS Instrumental OLS Instrumental Variable Variable Number of copycat bills Number of copycat bills Number of bills Number of bills (1) (2) (3) (4) Opposed (Yes = 1) 0.024 1.418∗∗∗ −2.696∗∗ −21.281∗∗ (0.025) (0.185) (1.354) (9.478) Incumbency Status (Incumbent) 0.111∗∗ 0.413∗∗∗ 9.421∗∗∗ 5.402∗ (0.047) (0.064) (2.546) (3.262) Incumbency Status (Open) 0.0003 0.108∗∗ −0.071 −1.508 (0.050) (0.055) (2.711) (2.815) Party (Independent) −0.803∗∗∗ −0.810∗∗∗ −32.513∗∗∗ −32.414∗∗∗ (0.196) (0.206) (10.551) (10.586) Party (Republican) −0.077∗∗∗ −0.132∗∗∗ −6.935∗∗∗ −6.201∗∗∗ (0.023) (0.025) (1.220) (1.279) Party (Third-party) −0.126 −0.229 −10.726 −9.356 (0.181) (0.191) (9.755) (9.811) Role (Senator = 1) 0.183∗∗∗ 0.178∗∗∗ 1.770 1.835 (0.024) (0.025) (1.301) (1.306) JD 0.078∗∗∗ 0.069∗∗ 0.989 1.104 (0.028) (0.030) (1.515) (1.521) Gender (Male = 1) −0.074∗∗∗ −0.041 −8.971∗∗∗ −9.415∗∗∗ (0.025) (0.026) (1.337) (1.360) Tenure −0.005∗∗∗ 0.001 0.329∗∗∗ 0.248∗∗ (0.002) (0.002) (0.095) (0.103) (Intercept) 1.455 0.390 48.727 62.916 (1.033) (1.096) (55.591) (56.231) State Fixed Effects Yes Yes Yes Yes Year Fixed Effects Yes Yes Yes Yes Observations 28,639 28,639 28,639 28,639 R2 0.283 0.206 0.530 0.527 Adjusted R2 0.281 0.204 0.529 0.526 Residual Std. Error (df = 28571) 1.751 1.843 94.254 94.565 Note: Column (1) and (3) reports the OLS estimate using the electoral competitiveness measure directly. Column (2) and (4) reports the 2SLS results using instrument described in Section 1.3.1. Standard errors for the coefficients are reported in the parenthesis. *, **, and *** indicate statistical significance at the 10, 5 and 1 percent levels, based on the p-value. In Table 1.2, I show that the effect persists even among legislators who ran opposed. Winning by an additional 1% of the vote share means that a legislator introduces 1.9 more substantive bills in the following session. The effect of vote margin on copycat bills is imprecisely estimated, but the coefficient is negative, as would be expected. 19 Table 1.2: Effect of Vote Margin on Number of Bills Introduced OLS Instrumental OLS Instrumental Variable Variable Number of copycat bills Number of copycat bills Number of bills Number of bills (1) (2) (3) (4) % Margin 0.001 −0.003 0.158∗∗∗ 1.924∗∗∗ (0.001) (0.003) (0.036) (0.191) Controls Yes Yes Yes Yes State Fixed Effects Yes Yes Yes Yes Year Fixed Effects Yes Yes Yes Yes Observations 19,099 19,099 19,099 19,099 R2 0.294 0.293 0.540 0.483 Adjusted R2 0.291 0.290 0.538 0.481 Residual Std. Error (df = 19031) 1.731 1.732 90.374 95.777 Note: We drop legislators where there is no competitor. Column (1) and (3) reports the OLS estimate using the electoral competitiveness measure directly. Column (2) and (4) reports the 2SLS results using instrument described in Section 1.3.1. Standard errors for the coeffi- cients are reported in the parenthesis. *, **, and *** indicate statistical significance at the 10, 5 and 1 percent levels, based on the p-value. In Table 1.3, I show that raising an additional $100K in an election leads to a legislator introducing 5.8 more substantive bills and 0.2 fewer copycat bills in the following legislative session. Donations can come from many sources, though. In Table 1.4, I disaggregate the sources of donations between dollars coming from outside donors, state parties, and candidates themselves. I find that raising additional dollars from oneself does not carry the same effect as raising from donors. Specifically, an additional $100K from outside donors in- creases the number of substantive bills introduced by 13.2, but $100K from one- self decreases the number of substantive bills by 19.0. This is what we would expect: raising more money from oneself is no signal of electoral security. In fact, it is likely the opposite since the candidate was unable to raise money ex- ternally. Thus, it is no surprise that success raising money from donors leads to candidates dedicating more time to substantive legislating while having to con- tribute from their own wealth leads candidates to legislate less, allowing them to focus on reelection. 20 Table 1.3: Effect of Campaign Contributions on Number of Bills Introduced OLS Instrumental OLS Instrumental Variable Variable Number of copycat bills Number of copycat bills Number of bills Number of bills (1) (2) (3) (4) Total Contributions ($100,000) −0.013∗∗∗ −0.189∗∗∗ −0.167 5.765∗∗∗ (0.004) (0.037) (0.202) (1.930) Controls Yes Yes Yes Yes State Fixed Effects Yes Yes Yes Yes Year Fixed Effects Yes Yes Yes Yes Observations 28,639 28,639 28,639 28,639 R2 0.283 0.228 0.530 0.516 Adjusted R2 0.281 0.226 0.529 0.515 Residual Std. Error (df = 28571) 1.751 1.817 94.260 95.672 Note: Column (1) and (3) reports the OLS estimate using the electoral competitiveness measure directly. Column (2) and (4) reports the 2SLS results using instrument described in Section 1.3.1. Standard errors for the coefficients are reported in the parenthesis. *, **, and *** indicate statistical significance at the 10, 5 and 1 percent levels, based on the p-value. Table 1.4: Effect of Campaign Contributions Split by Source on Number of Bills Introduced OLS Instrumental OLS Instrumental Variable Variable Number of copycat bills Number of copycat bills Number of bills Number of bills (1) (2) (3) (4) Total contributions from donors ($100,000) −0.013∗∗ −0.057 −0.070 13.242∗∗∗ (0.005) (0.046) (0.278) (2.438) Total contributions from party ($100,000) −0.032∗∗∗ −0.491∗∗∗ −2.627∗∗∗ −6.580 (0.010) (0.102) (0.564) (5.459) Total contributions from candidate ($100,000) 0.006 −0.273 2.017∗∗∗ −18.972∗ (0.013) (0.199) (0.704) (10.645) Controls Yes Yes Yes Yes State Fixed Effects Yes Yes Yes Yes Year Fixed Effects Yes Yes Yes Yes Observations 28,639 28,639 28,639 28,639 R2 0.283 0.210 0.531 0.488 Adjusted R2 0.281 0.209 0.529 0.486 Residual Std. Error (df = 28569) 1.750 1.837 94.218 98.436 Note: Column (1) and (3) reports the OLS estimate using the electoral competitiveness measure directly. Column (2) and (4) reports the 2SLS results using instrument described in Section 1.3.1. Standard errors for the coefficients are reported in the parenthesis. *, **, and *** indicate statistical significance at the 10, 5 and 1 percent levels, based on the p-value. Table A.3 and Table A.4 in the Appendix report the results of running the analysis on being opposed and vote margin using the alternative instrument construction. The effects I find are consistent with Table 1.1 and Table 1.2. Like- wise, Table A.5 and Table A.6 in the Appendix show the results for analysis us- 21 ing the alternative instrument. The results are in line with Table A.5 and Table A.6. 1.5 Conclusion In this paper, I find evidence that legislators that faced a competitive election tend to be less productive in a meaningful way in the subsequent session. Over- all, I find that legislators will rely more on copycat legislation and introduce fewer substantive bills overall. This is true if we measure the competitiveness of an election through vote margin, whether the candidate ran unopposed or the amount of money raised for that particular election cycle. These findings paint a somewhat bleak picture of American politics; one where a desirable outcome, such as having competitive races, leads to less ef- fective legislators. To make policy recommendations to remedy the situation, we need to understand how legislators spend their time while in the legislative session. For instance, does the threat of losing the next election drive legislators to spend more time fundraising instead of writing substantive bills, doing con- stituent service work, or establishing themselves as key figure of their party? Answering this question is crucial to better understand the bottlenecks of the legislative process and how we can improve it. 22 CHAPTER 2 DYNAMIC NETWORK FORMATION WITH FORWARD LOOKING AGENTS 2.1 Introduction Social connections play a crucial role in shaping relevant observed outcomes. For example, previous work finds that social connections shape how adoles- cents partake in risky behavior, how legislators can be effective lawmakers in congress, and drive labor outcomes. However, social networks are rarely fully observable in the data: accurate record keeping of friends, colleagues, foes, and the intensity of those connections is usually unavailable. In some cases, infor- mation on social connections is completely nonexistent. Furthermore, estab- lishing social connections is inherently a dynamic process, e.g., being friends today makes it easier to maintain that relationship tomorrow. The traditional approach of the literature has been to focus on static models of social networks and to assume that the true social network can be approximated by some prox- ies, such as having attended the same school or sharing some observable char- acteristics such as gender or race. In this paper, we build and estimate a model that addresses both of these limitations. To do so, we propose a new theory of network formation where agents are not myopic and where their decisions over their social links and behavior are en- dogenous. Our model, under some conditions, has a unique equilibrium predic- tion that depends solely on a finite set of structural parameters which sidesteps the curse of dimensionality of recovering every link in the network separately. Building on the method proposed by Battaglini, Patacchini and Rainone (2022), 23 we show that our model can be estimated and that we can recover the true value of the structural parameters under simulations. To illustrate the applicability of our approach, we estimate three distinct empirical examples. The first example focuses on the 111th and 112th U.S. Congress and the legislative effectiveness of lawmakers. The second application looks at R&D expenditures from 2006 to 2011 in the Chemicals And Allied Products industry. Lastly, we look at the edu- cational achievement of adolescents using the National Longitudinal Study of Adolescent to Adult Health (Add Health) dataset. Our model is divided into T periods. Each of these periods has two stages. In the first stage, the player decides how much effort to spend on establishing social connections, which is costly. One contribution of this paper is to assume that this cost also depends on the links formed in period T − 1. In the second stage of the game, players choose how much effort to exert to achieve some outcome, for example, passing bills for legislators or achieving good grades for adolescents, taking the social links established in the previous stage as given. To solve and evaluate this model, we rely on two methodological concepts. First, we introduce a new equilibrium concept, i.e., the Dynamic Network Com- petitive Equilibrium (DNCE). In our model, when agents change their level of effort, there is not only a direct effect for that particular change but also indirect spillover effects through their connections. For instance, in the case of the U.S. Congress, a legislator that decides to spend more time writing bills will de facto increase their effectiveness in the legislative production process. This change in effectiveness will trickle down to increasing the effectiveness of their social connections in Congress, which will in turn, add to the original increase in ef- fectiveness for the legislator, and so on and so forth. This cascading renders the 24 game almost impossible to solve. This is further complicated in our setting since efforts in one period will affect other periods indirectly. To address these issues, we extend the concept of Network Competitive Equilibrium (NCE) introduced by Battaglini, Patacchini and Rainone (2022). The idea of the NCE is that agents act as “price-takers”, i.e., that players do not internalize the indirect spillover effects on other players of their choice. We show how this allows us to characterize the equilibrium of our game with a system of nonlinear equations that depends on the structural parameters of the model and outcome levels. Second, using the system of nonlinear equations we derived, we use Bayesian methods to estimate our model. Because we cannot derive a closed- form likelihood function, standard Bayesian methods are not available to us. We instead use a modified version of Approximate Bayesian Computation (ABC) method as described by Battaglini, Patacchini and Rainone (2022). To test the validity of our approach, we simulate our model with a simple data-generating process. We show that we can recover the key structural parameters. We also show that as the number of agents, n, increases, our posterior distributions con- verge around the true values of our parameters. As mentioned previously, we apply our method to three distinct empirical settings. To keep our model tractable, we focus only on two periods in each example. The first example estimates peer effects on lawmakers’ legislative ef- fectiveness in the 111th and 112th U.S. Congress. Controlling for a set of legisla- tors’ individual characteristics, we find evidence that social connections matter in driving effectiveness, as measured by the Legislative Effectiveness Scores de- veloped by Volden and Wiseman (2014). Additionally, we find that the dynamic component of our model matters and that it improves the fit significantly com- 25 pared to the static approach of Battaglini, Patacchini and Rainone (2022). We also provide two counterfactual exercises. First, we evaluate the model when shutting down the enduring “old boy” networks as measured by the alumni connections in the network formation process. We show that there is no real change from eliminating alumni connections over the network centralities of the legislators across different groups (party, race, gender, etc.). We also pro- pose an exercise where we cull ideologically extreme legislators in the 112th U.S. Congress. To accomplish this, we change the political ideology as measured by the D-W nominate score of the most extreme legislators to the median D-W nominate score and measure the effect of that change on legislative effective- ness. Perhaps surprisingly, this exercise does not imply a significant difference in legislative effectiveness. Our second application exploits R&D investment from 2006 to 2011 in the Chemicals And Allied Products industry. Following the Hsieh, König and Liu (2022) approach, we estimate our model using R&D expenditures as our out- come variable and the productivity measured by the lagged stock of R&D ex- penditures as one of the characteristics we control for. We show that our model estimates null results for the network effects in this particular example. And lastly, we look at the behaviors of adolescents using the National Lon- gitudinal Study of Adolescent to Adult Health (Add Health) dataset. In this paper, we focus on educational achievement. We use the average GPA in a given year as the relevant outcome in our setting. First, we reaffirm the strong evidence that social network effects matter in the context of adolescent educa- tional achievement. Second, we show that having a dynamic model is crucial in explaining GPA differences among adolescents. 26 The remainder of this paper is organized as follows. Section 2 introduces our model of behavior and formation of social connections. Section 3 presents the econometric specification and the estimation method used to estimate our model. Section 4 presents a simulation to showcase the performance of the pro- posed approach. Section 5 applies the model to three distinct empirical appli- cations. Section 6 concludes. In the rest of this section, we review the related literature. Related literature The literature on estimating network effects in economics is extensive, but the vast majority of the research has two limitations: models are usually static and social networks are taken as exogenous. For the former limitation, a few re- cent papers have tried to tackle the problem of myopic agents (see, Ozgur, Bisin and Bramoullé (2018), Arduini et al. (2019)). Ozgur, Bisin and Bramoullé (2018) demonstrate various theoretical results for social interactions models with lin- ear dynamic economies. With a more general network topology setting, Arduini et al. (2019) proposes a model with forward-looking agents to study jointly so- cial network effects and smoking addiction. We contribute by incorporating both the agent’s behavior and choice of social links endogenously. Our ap- proach entirely avoids having to rely on observing an exogenous network to re- cover social interactions effects in a dynamic setting. Essentially, the only thing we need to make inference is a vector of observable outcomes. Assuming social networks are exogenous has two core issues. First, as de- scribed before, there is an issue of endogeneity between the behavioral decisions that determine the outcome we are interested in and the social network. There has been growing literature in recent years trying to answer this issue of endo- 27 geneity when estimating network effects (see e.g., Auerbach (2022), Battaglini, Patacchini and Rainone (2022), Battaglini, Sciabolazza and Patacchini (2020), Canen, Jackson and Trebbi (2022), de Paula, Rasul and Souza (2018), Goldsmith- Pinkham and Imbens (2013), Hsieh, König and Liu (2022), Rose (2019), Johnsson and Moon (2021)). Second, often times social links are unobservable. The usual approach has been to rely on proxies for the social links, e.g., cosponsorship network for legislators, friendship nominations for adolescents, and law clerk movements for federal judges. Battaglini et al. (2020), Battaglini, Patacchini and Rainone (2022), De Paula, Richards-Shubik and Tamer (2018), and Rose (2019), among others, have proposed alternatives. For instance, Battaglini et al. (2020), De Paula, Richards-Shubik and Tamer (2018), and Rose (2019) propose high- dimensional estimation techniques to estimate social networks. They first rely on assuming an exogenous linear model of behavior, which assumes away the endogeneity of the network. They also require that the network is sufficiently sparse and fixed over many repeated observations, which in our context is too limiting since we estimate our dynamic model over two periods. Therefore, to avoid those limitations, this paper extends the work of Battaglini, Patacchini and Rainone (2022) to now allow for non-myopic agents. In this paper, we apply our method to three different empirical settings. Our first example relates to the recent literature on studying how social net- works influence legislative behavior (see, e.g., Fowler (2006), Kirkland (2011), Battaglini and Patacchini (2018), Battaglini, Sciabolazza and Patacchini (2020), Battaglini, Patacchini and Rainone (2022), Canen, Jackson and Trebbi (2022)). Fowler (2006) and Kirkland (2011), using cosponsorship as the proxy for the social ties between legislators, showed that centrality correlates with effective- ness. Battaglini, Sciabolazza and Patacchini (2020) follows up on that work by 28 providing a two-step method to control for endogeneity between cosponsorship and legislative effectiveness by using a Heckman-type correction where the in- strument for the cosponsorship network is the alumni network. This approach has the shortfall of ignoring key structural network properties of the underly- ing social network. Battaglini, Patacchini and Rainone (2022) improve on this aspect with the proposed model of endogenous network formation1. We further expand on this by introducing dynamic effects in this paper and show that they are relevant and improve the fit of the model. Our second empirical example relates to the literature exploring network effects from R&D collaborations (see, e.g., König, Liu and Zenou (2019), Konig, Liu and Hsieh (2021), Hsieh, König and Liu (2022), Wang and Yang (2022), Cam- inati (2021), Zacchia (2020), Dawid and Hellmann (2020), Arqué-Castells and Spulber (2022)). Closest to our approach is Hsieh, König and Liu (2022) who de- rive a structural model for the coevolution of networks and behavior and apply it in the context of R&D spending and joint ventures decisions in the chemi- cal and pharmaceutical industry. Our paper differs in mainly two ways. First, in our setting, the true network G is unobservable. Second, firms are forward- looking players making decisions over two periods. Our final empirical setting is based on the vast literature on estimating peer effects on adolescent behaviors. Among others, Hanushek et al. (2003), An- grist (2004), Kang (2007) Boucher et al. (2014), Calvó-Armengol, Patacchini and Zenou (2009), and Patacchini, Liu and Rainone (2013) use networks based on school membership, classroom membership or self-reported friendship nomi- 1Canen, Jackson and Trebbi (2022) also propose and estimate a model of endogenous net- work formation based on the model proposed by Cabrales, Calvó-Armengol and Zenou (2011). The main difference is that in their approach where social efforts are not targeted, i.e., a legisla- tor chooses one particular level effort to connect with any legislators regardless of party identity for instance. 29 nations to evaluate the peer effects on behaviors ranging from smoking to school performance. In our context, we are closest to the setting of Calvó-Armengol, Patacchini and Zenou (2009) who look at school performance using friends nominations as the social connections network. Using proxies for the underly- ing social network has issues. First, these networks may be omitting important links. This is particularly salient when defining the network using classroom or school membership because it restricts friendship to only one setting. But, even with the friendship nominations, we might be missing key links if the self-report does not disclose every friendship. Moreover, it is almost impossible to report the strength of a connection. The second issue is that using only one network as our proxy for social connections is limiting. Researchers might want to com- bine different sources of information to infer social connections, e.g., classroom membership, friendship nominations, gender, and neighbors, but it is hard to know the optimal combination. Our paper, apart from having a model that allows for endogeneity and forward-looking agents, also provides a way to in- clude information from multiple adjacency matrices to endogenously choose the importance of particular data characteristics in the formation of the social networks. Finally, for our estimation method, it is important to mention the work of König (2016) and Boucher (2020) who also rely on ABC methods to estimate models of network formation. Unlike this paper, König (2016) and Boucher (2020) estimate probability distributions over networks with the underlying as- sumption that a particular network realization is observed. To achieve this, König (2016) uses the ABC method with a particular set of summary statistics and estimate spillover effects using patent and coauthorships data in physics and economics, while Boucher (2020) takes a similar approach but applied to 30 network effects to explain homophily among high school friends. 2.2 Model 2.2.1 Setup Consider a set of n agents, where N = {1, ..., n} is the set of agents. Agents live for T periods, and in each period each agent cares about a particular outcome denoted as Yi,t for t = 1, ...,T . The goal of each agent is to maximize the utility generated from that outcome. The type of outcome the agents care about can vary depending on the setting. For example, in the U.S. Congress, legislators care about the number of bills they pass. Another setting would be a researcher trying to maximize the number of papers they produce every year. To maximize its utility, the agent chooses an amount of effort to exert to produce Yi,t. We assume that Yi,t is an increasing function of this effort and the Yi,t of all the agents with whom i is socially connected at t. Specifically, we assume the following “production function” for Yi,t: Yi,t = ρ · ( si,t )α (li,t )1−α + εi,t (2.1) The Cobb-Douglass in (2.1) captures the effects of agent i’s effort li,t, and level of “social connectedness” si,t. We assume that i’s social connectedness is si,t = ∑ j∈N gi, j,tY j,t, (2.2) where gi, j,t is a measurement of the social link between i and j. The implication of (2.2) is that the level of effort exerted by j affects i through the degree of social 31 connection of i to j at t. The second term, εi,t, is some individual idiosyncratic factor that contributes to Yi,t efficacy independently from any social connections or effort choice. Players observe εi,t, but we do not. In the analysis below, we assume gi,i,t = 0, gi, j,t ∈ [ 0, g ] with g > 0, εi,t ∈ [ ε, ε ] with ε > 0, ε ∈ (0, 1), and li,t ∈ [ 0, l ] with l > 0. Additionally, we will maintain the following assumption that bounds Yi,t between 0 and 1: Assumption 1. ρ · gα · l 1−α + ε < 1. These assumptions on the parameters and functional form are only made for convenience. Other functional forms can be considered. In this model, the agents’ effort levels l = { l1,t, ..., ln,t } , outcomes Y ={ Y1,t, ...,Yn,t } and the social adjacency matrices Gt = ( gi, j,t ) i, j∈N are endogenous variables. In each period t, an agent i is forward looking and selects li,t and gi τ = (gi,1,t, ..., gi,n,t) to maximize the expected discounted outcomes in the T peri- ods. At t = T , agent i’s selects li,T ,gi T to maximize Yi,T ( li,T ,gi T ) , i.e., the outcome in period T only. At t < T , the agent selects li,t,gi τ to maximize: ui,t ( li,t,gi t ) = Yi,t ( lt i,t,g i t ) + E [∑T τ=t+1 βτ−1 d Yi,τ ( li,τ (Gτ−1) ,gi τ(Gτ−1) )] (2.3) where li,τ (Gτ−1) ,gi τ(Gτ−1) are the equilibrium values of effort and network con- nections at τ given the network formed in the previous period Gτ−1. The key feature of our model is that players are not myopic, i.e., players recognize that decisions at time t affect future periods. In every period, li,t,gi t and Yi,τ for all is are determined in two-stages. At t.2, the agents choose their efforts li,t, taking the social connections Gt as given. The cost of effort is assumed to be represented by a linear function Li(li,t) = c · li,t, where c is some cost parameter. At t.1, agents link with other agents to increase 32 the social component of their production function for the outcome of interest. At this stage, the agents simultaneously choose the social links gi, j,t. We assume that at t.1, agent i decides with which other agent j ∈ N\i he or she wishes to establish a link gi, j,t. A link with j at time t depends exclusively on i’s effort. The cost of establishing this link with intensity gi, j,t is given by the following: C(gi, j,t, gi, j,t−1, θi, j,t) = λ (1 + λ)  gi, j,t ζ ( gi, j,t−1, θi, j,t−1 ) + θi, j,t  1+ 1 λ , (2.4) where ζ ( gi, j,t−1, θi, j,t−1 ) is a function increasing in gi, j,t−1and θi, j,t−1, links with higher values at t−1 decrease the cost at t; and θi, j,t is a variable that captures the degree to which the types of i and j are socially “compatible” (the more i and j are socially compatible, the lower the cost for i to establish a link with intensity gi, j,t with j). This cost may be interpreted as, for example, the cost of the time spent socializing with j, the number of meetings between companies to establish a joint venture or the time that legislator i’s staff needs to spend with legislator j’s staff to coordinate actions. We assume here (but we don’t need to) that ζ ( gi, j,0 ) = ζ0, i.e., the cost in the first stage when we have no preceding network is constant; and ζ ( gi, j,t ) = ζ1 · ( gi, j,t + θi, j,t ) for t ≥ 1. If ζ ( gi, j,t ) = 0, then we are back to the static version of this model. A key feature of the evolution of social connectedness and the observed out- come over periods in this model is that they are interconnected, thus making the network formation model dynamic. Specifically, we assume that it is cheaper to maintain a social connection than to form a new one. The cost, moreover, may be heterogeneous. The variable θi, j is taken as exogenous in the theoretical analysis, and it may comprise several factors affecting the likelihood to observe a link—for example, similarity along various characteristics (gender, social, or educational background). We assume that the matrix Θt = ( θi, j,t ) i, j is symmetric 33 and that for each agent i there is a set Mi,t of other agents such that θi, j,t > 0 for j ∈ Mi,t and zero otherwise. This implies that agent i is compatible with at most a subset Mi,t with cardinality mi,t = ∣∣∣Mi,t ∣∣∣ of other agents. We denote m = maxi,t mi,t as the maximal cardinality of the subsets of connections over time. The following assumption guarantees that we will not have a corner solution in which an agent chooses li,t = l for some i ∈ N and t.2 Assumption 2. l > ((1 − α) ρ/c)1/α Note that if the social spillovers ρ is sufficiently small, Assumption 1 and Assumption 2 are automatically satisfied. The type ωi,t at time t of a agent i is defined by all the variables describing his/her preferences and social connections, so ωi,t = (εi,t, ( θi,k,t ) k∈N ,Mi,t). We de- note with Ω the space of types with typical element ωt ∈ Ω. A pure strategy for an agent is described by a socialization strategy g : Ω → [ 0, g ]n−1, mapping the agent’s type to a vector of intensities gi t = {gi, j,t} j,i for each of the n − 1 other agents, and an effort strategy l : Ω×G→ [ 0, l ] , mapping the social network and i’s type to an effort level. 2A formal proof of this fact is provided in the proof of Proposition 1. 34 2.2.2 Equilibrium analysis 2.2.3 Network competitive equilibrium In every period t, the previous section describes a relatively simple structure two-stage model that can be solved with backward induction. At t.2, the agents choose effort levels taking the social network Gt as given; at t.1, agents choose their social links. Solving for those levels of efforts and links becomes compli- cated because any action taken by i has not only a direct effect but also an indi- rect effect on their state. For example, consider the choice at t.1, when agent i chooses the link to j, gi, j,t: here a change in gi, j,t has a direct effect on Yi,t described by (2.1), but it may also have a complex set of indirect effects: the change in Yi,t given Gt changes all other Y j,ts of js who are connected to i at t. The change in gi, j,t, moreover, has dynamic effects: the choice of connections by i at t affect the distribution of the cost of connecting at t + 1, and so on. This in turn affects the networks and their effectiveness in the following periods. With a large set of agents, these indirect effects add a lot of complexity to the analysis. To address these complications, we apply and extend the concept of Net- work Competitive Equilibrium introduced by Battaglini, Patacchini and Rain- one (2022) to our dynamic environment. The key idea is to assume that, as in a competitive equilibrium for prices, players are “price takers ” with respect to the levels of the outcome Y of the other players. We can therefore introduce a Dynamic Network Competitive Equilibrium (henceforth, DNCE) as follows: Definition 1. Agents’ effort levels l = { l1,t, ..., ln,t }T t=1, outcomeY = { Y1,t, ...,Yn,t }T t=1 and the social matrices Gt = ( gi, j,t ) i, j∈N for t = 1, ...,T constitute a Dynamic Network 35 Competitive Equilibrium (DNCE) if: • network connections gi t = (gi,1,t, .., gi,n,t) are optimal for i at t given Yt and the expected Yt+τ for τ = 1, ...,T − t in equilibrium; • effort levels li,t are optimal for agent i at time t given Yt and Gt = ( gi t ) i∈N for t = 1, ...,T • the vector of outcome levels Y satisfies the production function (2.1) given l and Gt. The first two conditions are simply saying that agents are optimizing given others’ level of outcome, Yt without internalizing how a change in Yi,t could ripple down indirectly. Essentially, players act as “price-takers” where prices in this scenario are denoted by Yt. The last condition is akin to market clearing conditions; the levels of efforts, and social connections, need to produce the level of outcome Yt we observe. In the following subsections, we apply this DNCE concept to find a unique equilibrium prediction to our setting. For simplicity and tractability, we assume T = 2. Conceptually, we can do any T , but practically only low Ts. In the same vein, we assume that θi, j,t can either be 0 or 1. This boils down to having a dummy indicator of the underlying exogenous compatibility of i and j at time t. The choice of effort at t = 2 At stage 2.2, i.e., the second stage of the second period, the agents select effort given the network G2. Substituting in the solution the optimal level efforts into 36 (2.1), we obtain that the equilibrium levels of Y for a type i ∈ N is given by: Yi,2 = δ · ∑n j=1 gi, j,2Y j,2 + εi,2. (2.5) where δ = ρ ((1 − α) ρ/c) 1−α α . These equations can be expressed in matrix form as: [I − δ ·G2] · Y2 = ε2 (2.6) where ε2 is the vector ( εi,2 ) i∈N . If [I − δ ·G2] is invertible, then we can solve for Y2 as a function of the idiosyncratic ε2. The formation of the network at T = 2 At stage 2.1, the agents choose their social links to maximize the expected utility net of the cost of establishing the links. The expected continuation utility at 2.1 of an agent i is easily determined by substituting the optimal effort levels and Yi,2(G,ε) in (2.3): U i(G2, ε) = αδ ∑n j=1 gi, j,2Y j,2(G2, ε2) + εi,2 (2.7) Agent i will choose the links gi 2 = (gi,1,2, ..., gi,n,2) that maximize (2.7) with the additional constraint that gi, j,2 ∈ [ 0, g ] , i.e., agent i chooses his/her links solving: max gi 2∈[0,g]n ∑n j=1 αδ · gi, j,2Y j,2(G2, ε2) − λ (1 + λ)  gi, j,2 ζ ( gi, j,1, θi, j,1 ) + θi, j,2  1+ 1 λ   (2.8) taking G1 = {gi, j,1}i, j∈N and Y j,2(G2, ε2) as given. Because of “price taking” behav- ior set by our equilibrium concept, player i takes Y j,2(G2, ε2) as constant, not as a function of G2. Combining the solution of (2.8) with (2.6), we have that for a DNCE, Y2 (θ2) 37 and a matrix G2 (θ2) that need to solve the system: Yi,2 = δ · ∑ l∈N ( gi,l,2 · Yl,2 ) + εi,2 (2.9) and gi, j,2 ≤ ( ζ ( gi, j,1, θi, j,1 ) + θi, j,2 )1+λ ( αδY j,2 )λ ( = for gi, j,2 ≤ g) (2.10) for any i, j ∈ N. The choice of effort at t = 1 Again, in the second stage of a given period, the agents select their optimal level of efforts given G1, and substituting those into (2.1), we get that Yi,1 follows: Yi,1 = δ · ∑n j=1 gi, j,1Y j,1 + εi,1. (2.11) where δ = ρ ((1 − α) ρ/c) 1−α α . These equations can be expressed in matrix form as: [I − δ ·G1] · Y1 = ε1 (2.12) where ε1 is the vector of idiosyncratic base levels for Yi,1. The network formation at t = 1 Assuming that the constraints for G2 are not binding and that we have an inte- rior solution for (2.10), then given Yi,2 and Yi,1 as functions of G1 and the idiosyn- cratic ε, we can solve for the links G1 The expected continuation utility at 1.1 of a type i follows: U i(Gt, ε) =  αδ ∑n j=1 gi, j,1Y j,1 +βd · α λ (δ)1+λ · ∑ l∈N  ( ζ ( gi,l,1, θi, j,1 ) · Yl,2 )1+λ · ( 1 − pi,l,2 ) + (( ζ ( gi, j,1, θi, j,1 ) + 1 ) · Yl,2 )1+λ · pi,l,2  +εi,1 + βdεi,2  (2.13) 38 where βd < 1 is the “discount factor” of future Y levels, and pi,l,2 = P(θi,l,2 = 1 | ϵ). Again, the agent i chooses his/her links maximizing utility: max gi 1  ∑n j=1  αδ ∑n j=1 gi, j,1Y j,1 +αλ (δ)1+λ · βd · ∑ l∈N  ( ζ ( gi,l,1, θi, j,1 ) · Yl,2 )1+λ · ( 1 − pi,l,2 ) +(( ζ ( gi, j,1, θi, j,1 ) + 1 ) · Yl,2 )1+λ · pi,l,2  − λ (1+λ) ( gi, j,1 ζ0+θi, j,1 )1+ 1 λ   (2.14) Assuming that ζ ( gi, j,0 ) = ζ0 and ζ ( gi, j,1, θi, j,1 ) = ζ1 · ( gi, j,1 + θi, j,1 ) , from the FOCs we need to have: gi, j,1( ζ0 + θi, j,1 )1+λ =  αδY j,1 +αλ (δ)1+λ (1 + λ)βd  ( ζ1 ( gi, j,1 + θi, j,1 ) Y j,2 )λ ( 1 − pi, j,2 ) + (( ζ1 ( gi, j,1 + θi, j,1 ) + 1 ) Y j,2 )λ pi, j,2  ζ1Y j,2  λ (2.15) where pi, j,2 = P(θi, j,2 = 1 | ϵ). To ensure that gi, j,t is an interior solution, we make the following assumption. Assumption 3. g > (ζ0 + 1)1+λ (αδ)λ Under Assumption 3, gi, j,t < ḡ and we have the following proposition. Proposition 1. A Dynamic Network Competitive Equilibrium (DNCE) exists, and it 39 is characterized by a vector Y , and matrix G1 that solves Yi,1 = δ ∑ l∈N ( gi,l,1 · Yl,1 ) − εi,1 (2.16) Yi,2 = α λ (δ)1+λ ∑ l∈N (( ζ1 ( gi, j,1 + θi, j,1 ) + θi, j,2 ) · Yl,2 )1+λ − εi,2 (2.17) gi, j,1( ζ0 + θi, j,1 )1+λ =  αδY j,1 +αλ (δ)1+λ (1 + λ)βd  ( ζ1 ( gi, j,1 + θi, j,1 ) Y j,2 )λ ( 1 − pi, j,2 ) + (( ζ1 ( gi, j,1 + θi, j,1 ) + 1 ) Y j,2 )λ pi, j,2  ζ1Y j,2  λ (2.18) gi, j,2 = ( ζ1 ( gi, j,1 + θi, j,1 ) + θi, j,2 )1+λ ( αδY j,2 )λ (2.19) for any i, j ∈ N Therefore, the competitive equilibrium collapses to a system of 2N ×N equa- tions and 2N × N variables given by (2.19), (2.18), (2.16), and (2.17). Under Assumption 1-3, for the static model, i.e., ζ (·, ·) = 0, a sufficient con- dition (but not necessary) for the existence of a unique equilibrium is that δ is sufficiently small, i.e., δ < [ 1 (1+λ)αλm̄ ] 1 1+λ (see Battaglini, Patacchini and Rainone (2022) for a proof). We maintain that condition in our setting. Special cases For estimating our model, having a system of 2N × N nonlinear equation is an impractical challenge. We, therefore, look at two special cases: λ→ 0, and λ = 1. When λ → 0, the system of equations given by (2.18), boils down to having gi, j,1 → (ζ0 + θi, j,1). This means that in turns that gi, j,2 → (ζ0 · ζ1 + ζ1θi, j,1 + θi, j,2). In essence, the social connections matrices depend entirely on the exogenous θi, j,t. 40 This means that social connections are not endogenously determined anymore. Furthermore, if we let ζ0 = ζ1 = 0, then we are back to the simple static model. In this case, if θi, j,t is such that [I − δ · Θt] is invertible, then Yt = [I − δ · Θt] εt, i.e., the outcome Y is determined by the standard weighted Bonacich centrality where the weights are given by εi,t. Now, let λ = 1, meaning that we are back to having G be endogenously deter- mined. In this scenario, (2.18) reduces to the following simpler set of equations: gi, j,1 − ( ζ0 + θi, j,1 )2 [ αδY j,1 + αδ 22βd ( ζ1 ( gi, j,1 + θi, j,1 ) + pi, j,2 ) ζ1Y2 j,2 ] = 0 where again we let pi, j,2 = P(θi, j,2 = 1 | ϵ). This means that gi, j,1 is given by: gi, j,1 = ( ζ0 + θi, j,1 )2 [ αδY j,1 + αδ 22βd ( ζ1 ( gi, j,1 + θi, j,1 ) + pi, j,2 ) ζ1Y2 j,2 ] (2.20) ⇒ gi, j,1 = ( ζ0 + θi, j,1 )2 [ αδY j,1 + αδ 22βdθi, j,1ζ 2 1Y2 j,2 + αδ 22βd pi, j,2ζ1Y2 j,2 ] 1 − αδ22βd ( ζ0 + θi, j,1 )2 ζ2 1Y2 j,2 (2.21) This closed-form solution for gi, j,1 avoids the computational problem of solv- ing for gi, j,1 through a nonlinear system of equations. Again, in the DNCE, the level of Yi,t solves the following: Yi,1 − δ ∑ l∈N ( gi,l,1 · Yl,1 ) − εi,1 = 0 (2.22) Yi,2 − αδ 2 ∑ l∈N ( ζ1 ( gi,l,1 + θi, j,2 ) · Yl,2 )2 − εi,2 = 0 (2.23) Combining (2.21) with (2.23) and (2.22) yields a system of 2N equations and 2N variables. This is one of the main advantages of setting λ = 1, i.e., we avoid the curse of dimensionality of dealing with the full system of 2N × N equations. Therefore, for the rest of this paper, we will assume that λ = 1 to avoid the added computational strains of recovering for λ empirically. 41 2.3 Estimation Method 2.3.1 Model Specification We assume that we observe two periods of data (t = {1, 2}), with n agents i, and their level of outcome Yi,t. Additionally, we assume that we can observe some set of characteristics for each agent { Xi,1,t, . . . , Xi,K,t } . While we don’t observe the underlying endogenous network Gt nor the “social compatibility” measure θi, j,t, we make the assumption that θi, j,t follows some given functional form based on the agents’ characteristics, and in some cases, an observable exogenous net- work, Hi, j,t, that might be relevant to θi, j,t. To bring our model to data, we let εi,t = Xi,tβ+ ϵi,t where ϵi,t is some unobserv- able error term, β a K × 1 dimensional vector. Then, the solution to our model assuming λ = 1 is given by: Yi,1 = δ ∑ l∈N ( gi,l,1 · Yl,1 ) + Xi,1β + ϵi,1 (2.24) Yi,2 = αδ2 ∑ l∈N (( ζ1 ( gi,l,1 + θi,l,1 ) + θi,l,2 ) Yl,2 )2 + Xi,2β + ϵi,2 (2.25) gi, j,1 = ( ζ0 + θi, j,1 )2 [ αδY j,1 + αδ 22βd ( ζ1 ( gi, j,1 + θi, j,1 ) + pi, j,2 ) ζ1Y2 j,2 ] (2.26) where δ = ρ ((1 − α) ρ/c) 1−α α . Note again that setting ζ1 = 0, is equivalent to having the non-dynamic version of the model while setting ρ = 0 is equivalent to the simple linear model with no network effect. Therefore, in our context, we are essentially interested in seeing if ζ1 and ρ are different from 0. Finally, θi, j,t, we model it as a random realization of the following logistic function: Pr(θi, j,t = 1 | Z,H) = exp ( ι+γHi, j,t+ ∑ l m(zl i,t ,z l j,t)ψl ) 1+exp ( ι+γHi, j,t+ ∑ l m(zl i,t ,z l j,t)ψl ) (2.27) 42 where zl i,t is some relevant observable characteristics of i and m(·, ·) is some dis- tance function. Throughout this paper, we assume that m(zl i, z l j) = |z l i − zl j|. 2.3.2 Approximate Bayesian Computation (ABC) To estimate our model, we apply a variation of the Approximate Bayesian Com- putation (ABC) method. The goal, as with any Bayesian approach, is to recover the distribution of some parameters,ω, based on the observed data, i.e., P(ω | Y), also commonly referred to as the posterior. To achieve this, we rely on two core components; the likelihood function of the data given ω, P(Y | ω), and our prior belief on the distribution of ω, π(ω). In our context, a closed form for the likeli- hood function P(Y | ω) is unavailable. The ABC approach avoids this limitation by relying on simulating the data instead. The original ABC method proposed by Marjoram et al. (2003) offered a variation on the original Metropolis-Hasting algorithm to recover the posterior distribution of P(ω | Y): A1. Draw ω′ based on some transition kernel q(ω→ ω′) A2. Draw Y′ by simulating the model with parameters ω′ A3. Compute some distance, d(Y′,Y), between Y′ and Y. If d(Y′,Y) < ν where ν is some tolerance, move to the next step, else repeat. A4. Compute h = min ( π(ω′)q(ω′→ω) π(ω)q(ω→ω′) ) A5. Move to ω′ with probability h, and stay at ω with probability 1−h; go back to the first step. The result of this algorithm is a Markov Chain with a stationary distribution equal to P(ω | d(Y′,Y) < ν). In the limit, as ν goes to 0, P(ω | d(Y′,Y) < ν) should 43 converge to P(ω | Y) under some regularity conditions. The choice for distance function d(Y′,Y) depends on the context of the problem. If a set of sufficient statistics is available to the researcher, then reducing the distance between those statistics is obviously the right choice. Usually, a low-dimensional sufficient statistic is not available. Instead, researchers have to pick a set of summary statistics that would minimize the loss of information. Blum et al. (2013) provide a review of the literature on how the methods available to choose the set of summary statistics. While this method is technically available to estimate our model, in our con- text, simulating Y′ is particularly challenging. In fact, it would involve solving our set of nonlinear equations coming from (2.21), (2.23), and (2.22) at every it- eration (equations (2.18), (2.16), and (2.17) when λ , 1). Instead, we rely on the following set of equations: Yi,1 − δ · ∑ l∈N ( gi,l,1 · Yl,1 ) − εi,1 = Hi(ω; Y)) (2.28) Yi,2 − αδ 2 · ∑ l∈N (( ζ1gi,l,1 + θi, j,2 ) · Yl,2 )2 − εi,2 = Li(ω; Y)) (2.29) where gi,l,1 is given by (2.20). If ω is the true set of parameters, and Yi,1 and Yi,2 the empirical values, then Hi(ω; Y)) = 0 and Li(ω; Y)) = 0 for all i. If another set of parameters ω′ would generate a vector Y such that Y = Y′, then Hi(ω′; Y)) = 0 and Li(ω′; Y)) = 0 for all i. This means that we can circumvent simulating a new vector Y′ at every iteration by simply evaluating how far Hi(ω′; Y)) and Li(ω′; Y)) are from 0. Let λ(w,Y) be the vector resulting from stacking Li(ω; Y) and Hi(ω; Y) for all i. Then, we can modify the previous algorithm in the following way: B1. Draw ω′ based on some transition kernel q(ω→ ω′) 44 B2. Compute the norm of λ(w′,Y), ∥λ(w′,Y)∥. If ∥λ(w′,Y)∥ < ν where ν is some tolerance, move to the next step, else repeat. B3. Compute h = min ( π(ω′)q(ω′→ω) π(ω)q(ω→ω′) ) B4. Move to ω′ with probability h, and stay at ω with probability 1−h; go back to the first step. The result of this new algorithm is a Markov Chain with a stationary distribu- tion equal to P(ω | ∥λ(w′,Y)∥ < ν) under some regularity condition (see, for a detailed proof, Battaglini, Patacchini and Rainone (2022). In the limit, as ν goes to 0, P(ω | ∥λ(w′,Y)∥ < ν) converges to P(ω | Y). The logic behind this result is that ∥λ(w′,Y)∥ = 0 implies that Li(ω′; Y) = 0 and Hi(ω′; Y) = 0 for all i. In turn, by definition, for ω′, the equilibrium outcome Y′ is characterized by Li(ω′; Y′) = 0 and Hi(ω′; Y′) = 0 for all i. Which means that requiring Y′ = Y is equivalent to having ∥λ(w′,Y)∥ = 0 implying that P(ω | Y) = P(ω | ∥λ(w′,Y)∥ = 0). 2.4 Simulations We use Monte Carlo simulations to assess how well our model performs. The goal is to show that the estimation method described in the previous section can recover the structural parameters of the model. For simplicity, in the remainder of this paper, we set ζ ( gi, j,0 ) = 0 (this is the cost in the first stage when we have no preceding network, i.e., gi, j,0); and ζ ( gi, j,1, θi, j,1 ) = ζ1 · ( gi, j,1 + θi, j,1 ) for t ≥ 1. We also set c = 1, α = 1 2 , βd = .9, and λ = 1. This allows us to focus on the two core parameters of interest of our model: the network effect, ρ, and the dynamic effect, ζ1. 45 Now, to evaluate our method, we need to generate data to test it on. To do so, we assume that we have two periods (t = {1, 2}), with n = 400 agents i. Additionally, we assume that we can observe some set of characteristics for each agent { Xi,1,t, . . . , Xi,K,t } and an observable exogenous network, Hi, j,t, that is relevant to θi, j,t. In this section, we generate Hi, j,t by assuming it follows the Erdős-Rényi model, i.e., Hi, j,t = 1 with some probability p, and otherwise Hi, j,t = 0. Finally, we can set our structural parameters ω = (ρ, ζ1, σϵ , ι, γ, ψ, β) to some given value, generate a level of outcome Y by solving for the equilibrium out- come given ω, X and H, and then estimate the parameters given that data. Ap- pendix B.2 provides the full detail on how the data is generated. The only thing missing to run our estimation is a prior for ω = (ρ, ζ1, σϵ , ι, γ, ψ, β). While it is the case that for all parameters we can use un- informative priors, i.e., uniforms over all possible values, it is a good idea to restrict some of them to something more sensible. For instance, we let the prior for β be normal distributions with mean and variance equal to the estimated mean and variance from running an OLS on our outcome variable without any network effects. Figure 2.1 shows the results of this simulation. The line represents the pos- terior distribution of our parameters, the dashed line represents the priors, the dotted line gives the median of the posterior distributions, and the dark line shows the true values of ω. 46 Figure 2.1: Estimated Posterior Distributions for the Simulated Example - Key Variables - The two parameters of interest are ρ (network effect) and ζ1 (dynamic effect). Clearly, from Figure 2.1, we are updating our priors for most of our parame- ters, especially ρ. Not only that, but our posterior distribution for both ρ and ζ converge to their respective true value. This evidence supports that our method can recover the structural parameters of our model accurately. Sensitivity Analysis To get a better understanding of our method, we explore its performance by varying the number of agents n. To do so, we simulate our model, using the same way we describe in Appendix B.2, but where we set n to different val- ues, i.e., n = 100, 200, 300, 400. Figure 2.2 shows the difference represents the difference between the real and the estimated parameter. 47 Figure 2.2: Estimated Bias by Number of Agents n It is interesting to note three things from Figure 2.2. First, as n increases, the dispersion around the true value decreases, which suggests that our model works better in a setting with a larger set of players or at least that we can re- cover the true parameters with more precision. Second, we can recover ρ pretty well at any level of n. This is in line with Battaglini, Patacchini and Rainone (2022) that find that having n = 150 in their particular setting is sufficient to recover ρ with precision. Finally, for ζ, Figure 2.2 shows that having a large n is crucial. At n = 100, while the box plot contains 0, the spread is considerable. This points to the fact that our methodology might be limited in cases where n is too small. 48 2.5 Empirical Evidence In this section, we present three empirical settings that showcase the value of our approach. 2.5.1 Legislative Effectiveness in the U.S. Congress The first application studies the importance of social connections on U.S. legis- lators’ productivity, allowing for a dynamic network formation. Data Following Battaglini, Patacchini and Rainone (2022) approach, we measure members of Congress productivity by using the Legislative Effectiveness Scores (LESs) developed by Volden and Wiseman (2014). The score is based on how many bills a legislator introduces and how far these bills get on the floor (re- ferred to committee, receive action on the floor, passed, and so on). Legislative Effectiveness Project (http://www.thelawmakers.org) provides this data for the 93rd-110th Congress directly on their website. In this paper, we focus on two election cycles: the 111th Congress (election cycle 2008) to the 112th Congress (election cycle 2010). One advantage of choosing the 111th Congress and the 112th Congress for our context is that they are only separated by a midterm election during Obama’s first term as President of the United States of Amer- ica. This results in a large overlap between the House representatives present in both congresses, i.e., those reelected in 2010. 49 As for the set of controls Xi,r, we include the party, gender, race, the num- ber of years spent in Congress and its squared term, D-W ideology, margin of victory and its squared term, age, state, majority and minority party leadership, and previous legislative experience. Additionally, we include the main area of policy interest following Battaglini, Patacchini and Rainone (2022) approach3. To model the underlying cost θi, j,t for the network formation, we include a subset of our controls Xi,t: the number of years spent in Congress, age, state, most recurrent policy subtopic, majority or minority party leadership, gender, race, party, and age. In addition, we include the network of alumni connections between legislators, Ht. We construct this network using information on the educational background of legislators using the Biographical Directory of the United States Congress4. We set Hi, j = 1 if i, and j graduated from the same in- stitution within four years of each other, 0 otherwise. Battaglini and Patacchini (2018) provide more details on how to construct this network. To calibrate our model, we also make use of the co-sponsorship network 5. Note that the co-sponsorship network cannot be used directly as a component of θi, j,t unlike, Hi, j since sponsorship decisions are endogenous to the legislative activity. In fact, the LES is even built using co-sponsorship data. Table B.1 provides summary statistics for the different characteristics over the 111th Congress and the 112th Congress. 3For each Congress member, Battaglini, Patacchini and Rainone (2022) identify the main pol- icy interest in the following way. Using the data provided by the Congressional Bills Project (http://congressionalbills.org), which catego- rizes the bills using the policy topic coding sys- tem provided by the Policy Agendas Project (PAP) (www.comparativeagendas.net/us), for each Congress member i, they count the bills where the Congress member i was an original sponsor or cosponsor in each policy subtopic, and identify her/his most recurrent policy subtopic. 4(http://bioguide.Congress.gov/biosearch/biosearch.asp) 5The co-sponsorship network, Ci, j,t, is built by setting Ci, j,t = 1 if legislator i more than 2% of bills sponsored by i were also co-sponsored by and j, otherwise Ci, j,t = 0 50 Empirical results To estimate the importance of the dynamic network, we use the methodology described in Section 2.3. One caveat is that for the Bayesian estimation, we need to specify the priors of our parameters. While for most parameters, we use uninformative priors; we calibrate the priors of the parameters of the controls in the data generating process of the LES, β, and in the network formation of θi, j,t, with ι, γ, and ψ. For the former, we simply let the priors of β follow normal distributions using, where we calibrate the mean and variance by first running an OLS of Yi,t on the set of controls Xi,t. For the latter, we essentially do the same thing but where instead run a logit regression with the co-sponsorship network as our dependent variable 6. Table 2.1 and Table 2.2 reports the results of median value of posterior dis- tributions estimated. In brackets, we report the p-value of rejecting 0. Table 2.1 concerns only the network formation parameters, ι, γ, and ψ. Table 2.2 reports the main parameters of interest ρ, and ζ, and the controls parameters, β. 6The priors are set to the following values: ρ ∼ U(0, 1) βd ∼ 0.9 ζ1 ∼ U(0, 1) σϵ ∼ U(0, 1) β ∼ N(β̂ols,Σols) ι, γ, ψ ∼ N((ι̂logit, γ̂logit, ψ̂logit),Σlogit) where β̂ols,Σols are obtained by running an OLS regression of Y on the legislators’ characteristics, and (ι̂logit, γ̂logit, ψ̂logit),Σlogit are obtained by running a logit regression with the co-sponsorship network as the dependent variable. 51 Table 2.1: Network Formation for the Legislature Example Probability that θi, j,t = 1 Link in alumni network 0.063 (0.236) Seniority [1 = same quartile] 0.130∗∗∗ (0.000) Seniority i 0.046∗∗∗ (0.000) Seniority j 0.232∗∗∗ (0.000) Same state [1 = yes] 0.041∗∗∗ (0.000) Same topic [1 = yes] 0.122 (0.904) Leader [1 = both leaders] −0.019∗∗∗ (0.000) Same gender [1 = yes] 0.088∗∗∗ (0.000) Same race [1 = both white or both non white] 0.088∗∗∗ (0.000) Same party [1 = yes] 0.119 (0.476) Age [1 = same quartile] −0.013∗∗∗ (0.000) (Intercept) 0.077∗∗∗ (0.000) Observations 777,924 Note: Estimates of the parameters in equation (2.27). The medians of the pos- terior distributions estimated with the ABC algorithm is reported. The empir- ical p- values for zero the null hypothesis is reported in brackets. *, **, and *** indicate statistical significance at the 10, 5 and 1 percent levels, based on the empirical p-value. 52 Table 2.2: Estimation Results for the Legislature Example Dependent variable: LES ρ 0.242∗∗∗ (0.000) ζ 0.398∗∗∗ (0.000) Party −0.103∗∗∗ (0.000) Gender 0.051∗∗∗ (0.000) Non white −0.090∗∗∗ (0.000) Seniority −0.042∗∗∗ (0.000) Seniority2 0.004∗∗∗ (0.000) DW ideology −0.080∗∗∗ (0.000) Margin 0.004∗∗∗ (0.000) Margin2 0.00002∗∗ (0.012) Committee chair 3.787∗∗∗ (0.000) Delegation size −0.119∗∗∗ (0.000) Leader 0.222∗∗∗ (0.000) State legislative experience −0.070 (0.352) State legislative experience * State legislative professionalism 0.474∗∗∗ (0.000) Age −0.002∗∗∗ (0.008) (Intercept) 1.705∗∗∗ (0.000) State fixed effects Yes Congress fixed effects No Major topic fixed effects Yes Observations 882 Note: Estimates of the parameters in equations (2.24), (2.25), and (2.26). The medians of the posterior distributions estimated with the ABC algorithm is reported. The empirical p- values for zero the null hypothesis is reported in brackets. *, **, and *** indicate statistical significance at the 10, 5 and 1 percent levels, based on the empirical p-value. 53 In the appendix, Figure B.2 plots the posterior distributions of the model for the two parameters of interest ρ (network effect), and ζ (dynamic effect). Figure B.2 the posterior distributions for β, and Figure B.3 for the network formation parameters ι, γ, and ψ. In Table 2.3, we compared the model proposes by this paper, to simple OLS results (ρ = 0, and ζ = 0), and the model without dynamic effects (ζ = 0). 54 Table 2.3: Estimation Results for the Legislature Example - Comparison of Nested Models- Dependent variable: LES (1) (2) (3) ρ 0.202∗∗∗ 0.242∗∗∗ (0.000) (0.000) ζ 0.398∗∗∗ (0.000) Party −0.113 −0.122∗∗∗ −0.103∗∗∗ (0.716) (0.000) (0.000) Gender 0.047 0.034∗∗∗ 0.051∗∗∗ (0.704) (0.000) (0.000) Non white −0.075 −0.083∗∗∗ −0.090∗∗∗ (0.656) (0.000) (0.000) Seniority −0.039 −0.038∗∗∗ −0.042∗∗∗ (0.174) (0.000) (0.000) Seniority2 0.004∗∗∗ 0.004∗∗∗ 0.004∗∗∗ (0.006) (0.000) (0.000) DW ideology −0.070 −0.099∗∗∗ −0.080∗∗∗ (0.822) (0.000) (0.000) Margin 0.004 0.008∗∗∗ 0.004∗∗∗ (0.908) (0.000) (0.000) Margin2 0.00001 −0.00003∗∗ 0.00002∗∗ (0.960) (0.012) (0.012) Committee chair 3.813∗∗∗ 3.820∗∗∗ 3.787∗∗∗ (0.000) (0.000) (0.000) Delegation size −0.120 −0.121∗∗∗ −0.119∗∗∗ (0.313) (0.000) (0.000) Leader 0.230 0.241∗∗∗ 0.222∗∗∗ (0.233) (0.000) (0.000) State legislative experience −0.037 −0.023 −0.070 (0.832) (0.352) (0.352) State legislative experience * State legislative professionalism 0.402 0.294∗∗∗ 0.474∗∗∗ (0.414) (0.000) (0.000) Age −0.001 −0.001∗∗∗ −0.002∗∗∗ (0.793) (0.008) (0.008) (Intercept) 1.785 1.510∗∗∗ 1.705∗∗∗ (0.219) (0.000) (0.000) State fixed effects Yes Yes Yes Congress fixed effects No No No Major topic fixed effects Yes Yes Yes Partial F test 671.72 4.99 p-value 0.000 0.026 MSE 2.201 1.188 1.18