UTILIZING GENOMICS TOOLS TO INFORM TOMATO DISEASE MANAGEMENT A Dissertation Presented to the Faculty of the Graduate School of Cornell University In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy by Martha Amali Sudermann August 2021 © 2021 Martha Amali Sudermann UTILIZING GENOMICS TOOLS TO INFORM TOMATO DISEASE MANAGEMENT Martha Amali Sudermann Ph. D. Cornell University 2021 Tomatoes are a high value crop, but many diseases threaten optimal production and yield. In this dissertation three fungal and oomycete pathogens of tomato were examined using different genomic tools. In Chapter 1, the population structure and genetic diversity of the tomato leaf mold pathogen Passalora fulva were examined. High tunnel production of tomatoes is an important method of season extension and studies on P. fulva are necessary, as the disease appears more frequently. Races 0 and 2 were found to be the predominant races of isolates collected in the Northeast and Minnesota. A genotyping- by-sequencing (GBS) analysis found regional differentiation between the isolates collected in the Northeast compared to Minnesota. In Chapter 2, fungi from the genus Cladosporium were characterized. In early attempts to isolate P. fulva from leaves showing tomato leaf mold symptoms, more rapidly growing Cladosporium fungi were identified, yet they were not found to be pathogens of tomato in two high tunnel experiments. Phylogenetic, morphological, and GBS analysis offers additional characterization of the fungi. In Chapter 3, effector diversity within the US-23 clonal lineage of Phytophthora infestans, a devastating pathogen of tomato and potato that causes late blight, was examined. Previous work has demonstrated that some diversity exists within the US-23 clonal lineage. The PenSeq target enrichment sequencing tool enabled an examination of genes that are under selection. In the study of 12 isolates from the US-23 clonal lineage, variation in effector complement between isolates was identified. This is the first study iii focused on understanding differences in effectors within the US-23 clonal lineage and may lead to a better understanding of patterns of adaptive evolution that may exist within effectors present in isolates belonging to a single clonal lineage. Finally, in Chapter 4, early blight quantitative resistance in tomato was examined. Eleven tomato accessions with early blight resistance were sequenced to define cryptic introgressions underlying disease resistance in modern tomato (Solanum lycopersicum). Two quantitative trait loci (QTL) were fine-mapped on chromosomes 5 (EB-5) and 9 (EB- 9), and predictions were made about genes within the QTL bounds. The 11 tomato accessions that were sequenced were compiled with 764 previously sequenced accessions to predict EB-9 resistance in several heirloom tomatoes, as well as accessions of S. lycopersicum var. cerasiforme and S. pimpinellifolium. Results were experimentally validated with mist chamber experiments. iv BIOGRAPHICAL SKETCH Martha Amali Sudermann grew up at the intersection of prairie, oak savanna, and farmland in Northfield, Minnesota, on the home of the Wahpekute. Martha graduated from Northfield High School. After a year in rural Sweden as a Rotary Exchange Student, Martha attended St. Olaf College, double majoring in biology and religion, and minoring in mathematical biology. Martha spent a year in Lutheran Volunteer Corps working at the Urban Ecology Center in Milwaukee, Wisconsin before pursuing graduate studies in vegetable pathology at Cornell University in the laboratory of Dr. Christine Smart. v To my family: Mary, David, Hannelore, Rachel, and Amirhossein vi ACKNOWLEDGMENTS I want to thank colleagues, mentors, family, and friends for helping me to succeed. Thank you to teachers and professors alike who gave me confidence and created spaces where I felt a sense of belonging. Thanks especially to my middle school math teacher Mr. Meierbachtol, who made me realize that I could work through challenging problems, my AP Biology teacher Jody Saxton-West, whose excellent teaching and excitement were contagious, many college professors like my mathematical biology professor Rebecca Sanft, Lisa Bowers in microbiology, and many wonderful faculty mentors in the religion department including my advisor, John Barbour. I feel deep gratitude for the education I received, and the interdisciplinary thinking that was expected and encouraged at St. Olaf College. Thank you to Dr. Lindsey du Toit at Washington State University, for my first vegetable pathology experience during the summer of 2012. I knew little about plant pathology, but Lindsey took time to teach me. I owe a debt of gratitude to my advisor Dr. Chris Smart. She first took me on as a summer scholar in 2014, and this opportunity opened many doors for me. I am grateful for Chris’ mentorship. I received feedback and advice when I needed it, and I was also given the creative freedom to learn new skills and to improve my teaching. The pandemic threw a few curveballs, and I am grateful to have felt supported. I am also grateful for the mentorship of Smart Lab alumni Dr. Zach Hansen and Rachel Kreis, who taught me a diverse set of skills as an undergraduate researcher and in the early days of my graduate work. Thanks also to colleagues outside the lab including Dr. Sandeep Sharma for showing me how to transform filamentous fungi. I also owe a debt of gratitude to Dr. Taylor Anderson, for inviting me to collaborate on the comparative genomics early blight vii project. I learned a great deal about plant breeding, genomics, and genetics through this collaboration. During the throes of the pandemic, having a collaborative project kept me going on the days that felt most isolating. I am grateful for the expertise and insights of my committee members Dr. Stephen Reiners and Dr. Gillian Turgeon, as well as the expertise of Dr. Magdalen Lindeberg. Committee meetings were something I looked forward to every year, as I reflected on my research projects and future goals. Thanks to my lab: Holly Lange, Colin Day, Gregory Vogel, Chris Peritore, Hirut Bethaw, Chase Crowell, Ali Cala, and Juan Luis Girón, Patrick McMullen, and Libby Indermaur. One of the highlights of graduate school was getting to know lab mates during chats across the lab bench. Thanks to each of them for the interesting conversations and for new research ideas that arose from these chats. Thanks also to my summer scholars Melissa Regnier and Alejandra Rodríguez-Jaramillo for their dedication and excitement in the early years of the tomato leaf mold projects. Even when research was challenging, the excitement of outreach in schools gave me a lot of hope. Thanks to Anna Wallis, Katrin Ayer, and Lori Koenick for participating in all the Microbe Mania GRASSHOPR classes, this was truly a highlight! I am grateful for the friends that helped me get through graduate school. Many thanks to Lori Koenick and Jenny Wilson for their friendship throughout graduate school. Thanks to Jack Satterlee for bridge and game nights, and a shared connection to Minnesota. Thanks to Najva Akbari and Eric Schuppe for their hospitality, meals, Persian treats, and many adventures. Their friendship helped me to maintain my sanity before and during the pandemic. Thanks to Zoe Dubrow for being an excellent roommate. I cherish the late evening chats and taste testing marvelous baked goods. Thanks to Katrin Ayer for much needed walks, and for being a positive presence. It was a comfort to have a friend who had a similar graduate school timeline, including A-exams on the same day. viii Thanks also to Gregory Vogel, for research and data analysis advice, and for being a great lab bench neighbor. Thanks to my neighbors on Second Street in Ithaca for providing a sense of community when I needed it. I think fondly of all the porch visits and art that I received, the meals that were shared, and the chorus of laughter and play that was the soundtrack of my daily life. I thank my parents for their support, and for putting up with my calls when I needed some perspective. I hope we will not have to spend so much time away from one another again, and that we can make up for lost time. Thank you to my sisters for their encouragement. Thanks to my sister Lori and brother-in-law Stephen for hosting me in Washington during my first plant pathology experience. And finally, thanks to Amirhossein Tajdini for being my steady companion. I am thankful for four years of Persian dinners, encouragement, and the embarrassingly large number of times we frequented Sweet Melissa’s Ice Cream Shop. This work was supported in part by US Department of Agriculture National Institute of Food and Agriculture Grant 2011-68004-3015, a USDA SARE graduate student award GNE19-223, a New York State Specialty Crops Block Grant number 15- 015, and the Schmittau-Novak Small Grants program though Cornell’s School of Integrative Plant Science. Support was also provided by extension and outreach assistantships from the Cornell University’s College of Agriculture and Life Sciences and travel awards were received from Cornell Graduate School and the American Phytopathological Society Foundation. ix TABLE OF CONTENTS Biographical Sketch...........................................................................................................v Dedication........................................................................................................................vi Acknowledgements.........................................................................................................vii Introduction.....................................................................................................................11 Chapter 1—Towards a greater understanding of the population diversity of Passalora fulva in US high tunnels.................................................................................................. 27 Chapter 2—Is there an association between the Cladosporium species complex and the tomato leaf mold pathogen Passalora fulva?................................................................... 66 Chapter 3—Utilizing target enrichment sequencing to understand effector diversity within the US-23 clonal lineage of Phytophthora infestans............................................ 88 Chapter 4—Whole-genome introgression detection and haplotype analysis reveals early blight resistance in modern tomato breeding lines trace to ‘Devon Surprise’ and ‘Hawaii 7998’............................................................................................................................. 140 Conclusion.....................................................................................................................202 Appendix.......................................................................................................................206 x INTRODUCTION Tomato production in the United States My dissertation focuses on the ways that genomics tools can be used to better understand pathogen populations, pathogen diversity, and how pathogen populations adapt to selection pressures, including those imposed by deployed plant resistance proteins. In addition, comparative genomics tools were used to examine quantitative resistance in tomato. For my dissertation research, I focused on the tomato leaf mold pathogen Passalora fulva, the late blight pathogen Phytophthora infestans, and the early blight pathogen Alternaria lineariae. I sought to understand the genetic diversity and race structure of the fungal pathogen Passalora fulva (syn. Cladosporium fulvum). I also examined effector diversity within the US-23 clonal lineage of P. infestans. Finally, I utilized comparative genomics to better understand early blight quantitative resistance. Tomatoes (Solanum lycopersicum L.) are a high-value crop and management of diseases is a pressing concern. According to the 2019 Vegetables Summary from the USDA, 2.38 million acres of vegetable and melon crops were grown. California plants and harvests the largest area of tomatoes. Within the Northeast, only New Jersey ranked 5th within the top 10 states for area of tomatoes planted and harvested. Based on the 2017 census of Agriculture, New York ranked 12th, Pennsylvania ranked 18th, and New Jersey ranked 16th for market value of agricultural products sold for vegetables, melons, potatoes, and sweet potatoes. In terms of total production, tomatoes, onions, and sweet corn were at the top of the list for total production. In 2019, tomatoes were valued at over 1.60 billion dollars, with 282,000 acres planted (2017 Census of Agriculture). Tomatoes are grown in fields, covered greenhouse, and high tunnel environments, which offer season extension in colder climates. According to the 2017 Census of Agriculture, 7,974 farms in the United States grew tomatoes in protected environments. 11 Compared to the 2007 Census of Agriculture, the number of farms growing tomatoes in protected environments has nearly tripled in the United States, with a farm gate value of over 418 million dollars (2017 Census of Agriculture 2019). Demand remains high for tomatoes, including both fresh market and processing. Tomatoes are a high value crop, but bacterial, viral, fungal, oomycete, and nematode diseases all hinder production (Adhikari et al. 2017). More than 200 diseases threaten tomatoes, so disease management is crucial (Foolad et al. 2008). The small-fruited wild progenitor of tomato is S. pimpinellifolium. After domestication took place in South America, cherry tomatoes emerged (S. lycopersicum var. cerasiforme), and later big-fruited tomatoes (S. lycopersicum var. lycopersicum) (Wang et al. 2020). Throughout the domestication process, as particular traits were selected for and inbreeding occurred, there was an overall loss of genetic diversity. Therefore, breeders must introgress genes or quantitative trait loci (QTL) that are associated with disease resistance into tomatoes with preferred horticultural characteristics (Adhikari et al. 2017). Traditional breeding methods require significant investments in time and resources. Genomics tools, paired with marker assisted breeding and genetic engineering, can all increase the speed at which disease resistant cultivars are introduced. What tomato pathogens must growers and gardeners in the Northeast contend with? Understanding the tomato pathogens that I worked with in the context of other tomato pathogens, helps to put my research and experience with several plant pathogens into context. Management of a particular disease cannot be understood in isolation. Many tomato diseases continue to challenge growers in the Northeast. Fungal diseases of great concern for their potential economic impact include Septoria leaf spot caused by the fungal pathogen Septoria lycopersici, early blight caused by A. linariae, and Botrytis gray mold caused by Botrytis cinerea (Jones et al. 2014). 12 Tomato leaf mold caused by P. fulva is also a concern to growers. Gray mold and tomato leaf mold pose greater threats in greenhouse and high tunnels, where humidity remains high. Powdery mildew can also be a concern both in field and controlled environments. Diseases of concern caused by oomycetes include late blight caused by P. infestans. Care must also be taken to manage the bacterial speck, caused by Pseudomonas syringae pv. tomato, bacterial spot caused by several species within the genus Xanthomonas, and bacterial canker caused by Clavibacter michiganensis. Many of the pathogens will infect both the foliage and fruit, depending on disease progression. In contrast, P. fulva and S. lycoperscici will primarily infect the foliage (Jones et al. 2014). Environmental preferences of tomato pathogens—Many tomato pathogens require wet conditions and adequate moisture to enable infection and large-scale spread. Watersplash, including from overhead irrigation, can greatly accelerate the spread of bacterial pathogens, as well many of the fungal pathogens. Exceptions include P. fulva, which requires high humidity, but not wet conditions, and B. cinerea, where moisture is not a requirement. Temperature also plays a role. Pathogens such as P. infestans, several of the bacterial pathogens, and B. cinerea may cause disease more readily at slightly lower temperatures, whereas Xanthomonas, A. linariae, S. lycopersici, and P. fulva, thrive in warmer temperatures (Jones et al. 2014; Fry et al. 2013; Elmer and Ferrandino 1995). Monitoring temperature, precipitation, and other weather conditions, can help improve management, including more pin-pointed chemical applications, using decision support systems (Small et al. 2015). As growers expand high tunnel and greenhouse tomato production, diseases such as tomato leaf mold will continue to increase in prevalence, so long as growers do not think carefully about planting resistant tomato cultivars. Diseases such as early blight and late blight are most often greater concerns in the field, rather than enclosed environments. This is due, in part, to the fact that the pathogens require greater leaf wetness for infection. 13 A comparison of pathogen reproductive strategies—Understanding how pathogen populations reproduce and spread also greatly assists in management. The fungal pathogens S. lycoperscici, P. fulva, and A. linariae reproduce asexually and release conidia when conditions are optimal as the primary inoculum. Botrytis cinerea produces melanized sclerotia as survival structures, and these structures produce conidiophores, which release conidia as well. On rarer occasions, sexual reproduction is possible, resulting in the formation of apothecia and the release of ascospores (Agrios 2005; Williamson et al. 2007). The conidia produced by many of the fungi have the capability to overwinter on plant debris, in fields or greenhouses which are not thoroughly sanitized. For most fungal pathogens of tomato, the conidia can be wind-borne and spread by water splash, and for B. cinerea and S. lycoperscici, the pathogens can also be seed-borne. The oomycete P. infestans produces asexual sporangia, but in locations such as Europe and Mexico, the pathogen can also produce oospores, which have a greater ability to overwinter. The pathogen is both wind-borne and can also be spread by water-splash. The bacterial pathogens all require wet conditions and often the bacteria enter through wounds and can also be transmitted through seed. While overwintering could be possible, it is not common in colder climates (Jones et al. 2014) as long as proper sanitation occurs. With a diversity of different diseases of tomato, what are the best management strategies? Remove plant debris and neighboring host plants—Many tomato pathogens can survive on tomato debris and solanaceous weed hosts. For example, B. cinerea makes sclerotia that an act as survival structures and mycelia can survive in seed and debris. In addition, P. infestans can survive in tubers over the winter. Therefore, growers should remove crop debris and weedy hosts near fields. If cull piles are in the vicinity of fields, these should be removed as well. As a precaution, plant debris and volunteer plants should 14 be plowed after the previous harvest, in preparation for the next season (Jones et al. 2014; Zitter 1987). Maintain Hygienic Greenhouse Environment and Clean Machinery and Tools— Maintaining hygienic greenhouse environments and keeping machinery and tools clean are important precautions. For bacterial pathogens in particular, seedbed materials and potting materials should be pasteurized or sufficiently cleaned so to prevent the spread of the pathogens (Jones et. al 2014). Ammonium compounds should be used to disinfect greenhouse surfaces, shoes, tools, and other greenhouse materials (Zitter 1985). Obtain Clean and Certified Seed—Since infected seed can aid in long distance travel of both pathogens, growers should purchase clean, certified seed that is not harboring the pathogens (Zitter 1984). Obtaining clean and certified seed is especially important for bacterial pathogens but can also help in the management of some fungal pathogens (Agrios 2005). Obtain Resistant Cultivars—It may not be possible to choose varieties that are completely resistant to all the pathogens, but growers should use to use resistant varieties when they can. Management Strategies During the Season—In both field and greenhouse conditions, spacing of tomatoes should be considered to provide proper ventilation. Additional pruning may also be necessary. Wounds are a point of entry for bacteria, so care should also be taken not to wound plants (Zitter 1986). Irrigation should be adjusted so that plants do not become overly wet, and below-ground irrigation is recommended to avoid favorable conditions for disease (Jones et. al 2014). During the growing season, plastic or organic mulches will aid in managing the disease, helping absorb excess water and keeping possible weed hosts controlled. Throughout the growing season scouting should become an integral part of disease management. If disease of any kind is suspected, plants should be removed and 15 disposed of. Furthermore, healthy plants that were in proximity to diseased plants should also be removed. Based on New York State Integrative Pest Management Guidelines (https://nysipm.cornell.edu/agriculture/vegetables/vegetable-ipm-practices/), other practices include subsoiling to enhance drainage, growing on raised beds to keep roots out of standing water, reducing weed pressure, and irrigating with trip tape are important cultural practices. If necessary, fungicide application will also help prevent devastating losses. But fungicide application alone should not be considered as the solution. Not to mention environmental costs, as well as monetary costs, over time, pathogen populations will become insensitive to fungicides, rendering them ineffective. Use of fungicides should be part of a broader management plan that encompasses cultural practices and preventative methods for disease. In management of P. infestans and applied to the other fungal and oomycete pathogens, protectant fungicides should be used. Forecasting systems and scouting help to dictate spray intervals. In the chance that plants have been exposed to sporangia, a switch to systemic fungicides must occur (Fry 1998). Copper is also an important tool to manage tomato diseases, both in organic and conventional systems. For organic systems, copper offers some means to manage late blight, in addition to management of bacterial diseases, but alternatives should also be considered (Johnson et al. 2015). Forecasting systems can help growers to apply fungicides at optimal times and limit their overuse. The Cornell Decision Support System helps growers make informed decisions about fungicide application based on weather predictions and patterns when trying to manage late blight. The tools use Blitecast reports to assist in scheduling the initial fungicide application. Simcast will then provide reports to help in the scheduling of later fungicide applications (Small et. al 2015). As the season ends in late summer or early fall, growers should remove plants and 16 debris in preparation for the following season. This reduces the potential of the pathogens to over-winter. Plans for rotation should also be in place. Many tomato pathogens can survive in debris and on weedy hosts, so rotating to a non-host after the growing season will help prevent inoculum from infecting tomatoes in later seasons. A three-year crop rotation away from solanaceous crops is the minimum suggested rotation (Jones et. al 2014). How Can Tomato Diseases be Managed in the Era of Genomics and other ‘Omics Technologies? Disease resistant crops are necessary for more sustainable agricultural systems. Chemical control of tomato diseases is an important management strategy, but over reliance on fungicides and antibiotics can be detrimental, not only for the environment, but because many pathogen populations eventually become resistant, and more or different chemical inputs are needed, such as fungicides with multi-site modes of action. Classical breeding has proved effective in many regards. Sources of resistance often come from wild species, and introgressing genes or QTL into economically viable tomatoes is a lengthy process. For many pathogens, there are still no known resistance genes. Due to the controversial and highly politicized nature of genetic engineering, few breeding efforts involve introduction of transgenic R genes (Dangl and Jones 2001). In pathosystems that better follow gene-for-gene interactions, care must also be taken not to just deploy a single R gene, because resistance can easily be overcome in many cases. Instead, multiple R genes should be deployed in a gene-pyramiding strategy, and a multiline strategy could also be attempted, where a particular variety is grown that contains one of a few different resistance genes (Piquerez et al. 2014). Scientists must continue to track whether resistance genes are being overcome or not. Understanding plant resistance, as well as mechanisms by which pathogens overcome resistance are important themes in my dissertation. The choice to grow resistant 17 tomato varieties, may be easier said than done, due to customer and industry preferences, yet with the rise of more affordable genomics tools, such as next-generation sequencing, there are greater capabilities in comparative genomics, and more reproducible data analysis pipelines. Breeders can more rapidly develop new cultivars based on how plants are responding to changing pathogen populations. As fundamental research demonstrates, gene-for-gene interactions involve a whole complex of interacting proteins. An investment in fundamental research on plant-microbe interactions is also necessary for aiding in efforts to breed tomatoes that have lasting (or stable) resistance to pathogen populations (Piquerez et al. 2014). Furthermore, additional research must further our understanding of quantitative resistance. What can be learned from research on multiple diseases of tomatoes? In my dissertation research, I had the opportunity to work with a biotroph (P. fulva), the hemibiotroph (P. infestans), and a necrotroph (A. linariae). Learning to culture, inoculate plants, isolate, and work with each of the microbes presented unique challenges. The type of resistance also varied between the fungi or oomycete and the tomato host. I could not become too comfortable with culture techniques, laboratory techniques, or simplification of concepts like gene-for-gene interactions. What is known about disease resistance in each of the pathosystems that I study? As context for each of my chapters, I provide a short overview of what is known about resistance within each of the three pathosystems. Passalora fulva-Tomato Pathosystem—Tomato leaf mold resistant tomatoes are not well described in the United States. In the 20th century, observations were made that the cherry tomato S. pimpinellifolium showed resistant to tomato leaf mold, and this was the basis of some of the resistance breeding efforts (Alexander 1934; Rivas and Thomas 2005). It is challenging to transfer the information shared in publications of the 20th century to modern breeding efforts because little information is freely shared, and 18 resistance breeding efforts are based on earlier naming conventions used to describe pathogen races. Many of the Cf genes formerly described in detail, such as Cf-1 and Cf- 3 have not been experimentally validated as having direct gene-for-gene interactions with corresponding effectors (Boukema and Garretsen 1975; Leski 1977; Li et al. 2015). In contrast, there has been experimental validation demonstrating that several Cf genes, including Cf-2, Cf-4, Cf-4E, Cf-5, and Cf-9 interact with corresponding effectors, Avr2, Avr4, Avr4E, Avr5, and Avr9 (Stergiopoulos and de Wit 2009; Mesarich et al. 2014).The interactions are still being experimentally validated in some cases, but intermediary proteins may also be important, for example in the interaction between Cf-2 and Avr2, the co-receptor Rcr3, which is an extracellular immune protease, binds to Avr2, and when this complex forms, Cf-2 recognizes the complex, and a hypersensitive response is triggered (Paulus et al. 2020). Phytophthora infestans-Tomato Pathosystem—There has been a great deal of work done understanding resistance mechanisms in potato, including extensive work on understanding potato resistance genes and corresponding effectors in P. infestans (Nowicki et al. 2012). Less is known about late blight resistance in tomato. Despite work characterizing resistance genes in potatoes using tools like the target enrichment sequencing approach called RenSeq to characterize resistance genes with NB-LRR motifs (Jupe et al. 2013; Witek et al. 2016) and PenSeq to characterize pathogen effectors with RXLR motifs (Thilliez et al. 2019), additional work is needed to understand interactions between resistance genes and effectors within the tomato-P. infestans pathosystem. In potato, as well as tomato, introducing resistance genes is effective in the short term, but for greater durability, pyramiding of resistance genes is important, along with identification of QTLs that might contribute to late blight resistance (Foolad et al. 2008). In potato, over 20 resistance genes have been identified and cloned, including R1, R2, R3a, R3b, and others (Yang et al. 2017). The effector complements Avr1, Avr2, Avr3a, 19 and Avr3b, and others have also been identified and studied extensively. Understanding how rapidly pathogen populations can overcome resistance helps aid in developing more resistant varieties. Target enrichment re-sequencing of both effectors and resistance genes is one means by which genomic technologies can aid in breeding effort and management of a devastating disease. In addition, these tools help to more rapidly identify resistance genes that originate in wild species (Yang et al. 2017). In tomato, fewer details exist about the interactions between tomato resistance genes and pathogen effectors. The best characterized resistance genes, which originated from S. pimpinellifolium, include Ph-1 (chromosome 7), Ph-2 (chromosome 10), and Ph- 3 (chromosome 9). There are also additional QTL that confer resistance to late blight in tomato (Panthee et al. 2017). In New York State, after 39 varieties of tomatoes were tested for late blight resistance, tomatoes with both Ph-2 and Ph-3, or tomatoes that were homozygous for Ph-3, were less susceptible to tomato late blight (Hansen et al. 2014). Unfortunately, P. infestans changes rapidly, and has been able to overcome Ph-3 resistance (Foolad et al. 2008). To map QTL for resistance to tomato late blight, it was observed that the major QTL on chromosomes 9 and 10 are likely caused by Ph-2 and Ph-3, but an additional minor QTL on chromosome 12 was also identified (Panthee et al. 2017). Additional studies such as a RenSeq approach on tomato would likely enable further understanding of interactions between plant resistance proteins and tomato effectors. Alternaria linariae-tomato Pathosystem—Unlike the other pathosystems I work on, early blight resistance is quantitative. Prior to the work of my collaborator Taylor Anderson, QTL conferring EB resistance were identified, but the individual effects were quite low. Since the 1940s, breeders have incorporated genetic resistance to early blight (Anderson et al. 2021). Much of this resistance was from wild species such as S. pimpinellifolium and S. habrochaites (Ashrafi and Foolad 2015). Breeding for resistance 20 is made harder because often early blight resistance in wild species is often associated with undesirable horticultural traits including delayed maturity, decreased yield, and indeterminant growth patterns (Adhikari et al. 2017). Recently, several large effect QTL on chromosomes 5, 9, and 1 were identified, and new fresh market Cornell breeding lines were developed (Anderson et al. 2021). Still, the QTL intervals remain quite large, and additional fine mapping was needed to better define the ancestral introgressions, which was the impetus for the early blight project that is discussed in Chapter 4. What does the future of disease management look like? Tomato disease management is an interdisciplinary endeavor. Tools such as integrative pest management are crucial. Microbial populations change rapidly in response to environmental conditions, to resistance genes that have been deployed, and to chemical management strategies. Breeders, entomologists, soil scientists, horticulturists, and plant pathologists will need to continue to work closely together. Disease management must continue to change as climate change becomes ever more present. For example, technologies for earlier detection of disease like hyperspectral imaging will become more affordable and easier to access and allow for more precise and pointed management of diseases (Gold et al. 2020), as well as better observation of biotic and abiotic influences on plants throughout a growing season. Understanding disease resistance from a genomics lens has proven to be informative, and genomics capabilities have enabled far more rapid disease resistance breeding efforts, yet we must continue to integrate other ‘omics technologies such as transcriptomics and proteomics data into the analysis. We must also think more openly about tools like gene editing, which could offer a precise way to introduce, delete, or modify certain genes or genetic regions, which could also assist in developing more resistance varieties in a more rapid manner. Public perception of genetic engineering is a bottleneck, and scientists must focus on how they 21 communicate to the public and policymakers about new technologies, that sound potentially scary to consumers. Care must be taken to perform the proper experiments to reassure the public of the safety of these technologies. In plant disease management endeavors, we must also continue to prioritize sustainability, in terms of making decisions that do less harm to the environment. We need to ask how management strategies impact soil health, and consider strategies that will be sustainable to growers, from an economic perspective. As we demand larger yields of crops on less farmland, we are also facing climate change, which will result in more unpredictable weather patterns, drought, and flooding. Disease management will become ever more complex. We will need the expertise of scientists from diverse backgrounds and lived experiences, in order to solve increasingly challenging problems. Returning to management of tomato diseases, we need to grow tomatoes that are more adapted to a specific region, and we must grow tomatoes and other crops with greater genetic diversity. This, in turn, may allow growers to grow tomatoes with fewer inputs and fewer pest problems. With greater genetic diversity, diseases become more manageable since multiple cultivars of tomatoes are grown. There is also a demand for flavorful and interesting tomatoes and other vegetables, and this is an opportunity for small-scale growers from diverse backgrounds to help meet the demand for nutritious and tasty tomatoes. Much of our food is grown by mostly white farmers. This is due, in part, to the fact that land ownership is often confined to a primarily white demographic, as systemic racism has disadvantaged other groups and made it harder to build wealth, get loans, and own land. African American farmers have gradually been pushed off the land after Reconstruction, in the late 19th century and early 20th century, due to systemic racism. For example, loan offices that would not provide the necessary loans to purchase or keep farms running (Penniman and Washington 2018). Agriculture will only be truly sustainable when farmers are also more diverse, 22 farmworkers receive fair and equitable wages, and diverse groups of people are able to share their unique skills and experience to the agricultural system. We need to look at our changing world as a call to action, as we reconsider how food is grown and how we manage diseases. Agriculture should connect us to the land and to one another, rather than just be a means of production. 23 REFERENCES 2017 Census of Agriculture. 2019. USDA National Agricultural Statistics Service. Available at: www.nass.usda.gov/AgCensus Adhikari, P., Oh, Y., and Panthee, D. R. 2017. Current status of early blight resistance in tomato: an update. Int. J. Mol. Sci. 18:2019. Agrios, G. N. 2005. Plant Pathology. 5th ed. Boston: Elsevier. Anderson, T. A., Zitter, S. M., De Jong, D. M., Francis, D. M., and Mutschler, M. A. 2021. Cryptic introgressions contribute to transgressive segregation for early blight resistance in tomato. Theor. Appl. Genet. https://doi.org/10.1007/s00122- 021-03842-x Ashrafi, H., and Foolad, M. R. 2015. Characterization of early blight resistance in a recombinant inbred line population of tomato: II. Identification of QTLs and their co-localization with candidate resistance genes. Adv. Stud. Biol. 7:149– 168. Boukema, I. W., and Garretsen, F. 1975. Uniform resistance to Cladosporium fulvum Cooke in tomato (Lycopersicon esculentum Mill.). 2. Investigations on F2’s and F3’s from diallel crosses. Euphytica. 24:105–116. Dangl, J. L., and Jones, J. D. G. 2001. Plant pathogens and integrated defense responses to infection. Nature 411:826–33. Elmer, W., and Ferrandino, F. 1995. Influence of spore density, leaf age, temperature, and dew periods on septoria leaf spot of tomato. Plant Dis. 79:287–290. Foolad, M. R., Merk, H. L., and Ashrafi, H. 2008. Genetics, genomics and breeding of late blight and early blight resistance in tomato. Crit. Rev. Plant Sci. 27:75–107. Fry, W.E., McGrath, M.T., Seaman, A., Zitter, T.A., McLeod, A., Danies, G., Small, I.M., Myers, K., Everts, K., Gevens, A.J., Gugino, B.K., Johnson, S.B., Judelson, H., Ristaino, J., Roberts, P., Secor, G., Seebold, K., Snover-Clift, K., Wyenandt, A., Grünwald, N.J., Smart, C.D. 2012. The 2009 late blight pandemic in the Eastern United States – causes and results. Plant Dis. 97:296– 306. Gold, K. M., Townsend, P. A., Herrmann, I., and Gevens, A. J. 2020. Investigating potato late blight physiological differences across potato cultivars with spectroscopy and machine learning. Plant Sci. 295:110316. Hansen, Z. R., Small, I. M., Mutschler, M., Fry, W. E., and Smart, C. D. 2014. 24 Differential susceptibility of 39 tomato varieties to Phytophthora infestans clonal lineage US-23. Plant Dis. 98:1666–1670. Jones, J. B., Zitter, T. A., Momol, T. M., and Miller, S. A., eds. 2014. Compendium of Tomato Diseases and Pests. 2nd ed. St. Paul, Minnesota: American Phytopathological Society Press. Jupe, F., Witek, K., Verweij, W., Śliwka, J., Pritchard, L., Etherington, G.J., Maclean, D., Cock, P.J., Leggett, R.M., Bryan, G.J., Cardle, L., Hein, I., Jones, J.D.G. 2013. Resistance gene enrichment sequencing (RenSeq) enables reannotation of the NB-LRR gene family from sequenced plant genomes and rapid mapping of resistance loci in segregating populations. Plant J. 76:530–544. Leski, B. 1977. Identification of Cladosporium fulvum Cooke races from group B and C on tomatoes in Poland. Acta Agrobot. 30:181–194. Li, S., Zhao, T., Li, H., Xu, X., and Li, J. 2015. First report of races 2.5 and 2.4.5 of Cladosporium fulvum (syn. Passalora fulva), causal fungus of tomato leaf mold disease in China. J. Gen. Plant Pathol. 81:162–165. Mesarich, C.H., Griffiths, S.A., van der Burgt, A., Okmen, B., Beenen, H.G., Etalo, D.W., Joosten, M.H.A.J., de Wit, P.J.G.M. 2014. Transcriptome sequencing uncovers the Avr5 avirulence gene of the tomato leaf mold pathogen Cladosporium fulvum. Mol. Plant-Microbe Interact. 27:846–857. Nowicki, M., Foolad, M. R., Nowakowska, M., and Kozik, E. U. 2012. Potato and tomato late bight caused by Phytophthora infestans: an overview of pathology and resistance breeding. Plant Dis. 96:4–17. Panthee, D. R., Piotrowski, A., and Ibrahem, R. 2017. Mapping quantitative trait loci (QTL) for resistance to late blight in tomato. Int. J. Mol. Sci. 18:1589. Paulus, J.K., Kourelis, J., Ramasubramanian, S., Homma, F., Godson, A., Hoerger, A.C., Hong, T.N., Krahn, D., Carballo, L.O., Wang, S., Win, J., Smoker, M., Kamoun, S., Dong, S., van der Hoorn, R.A.L. 2020. Extracellular proteolytic cascade in tomato activates immune protease Rcr3. Proc. Natl. Acad. Sci. U. S. A. 117:17409–17417. Penniman, L., and Washington, K. 2018. Farming while Black: Soul Fire Farm’s practical guide to liberation on the land. White River Junction, Vermont: Chelsea Green Publishing. Piquerez, S. J. M., Harvey, S. E., Beynon, J. L., and Ntoukakis, V. 2014. Improving crop disease resistance: lessons from research on Arabidopsis and tomato. Front. Plant Sci. 5: 671. Small, I. M., Joseph, L., and Fry, W. E. 2015. Development and implementation of the 25 BlightPro decision support system for potato and tomato late blight management. Comput. Electron. Agric. 115:57–65. Stergiopoulos, I., and de Wit, P. J. G. M. 2009. Fungal effector proteins. Annu. Rev. Phytopathol. 47:233–263. Wang, X., Gao, L., Jiao, C., Stravoravdis, S., Hosmani, P.S., Saha, S., Zhang, J., Mainiero, S., Strickler, S.R., Catala, C., Martin, G.B., Mueller, L.A., Vrebalov, J., Giovannoni, J.J., Wu, S., Fei, Z. 2020. Genome of Solanum pimpinellifolium provides insights into structural variants during tomato breeding. Nat. Commun. 11:5817. Williamson, B., Tudzynski, B., Tudzynski, P., and Kan, J. A. L. V. 2007. Botrytis cinerea: the cause of grey mould disease. Mol. Plant Path. 8:561–580. Witek, K., Jupe, F., Witek, A. I., Baker, D., Clark, M. D., and Jones, J. D. G. 2016. Accelerated cloning of a potato late blight–resistance gene using RenSeq and SMRT sequencing. Nat. Biotech. 34:656–660. Yang, L., Wang, D., Xu, Y., Zhao, H., Wang, L., Cao, X., Chen, Y., Chen, Q. 2017. A new resistance gene against potato late blight originating from Solanum pinnatisectum located on potato chromosome 7. Front. Plant Sci. 8. Zitter, T. A. 1984. Vegetable crops: potato early blight. Vegetable MD Online. http://vegetablemdonline.ppath.cornell.edu/factsheets/Potato_EarlyBlt.htm Zitter, T. A. 1985. Bacterial diseases of tomato. Vegetable MD Online. http://vegetablemdonline.ppath.cornell.edu/factsheets/tomato_bacterial.htm Zitter, T. A. 1986. Vegetable crops: botrytis gray mold of greenhouse and field tomato. Vegetable MD Online. http://vegetablemdonline.ppath.cornell.edu/factsheets/Tomato_Botrytis.htm Zitter, T. A. 1987. Vegetable crops: septoria leaf spot of tomato. Vegetable MD Online. http://vegetablemdonline.ppath.cornell.edu/factsheets/Tomato_Septoria.htm 26 CHAPTER 1 TOWARDS A GREATER UNDERSTANDING OF THE POPULATION DIVERSITY OF PASSALORA FULVA IN US HIGH TUNNELS* Abstract High tunnels extend the growing season of high value crops, such as tomatoes, but the environmental conditions within high tunnels favor the spread of the tomato leaf mold pathogen Passalora fulva (syn. Cladosporium fulvum). Tomato leaf mold leads to defoliation, and if severe, losses in yield. Despite research on P. fulva in molecular contexts, little is known about the genetic structure and diversity of pathogen populations associated with high tunnel tomato production in the United States. From 2016 to 2019, tomato leaf samples were collected from high tunnels in the Northeast and Minnesota, and 50 P. fulva isolates were obtained and characterized. Other Cladosporium species were also isolated from the leaf surfaces. Koch’s postulates were conducted to confirm that P. fulva was the cause of the disease symptoms observed. Race determination experiments revealed that the isolates belonged to either race 0 (six isolates) or race 2 (44 isolates). Polymorphisms were identified within four previously characterized effector genes Avr2, Avr4, Avr4e, and Avr9. The largest number of polymorphisms were observed for Avr2. Both mating type genes, MAT1-1-1 and MAT1-2-1, were present in the isolate collection. For greater insights on diversity, the 50 isolates were genotyped at 7,514 single-nucleotide polymorphism loci using genotyping-by-sequencing, and differentiation by region but not by year was observed. Information about P. fulva * Sudermann, M., McGilp, L., Vogel, G., Regnier, M., Rodriguez-Jaramillo, A., and Smart, C. D. 2021. Toward a greater understanding of the population diversity of Passalora fulva in US high tunnels. Phytopathology. In revision. 27 population diversity will enable better management recommendations for growers, as high tunnel production of tomatoes expands. Introduction Tomatoes (Solanum lycopersicum L.) are a high value crop with a farm gate value of over 418 million dollars. Between 2007 and 2019, the number of farms growing tomatoes in protected environments has nearly tripled in the United States, reaching 7,974 farms in 2017 (2017 Census of Agriculture 2019; 2007 Census of Agriculture 2009). With the increasing use of high tunnels and greenhouses for season extension in the Northeast and Midwest, tomato leaf mold is now seen yearly. This disease has become a management concern, and can cause loss of yield, defoliation, and plant death (Thomma et al. 2005). Tomato leaf mold is caused by the biotrophic fungus Passalora fulva syn. Cladosporium fulvum. The fungus belongs to the Dothideomycetes class and the family Mycosphaerellaceae. It is now placed in the anamorph genus Passalora (Braun et al. 2003). Tomato leaf mold was first described by the British mycologist and botanist Mordecai Cubitt Cooke, in 1883, as a foliar disease of tomato (Cooke 1883). The infected leaf specimen was said to have originated in South Carolina (Alexander 1934). Detailed research from the 1930s (Bond 1938) describes infection occurring after conidia germinate on the abaxial leaf surface. After hyphae enter and clog the stomata, plants cannot respire, resulting in eventual defoliation. Disease symptoms first appear as light green spots on leaves. Subsequently, olive green conidia emerge and eventually, the leaves show signs of necrosis (Figure 1.1). It is most challenging to manage the disease in greenhouses or high tunnels, where conditions are humid (Thomma et al. 2005). The existence of a sexual stage has been hypothesized, but only the 28 asexual stage has been observed (Stergiopoulos et al. 2007a, 2007b). Figure 1.1. A) The abaxial side and B) adaxial side of a symptomatic tomato leaf infected by the tomato leaf mold pathogen Passalora fulva. The increased use of high tunnels in North America, combined with the ability of the pathogen population to rapidly overcome resistance, has resulted in a problematic re- occurrence of tomato leaf mold. Beginning in the late 1800s, greenhouse production of tomatoes arose around the world. Breeding for resistance became a priority, as fungicides were costly and not as effective as choosing resistant cultivars (Alexander 1934). By the 1930s, North American breeders introgressed resistance genes (known as Cf genes for C. fulvum) from resistant wild species, after it was observed that the red currant tomato (Solanum. pimpinellifolium) showed resistance to isolates of P. fulva (Alexander 1934; Langford 1937; Rivas and Thomas 2005). For several decades, outbreaks encouraged 29 continued breeding efforts (Bailey 1950; Kerr and Bailey 1966; Rivas and Thomas 2005), as resistance is often overcome because of strong selection pressure, following the introduction of tomato Cf genes (Stergiopoulos et al. 2007b; Iida et al. 2015). In North America, the breeding records diminished after the 1970s, as attention shifted away from greenhouse production of tomatoes. Only in the last decades, as greenhouse and high tunnel production has increased once again, have concerns over management of tomato leaf mold resurged. The well-characterized effector proteins Avr2, Avr4, Avr4E, Avr5, and Avr9 are secreted by P. fulva during infection, and are recognized by the products of the single dominant resistance genes Cf-2, Cf-4, Cf-4E, Cf-5, and CF-9, respectively (Stergiopoulos and Wit 2009; Mesarich et al. 2014). Although the study of pathogen race was common, beginning in the 1930s (Langford 1937), recent work has aimed to characterize pathogen populations once again using differential sets of tomatoes for the determination of isolate race (Iida et al. 2010; Rollan et al. 2013; Iida et al. 2015; Li et al. 2015; Medina et al. 2015; Lucentini et al. 2021; Yoshida et al. 2021). In Japan, 13 races have been identified (Iida et al. 2015; Yoshida et al. 2021), and six races have been described in China (Li et al. 2015). Substantially fewer races were identified in Argentina, with all isolates belonging to either race 0 or 2 (Rollan et al. 2013; Medina et al. 2015). Earlier studies included tomatoes with Cf-1 and Cf-3 as part of their differential sets for race determination, however, recent studies do not include them as they do not follow typical gene-for-gene interactions with effector complements (Boukema and Garretsen 1975; Leski 1977; Li et al. 2015). In addition to race determination experiments, the examination of polymorphisms within well-characterized effectors has been important to the understanding of the coevolution of Cf resistance genes and effectors. Through alterations to effector genes, such as point mutations, deletions, and insertions of transposon-like elements, the 30 pathogen can overcome Cf-mediated resistance, leading to the emergence of new races (Stergiopoulos et al. 2007b). Sequencing the previously characterized effector genes has enhanced our understanding of effector evolution and the ability of the pathogen to evade host resistance genes (Stergiopoulos et al. 2007a; Medina et al. 2015; Iida et al. 2015). Since no teleomorph has been observed, P. fulva is hypothesized to reproduce asexually. Amplified fragment length polymorphism (AFLP) analysis on a global collection of isolates showed greater genotypic diversity than what would be expected for a primarily asexual fungus where only a few clonal groups exist (Stergiopoulos et al. 2007). Mating type is another means to assess the possibility of sexual reproduction. Heterothallic fungi require opposite mating types to reproduce sexually, and for filamentous ascomycetes, mating type is controlled by the MAT locus, containing the idiomorphs MAT1-1 and MAT1-2. Based on similarity to homologous genes in other ascomycetous fungi, MAT1-1-1 and MAT1-2-1 have been cloned and characterized in P. fulva (Stergiopoulos et al. 2007). In studies characterizing 86 isolates from around the world and 133 isolates from Japan, both mating types were present in the populations, but departures from 1:1 ratio of each mating type were observed, which did not provide additional support for the possibility of random mating and sexual reproduction (Stergiopoulos et al. 2007; Iida et al. 2015; Milgroom 2017). Uncertainty remains about the role sexual reproduction may play in P. fulva. Questions remain, not only about reproductive mode, but also about the genotypic diversity and population structure of P. fulva populations in the United States. The first objective of this study was to determine the race structure of isolates collected in the Northeastern United States and Minnesota. The second objective was to determine the genetic diversity among these isolates. Through analysis of these data, we hope to better understand the regional differentiation in the populations, as well as the genetic diversity of P. fulva isolates collected in the US, to offer more precise recommendations for the 31 management of tomato leaf mold. Materials and Methods Fungal isolation—Samples of leaves with tomato leaf mold were collected in the summers of 2015 through 2019. To isolate P. fulva, rather than other fast-growing fungi that were on the surface of leaves, five-millimeter leaf disks were surface sterilized in 70% ethanol for 1 minute before being rinsed in sterile water, and then placed in a 10% bleach solution for one minute. After a thorough rinse with sterile water, leaf disks were placed on sterile filter paper to air dry for two minutes and then placed on potato dextrose agar (PDA), amended with 1 ml of 90% lactic acid per liter of media. To obtain single- conidial isolates, a small agar plug from a sporulating culture was placed in 1 ml of deionized water in a 1.5 ml tube. After vortexing tubes for five seconds, 1 µl of the conidial suspension was added to a new tube with 999 µl of deionized water. One hundred µl of the 1:1000 diluted solution was then pipetted onto PDA and dispersed using L- shaped cell spreaders. After 48 hours, single germinating conidia, visible by stereo microscope, were transferred to new PDA plates using a scalpel. All cultures were stored at 20-22°C in the laboratory. Small plugs from the resulting cultures were then stored in 30% glycerol stocks and placed at -80 °C. Identification of the causal fungus— To confirm the identity of P. fulva, the internal transcribed spacer region was amplified using PCR primers ITS4 and ITS5 (White et al 1990). To perform DNA extractions, mycelia growing on PDA was either collected directly from plates, or mycelia was grown in sterile potato dextrose broth, vacuum filtered, and collected for extractions. Twenty mg of mycelia was placed in 2 mL round- bottom tubes with two, 3-mm stainless steel beads (Qiagen, Valencia, CA, USA). Tubes were placed in a 2x24 Qiagen TissueLyser adapter set (Qiagen, Valencia, CA, USA) and stored at -80 °C for at least two hours, to aid in pulverizing the mycelia. Plates were then placed in the Retsch Mixer Mill MM 400 (Retsch Inc., Newton, PA) and mycelia was 32 ground at 30 Hz for 1 minute. DNA was then extracted using the DNeasy Plant Mini Kit, according to the manufacturer’s For PCR amplification of the ITS region, 12.5 µl Emerald Amp GT 2X Mastermix (Takara Bio Inc, Shiga, Japan) was combined with 9 µl deionized water, and 0.2µM of the forward and reverse primers ITS4 and ITS5 (White et al. 1990), and 50 ng of template DNA was added for a final reaction volume of 25 µl. Reactions occurred in the Bio-Rad C1000 Touch Thermal Cycler (Bio-Rad, Hercules, CA, USA). PCR conditions were a denaturation step at 94°C for 4 minutes, followed by 33 cycles of 94°C for 45 seconds, an annealing step at 57 °C for 45 seconds, and an extension step at 72 °C for 1 minute. The final step was 72 °C for 5 min. Amplicons were Sanger sequenced at the Cornell Institute of Biotechnology and queried against the NCBI nucleotide database using the Basic Local Alignment Search Tool (BLAST) (White et al. 1990; Altschul et al. 1990). Two isolates, 17036 and 17038, collected from grower’s high tunnels in 2017, which were identified using the described methods, were used to complete Koch’s postulates (Table 1.S1). Koch’s postulates were completed using 4-week-old tomato plants of the susceptible cultivars ‘Moneymaker’ and ‘BHN 589’. Conidia from two- week old cultures were scraped from a Petri dish and prepared into a slurry with deionized water. The mixture of conidia and water was filtered through cheese cloth and placed in Nalgene aerosol spray bottles (Thermo Fisher Scientific, Waltham, MA), after diluting to a concentration of approximately 2.5x105 conidia/ml. Plants were placed into a research high tunnel at Cornell AgriTech in Geneva NY, and all leaf surfaces were sprayed to runoff with the conidial suspension. After two weeks, symptoms and signs appeared on the foliage, and the fungus was re-isolated on PDA. A similar protocol was followed to try to complete Koch’s postulates with other fungi that were isolated from symptomatic leaf surfaces. Race determination assays— Seed for a differential set containing tomato lines with 33 resistance genes Cf-2, Cf-4, Cf-5, Cf-6, Cf-9, and Cf-4 and Cf-11 in combination, as well as a tomato line with no known resistance genes, were obtained from either the Charles M. Rick Tomato Genetics Center at the University of California Davis, the Centre for Genetic Resources at Wageningen University & Research, or commercially (Table 1.1). To obtain enough seed for race determination, four tomato seedlings of each genotype were placed individually in 2-gallon pots containing LM-3 All Purpose Mix (Lambert Peat Moss Inc., Rivière-Ouelle, Québec Canada). Tomatoes were staked and pruned until they produced fruit. Tomatoes were fertilized each week with All Purpose MiracleGro (24-8-16) fertilizer (Scotts Miracle-Gro Products Inc., Marysville, OH), based on the manufacturer’s recommendations, as well as calcium nitrate (15.5-0-0), and magnesium sulfate epsom salts once a week. Tomato plants were grown to maturity in a climate- controlled greenhouse, with 16 hours of light and 8 hours of dark. Fruit was harvested at maturity. Seed was collected from the fruit, cleaned in a 50% HCl solution, rinsed with trisodium phosphate cleaner, dried, and stored in a cold room until race assays began. 34 Table 1.1. Results of the race determination experiments, and information on each of the tomato accessions used in the experiments.1 Cf Number of Cultivar or Accession resistance isolates number Background gene causing Genotype present disease Seed obtained Moneymaker No Cf gene 50/50 commercially Moneymaker LA3043 Cf-2 44/50 background Moneymaker LA3045 background Cf-4 0/50 LA3046 Moneymaker Cf-5 0/50 background LA2448 Ontario 7818 Cf-6 0/50 Moneymaker LA3047 Cf-9 0/50 background CGN18402 Ontario 7716 Cf-4, Cf-11 0/50 1 Accessions with the ‘LA’ prefix were obtained from the Charles M. Rick Tomato Genetics Center, and CGN18402 was obtained from the Centre for Genetic Resources in the Netherlands. Race assays were conducted during the summers of 2019 and 2020 at the Cornell AgriTech research high tunnel. Tomatoes were sown in 50-cell flats containing LM-1 Germination Mix (Lambert Peat Moss Inc., Rivière-Ouelle, Québec Canada) and grown in a greenhouse for 4 weeks. In preparation for transplanting, tomatoes were fertilized once a week, after the first true leaves expanded, using MiracleGro All Purpose Plant Food (24-8-16). Tomatoes were transplanted into 1-gal pots and transported to the high tunnel. Each differential set of tomatoes (Table 1.1) was placed 1 meter apart from the other sets, across three rows that were spaced 1.25 meters apart. Including control tomatoes, 22 sets of tomatoes could fit into the high tunnel. Experiments were set up at 35 three-week intervals throughout the summer. Each isolate was randomly assigned to two different sets, and the position of genotypes within blocks was also randomly assigned. Thus, there were two replicates of each isolate and genotype combination. To prepare inoculum, cultures were started from isolates in long-term storage, to avoid the effects of serial microbial transfers. Due to the slow-growing nature of the fungus, a suspension of conidia and water was spread on plates and grown for four weeks at room temperature (20-22°C). Conidia were then washed from plates with sterile water and their concentrations were measured using a hemocytometer. A total of 300 mL of inoculum diluted to a final concentration of 2.5x105 conidia per ml was prepared for each isolate. Plants were inoculated to runoff using a separate handheld misting bottle for each isolate. After inoculation of each set of tomatoes, plants were regularly watered until disease symptoms appeared after about 14 days. Inoculation equipment and other materials were thoroughly disinfected in a bleach solution between experiments. Plants were rated for the presence or absence of tomato leaf mold symptoms about 18 days post inoculation, to ensure ample disease development prior to the rating. Sequencing of effector genes and mating type idiomorphs—The primers used for PCR amplification and sequencing were previously described and are listed in Table 1.S2 (Medina et al. 2015; Iida et al. 2015). To amplify and sequence the effector genes, the same DNA extraction and PCR preparation protocols that were described for the ITS region were followed. For amplification of Avr2, Avr4, Avr4e, and Avr9, the PCR conditions were a denaturation step at 94° C for 5 minutes, followed by 35 cycles of 94° C for 30 seconds, an annealing step at 60° C for 30 seconds, and an extension step of 72° C for 1 minute. The final step was 72° C for 7 min. The reaction mixtures for the amplification of mating type idiomorphs were identical to those described above. The PCR program was described previously (Stergiopoulos et al. 2007). Sequence data was analyzed in Geneious Prime 2020.2.2, with alignments of sequences to previously 36 published reference sequences (Table 1.S3) performed using MUSCLE 3.8.425 (Kearse et al. 2012; Edgar 2004). We identified polymorphisms within the coding regions of genes based on the position of the start codon of the reference genes. The main reference sequences were from the race 0 isolate ‘ELH’ from Argentina (Medina et al. 2015). We also compared sequence data to additional reference sequences belonging to race 4 or race 5 isolates (Table 1.S3) (Luderer et al. 2002; Joosten et al. 1997; Westerink et al. 2004; Ackerveken et al. 1992). Sequenced genes were then translated using Geneious and the results were compared to protein predictions from reference sequences, obtained from the NCBI Protein database. Genotyping-by-Sequencing—Genotyping-by-sequencing (GBS) libraries were prepared and sequenced at the University of Wisconsin-Madison Biotechnology Center DNA Sequencing Core Facility. Libraries were digested with ApeKI and then sequenced on an Illumina NovaSeq6000, generating 150 base-pair paired-end reads. Several replicate samples were included in the plate (Table 1.S1). Genotypes were called with the TASSEL 5 GBS v2 Pipeline, using the forward reads as input. Sequence tags were aligned to the C. fulvum v1.0 genome (Glaubitz et al. 2014), obtained from the Joint Genome Institute (de Wit et al. 2012; Ohm et al. 2012). We used vcfR 1.11.0 (Knaus and Grünwald 2017) and VCFtools 0.1.16 (Danecek et al. 2011) to retain only high-quality SNPs. Using vcfR, variants were filtered when read depth (DP) was less than 5 and greater than 100 (Knaus and Grünwald 2017). Samples were then omitted if they had more than 55% missing data. Variants with more than 20% missing data were also omitted. Only biallelic and polymorphic sites were retained, and given that the fungus is haploid, heterozygous genotype calls were also censored. SNPs were retained only when the minor allele frequency was greater than 0.05. When technical replicates were present, the isolate with the least missing data were retained. Analysis of GBS data was conducted in R version 4.0.2 (R Core Team 2020) using 37 code adapted from the Population genetics and genomics in R primer (https://grunwaldlab.github.io/Population_Genetics_in_R/ index.html). The R packages vcfR (Knaus and Grünwald 2017) and ggplot2 (Wickham 2016) were used in the processing and visualization of data. The R package ape (Paradis et al. 2004) was used in the construction of the neighbor-joining (NJ) tree, and the RColorBrewer color palette package (Neuwirth 2014) was also used. The map illustrating the counties where sampling took place was constructed using the maps R package (https://www.rdocumentation.org/packages/maps/versions/3.3.0). Clonal groups were identified using pairwise identity-by-state (IBS), as referenced in previous publications (Vogel et al. 2020; Carlson et al. 2017). For all pairwise combinations of isolates, the proportion of alleles that were shared at the non- missing sites were calculated. To differentiate potential clones, a threshold of 99% was set based on visualization of a histogram of the IBS matrix data. Isolates beyond the 99% threshold were compiled into clonal groups. The R package adegenet (Jombart 2008; Jombart and Ahmed 2011) was used to construct the PCA plots using the clone-corrected dataset. Using the clone-corrected data, pairwise Fst (Weir and Cockerham 1984) was calculated using the R package hierfstat (Goudet 2005) The analog to Fst, Gst (Nei 1972, 1973; Hedrick 2005; Knaus and Grünwald 2017) was also calculated with the clone- corrected dataset. Linkage disequilibrium (LD) decay of the clone-corrected data was visualized using PopLDdecay version 3.41. Pairwise LD was measured by r2. The bin2 parameter was set to 10000 bp (Zhang et al. 2019). Data Availability—Raw demultiplexed FASTQ files have been deposited into National Center of Biotechnology Information Sequence Read Archive (BioProject accession number PRJNA734954). Scripts are available on Github (https://github.com/mas835/tomatoleafmold). 38 Results Fungal isolation and identification—Between 2016 and 2019, we isolated P. fulva from 35 symptomatic tomato leaf samples collected in the Northeast and 15 samples from Minnesota (Figure 1.2, Table 1.S1). For leaf samples that were not surface sterilized with both ethanol and bleach, fungi on the surface of leaves were readily isolated. The fungi that could not be identified as P. fulva were morphologically distinct, quickly covered the surface of the PDA and were darker gray in color. In contrast, P. fulva remained confined to a small section of the plate (Figure 1.3). BLAST searches and phylogenetic analysis of the ITS results revealed that the isolated fungi, that were not P. fulva, had the greatest similarity to several Cladosporium spp. including C. cladosporioides and C. pseudocladosporioides. In contrast, P. fulva, as confirmed by ITS sequencing, was only consistently isolated from samples that were thoroughly surface sterilized. Unlike the other isolated fungi, P. fulva remained confined to a small section of PDA plates and were lighter in color (Figure 1.3). Figure 1.2. Geographical origin of Passalora fulva isolates obtained from tomato leaf samples collected in high tunnels from Minnesota (orange), New York (pink), Vermont (green), New Hampshire (purple), and Massachusetts (turquoise). The points are placed on the coordinates of the county were symptomatic leaf samples were collected. The size of the points is proportional to the number of isolates collected from that county. 39 Figure 1.3. Morphological differences between A) Passalora fulva, B) Cladosporium cladosporioides, and C) Cladosporium pseudocladosporioides on potato dextrose agar. The pathogenicity of P. fulva was confirmed by successful completion of Koch’s postulates using the two isolates 17036 and 17038 collected in 2017 (Table 1.S1). Disease symptoms and signs appeared 14 days post inoculation. The pathogen was then reisolated on acidified PDA. Unlike the samples we received from growers’ high tunnels, P. fulva was more easily isolated, even on leaves where there was minimal surface sterilization. We could not complete Koch’s postulates with isolates that were classified as other Cladosporium spp., as no disease symptoms were produced following inoculation. Race determination experiments—Race determination experiments were completed over two growing seasons in a high tunnel environment. All 50 P. fulva isolates caused disease on the ‘Moneymaker’ cultivar, which has no known resistance genes. None of the 50 isolates caused any disease symptoms on tomato accessions containing either Cf-4, Cf-5, Cf-6, Cf-9, or Cf-4/11 in combination. Forty-four isolates caused disease symptoms on tomatoes with the resistance gene Cf-2, whereas five isolates from New York (17052, 17053, 17057, 18013, and 19006) and one isolate (19008) from Massachusetts caused disease symptoms only on the susceptible ‘Moneymaker’ cultivar (Table 1.1). Therefore, based on previous race nomenclature conventions, 44 isolates collected across the Northeast and Minnesota belonged to race 2, while six isolates belonged to race 0. 40 Analysis of allelic variation of four effectors—Four effector genes, Avr2, Avr4, Avr4e, Avr9, were sequenced for each of the 50 isolates to identify polymorphisms relative to a race 0 reference sequence (Table 1.2). For Avr2, six polymorphisms were identified within the coding region, and one polymorphism was identified in the non-coding region, downstream of the coding region of the gene. These deletions and insertions result in frameshift mutations. Three other isolates had a large deletion spanning position 37 to the end of coding region, resulting in a truncated protein after L12. The six race 0 isolates had a deletion of adenine and guanine, downstream of the coding region (Table 1.2 and 1.3). Finally, seven isolates belonging to race 2 (17035, 17046, 17049, 18009, 19001, 19013, 19017), did not produce a PCR amplicon with the Avr2 primers used (Medina et al. 2015). No isolate had more than one polymorphism relative to the race 0 Avr2 reference sequence. There were 12 isolates from Minnesota and two isolates from Ontario County, New York that belonged to race 2, yet did not show any polymorphisms compared to the race 0 reference sequence (Table 1.2 and Table 1.3). For Avr4, only one SNP was located, upstream of the coding region at position, but 18 isolates from the Northeast states contained the non-reference allele. For Avr4e, two SNPs were observed, and the same 15 isolates contained the non-reference alleles, resulting in two potential nonsynonymous substitutions. Finally, for Avr9, only one SNP was identified within the coding region, and 16 isolates contained the non-reference allele. This corresponded to a nonsynonymous substitution (Table 1.2, Table 1.3). 41 Table 1.2. Mating type, race, and effector gene polymorphisms in the coding region, in reference to race 0 isolate ‘ELH’ (Medina et al. 2015) for each isolate. Clonal Sample State MAT-type Race Avr2 Avr4 Avr4e Avr9 Group Name Pf76 Pf77 Pf78 Pf79 1 MN MAT1-2-1 c.244T>C Race 2 No changes No changes c.23T>C Pf81 c.248T>C Pf90 Pf91 Pf93 17035 17046 NY 17049 2 18009 MAT1-1-1 Race 2 Not amplified g.128G>A No changes No changes 19001 19013 NH 19017 NY 17036 17037 3 17038 NY MAT1-2-1 Race 2 c.64delA No changes No changes No changes 19016 19018 Pf85 Pf88 4 MN MAT1-2-1 Race 2 No changes No changes No changes c.23T>C Pf89 Pf92 19010 19011 5 NH MAT1-2-1 Race 2 c.64delA g.128G>A No changes No changes 19012 19014 17052 6 17053 NY MAT1-2-1 Race 0 g.347_348delAG No changes No changes No changes 19008 17040 7 18019 NY MAT1-2-1 Race 2 c.64delA No changes No changes No changes 18024 19002 8 19003 VT MAT1-2-1 Race 2 Large deletion g.128G>A No changes No changes 19004 17043 9 NY MAT1-2-1 Race 2 c.64delA g.128G>A No changes No changes 19009 16139 10 NY MAT1-1-1 c.244T>C Race 2 No changes No changes c.23T>C 18025 c.248T>C Pf80 11 MN MAT1-1-1 c.244T>C Race 2 c.121_122delTA No changes c.23T>C Pf82 c.248T>C 12 19006 MA MAT1-1-1 Race 0 g.347_348delAG No changes No changes No changes 13 18010 NY MAT1-1-1 Race 2 c.209insA g.128G>A No changes No changes 14 18012 NH MAT1-1-1 Race 2 c.2T>C No changes c.244T>C No changes 42 c.248T>C 15 18013 NY MAT1-1-1 Race 0 g.347_348delAG No changes No changes No changes 16 17057 NY MAT1-2-1 Race 0 No changes g.128G>A No changes No changes 17 18005 MA MAT1-2-1 c.244T>C Race 2 c.46delG No changes No changes c.248T>C 18 Pf84 MN MAT1-2-1 c.244T>C Race 2 c.2T>C No changes No changes c.248T>C Table 1.3. The polymorphisms and predicted effects on proteins for four effector genes in reference to the race 0 isolate ‘ELH’ from Medina et al. 2015. Number of Size Polymorphisms in coding region and predicted effects on proteins polymorphisms Length Non- Number of of Coding Type of Predicted effect on Gene coding isolates with Polymorphism reference region polymorphism protein region polymorphisms sequence M1T-Nonsynonymous 2 c.2T >C SNP substitution V17fs-Frameshift 1 c.46delG Indel mutation K23fs-Frameshift 14 c.64delA Indel mutation Y41fs Frameshift 2 c.121_122delTA Indel mutation H70fs- Frameshift Avr2 444 6 1 1 c.209insA Indel mutation 6 g.347_348delAG Indel No effect Large deletion 3 Indel Truncated after T12 starting at c.37 Avr4 823 0 1 18 g.128G>A SNP No effect F82L-Nonsynonymous 15 c.244T>C SNP Avr4e substitution 618 2 0 M93T-Nonsynonymous 15 c.278T>C SNP substitution Avr9 V8A-Nonsynonymous 552 1 0 16 c.23T>C SNP substitution 43 Mating Type Characterization— For all 50 isolates, MAT1-1-1 or MAT1-2-2 were amplified using primers designed to amplify genes encoding protein products homologous to mating-type proteins from other members of the Mycosphaerellaceae family. For all the isolates, we could amplify one, but not both MAT genes (Table 1.4). Overall, 15 of the 50 isolates (30% of the population) contained MAT1-1-1 and the other 35 isolates (70%) contained MAT1-2-1. A Chi-square goodness of fit test rejected the null hypothesis of an equal ratio of mating type alleles (χ2 = 8; P=0.0047). Sample sizes were insufficient to perform statistical analyses by state. Table 1.4. Mating type characterization of isolates organized by state. Number of MAT-type NY MN VT NH MA isolates 10 2 0 2 1 MAT1-1-1 (41.7%) (13.3%) (0%) (33.3%) (50%) 15 (30%) 14 13 3 4 1 MAT1-2-1 35 (70%) (58.3%) (86.7%) (100%) (66.7%) (50%) Sample 24 15 3 6 2 50 size Genotyping-by-Sequencing—Of the 34,044 variants discovered in the raw dataset, 7,514 SNPs remained in the filtered dataset. Mean read depth per individual was 23.56 and the mean depth per site averaged across individuals was 23.13. To better understand the population structure of isolates in this study, a neighbor- joining tree was constructed with 1000 bootstrap replicates (Figure 1.4). In many instances, isolates clustered together by state, including into several monophyletic groups. There were examples where this was not the case, including with the clade containing Pf84 and 18005 (from MN and NY, respectively), and the clade with isolates 19006 and 18013 (from MA and NY, respectively). The isolates from Vermont clustered together by state and year, although the sample size is small, with just three isolates that were all 44 collected in 2019. Similarly, all but one isolate from New Hampshire clustered together (Figure 1.4). Some isolates from the same county, but collected in different years, also belonged to the same clade, indicative of the pathogen over-wintering. 45 Figure 1.4. A) Neighbor-Joining tree showing the genetic distance between 50 P. fulva isolates. B) Principal components analysis plot of the clone-corrected data. The size of the points is proportional to the size of the clonal group. The color of the points indicates where the isolates from a particular clonal group are from, with the exception that for clonal group 2, one isolate was from New Hampshire. 46 An IBS similarity matrix revealed that the mean pairwise IBS between isolates was 0.715. The mean IBS between technical replicates was 0.99994, corresponding to an error rate of 0.006% and the lowest IBS between a pair of technical replicates (Pf79 and Pf79-2) was 0.99968, corresponding to an error rate of 0.032%. A histogram of the pairwise IBS matrix showed that IBS between most isolates ranged between 0.6 to 0.8, with a separate, strong peak at around 0.99. Given that this peak was distinct from the distribution of the majority of pairwise IBS values, and aligned with the IBS identified between technical replicates, we considered any group of isolates featuring pairwise IBS > 0.99 to represent the same unique genotype and comprise a clonal group. In total, 18 unique genotypes were identified (Table 1.2). Each of the 11 clonal group with more than one isolate was comprised of either isolates from Minnesota or the Northeast, but no clonal groups were comprised of isolates from both geographic regions. There were seven unique genotypes that were represented by the single isolates 17057, 18005, 18010, 18012, 18013, and the Minnesota isolate Pf84. The largest clonal group was comprised of eight isolates from Minnesota isolates collected in both 2016 and 2017. The second largest clonal group was comprised of six isolates from New York and one isolate from New Hampshire, collected between 2017-2019 (Table 1.2). To further examine the population diversity, we conducted principal component analysis (PCA) using a clone-corrected dataset that included one isolate representing each unique genotype. Twenty-two percent of the variance in the SNP genotype data was explained by principal component 1, which largely differentiated Minnesota isolates from the isolates collected in New York and the other Northeastern states. Principal component 2 explained 15% of the variance and differentiated New Hampshire isolates from the rest of the isolates (Figure 1.4). To quantify differentiation between isolates from different locations Fst and Gst 47 were also calculated for the clone-corrected data. Between New York and Minnesota, Fst was 0.14 and Gst was 0.132. The results suggest moderate differentiation between the Minnesota and New York isolates. Given the small sample size of the Vermont, New Hampshire, and Massachusetts isolates, only the comparison between the New York and Minnesota populations were examined. Linkage disequilibrium (LD), using the clone-corrected dataset, was examined in a plot showing pairwise r2 between SNPs as a function of distance, averaged in 10 kb bins. Linkage disequilibrium was observed to decay with greater distance between sites (Figure 1.5). Whereas the mean r2 between sites within 50 kb of each other was 0.286, sites between 250-300 kb of each other had a mean r2 of 0.162. Figure 1.5. Plot of LD decay: Pairwise r2 (a measure of pairwise LD) between SNPs, averaged in bins of 10 kb, vs. physical distance (kb). 48 Discussion Fungal isolation and identification—The tomato leaf mold pathogen P. fulva was isolated from symptomatic tomato plants, as confirmed by sequencing the ITS region and through completion of Koch’s postulates. Other Cladosporium spp. belonging to the C. cladosporioides complex (Bensch et al. 2010; Zalar et al. 2007; Schubert et al. 2007; Crous et al. 2007; Crous 2009; Crous et al. 2009) were also isolated from symptomatic tomato plants. These fungi appeared to be secondary colonizers, as they did not cause symptoms on tomato (cultivar ‘Moneymaker’). When we reisolated P. fulva from symptomatic leaves during completion of Koch’s postulates, the isolation process was more rapid. For samples collected from growers’ high tunnels, extensive surface sterilization was the only way to successfully isolate P. fulva. In Argentina, two Cladosporium species, C. sphaerospermum and C. cladosporioides, were also isolated from symptomatic tomatoes (Medina et al 2015). As with our isolation experience, the isolates from Argentina were found to be morphologically different from P. fulva, and faster growing (Medina et al. 2015). The Cladosporium genus is comprised of a diverse set of fungi, yet few reports describe pathogenicity on tomatoes. Some Cladosporium species such as C. oxysporum and C. cladosporioides have been reported to cause a tomato leaf spot (Lamboy and Dillard 1997; Huang et al. 2013; Robles-Yerena et al. 2019). Other research has suggested that C. cladosporioides can cause black mold on post-harvest tomato fruit (Ma et al. 2020). The results from our evaluation of Koch’s postulates suggest that P. fulva is the causative agent of tomato leaf mold, while the other found Cladosporium species are secondary colonizers. Race determination experiments—The race determination assays demonstrated that most of the 50 isolates tested belong to race 2, while a small subset of six isolates belonging to race 0. Our results are like the findings from the race typing done in 49 Argentina (Medina et al. 2015). This contrasts with the race structure in other locations, including Japan, China, and Europe, where isolates that overcome Cf-2, Cf-4, Cf-5, and Cf-9 have been identified. Isolates have also overcome Cf-11 in Japan and Europe (Iida et al. 2015; Yoshida et al. 2021; Li et al. 2015; Lindhout et al. 1989). These defeated R genes were gradually introduced into tomato lines in Japan beginning in the 1960s (Iida et al. 2015; Yoshida et al. 2021). As was concluded from a study of Argentinian isolates, where tomato leaf mold was a relatively new disease in the mid-2010s, it is possible that because tomato leaf mold is a re-emerging disease in the US, the tomato cultivars readily grown in high tunnel and greenhouse environments do not contain as many Cf resistance genes compared to the cultivars grown in Europe, Japan, and China, leading to a less variable race structure (Lindhout et al. 1989; Iida et al. 2015; Li et al. 2015). A large body of literature from North America during the 1930s through 1970s, describing research performed in locations such as Ontario, Canada, details tomato leaf mold resistance breeding efforts and corresponding race determination studies (Langford 1937; Bailey 1950). While the race nomenclature gradually shifted and understandings of gene-for- gene interactions between the pathogen effectors and tomato Cf genes became more complete, the studies demonstrated that several predominant races emerged, directly corresponding to the introgressed resistance genes (Kerr and Bailey 1966). Those early studies from locations in Canada, identified more races than the two, races 0 and 2, that we identified in this study (Langford 1937; Bailey 1950; Kerr and Bailey 1966). This loss of pathogen diversity over the past 90 years could be due to changes in host genotypes grown for tomato production in the USA and Canada. Despite the small number of races identified, most of the isolates from Minnesota and the Northeast overcame Cf-2 mediated resistance. We know that Cf-2, on chromosome 6, came from the wild species L. pimpinellifolium on chromosome 6 (Rivas and Thomas 2005). In Japan, Cf-2 was incorporated into commercial lines in the 1960s, 50 and within a decade, race 2 isolates were identified (Iida et al. 2015). Given that 44 out of 50 isolates in our study overcame resistance to Cf-2, growers are advised to avoid relying on tomatoes with this resistance gene. Analysis of allelic variation of four effector genes—There was little allelic variation within the four effector genes Avr2, Avr4, Avr4e, and Avr9, compared to previous studies that looked at allelic variation of effector genes within P. fulva. (Stergiopoulos et al. 2007b; Iida et al. 2015). The 18 unique genotypes had at least one polymorphism compared to the reference race 0 sequence within one or more of the effector genes (Table 1.2). If a clonal group did have a polymorphism, most often isolates within the group only had one polymorphism for a given gene, relative to the race 0 reference isolate. Clonal groups shared polymorphisms and patterns of allelic variation, and polymorphisms were more often seen between isolates from the same geographic location than the year of collection. This suggests that location plays an important role in the variation of effector genes within the pathogen, potentially due to on-farm overwintering of the pathogen. Avr2 had the largest number of polymorphisms in the coding region, and most involved frameshift mutations. As we would expect, only the race 2 isolates contained any polymorphisms within the coding region, suggesting that these polymorphisms result in protein modifications or deletions, ultimately resulting in a compatible interaction with the corresponding resistance protein Cf-2, leading to disease. For example, the four indels are predicted to result in frameshift mutations that prevent the full translation of the Avr2 protein (Table 1.3). In addition, a substantial deletion that was found in Vermont isolates 19002, 19003, and 19004, which spanned across both the coding and non-coding region and was predicted to result in a truncated Avr2 protein, with only the first 12 of the 78 amino acids present. These three isolates were collected from cultivar ‘Berkeley tie dye’ on the same farm in Addison County Vermont in July of 2019 and belonged to the same clonal group (Table 1.2). Most of the polymorphisms within Avr2 were indels, while 51 SNPs were observed more frequently within the other effector genes. A similar pattern was also described previously, though the indels themselves were often different (Stergiopoulos et al. 2007a). There were six race 2 isolates, belonging to clonal group 2, from New York and New Hampshire, for which an amplicon was not produced following PCR amplification using Avr2 primers. It is possible that the Avr2 gene, or a portion of the gene, was deleted in these isolates, preventing PCR amplification. Further investigation is needed with regards to the 12 isolates from Minnesota and two isolates from Ontario County, New York that belonged to race 2, yet did not show any polymorphisms compared to the race 0 reference sequence (Table 1.2). Additional analysis of these isolates, such as examination of gene expression, is warranted beyond simply comparing gene and predicted protein sequences. For Avr4e and Avr9, SNPs resulting in nonsynonymous substitutions, rather than deletions causing frameshifts were observed. In Avr4e, the predicted substitutions were p.F82L and p.M93T relative to the race 0 reference sequence. Based on previous research, these two nonsynonymous substitutions allow P. fulva to overcome Hcr9-4e (also known as Cf-4e) mediated resistance (Westerink et al. 2004). We did not include Hcr9-4e in our differential set, so this result was not experimentally confirmed. The same nonsynonymous substitutions were also noted in other studies (Stergiopoulos et al. 2007b; Iida et al. 2015). With regards to Avr9, the only SNP we identified, resulting in the nonsynonymous substitution p.V8A, was also seen in previous studies of allelic variation in a world-wide collection of isolates and in Japan (Stergiopoulos et al. 2007b; Iida et al. 2015). The effect of this nonsynonymous mutation is unknown. Functional experiments are needed to understand the effect that this polymorphism might have on protein structure and function. Finally, there were no polymorphisms in the coding region of Avr4, which 52 might suggest that there is less selective pressure on the pathogen to overcome Cf-4 mediated resistance. Mating type determination—Passalora fulva is thought to be an asexual fungus, as only the anamorph has been observed. Therefore, it is assumed that the pathogen population is clonal, and any genetic variation is due to mutation, in the absence of recombination, as demonstrated by the change in race structure in response to the deployment of tomato resistance genes (Joosten and de Wit 1999; Westerink et al. 2004; Stergiopoulos et al. 2007b). The amplification of mating type idiomorphs showed that 30% of our isolates had the MAT1-1-1 gene, while 70% had MAT1-2-1. As expected, isolates that belonged to the same clonal group had the same mating type (Table 1.2). By clonal grouping (unique genotypes), 7 of 18 (39%) groups had the MAT1-1-1 gene and 11 of 18 (61%) groups had the MAT1-2-1 gene. By race, two isolates had MAT1- 1-1 (33.3%), and four isolates had MAT1-2-1 (66.7%) for race 0. Fourteen out of 44 (32%) had MAT1-1-1, and 30 out of 44 isolates (68%) had MAT1-2-1 in race 2. In both instances, the mating type distribution was like the distribution of the full set of 50 isolates. Previously, MAT1-1-1 was observed at a higher frequency in isolate collections from Europe (64%) (Stergiopoulos et al. 2007b), whereas in Japan, both mating types were present, with a slight bias for MAT1-2-1 (73%) (Iida et al. 2015). Given that the collection of isolates was only 50, a larger collection, with many individuals collected per field per year would be helpful to better understand the ratio of mating types, and the potential for sexual reproduction (Table 1.4). Genotypic Diversity and Population Structure—Our clone-corrected dataset showed greater genotypic diversity than we hypothesized. In assessing clonality, 18 unique genotypes and 11 genotypes with more than one individual, were identified. Besides one exception, isolates within the same clonal group were also from the same geographic location (Table 1.2). In one instance in New Hampshire, isolate 19013 had the same 53 unique genotype to several New York isolates in clonal group 2 (Table 1.2). Previous AFLP analysis also pointed to higher-than-expected genotypic diversity (Stergiopoulos et al. 2007b). Isolates belonging to the same regions, regardless of year of isolation, often grouped together in monophyletic groups (Figure 4). Interestingly, all the Vermont isolates grouped together into a single monophyletic group, while the isolates from all the other states belonged to multiple groups (Figure 1.4). Even though all the isolates belonged to just two races, 0 or 2, isolates from both races could have either mating type idiomorph. Other studies also describe isolates from the same race having different mating types (Stergiopoulos et al. 2007b; Iida et al. 2015). As observed in neighbor-joining tree, PCA plots, and Fst and Gst analyses, isolates collected in Minnesota were genetically differentiated compared to locations in the Northeast. Additional sampling and analysis could offer confirmation of the regional differentiation. The analysis of linkage disequilibrium showed a gradual decay as sites became farther apart, which suggests that genetic recombination takes place in P. fulva populations. The existence of genetic recombination is also consistent with the large number of clonal groups. A larger sample size is necessary to better understand the extent to which genetic recombination occurs in P. fulva. Evidence of genetic recombination does not necessarily indicate the presence of a sexual cycle in P. fulva but could result instead from parasexuality leading to mitotic crossing over. Nevertheless, the identification of both mating type alleles, along with evidence of genetic recombination, raises the possibility of sexual reproduction occurring in P. fulva. Conclusion The objectives of this study were to determine the race structure and genetic diversity of isolates collected in the Northeastern United States and Minnesota. We observed that isolates from these regions belong primarily to race 2, with a small number belonging to race 0. Moderate regional differentiation exists between isolates collected in Minnesota 54 compared to the isolates collected in New York and neighboring states. Both mating type genes MAT1-1-1 and MAT1-2-1 were present in the isolate collection. While only two races were present, 18 different clonal groups were identified, suggesting greater genetic diversity between isolates than we might expect from populations that exclusively reproduce asexually. Acknowledgements We thank Holly Lange, Rachel Kreis, and Andrew Aldcroft for assistance isolating fungi from leaf samples, and for assistance fulfilling Koch’s postulates. We thank Cornell Cooperative Extension educators including Amy Ivy and Judson Reid for assistance in collecting tomato leaf samples. We also thank extension specialists in New Hampshire, Vermont, and Massachusetts for mailing us leaf mold samples. We thank Colin Day and Garrett Giles for their assistance with the preparation, planting, and clean-up of the high tunnel race typing experiments. 55 REFERENCES 2007 Census of Agriculture. 2009. USDA National Agricultural Statistics Service. Available at: www.nass.usda.gov/AgCensus 2017 Census of Agriculture. 2019. USDA National Agricultural Statistics Service. Available at: www.nass.usda.gov/AgCensus Alexander, L. J. 1934. Leaf mold resistance in the tomato. Ohio Agricultural Experiment Station. Bulletin 539. Available at: https://kb.osu.edu/handle/1811/60988 Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403–410. https://doi.org/10.1016/S0022-2836(05)80360-2. Bailey, D. L. 1950. Studies in racial trends and constancy in Cladosporium fulvum Cooke. Can. J. Res. 28c:535–565. Bensch, K., Groenewald, J.Z., Dijksterhuis, J., Starink-Willemse, M., Andersen, B., Summerell, B.A., Shin, H.-D., Dugan, F.M., Schroers, H.-J., Braun, U., Crous, P.W. 2010. Species and ecological diversity within the Cladosporium cladosporioides complex (Davidiellaceae, Capnodiales). Stud. Mycol. 67:1–94. Bond, T. E. T. 1938. Infection experiments with Cladosporium fulvum Cooke and related species. Ann. Appl. Biol. 25:277–307. Boukema, I. W., and Garretsen, F. 1975. Uniform resistance to Cladosporium fulvum Cooke in tomato (Lycopersicon esculentum Mill.). 2. Investigations on F2’s and F3’s from diallel crosses. Euphytica. 24:105–116. Braun, U., Crous, P. W., Dugan, F., Groenewald, J. Z., and Sybren De Hoog, G. 2003. Phylogeny and taxonomy of Cladosporium-like hyphomycetes, including Davidiella gen. nov., the teleomorph of Cladosporium s. str. Mycol. Prog. 2:3– 18. Carlson, M. O., Gazave, E., Gore, M. A., and Smart, C. D. 2017. Temporal Genetic dynamics of an experimental, biparental field population of Phytophthora capsici. Front. Genet. 8:26. https://doi.org/10.3389/fgene.2017.00026 Cooke, M. C. 1883. Grevillea. New American Fungi. 12:22–33. Crous, P. 2009. Taxonomy and phylogeny of the genus Mycosphaerella and its anamorphs. Fungal Divers. 38:1-24. Crous, P. W., Braun, U., Schubert, K., and Groenewald, J. Z. 2007. Delimiting 56 Cladosporium from morphologically similar genera. Stud. Mycol. 58:33–56. Crous, P.W., Schoch, C.L., Hyde, K.D., Wood, A.R., Gueidan, C., de Hoog, G.S., Groenewald, J.Z. 2009. Phylogenetic lineages in the Capnodiales. Stud. Mycol. 64:17–47. Danecek, P., Auton, A., Abecasis, G., Albers, C.A., Banks, E., DePristo, M.A., Handsaker, R.E., Lunter, G., Marth, G.T., Sherry, S.T., McVean, G., Durbin, R. 2011. The variant call format and VCFtools. Bioinformatics. 27:2156–2158. de Wit, P.J.G.M., van der Burgt, A., Ökmen, B., Stergiopoulos, I., Abd-Elsalam, K.A., Aerts, A.L., Bahkali, A.H., Beenen, H.G., Chettri, P., Cox, M.P., Datema, E., de Vries, R.P., Dhillon, B., Ganley, A.R., Griffiths, S.A., Guo, Y., Hamelin, R.C., Henrissat, B., Kabir, M.S., Jashni, M.K., Kema, G., Klaubauf, S., Lapidus, A., Levasseur, A., Lindquist, E., Mehrabi, R., Ohm, R.A., Owen, T.J., Salamov, A., Schwelm, A., Schijlen, E., Sun, H., van den Burg, H.A., van Ham, R.C.H.J., Zhang, S., Goodwin, S.B., Grigoriev, I.V., Collemare, J., Bradshaw, R.E. 2012. The genomes of the fungal plant pathogens Cladosporium fulvum and Dothistroma septosporum reveal adaptation to different hosts and lifestyles but also signatures of common ancestry. PLoS Genet. Edgar, R. C. 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32:1792–1797. Glaubitz, J.C., Casstevens, T.M., Lu, F., Harriman, J., Elshire, R.J., Sun, Q., Buckler, E.S. 2014. TASSEL-GBS: a high capacity genotyping by sequencing analysis pipeline. PLoS ONE. 9:e90346. Goudet, J. 2005. Hierfstat, a package for r to compute and test hierarchical F-statistics. Molecular Ecology Notes. 5:184–186. Hedrick, P. W. 2005. A standardized genetic differentiation measure. Evolution. 59:1633–1638. Huang, X.-Y., Liu, Z.-H., Li, J., and Ji, P. 2013. First report of a leaf spot on greenhouse tomato caused by Cladosporium oxysporum in China. Plant Dis. 97:845–845. Iida, Y., Hof, P. van ‘t, Beenen, H., Mesarich, C., Kubota, M., Stergiopoulos, I., Mehrabi, R., Notsu, A., Fujiwara, K., Bahkali, A., Abd-Elsalam, K., Collemare, J., de Wit, P.J.G.M. 2015. Novel mutations detected in avirulence genes overcoming tomato Cf resistance genes in isolates of a Japanese population of Cladosporium fulvum. PLoS ONE. 10:e0123271. Iida, Y., Iwadate, Y., Kubota, M., and Terami, F. 2010. Occurrence of a new race 2.9 of leaf mold of tomato in Japan. J. Gen. Plant Pathol. 76:84–86. 57 Jombart, T. 2008. adegenet: a R package for the multivariate analysis of genetic markers. Bioinformatics. 24:1403–1405. Jombart, T., and Ahmed, I. 2011. adegenet 1.3-1: new tools for the analysis of genome- wide SNP data. Bioinformatics. 27:3070–3071. Joosten, M. H., Vogelsang, R., Cozijnsen, T. J., Verberne, M. C., and de Wit, P. J. 1997. The biotrophic fungus Cladosporium fulvum circumvents Cf-4-mediated resistance by producing unstable AVR4 elicitors. Plant Cell. 9:367–379. Joosten, M., and De Wit, P. 1999. The tomato-Cladosporium fulvum interaction: a versatile experimental system to study plant-pathogen interactions. Annu. Rev. Phytopathol. 37:335–367. Kerr, E. A., and Bailey, D. L. 1966. Breeding for resistance to Cladosporium fulvum Cke. in tomato. Acta Hortic. 4:145–148. Knaus, B. J., and Grünwald, N. J. 2017. VCFR: a package to manipulate and visualize variant call format data in R. Mol. Ecol. Resour. 17:44–53. Lamboy, J. S., and Dillard, H. R. 1997. First report of a leaf spot caused by Cladosporium oxysporum on greenhouse tomato. Plant Dis. 81:228–228. Langford, A. N. 1937. The parasitism of Cladosporium fulvum Cooke and the genetics of resistance to it. Can. J. Res. 15c:108–128. Leski, B. 1977. Identification of Cladosporium fulvum Cooke races from group B and C on tomatoes in Poland. Acta Agrobot. 30:181–194. Li, S., Zhao, T., Li, H., Xu, X., and Li, J. 2015. First report of races 2.5 and 2.4.5 of Cladosporium fulvum (syn. Passalora fulva), causal fungus of tomato leaf mold disease in China. J. Gen. Plant. Pathol. 81:162–165. Lindhout, P., Korta, W., Cislik, M., Vos, I., and Gerlagh, T. 1989. Further identification of races of Cladosporium fulvum (Fulvia fulva) on tomato originating from the Netherlands France and Poland. Netherlands J. Plant Pathol. 95:143–148. Lucentini, C. G., Medina, R., Franco, M. E. E., Saparrat, M. C. N., and Balatti, P. A. 2021. Fulvia fulva [syn. Cladosporium fulvum, Passalora fulva] races in Argentina are evolving through genetic changes and carry polymorphic avr and ecp gene sequences. Eur J Plant Pathol. 159:525–542. Luderer, R., Takken, F. L. W., De Wit, P. J. G. M., and Joosten, M. H. A. J. 2002. Cladosporium fulvum overcomes Cf-2-mediated resistance by producing truncated AVR2 elicitor proteins. Mol. Microbiol. 45:875–884. Ma, M., de Silva, D. D., and Taylor, P. W. J. 2020. Black mould of post-harvest tomato 58 (Solanum lycopersicum) caused by Cladosporium cladosporioides in Australia. Australas. Plant Dis. Notes. 15:25. Medina, R., López, S.M.Y., Franco, M.E.E., Rollan, C., Ronco, B.L., Saparrat, M.C.N., De Wit, P.J.G.M., Balatti, P.A. 2015. A survey on occurrence of Cladosporium fulvum identifies race 0 and race 2 in tomato-growing areas of Argentina. Plant Dis. 99:1732–1737. Mesarich, C.H., Griffiths, S.A., van der Burgt, A., Okmen, B., Beenen, H.G., Etalo, D.W., Joosten, M.H.A.J., de Wit, P.J.G.M. 2014. Transcriptome sequencing uncovers the Avr5 avirulence gene of the tomato leaf mold pathogen Cladosporium fulvum. Mol. Plant Microbe Interact. 27:846–857. Milgroom, M. G. 2017. Population biology of plant pathogens: genetics, ecology, and evolution. St. Paul, MN: The American Phytopathological Society. Nei, M. 1972. Genetic distance between populations. Am. Nat. 106:283–292. Nei, M. 1973. Analysis of gene diversity in subdivided populations. Proc. Natl. Acad. Sci. U.S.A. 70:3321–3323. Neuwirth, E. 2014. RColorBrewer: ColorBrewer Palettes. Available at: https://CRAN.R-project.org/package=RColorBrewer Ohm, R.A., Feau, N., Henrissat, B., Schoch, C.L., Horwitz, B.A., Barry, K.W., Condon, B.J., Copeland, A.C., Dhillon, B., Glaser, F., Hesse, C.N., Kosti, I., LaButti, K., Lindquist, E.A., Lucas, S., Salamov, A.A., Bradshaw, R.E., Ciuffetti, L., Hamelin, R.C., Kema, G.H.J., Lawrence, C., Scott, J.A., Spatafora, J.W., Turgeon, B.G., de Wit, P.J.G.M., Zhong, S., Goodwin, S.B., Grigoriev, I.V. 2012. Diverse lifestyles and strategies of plant pathogenesis encoded in the genomes of eighteen Dothideomycetes fungi. PLOS Pathog. 8:e1003037. Paradis, E., Claude, J., and Strimmer, K. 2004. APE: analyses of phylogenetics and evolution in R language. Bioinformatics. 20:289–290. R Core Team. 2020. A language and environment for statistical computing. R foundation for statistical computing, Vienna, Austria. Available at: https://www.R-project.org Rivas, S., and Thomas, C. M. 2005. Molecular interactions between tomato and the leaf mold pathogen Cladosporium fulvum. Annu. Rev. Phytopathol. 43:395–436. Robles-Yerena, L., Ayala-Escobar, V., Leyva-Mir, S. G., Lima, N. B., Camacho-Tapia, M., and Tovar-Pedraza, J. M. 2019. First report of Cladosporium cladosporioides causing leaf spot on tomato in Mexico. J. Plant Pathol. 101:759–759. 59 Rollan, C., Protto, V., Medina, R., Lopez, S., Vera Bahima, J., Ronco, L., Saparrat, M., Balatti, P. 2013. Identification of races 0 and 2 of Cladosporium fulvum (syn Passalora fulva) on Tomato in the Cinturón Hortícola de La Plata, Argentina. Plant Dis. 97:992–992. Schubert, K., Groenewald, J.Z., Braun, U., Dijksterhuis, J., Starink, M., Hill, C.F., Zalar, P., de Hoog, G.S., Crous, P.W. 2007. Biodiversity in the Cladosporium herbarum complex (Davidiellaceae, Capnodiales), with standardisation of methods for Cladosporium taxonomy and diagnostics. Stud. Mycol. 58:105– 156. Stergiopoulos, I., De Kock, M. J. D., Lindhout, P., and De Wit, P. J. G. M. 2007a. Allelic variation in the effector genes of the tomato pathogen Cladosporium fulvum reveals different modes of adaptive evolution. Mol. Plant Microbe Interact. 20:1271–1283. Stergiopoulos, I., Groenewald, M., Staats, M., Lindhout, P., Crous, P. W., and De Wit, P. J. G. M. 2007b. Mating-type genes and the genetic structure of a world-wide collection of the tomato pathogen Cladosporium fulvum. Fungal Genet. and Biol. 44:415–429. Stergiopoulos, I., and De Wit, P. J. G. M. 2009. Fungal effector proteins. Annu. Rev. Phytopathol. 47:233–263. Thomma, B. P. H. J., Van Esse, H. P., Crous, P. W., and De Wit, P. J. G. M. 2005. Cladosporium fulvum (syn. Passalora fulva), a highly specialized plant pathogen as a model for functional studies on plant pathogenic Mycosphaerellaceae. Mol. Plant Pathol. 6:379–393. Vogel, G., Gore, M. A., and Smart, C. D. 2020. Genome-wide association study in New York Phytophthora capsici isolates reveals loci involved in mating type and mefenoxam sensitivity. Phytopathology. 111:204–216. Weir, B. S., and Cockerham, C. C. 1984. Estimating f-statistics for the analysis of population Structure. Evolution. 38:1358–1370. Westerink, N., Brandwagt, B. F., De Wit, P. J. G. M., and Joosten, M. H. A. J. 2004. Cladosporium fulvum circumvents the second functional resistance gene homologue at the Cf-4 locus (Hcr9-4E) by secretion of a stable avr4E isoform. Mol. Microbiol. 54:533–545. White, T., Bruns, T., Lee, S., Taylor, J., Innis, M., Gelfand, D., Sninsky, J. 1990. Amplification and direct sequencing of fungal ribosomal RNA genes for phylogenetics. In PCR Protocols: A Guide to Methods and Applications-A Laboratory Manual, p. 315–322. Wickham, H. 2016. ggplot2: Elegant Graphics for Data Analysis. 2nd ed. Springer 60 International Publishing. Available at: https://www.springer.com/gp/book/9783319242750 Van den Ackerveken, G. F. J. M., Kan, J. A. L. V., and Wit, P. J. G. M. D. 1992. Molecular analysis of the avirulence gene avr9 of the fungal tomato pathogen Cladosporium fulvum fully supports the gene-for-gene hypothesis. Plant J. 2:359–366. Yoshida, K., Asano, S., Sushida, H., and Iida, Y. 2021. Occurrence of tomato leaf mold caused by novel race 2.4.9 of Cladosporium fulvum in Japan. J. Gen. Plant Pathol. 87:35–38. Zalar, P., de Hoog, G. S., Schroers, H.-J., Crous, P. W., Groenewald, J. Z., and Gunde- Cimerman, N. 2007. Phylogeny and ecology of the ubiquitous saprobe Cladosporium sphaerospermum, with descriptions of seven new species from hypersaline environments. Stud. Mycol. 58:157–183. Zhang, C., Dong, S.-S., Xu, J.-Y., He, W.-M., and Yang, T.-L. 2019. PopLDdecay: a fast and effective tool for linkage disequilibrium decay analysis based on variant call format files. Bioinformatics. 35:1786–1788 61 SUPPLEMENTAL MATERIALS Table 1.S1. Metadata on isolates submitted for genotyping-by-sequencing. Technical replicates are designated ‘-2’, at the end of the isolate name. Sample Date County State Cultivar Species Name Collected 15002 7/1/15 Saratoga New York Pony Express Cladosporium sp. 15005 8/20/15 Essex New York Sun Sugar Cladosporium sp. 15006 8/20/15 Essex New York Sun Gold Cladosporium sp. 15010 8/20/15 Essex New York Sun Gold Cladosporium sp. 15011 8/20/15 Essex New York Sun Gold Cladosporium sp. 15011-2 8/20/15 Essex New York Sun Gold Cladosporium sp. 15013 8/20/15 Clinton New York Marbonne Cladosporium sp. 15016 9/24/15 Orleans New York Unknown Cladosporium sp. 15017 9/24/15 Columbia New York Unknown Cladosporium sp. 15017-2 9/24/15 Columbia New York Unknown Cladosporium sp. 15022 8/15/15 Columbia New York Unknown Cladosporium sp. 15022-2 8/15/15 Columbia New York Unknown Cladosporium sp. 15023 8/8/15 Onondaga New York Unknown Cladosporium sp. 15025 10/30/15 Onondaga New York Cherokee purple Cladosporium sp. 15026 10/30/15 Onondaga New York Striped German Cladosporium sp. 15028 11/17/15 Onondaga New York Mt Fresh + Cladosporium sp. 16077 6/16/16 Ulster New York Unknown Cladosporium sp. 16097 6/21/16 Essex New York Unknown Cladosporium sp. 16139 7/19/16 Ontario New York Unknown Passalora fulva 16141 7/21/16 Orleans New York Unknown Cladosporium sp. 16163 7/25/16 Clinton New York Unknown Cladosporium sp. 16184 8/2/16 Ontario New York Unknown Cladosporium sp. 16185 8/4/16 Essex New York Unknown Cladosporium sp. 16186 8/4/16 St. Lawrence New York Unknown Cladosporium sp. 16187 8/2/16 Essex New York Unknown Cladosporium sp. 16195 8/13/16 Clinton New York Unknown Cladosporium sp. 16206 8/18/16 Allegany New York Unknown Cladosporium sp. 16219 9/1/16 Ulster New York Unknown Cladosporium sp. 16224 9/7/16 Sullivan New York Unknown Cladosporium sp. 16225 9/7/16 Sullivan New York Unknown Cladosporium sp. 16226 9/7/16 Sullivan New York Unknown Cladosporium sp. 16227 9/7/16 Sullivan New York Unknown Cladosporium sp. 16239 10/3/16 Oneida New York Unknown Cladosporium sp. 16243 10/18/16 Schuyler New York Unknown Cladosporium sp. 16271 10/4/16 Ontario New York Unknown Cladosporium sp. 17035 7/10/17 Monroe New York Unknown Passalora fulva 17035-2 7/10/17 Monroe New York Unknown Passalora fulva 17036 7/12/17 Orange New York Unknown Passalora fulva 17037 7/12/17 Orange New York Unknown Passalora fulva 17038 7/12/17 Orange New York Unknown Passalora fulva 17040 7/14/17 Essex New York Unknown Passalora fulva 62 17043 7/20/17 Ontario New York BHN 589 Passalora fulva 17046 7/21/17 Gennesee New York Unknown Passalora fulva 17049 8/15/17 Clinton New York Unknown Passalora fulva 17052 8/21/17 Suffolk New York Jasper Passalora fulva 17052-2 8/21/17 Suffolk New York Jasper Passalora fulva 17053 8/21/17 Suffolk New York Red cherry/grape Passalora fulva 17057 8/21/17 Dutchess New York Unknown Passalora fulva Massachuset 18005 8/5/18 Worcester Jet star ultrasonic Passalora fulva ts 18009 8/20/18 Clinton New York Unknown Passalora fulva 18010 8/28/18 Allegany New York Unknown Passalora fulva New 18012 8/29/18 Merrimack Unknown Passalora fulva Hamshire 18013 8/30/18 Montgomery New York Unknown Passalora fulva 18019 9/17/18 Essex New York Unknown Passalora fulva 18024 9/22/18 Saratoga New York Unknown Passalora fulva 18025 9/28/18 Ontario New York Unknown Passalora fulva 19001 6/30/19 Allegany New York Not known Passalora fulva 19001-2 6/30/19 Allegany New York Not known Passalora fulva 19002 7/12/19 Addison Vermont Berkeley tie dye Passalora fulva 19003 7/12/19 Addison Vermont Berkeley tie dye Passalora fulva 19004 7/12/19 Addison Vermont Berkeley tie dye Passalora fulva Massachuset 19006 7/25/19 Hampshire Unknown Passalora fulva ts Massachuset 19006-2 7/25/19 Hampshire Unknown Passalora fulva ts 19008 8/2/19 Fulton New York Mountain Fresh Passalora fulva 19009 8/8/19 Essex New York Heirloom mix Passalora fulva New 19010 8/9/19 Strafford Rose Passalora fulva Hamshire New 19011 8/9/19 Strafford Big beef Passalora fulva Hamshire New 19012 8/9/19 Strafford Valencia Passalora fulva Hamshire New 19013 8/9/19 Strafford Estiva Passalora fulva Hamshire New 19014 8/9/19 Strafford Sungold Passalora fulva Hamshire 19016 8/15/19 Orange New York Unknown Passalora fulva 19016-2 8/15/19 Orange New York Unknown Passalora fulva 19017 8/16/19 Chautauqua New York Unknown Passalora fulva 19018 8/25/19 Tompkins New York Unknown Passalora fulva 19018-2 8/25/19 Tompkins New York Unknown Passalora fulva Pf76 7/2016 Anoka Minnesota BHN 589 Passalora fulva Pf77 7/2016 Anoka Minnesota BHN 589 Passalora fulva Pf77-2 7/2016 Anoka Minnesota BHN 589 Passalora fulva Pf78 7/2016 Anoka Minnesota BHN 589 Passalora fulva Pf78-2 7/2016 Anoka Minnesota BHN 589 Passalora fulva Pf79 7/2016 Anoka Minnesota BHN 589 Passalora fulva Pf79-2 7/2016 Anoka Minnesota BHN 589 Passalora fulva 63 Pf80 7/2016 St. Louis Minnesota Estonia Passalora fulva Pf81 7/2016 Mower Minnesota Unknown Passalora fulva Pf81-2 7/2016 Mower Minnesota Unknown Passalora fulva Pf82 7/2016 St. Louis Minnesota Sweet Cherry Passalora fulva Pf84 7/2016 Benton Minnesota BHN 589 Passalora fulva Pf85 7/2016 Benton Minnesota BHN 589 Passalora fulva Pf88 10/2016 McLeod Minnesota Unknown Passalora fulva Pf89 10/2016 McLeod Minnesota Unknown Passalora fulva Pf90 7/2017 Douglas Minnesota Big Beef Passalora fulva Pf91 8/2017 Anoka Minnesota Unknown Passalora fulva Pf92 8/2017 McLeod Minnesota Unknown Passalora fulva Pf92-2 8/2017 McLeod Minnesota Unknown Passalora fulva Pf93 8/2017 Douglas Minnesota Big Beef Passalora fulva 64 Table 1.S2. Primers used in the study. Primer Source of primers Primer sequence 5'-3' Avr2f Medina et al. 2015 CATCAGCATATCCTCTTCCATCC Avr2r Medina et al. 2015 CAGTACGTTCAAAAGCAGATAAGG Avr4f Iida et al. 2015 ACGCAGGTCCAAAATAGCTC Avr4r Iida et al. 2015 TCGCAGTTATTTCACCTTGCT Avr4ef Iida et al. 2015 CCGCAGCGAAGTAAATTTTG Avr4er Iida et al. 2015 GTCAGTCCAGTCCGGAACC Avr9f Iida et al. 2015 AGTAGATCCGGCCGAGAGAG Avr9r Iida et al. 2015 AAAGCCTTCAATATGAACGAAT ITS4 White et al. 1990 TCCTCCGCTTATTGATATGC ITS5 White et al. 1990 GGAAGTAAAAGTCGTAACAAGG MAT1_1_P1F Stergiopoulos et al. 2007 CTTCACCACACCCAAAC MAT1_1_ P4R Stergiopoulos et al. 2007 TGTTCGGTGTCGTGATG MAT1_2_P1F Stergiopoulos et al. 2007 CTGCCAGTTCTGCTTTG MAT1_2_P4R Stergiopoulos et al. 2007 TCCACGTCGAAGTAGAG Table 1.S3. Effector gene reference sequence information. Effector Reference GenBank Race gene publication accession 0 Medina et al. 2015 KC132845 Avr2 4 Luderer et al. 2002 AJ421628 0 Medina et al. 2015 KC132831 Avr4 5 Joosten et al. 1997 Y08356 0 Medina et al. 2015 KC132834 Avr4e Westerink et al. 4 AY546101 2004 0 Medina et al. 2015 KC132839 Avr9 Van den Ackerveken 5 X60284 et al. 1992 65 CHAPTER 2 IS THERE AN ASSOCIATION BETWEEN THE CLADOSPORIUM SPECIES COMPLEX AND THE TOMATO LEAF MOLD PATHOGEN PASSALORA FULVA?* Abstract The genus Cladosporium consists of many different species. Many are ubiquitous in the air, soil, on plant surfaces, as pathogens, and are also present indoors. Between 2015 and 2016, tomato leaf mold samples from New York high tunnels were gathered and isolation attempts of the causal fungus Passalora fulva (syn. Cladosporium fulvum) were made by surface sterilization of leaf pieces in 10% bleach. A total of 48 Cladosporium spp. isolates were collected. In this study, the isolated Cladosporium spp. were characterized using morphological, phylogenetic, and genotyping-by-sequencing techniques. High tunnel experiments were conducted in order to determine if Cladosporium spp. caused disease symptoms on tomato leaves. Over two summers, Koch’s postulates could not be completed in mist chamber experiments, as no disease symptoms were seen following inoculations. We could only confirm Koch’s postulates when we inoculated tomatoes with P. fulva. Questions remain about potential associations between the Cladosporium spp. and P. fulva. * In preparation for submission to Plant Disease. Other authors include Lillian McGilp, Melissa Regnier, and Christine D. Smart. 66 Introduction The genus Cladosporium is comprised of more than 218 currently identified. Three major species complexes exist: Cladosporium herbarum, Cladosporium sphaerospermum, and Cladosporium cladosporioides (Bensch et al. 2015, 2018). Fungi from the genus are prolific: they are found both outdoors and indoors, in clinical samples, the air, soil samples, as saprophytes or endophytes on plants, biocontrol agents, and some species are plant pathogens (Bensch et al. 2015). Species from the genus Cladosporium are distinct morphologically with conidia that contain convex domes with periclinal rims (Bensch et al. 2012). Passalora fulva (syn. Cladosporium fulvum) was previously placed in the genus Cladosporium, but morphological distinctions such as different conidiogenous scars and revised phylogenetic analyses placed the fungus into the Passalora genus (Crous and Braun 2003; Braun et al. 2003). Research on P. fulva has increased in the past decade in many locations, including Minnesota and New York, because it causes tomato leaf mold, a disease that is common in high tunnel tomato production (Sudermann et al. 2021). Passalora fulva is an ascomycete that causes distinctive light green spots on the foliage. Over time, the fungus sporulates on the abaxial parts of the leaves, and if severe, defoliation can occur (Thomma et al. 2005). We noted that when isolating P. fulva, if sporulating lesions were scraped and placed on potato dextrose agar (PDA), or leaves were surface sterilized in 10% bleach, rinsed, and then placed on PDA, rapidly growing fungi that belonged to the genus Cladosporium were initially isolated. Morphological, molecular, and phylogenetic characterization demonstrated that isolations were not P. fulva, but rather Cladosporium spp. We could not inoculate tomatoes with Cladosporium spp. and observe disease symptoms. In contrast, during the summer months in the high tunnel environment, we consistently inoculated susceptible tomatoes with P. fulva isolates and saw distinct tomato leaf mold symptoms and signs. We could then reisolate 67 the causal fungus (Sudermann et al. 2021. There has been little discussion in the literature of possible associations between Cladosporium spp. and tomato, or between Cladosporium spp. and tomato pathogens including P. fulva. In Argentina, two Cladosporium species C. sphaerospermum and C. cladosporioides were isolated from tomato during a survey of tomato leaf mold isolates (Medina et al. 2015). Up to this point, there was also tabulated information on C. limoniforme being isolated from tomato (Bensch et al. 2015). Early papers report several Cladosporium species on tomato, including C. herbarum associated with decaying tomatoes (Guba and Rackemann 1938). In more recent work, C. oxysporum and C. cladosporioides have been reported to cause a tomato leaf spot (Lamboy and Dillard 1997; Huang et al. 2013; Robles-Yerena et al. 2019). C. cladosporioides has also been reported to cause of black mold on post-harvest tomato fruit (Ma et al. 2020). Several Cladosporium spp. have antagonistic behavior towards rust pathogens. For example, C. cladosporioides and C. pseudocladosporioides are potential fungal antagonists on Puccinia horiana, the fungus responsible for chrysanthemum white rust (Torres et al. 2017). Cladosporium spp. have also been shown to play a role in fruit rots, including in association with the vinegar fly Drosophila suzukii (Swett et al. 2019). Questions remain about why it is that other Cladosporium spp. were readily isolated from symptomatic tomato leaf surfaces. The objectives of the study were to characterize the Cladosporium isolates that were isolated from leaf surfaces and to further examine the close association between the fungal species and P. fulva. Uncertainty exists about whether a relationship exists between Cladosporium spp. and P. fulva, whether it be an antagonistic relationship, or whether the Cladosporium spp. exist on leaf surfaces as saprophytes after initial infection by P. fulva. Acknowledgement and discussion of these other fungal species readily present on tomato leaf surfaces with tomato leaf mold symptoms are important, as it can impact management of tomato leaf mold and potentially 68 other foliar diseases of tomatoes grown in high tunnels. Fungal isolation— Fungal isolates were collected from high tunnels in New York during the 2015-2016 growing season, using protocols described previously (Sudermann et al. 2021). In short, tomato leaf mold symptomatic leaf samples were surface sterilized in 10% bleach for 1 minute, then rinsed in sterile water, before being placed on potato dextrose agar (PDA). In another isolation method, sporulating conidia were scraped directly from a tomato leaf mold lesion onto PDA. Within 24-48 hours, fungi were growing on the PDA and were transferred to obtain pure cultures. Single conidial isolation was then performed on each isolate to ensure a single genotype in each culture. The protocol for single conidial isolation was identical to the methods previously described (Sudermann et al. 2021). Identification of fungi from the genus Cladosporium— In order to characterize fungal isolates, slides of conidia were prepared and examined under a compound microscope. Conidia were measured and photographed using a calibrated M7025X Dino-Eye Edge Eyepiece Camera attached to the compound microscope. Additionally, two DNA regions were sequenced including the internal transcribed spacer (ITS) region using PCR primers ITS 4 and ITS 5 (White et al. 1990), and a portion of the actin gene using primers Actin 512F and Actin 738R (Carbone and Kohn 1999) (Table 2.S1). DNA extractions from the Cladosporium spp. were identical to the procedures described previously, and PCR reactions and sequencing protocols were also described (Sudermann et al. 2021). Cladosporium reference sequences compiled from NCBI GenBank are listed in Table 2.S2. Neighbor-joining trees were constructed using Geneious, and the Tamura-Nei genetic distance model (Tamura and Nei 1993). Genotyping-By-Sequencing—Following DNA extractions of each Cladosporium spp. isolate, the highlighted isolates in Table 2.S1 were sequenced using genotyping-by- sequencing (Elshire et al. 2011). Libraries were digested with ApeKI and sequenced on 69 the Illumina NovaSeq6000 platform at the University of Wisconsin-Madison Biotechnology Center DNA Sequencing Core Facility. Reads were 150 base-pair paired- end reads. A replicate sample was included on the plate (Table 2.1). Genotypes were called with the Tassel 5 GBS v2 Pipeline (Glaubitz et al. 2014). Sequence tags were aligned to a C. cladosporioides reference sequence (NCBI BioProject accession PRJNA396076). Data was filtered and analyzed as previously described (Sudermann et al. 2021). To identify clonal groups, pairwise-identity-by state (IBS) was used. Based on visualization of the pairwise IBS matrix, a cut-off of 95% was set for declaring whether isolates had unique genotypes or not. The mean error rate between the technical replicates was 0%. Data Availability—The genotyping-by-sequencing data in the form of raw demultiplexed FASTQ files will be deposited into National Center of Biotechnology Information Sequence Read Archive. High tunnel pathogenicity experiments-High tunnel experiments occurred in 2020 and 2021. In 2020, two isolates 15008 and 16-226 (Table 2.1) were each inoculated on tomatoes belonging to a differential set containing the Cf resistance genes 2, 4, 5, 6, 9, and 4 and 11 combined, and the ‘Moneymaker’ cultivar with no resistance genes, according to race determination assays conducted previously on P. fulva isolates (Sudermann et al. 2021). Each isolate was inoculated onto two differential sets. In 2021, 4 additional Cladosporium spp. isolates were used as inoculum, as well the isolate 16- 226, which was used in both years. The Cladosporium spp. isolates were 16-224, 16-243, 16-226, 16-184, and 16-239 (Table 2.2). Four-week-old ‘Moneymaker’ tomatoes (no Cf resistance genes) were potted into 1-gallon pots and placed in a high tunnel. Each isolate was used to inoculate five tomato plants. As positive controls, the P. fulva isolate 19002 was also used to inoculate five susceptible ‘Moneymaker’ tomatoes. As negative controls, tomatoes that were not inoculated were also placed in the high tunnel. Inoculations were 70 performed as previously described (Sudermann et al. 2021), only spore suspensions were applied with a paint brush to the abaxial side of the leaves to the point that there was runoff, and the inoculum concentrations were 2.5x107 conidia/ml. Results Fungal isolation—Initial attempts to isolate P. fulva resulted in a collection of 47 Cladosporium spp. isolates from across New York (Table 2.1). Morphologically, Cladosporium isolates had quite similar sizes and shapes of conidia, except for C. sphaerospermum, which had more circular conidia. The conidia were all 3-7 µm in length and 2-4 µm in width. In contrast, P. fulva conidia were more oblong and substantially larger, with conidia ranging from 10-20 µm in length (Figure 2.1). The Cladosporium isolates were all a dark grey green in color and grew rapidly, extending to the edge of the PDA plates except for C. sphaerospermum, which had less mycelial growth and fewer arial hyphae (Figure 2.1). This was not observed when growing P. fulva in culture (Figure 2.2). The P. fulva isolates grew slowly, and there was little spread beyond where the initial transfer of an agar plug took place. When a liquid suspension of spores was spread on plates small colonies grew. 71 Table 2.1. Cladosporium spp. isolates collected from high tunnels in New York during the 2015-2016 growing season.1 Actin BLAST results ITS BLAST Cladosporium Sample Date results species complex ID Collected County Cultivar Clonal Group 15001 7/1/15 Clinton Native Bites Not examined 15002 7/1/15 Saratoga Pony Express 4 15005 8/20/15 Essex Sun Sugar 4 15018 9/25/15 Washington Cherry in high tunnel Not examined 15025 10/30/15 Onondaga Cherokee purple Unassigned Cladosporium Cladosporium spp. Cladosporioides 15026 10/30/15 Onondaga Striped cladosporioides German Unassigned 16-141 7/21/16 Orleans Unknown 3 16-187 8/2/16 Essex Unknown 3 16-206 8/18/16 Allegany Unknown 6 16-238 9/30/16 Orange Unknown Not examined 16-243 10/18/16 Schuyler Unknown 3 16-272 10/17/16 Ontario Unknown Not examined Cladosporium Cladosporium limoniforme spp. Herbarum 15011 8/20/15 Essex Sun Gold Unassigned 15003 6/9/15 Erie Yellow Brandywine Not examined 15006 8/20/15 Essex Sun Gold 1 15007 8/20/15 Essex Unknown Large Fruit Not examined 15008 8/20/15 Essex Sun Sugar Not examined 15010 8/20/15 Essex Sun Gold 1 15013 8/20/15 Clinton Tomato Marbonne Unassigned 15014 8/20/15 Clinton Red Mountain Not examined 15016 9/24/15 Orleans Unknown 1 Cladosporium Cladosporium pseudocladosporioides spp. Cladosporioides 15017 9/24/15 Orleans Unknown 1 15017-2 9/24/15 Orleans Unknown 1 15019 9/30/15 Suffolk Unknown Not examined 15020 10/7/15 Schuyler Unknown Not examined 15021 10/7/15 Schuyler Unknown Not examined 15022 8/15/15 Columbia Unknown 1 15024 8/15/15 Saratoga BHN-589 Not examined 15027 10/30/15 Onondaga Striped German Not examined 15028 11/17/15 Unknown Mt Fresh + 5 16-077 6/16/16 Ulster Unknown 1 72 16-097 6/21/16 Essex Unknown 1 16-163 7/25/16 Clinton Unknown 1 16-176 7/24/16 Vermont Unknown Not examined 16-184 8/2/16 Ontario Unknown 1 16-185 8/4/16 Essex Unknown 2 16-186 8/4/16 St. Lawrence Unknown 1 16-191 8/5/16 Essex Unknown Not examined 16-195 8/13/16 Clinton Unknown 2 16-218 8/31/16 Essex Unknown Not examined 16-219 9/1/16 Ulster Unknown 1 16-224 9/7/16 Sullivan Unknown 2 16-227 9/7/16 Sullivan Unknown 1 16-271 10/4/16 Ontario Unknown 2 Cladosporium Cladosporium 16-225 9/7/16 Sullivan Unknown Unassigned sphaerospermum spp. Sphaerospermum 16-226 9/7/16 Sullivan Unknown Unassigned Cladosporium Cladosporium subuliforme spp. Cladosporioides 15023 8/8/15 Columbia Unknown Unassigned Cladosporium Cladosporium tenuissimum spp. Cladosporioides 16-239 10/3/16 Oneida Unknown Unassigned 1The highlighted isolates were included on a genotyping-by-sequencing plate. Even if isolates were included on the GBS plate, not all aligned well to the reference genome, and therefore, remained unassigned. 73 A B C D E Figure 2.1. Several different Cladosporium species isolated from leaf surfaces and the isolates used in the mist chamber experiment. A)16-224-C. pseudocladosporioides, B) 16-243-C. cladosporioides, C)16-226- C. sphaerospermum, D)16-184-C. pseudocladosporioides, E)16-239-C. tenuissimum. A B C Figure 2.2. A) The morphology of Passalora fulva on PDA. B) A tomato leaf with signs and symptoms. C) The conidia under the microscope, at the scale of 20 µm. The morphology of the fungus in culture and of the conidia are different than the morphology of the Cladosporium spp. 74 Identification of fungi from the genus Cladosporium—Based on BLAST search results and phylogenetic analysis of the actin gene, most isolates were identified as C. pseudocladosporioides (n = 30) or C. cladosporioides (n = 12). Other isolates were identified as C. tenuissimum (16-239), C. subuliforme (15023), C. limoniforme (15011), and C. sphaerospermum (16-225, 16-226) (Table 2.1, Figure 2.3). The ITS sequences did not offer sufficient resolution to characterize the Cladosporium species beyond the genus level. 75 Cercospera bet icola 16-139 Passalora fulva Passalora sp. Cladosporium cladosporioides-2 15005 100 1500216-141 16-243 Cladosporium flabelliforme Cladosporium perangustum Cladosporium iranicum 85 Cladosporium globisporum Cladosporium exasperatum Cladosporium phyllophilum Cladosporium exile Cladosporium scabrellum 99.8 Cladosporium myrtacearum Cladosporium hillianum 99 Cladosporium chalastosporoides 16-272 98.6 1500197.6 89.3 Cladosporium colocasiae Cladosporium tenuissimum 98.4 16-23968 Cladosporium sphaerospermum16-225 96 16-226 83 Cladosporium subt ilissimum Cladosporium limoniforme 94.8 15011Cladosporium subuliforme 15023 90.4 50.6 Cladosporium delicatulum 70.6 Cladosporium inversicolor Cladosporium basiinf latum 86.6 Cladosporium acalyphae 86.615018 16-238 79.8 15026Cladosporium cladosporioides-1 73.3 15025 16-187 64.1 16-206Cladosporium phyllact iniicola Cladosporium licheniphilum Cladosporium funiculosum 86.6 1502116-218 15013 59.3 1502715010 15007 15022 Cladosporium pseudocladosporioides-1 16-227 60.3 16-18616-163 16-077 68.2 16-184 16-097 15014 15008 15019 15017 15016 15024 15003 15006 991.59028 16-191 16-219 16-195 50.1 16-224 16-176 62.4 1502064.5 16-271Cladosporium pseudocladosporioides-2 16-185 51.5 Cladosporium cucumerinum Cladosporium rectoides Cladosporium xylophilum 0.06 Figure 2.3. Neighbor-Joining tree of the partial actin gene produced using 1000 bootstrap replicates. 76 Genotyping-By-Sequencing— Of the 33 isolates submitted for genotyping-by- sequencing, only 23 aligned well to the C. cladosporioides reference genome and remained in the final dataset after filtering steps. There were 11,790 variants present, and 2,281 SNPs remained in the final filtered dataset. The mean depth per isolate across all sites was: 25.32. To visualize genetic distance, a neighbor-joining tree was constructed with the 23 isolates that aligned to the reference genome. Several monophyletic groups emerged. The isolates that were characterized as C. cladosporioides based on the actin sequence results grouped together, as did the isolates that were characterized as C. pseudocladosporioides (Figure 2.4A). 77 Figure 2.4. A. A neighbor-joining tree of select isolates that underwent genotyping-by-sequencing and aligned to a C. cladosporioides reference genome. B. PCA plot of clone-corrected data. Points are proportional to the number of isolates within each of the 6 clonal lineages. 78 Clone-correction also occurred on the 23 isolates, and 6 distinct clonal groups emerged, based on the IBS pairwise comparisons. The IBS between the technical replicates 15017 and 15017-2 was 100%. The median IBS across all pairwise comparisons was 61%. Based on visualization of a histogram of the IBS matrix, there was a peak around 95%. Any pairwise comparisons with an IBS above 95% were considered clones. In the clone corrected dataset, most of the isolates belonged to just a few clonal groups. One clonal group had 13 isolates, and these isolates were identified as C. pseudocladosporioides. In a PCA plot of the clone-corrected data, there were three main clusters of isolates. The variance explained by PC1 was 62% and differentiated clonal groups 1, 2, and 5 from the clonal groups 3, 4, and 6. The variance explained by PC2 was 21% and separates clonal group 2 from the rest of the clonal groups (Figure 2.4C). The groups 1, 2, and 5, were identified as C. pseudocladosporioides based on the sequencing of the partial actin gene. Clonal groups 3, 4, and 6, were identified as C. cladosporioides based on the sequencing of the partial actin gene. Clonal groups that contained more than one isolate were made up of isolate from multiple counties, so there didn’t seem to be a connection between geographic location and specific genotypes. High tunnel pathogenicity experiments—In 2020, the high tunnel tomatoes of the susceptible cultivar ‘Moneymaker’ that were inoculated with the Cladosporium spp. isolates 15008 and 16-226, as well as tomatoes with the Cf resistance genes Cf-2, Cf-4, Cf-5, Cf-9, and Cf-4,11 showed no disease symptoms, in contrast to the tomatoes that were inoculated with P. fulva isolates, which showed symptoms on the susceptible ‘Moneymaker’ tomato 14 days post-inoculation. Many of the P. fulva isolates also showed disease symptoms on tomatoes containing Cf-2 as well (Sudermann et al. 2021). In a larger experiment in 2021, isolates representing each of the five Cladosporium spp. collected were inoculated onto five ‘Moneymaker’ tomatoes (no Cf genes). All plants remained healthy, with no symptoms or signs on any plants. In contrast, 79 positive control plants, inoculated with the P. fulva isolate 19002, showed disease symptoms on each of the inoculated tomatoes after 14 days. The causal fungus P. fulva was then re-isolated from symptomatic leaves. Discussion Fungal isolation—Besides Medina et. al 2015, there has not been a description of the isolation of Cladosporium spp. from tomato leaves infected with tomato leaf mold. It is important to recognize that other fungi are present on the surface of tomato leaves, which not only adds complexity to research on the tomato-P. fulva pathosystem but could also play a role in management of high tunnel diseases of tomatoes. Simply trying to scrape conidia from leaf surfaces or trying to surface sterilize leaf mold symptomatic tomato leaves just with a 10% bleach solution may result in isolating the faster growing Cladosporium spp. instead of the tomato leaf mold causal fungus P. fulva. Identification of fungi from the genus Cladosporium— Most of the isolated fungal species were characterized as just a few Cladosporium spp. (Table 2.1). Of the six species identified, C. sphaerospermum and C. cladosporioides were previously isolated on tomatoes with leaf mold symptoms in Argentina (Medina et al. 2015). Cladosporium limoniforme was also previously isolated on tomato, but no additional details were provided about how it was isolated or potential connections with P. fulva (Bensch et al. 2015). Finally, reports of C. oxysporum causing a leaf spot Lamboy and Dillard 1997; Huang et al. 2013) and a recent article in which researchers suggest that C. cladosporioides causes a leaf spot have been published (Robles-Yerena et al. 2019). I did not isolate or inoculate tomatoes with C. oxysporum, but it does raise questions and warrant further study comparing C. oxysporum compared to other Cladosporium spp. In addition, the articles discuss possible associations with P. fulva. Most of the New York isolates were within the C. cladosporioides species complex, but a few of the isolates were also within the C. sphaerospermum or the C. 80 herbarum complexes (Bensch et al. 2018). Phylogenetic analysis largely agreed with the results, but it should be noted that some of the reference sequences available on the NCBI GenBank database only extended to characterization at the genus level, especially when it came to the comparing the ITS sequences of our isolates to previously deposited sequence data. During two high tunnel experiments, we could not complete Koch’s postulates with any of the Cladosporium isolates. Alone, none of the Cladosporium spp. that were isolated from tomato leaves infected with P. fulva appear to be pathogenic to tomato. The role these Cladosporium spp. play in disease development when combined with P. fulva remains unknown. Genotyping-By-Sequencing—GBS analysis occurred on a limited number of isolates. Given that several different species appear to be present in our collection, the best available genome was C. cladosporioides (NCBI BioProject accession number PRJNA396076). Given the limited number of available Cladosporium genomes, some of the isolates that we submitted for GBS did not align well to the reference genome, and they could not be part of the downstream analysis. For the 23 isolates that did have reads that aligned well to the reference genome, isolates belonged to six distinct genotypes. With a larger number of samples and more reference genome options, more robust analysis would help to better understand the relationships between the isolates. Conclusion—Cladosporium spp. were isolated from the surfaces of tomato leaves with symptoms of tomato leaf mold. The objective of this study to was to further characterize these isolates, by comparison of DNA sequence, phylogenetic analysis, and GBS analysis. Isolates were primarily characterized as C. cladosporioides and C. pseudocladosporioides. These species are considered secondary colonizers as we could not confirm Koch’s postulates with four different Cladosporium spp., while we consistently confirmed Koch’s postulates when we inoculated high tunnel with P. fulva. 81 The presence of Cladosporium spp. associated with P. fulva lesions on tomato leaves is intriguing and future studies will determine if differences is disease severity occur when Cladosporium spp. are present in lesions caused by P. fulva. Acknowledgements We thank Holly Lange, Rachel Kreis, Gregory Vogel, and Andrew Aldcroft for assistance isolating fungi from leaf samples, and early characterization of the isolates. We thank Cornell Cooperative Extension educators for assistance in collecting tomato leaf samples. 82 REFERENCES Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403–410. Bensch, K., Braun, U., Groenewald, J. Z., and Crous, P. W. 2012. The genus Cladosporium. Stud. Mycol. 72:1–401. Bensch, K., Groenewald, J. Z., Braun, U., Dijksterhuis, J., de Jesús Yáñez-Morales, M., and Crous, P. W. 2015. Common but different: the expanding realm of Cladosporium. Stud. Mycol. 82:23–74. Bensch, K., Groenewald, J.Z., Meijer, M., Dijksterhuis, J., Jurjević, Ž., Andersen, B., Houbraken, J., Crous, P.W., Samson, R.A. 2018. Cladosporium species in indoor environments. Studies in Mycology. 89:177–301. Braun, U., Crous, P. W., Dugan, F., Groenewald, J. Z. (Ewald), and Sybren De Hoog, G. 2003. Phylogeny and taxonomy of Cladosporium-like hyphomycetes, including Davidiella gen. nov., the teleomorph of Cladosporium s. str. Mycol Progress. 2:3–18. Carbone, I., and Kohn, L. M. 1999. A method for designing primer sets for speciation studies in filamentous ascomycetes. Mycologia. 91:553–556. Crous, P., and Braun, U. 2003. Mycosphaerella and allied anamorphs: 1. Names published in Cercospora and Passalora. Crous, P.W.; Braun, U. Mycosphaerella and Its Anamorphs: 1. Names Published in Cercospora and Passalora. 2003. Centraalbureau voor Schimmelcultures (CBS). Edgar, R. C. 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32:1792–1797. Elshire, R.J., Glaubitz, J.C., Sun, Q., Poland, J.A., Kawamoto, K., Buckler, E.S., Mitchell, S.E. 2011. A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLOS ONE. 6:e19379. Glaubitz, J.C., Casstevens, T.M., Lu, F., Harriman, J., Elshire, R.J., Sun, Q., Buckler, E.S. 2014. TASSEL-GBS: a high capacity genotyping by sequencing analysis pipeline. PLOS ONE. 9:e90346. Guba, E. F., and Rackemann, F. M. 1938. Species of Cladosporium on tomato and the allergic response in man as an aid to their identification. Mycologia. 30:625– 634. Huang, X.-Y., Liu, Z.-H., Li, J., and Ji, P. 2013. First report of a leaf spot on 83 greenhouse tomato caused by Cladosporium oxysporum in China. Plant Dis. 97:845–845. Kearse, M., Moir, R., Wilson, A., Stones-Havas, S., Cheung, M., Sturrock, S., Buxton, S., Cooper, A., Markowitz, S., Duran, C., Thierer, T., Ashton, B., Meintjes, P., Drummond, A. 2012. Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics. 28:1647–1649. Lamboy, J. S., and Dillard, H. R. 1997. First report of a leaf spot caused by Cladosporium oxysporum on greenhouse tomato. Plant Dis. 81:228–228. Ma, M., de Silva, D. D., and Taylor, P. W. J. 2020. Black mould of post-harvest tomato (Solanum lycopersicum) caused by Cladosporium cladosporioides in Australia. Australasian Plant Dis. Notes. 15:25. Medina, R., López, S.M.Y., Franco, M.E.E., Rollan, C., Ronco, B.L., Saparrat, M.C.N., De Wit, P.J.G.M., Balatti, P.A. 2015. A Survey on occurrence of Cladosporium fulvum identifies race 0 and race 2 in tomato-growing areas of Argentina. Plant Dis. 99:1732–1737. Robles-Yerena, L., Ayala-Escobar, V., Leyva-Mir, S. G., Lima, N. B., Camacho-Tapia, M., and Tovar-Pedraza, J. M. 2019. First report of Cladosporium cladosporioides causing leaf spot on tomato in Mexico. J Plant Pathol. 101:759– 759. Sudermann, M., McGilp, L., Vogel, G., Regnier, M., Rodriguez-Jaramillo, A., and Smart, C. D. 2021. Toward a greater understanding of the population diversity of Passalora fulva in US high tunnels. Phytopathology. In revision. Swett, C. L., Hamby, K. A., Hellman, E. M., Carignan, C., Bourret, T. B., and Koivunen, E. E. 2019. Characterizing members of the Cladosporium cladosporioides species complex as fruit rot pathogens of red raspberries in the mid-Atlantic and co-occurrence with Drosophila suzukii (spotted wing drosophila). Phytoparasitica. 47:415–428. Tamura, K., and Nei, M. 1993. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol. Biol. Evol. 10:512–526. Thomma, B. P. H. J., Van Esse, H. P., Crous, P. W., and DE Wit, P. J. G. M. 2005. Cladosporium fulvum (syn. Passalora fulva), a highly specialized plant pathogen as a model for functional studies on plant pathogenic Mycosphaerellaceae. Mol. Plant Pathol. 6:379–393. Torres, D. E., Rojas-Martínez, R. I., Zavaleta-Mejía, E., Guevara-Fefer, P., Márquez- Guzmán, G. J., and Pérez-Martínez, C. 2017. Cladosporium cladosporioides 84 and Cladosporium pseudocladosporioides as potential new fungal antagonists of Puccinia horiana Henn., the causal agent of chrysanthemum white rust. PLOS ONE. 12:e0170782. White, T., Bruns, T., Lee, S., Taylor, J., Innis, M., Gelfand, D., Sninsky, J.1990. Amplification and direct sequencing of fungal ribosomal RNA genes for phylogenetics. In PCR Protocols: A Guide to Methods and Applications-A Laboratory Manual, p. 315–32. 85 SUPPLEMENTAL MATERIALS Table 2.S1. Table of primers used in the study. Primer Source of primers Primer sequence 5'-3' ITS4 White et al. 1990 TCCTCCGCTTATTGATATGC ITS5 White et al. 1990 GGAAGTAAAAGTCGTAACAAGG Actin 512F Carbone and Kohn 1999 ATGTGCAAGGCCGGTTTCGC Actin 738R Carbone and Kohn 1999 TACGAGTCCTTCTGGCCCAT Table 2.S2. Accession numbers of the fungal isolates used as reference sequences in the phylogenies. Partial Actin Fungal Species Reference Gene Reference Seque nces Cercospora beticola FJ473436 Cladosporium acalyphae HM148481 Cladosporium basiinflatum HM148487 Cladosporium chalastosporoides HM148488 Cladosporium cladosporioides-1 HM148506 Cladosporium cladosporioides-2 MN013164 Cladosporium colocasiae HM148556 Cladosporium cucumerinum HM148567 Cladosporium delicatulum KP702086 Cladosporium exasperatum HM148579 Cladosporium exile HM148580 Cladosporium flabelliforme LN834545 Cladosporium funiculosum MF473831 Cladosporium globisporum HM148585 Cladosporium hillianum HM148587 Cladosporium inversicolor KP702031 Cladosporium iranicum HM148599 Cladosporium licheniphilum HM148600 Cladosporium limoniforme MW455498 Cladosporium myrtacearum HM148606 Cladosporium oxysporum JQ966536 Cladosporium perangustum KC484667 Cladosporium phyllactiniicola HM148642 86 Cladosporium phyllophilum MW251551 Cladosporium pseudocladosporioides-1 MH047342 Cladosporium pseudocladosporioides-2 MN865116 Cladosporium rectoides MN984222 Cladosporium scabrellum HM148685 Cladosporium sphaerospermum EU570274 Cladosporium subtilissimum EF679551 Cladosporium subuliforme KT600650 Cladosporium tenuissimum LN834587 Cladosporium xylophilum MT671938 Passalora fulva-1 AJ300327 Passalora fulva-2 LC121216 Passalora sp. MW556002 87 CHAPTER 3 UTILIZING TARGET ENRICHMENT SEQUENCING TO UNDERSTAND EFFECTOR DIVERSITY WITHIN THE US-23 CLONAL LINEAGE OF PHYTOPHTHORA INFESTANS* Abstract The oomycete Phytophthora infestans is a devastating pathogen that is responsible for late blight of potatoes and tomatoes. In the United States, the US-23 clonal lineage has been the predominant lineage for more than a decade. Previous research suggests that some genetic diversity exists within the clonal lineage, but the implications of that diversity are unknown. Effector diversity within the lineage was examined to better understand changes in the US-23 lineage over time. Known P. infestans effectors can change in response to resistance genes that are deployed by potato and tomato breeders. In this study, effector diversity was compared across 12 US-23 isolates using target enrichment looking at both presence/absence polymorphisms, and polymorphisms within effectors. In total, across the 12 isolates that underwent target enrichment re-sequencing, 29 effectors showed differences in presence/absence, and 14 effectors were absent from all the isolates. The number of the sequence polymorphisms varied widely among genes. Finally, predictions were made to examine the effect that the polymorphisms had on protein function, and most were predicted to be missense mutations that arose from point mutations. A smaller number were predicted to result in synonymous substitutions. * The target journal will be Molecular Plant Pathology. Other authors will include Tze-Yin Lim, Miles Armstrong, Ingo Hein, and Christine D. Smart (corresponding author). 88 Introduction Phytophthora infestans is an oomycete pathogen that causes late blight of potatoes and tomatoes and is responsible for pandemics and significant losses to growers for decades (Fry et al. 2015). In the United States, potato is the largest vegetable crop in terms of acres harvested (with 1.1 million acres), and tomatoes grown in the open were ranked fourth (335,348 harvested acres) (2017 Census of Agriculture 2019). Globally, in 2019, 370,436,581 tons of potatoes on 42.9 million harvested acres, and 180,766,329 tons of tomatoes on 12.4 million acres harvested, were produced (FAOSTAT-Crops 2019). Late blight was first described in the 1840s in the United States on potatoes, and the outbreak in Ireland that contributed to the Irish Potato Famine occurred in 1845 (Fry et al. 2013). It took several more decades before the pathogen was classified into the genus Phytophthora by Anton de Bary in 1861 (de Bary 1861). In 2009, a late blight pandemic spread across much of the US Northeast after infected tomato seedlings were distributed to retailers. Although large scale destruction caused by P. infestans does not occur each year, when conditions are ideal, growers experience devastating losses (Fry et al. 2013). While P. infestans is heterothallic, most often, the pathogen reproduces asexually in the United States. Infections occur on foliage and stems of potato and tomato, as well as tomato fruits and potato tubers. At the time of the Irish Potato Famine, the predominant clonal lineage was Herb- 1. A shift occurred from Herb-1 to US-1 in the early to mid-1900s. From this point on until the 1990s, the predominant lineage in the United States and globally, besides in Mexico where the population was very diverse, was called US-1 (Goodwin et al. 1994). Since 2011, US-23 has been the predominant clonal lineage, and subclonal variation has been observed within the lineage. From 2011 to 2016, the number of multi-locus genotypes (MLGs) increased (Hansen et al. 2016; Saville and Ristaino 2019). These previous studies have shown that the U/S-23 clonal lineage has changed with time, 89 however the effects of these changes is unknown. Target enrichment sequencing allows us to analyze to P. infestans populations from evolutionary and population genomic lenses (Thilliez et al. 2019). Many P. infestans population studies use simple sequence repeat (SSR) markers (Li et al. 2013), and while the markers facilitate rapid genotyping efforts, they have little utility when it comes to the study of adaptive evolution. Target enrichment sequencing offers higher read depth of genes and can better identify sequence polymorphisms. The pathogen enrichment sequencing (PenSeq) tool incorporates capture sequences for more than 500 P. infestans effectors into the library bait design for P. infestans (Thilliez et al. 2019). The baited effectors have a conserved RXLR (Arginine, any amino acid, Leucine, Arginine) motif (Whisson et al. 2007; Haas et al. 2009). The C-terminal end of the proteins contain domains that are diverse and rapidly evolving, and they can contribute to virulence, as host cell death suppressors (Bos 2006; Win et al. 2007; Jiang et al. 2008; Dou et al. 2008). The effectors are in regions that are hypothesized to experience higher rates of loss and gain of genes than the core genome regions (Haas et al. 2009). Since many genes and SNPs are targeted with the PenSeq approach, there is less concern regarding bias when analyzing population genetics summary statistics, and The PenSeq tool can better identify genetic variation under positive or balancing selection (Thilliez et al. 2019). Utilization of the PenSeq tool allows for reductions in genome complexity because many effectors are only encoded by a small part of the genome (Thilliez et al. 2019). The method is useful in organisms like P. infestans, which has a substantially larger genome than many other oomycetes, due to many repetitive regions (Haas et al. 2009). In P. infestans, only a small number of the predicted effectors have been studied extensively. For example, the protein AVR3a interacts with R3a and is involved in INF1- mediated cell death (Armstrong et al. 2005; Bos et al. 2010; Gilroy et al. 2011). 90 PITG_21388 (Avrblb1) is a representative gene within the multigenic Avrblb1 family. It encodes AVRblb1 and is also known as IPIO. The Avrblb2 family is also diverse and multigenic (Thilliez et al. 2019). AVRblb1 and AVRblb2 are induced during infection, and they interact with potato host proteins RPI-blb1 and RPI-blb2, respectively. AVRblb1 is shown to disrupt cell wall-plasma membrane adhesion, while AVRblb2 blocks plant protease secretion and is required for full virulence (Vleeshouwers et al. 2008; Bozkurt et al. 2011; Oh et al. 2009). Other effectors that have been studied include AVR1, which is a virulence factor that suppresses callose deposition and encourages colonization (Du et al. 2015), AVR2 (Gilroy et al. 2011), AVR3b, AVRSmira1 (Rietman et al. 2012) AVRvnt1 (Gao et al. 2020), and PexRd24 (putative non-host determinant in pepper) (Lee et al. 2014) (Thilliez et al. 2019). To date, studies on P. infestans effectors have primarily focused on the interaction with potato and less is known about effectors that interact with known resistance proteins in tomato. The objective of this study was to utilize PenSeq tools to understand effector diversity within the single clonal lineage US-23. Materials and Methods Isolation, extraction of DNA, and target enrichment sequencing—P. infestans isolates had previously been isolated from potato and tomato samples during yearly outbreaks of late blight. The isolates were collected from symptomatic potato and tomatoes from 2009 to 2015, as the US-23 clonal lineage was emerging (Table 3.1). Cultures were grown on pea agar and stored in a cold room on rye B agar slants (Hansen et al. 2016). New cultures were grown from the vials in long-term storage. After cultures grew on pea agar, pea broth was prepared, and several agar plugs were placed on pea broth for one week. The mycelia were then vacuum filtered on sterile filter paper, placed in 1.5 mL tubes and stored in the -80°C freezer for later extractions. DNA was extracted from mycelia using the Qiagen Plant DNeasy kit (Qiagen, Valencia, CA, USA) (Hansen et al. 2016). 91 Table 3.1. Isolates from the US-23 clonal lineage that were used in study, including alternatives names, and information on each isolate. Mean Number read Isolate USABlight Alternative Name ID Name Tool State County Host Sample Collection Year Date of variant depth sites across variant sites FL2009P4* Unknown Unknown WGS KY Unknown Unknown 2009 N/A 300556 20.9 BL2009P4* N/A 9231 WGS PA Blair Potato 2009 N/A 430592 33.6 702 120608261S1 US120020 PenSeq NC Chowan Tomato 2012 6/6/12 2783 109.1 778 120830242S1 US120131 PenSeq NY Onondaga Tomato 2012 8/31/12 2680 105 813 120822205S1 US120113 PenSeq NY Washington Tomato 2012 8/23/12 1998 80.9 1627 140804089S1 US140056 PenSeq NY Yates Tomato 2014 8/4/14 2707 116.7 1671 140815134S1 US140081 PenSeq PA Indiana Tomato 2014 8/15/14 2421 102.8 1714 140902236S1 US140120 PenSeq NY St. Lawrence Potato 2014 9/2/14 2325 93.4 1734 140729078S1 US140049 PenSeq MA Franklin Potato 2014 7/29/14 2103 89.1 9231 N/A BL2009P4 PenSeq PA Blair Potato 2009 N/A 2619 100.7 10232 N/A US100016 PenSeq WI Waukesha Tomato 2010 8/4/10 2891 109 122344 120814179S1 US120100 PenSeq NY Erie Tomato 2012 10/12/12 2821 130.6 150002 150420010S1 N/A PenSeq FL Hendry Tomato 2015 11/22/16 2219 91 150084 150824115S1 N/A PenSeq NY Wayne Potato 2015 8/26/15 2567 106.5 *Isolates that underwent whole-genome sequencing. The remaining 12 isolates underwent PenSeq target enrichment sequencing. Isolate 9231 and BL2009P4 are the same isolate, referred to by different names. 92 The 12 isolates that underwent target enrichment sequencing were sent to the James Hutton Institute, and PenSeq target enrichment sequencing was performed using 3,729 baits to target effector genes and other genes of interest (Table S1). A total of 579 genes were targeted, including 438 RXLR effectors, and 141 non-RXLR genes. The enrichment, post-capture amplification, and paired-end sequencing were performed as previously described (Thilliez et al. 2019). In addition to the 12 isolates that underwent target enrichment re-sequencing, two US-23 strains, BL2009P4 (Martin et al. 2013) and FL2009P4 (Knaus et al. 2016) were previously whole-genome sequenced and utilized in the analyses. The BL2009P4 reads were 100 bp Illumina paired-end sequence reads, whereas the FL2009P4 sequences were 51 bp paired-end reads. Coincidently, BL2009P4 (otherwise known as 9231) was both whole-genome sequenced and underwent target enrichment re-sequencing. The PenSeq isolate is referred to as 9231, whereas the BL2009P4 refers to the isolate that underwent whole-genome sequencing. Data processing—Following sequencing, adapters were trimmed using Cutadapt (Martin 2011). Reads were mapped to the reference genome T30-4 reference genome (Haas et al. 2009) using Bowtie2 2.3.4.3 (Langmead and Salzberg 2012), at a 1% mismatch rate, and the –very-sensitive setting. BAM file outputs were then sorted and indexed. After alignment, the coverageBed utility from BedTools 2.27.1 (Quinlan and Hall 2010) was used to count per-base coverage of annotated transcripts, and results were visualized with ggplot2 in Rstudio 1.2.1335 (Wickham 2016). For the two isolates FL2009P4 and BL2009P4 that underwent whole-genome sequencing rather than PenSeq, data was processed in a similar manner, except that reads were mapped to the T30-4 reference genome using BWA-MEM 0.7.17 (Li 2013), before proceeding with the subsequent steps. Analysis of Presence/Absence polymorphisms—The mean read depth was calculated 93 for each of the 579 baited genes (Table S1) for each of the 12 isolates that underwent PenSeq target enrichment sequencing, using coverageBed output of per-base coverage. Genes were considered absent if the mean read depth was less than 5. The per-base coverage of the baited effector genes was also visualized using the Integrative Genome Viewer (IGV) (Thorvaldsdóttir et al. 2013), and the coverage of each gene was visualized by generating plots of read depth by position (bp) using ggplot2. The analysis was repeated for the two isolates that were previously whole-genome sequenced in order to compare the sequencing methods. PCR validation of presence/absence polymorphism data and sequencing of effector gene amplicons—Of the 438 annotated effector genes included in this study, nine were used to validate the presence/absence data described above. This included three genes that were present (PITG_14371 (Avr3a), PITG_18215 (Avr3b), PITG_22727), three genes that were absent (PITG_16663 (Avr1), PITG_21107, PITG_15424), and three genes with presence/absence polymorphisms (PITG_07634, PITG_19800, PITG_10232) were used to validate the target enrichment resequencing results with each of the 12 US- 23 isolates. Primer sequences were either compiled from earlier publications or designed using NCBI Primer BLAST (Table 3.S2). For PITG_19800, PITG_14371, and PITG_22727, the primers amplified the entire gene. For PITG_07634, PITG_10232, and PITG_18215, primers were designed to amplify part of the genes, to try to avoid nonspecific binding of primers. For PCR amplification of each gene, 12.5 µl Emerald Amp GT 2X Mastermix (Takara Bio Inc, Shiga, Japan) was combined with 9 µl deionized water, and 0.2µM of the forward and reverse primers, and about 50 ng of template DNA were added for a final reaction volume of 25 µl. The PCR reactions were performed on the Bio-Rad C1000 Touch Thermal Cycler (Bio-Rad, Hercules, CA, USA) using a denaturation step at 95°C for 2 minutes, followed by 35 cycles of 95°C for 30 seconds, an annealing step at 60 °C for 30 seconds, and an extension step at 72 °C for 1 minute. 94 The final extension step was 72 °C for 7 min. The PCR conditions were the same for amplification of all the effector genes, except for Avr1, which had an annealing temp of 62 °C. Amplicons were then Sanger sequenced, and aligned in Geneious Prime 2020.2.2, to T30-4 reference sequences using MUSCLE 3.8.425 (Kearse et al. 2012; Edgar 2004). NCBI Basic Local Alignment Tool (BLAST) was used to compare sequenced genes with other P. infestans genes, to ensure that the target effector gene was amplified. Variant Calling Analysis—Variant calling occurred with FreeBayes (Garrison and Marth 2012) for all 12 isolates that underwent target enrichment resequencing. Variants were filtered using Biopet Vcffilter with the following parameters DP > 10, MQM > 20, SAF > 1, SAR > 1, and QUAL / AO > 2 (Thilliez et al. 2019). In order validate the variant calling analysis, the effector gene PITG_14371 was PCR amplified and sequenced according to the protocols described above. Results Analysis of presence/absence polymorphisms—The mean mapped read depths averaged across all the baited genes for each isolate ranged from 118-259. Of the 579 baited genes, 16 genes were absent from all 12 isolates, including 14 of the 438 candidate RXLR effector genes, and two out of 141 non-RXLR protein coding genes (Table 3.2; Figure 3.1). Interestingly, none of the isolates had PITG_16663, which encodes the effector AVR1. Of the non-RXLR genes, the two genes that were absent from all the isolates were PITG_23119T0 and PITG_23195T0, which encode Epi1-like and Epi9-like protease inhibitors, respectively, were absent from all 12 isolates (Table 3.S1). 95 Table 3.2. A description of each of the genes that was absent from all 12 US-23 isolates that were part of the PenSeq experiment. The results were compared to the two isolates that were whole genome sequenced. Genes that are absent from all 12 US-23 isolates (or very reduced coverage) WGS Comparison Average mapped Gene ID PenSeq T30-4 annotation read depth for BL2009P4 FL2009P4 annotation the 12 PenSeq Average mapped Average mapped isolates read depth read depth Secreted RXLR PITG_05072 RXLR effector peptide, < 4 for all isolates 2.4 0.75 putative PexRD45 - Avrblb2 family- PITG_09632 RXLR Secreted RXLR 0 No coverage 0.92 effector peptide, putative Secreted RXLR PITG_09754 RXLR effector peptide, 0 0.70 0.18 putative Secreted RXLR PITG_12706 RXLR effector peptide, <2 for all isolates 4.3 5.3 putative Secreted RXLR PITG_14673 RXLR effector peptide, 0 0 0 putative Secreted RXLR PITG_15424 RXLR effector peptide, 0 0 0 putative Secreted RXLR PITG_16180 RXLR effector peptide, <1 for all isolates 0.12 0.31 putative Avr1-Secreted PITG_16663 RXLR RXLR effector 0 0.03 0.01 peptide, putative Secreted RXLR PITG_20336 RXLR effector peptide, <5 for all isolates 1.11 1.00 putative Secreted RXLR PITG_21107 RXLR effector peptide, 0 0 0 putative Secreted RXLR PITG_21303 RXLR effector peptide, <2 for all isolates 1.29 0 putative Secreted RXLR PITG_22256 RXLR effector peptide, 0 0 0 putative Secreted RXLR PITG_22929 RXLR effector peptide, <1 for all isolates 40.27 18.11 putative Secreted RXLR PITG_22972 RXLR effector peptide, 0 32.09 16.95 putative PITG_23119 Non-RXLR Epi1-like protease inhibitor 0 0 0 PITG_23195 Non-RXLR Epi9-like protease inhibitor 0 0 0 96 PITG_05072 PITG_09632 (PexRD45−AvrBlb2 Family PITG_09754 1.00 0.75 0.50 0.25 0.00 0 100 200 300 400 0 100 200 300 0 100 200 PITG_12706 PITG_14673 PITG_15424 1.00 0.75 0.50 0.25 0.00 0 100 200 300 0 200 400 600 800 0 200 400 600 PITG_16180 PITG_16663 (Avr1) PITG_20336 1.00 0.75 702 778 0.50 813 0.25 1627 0.00 1671 0 100 200 300 0 200 400 600 0 50 100 150 200 250 1714 PITG_21107 PITG_21303 PITG_22256 1734 1.00 9231 0.75 10232 0.50 122344 150002 0.25 150084 0.00 0 100 200 300 400 0 200 400 600 0 100 200 300 PITG_22929 PITG_22972 PITG_23119 1.00 0.75 0.50 0.25 0.00 0 100 200 300 0 50 100 150 200 0 100 200 300 400 PITG_23195 1.00 0.75 0.50 0.25 0.00 0 100 200 Position (bp) Figure 3.1. Plots showing the coverage of effector genes that were absent from all 12 isolates. y-axis = log of read depth, x- axis = specific position across the corresponding gene. A one on the y-axis indicates a read depth of 10, and a two indicates a read depth of 100. A mean read depth of 5 or below was the threshold for declaring that the gene was absent from an isolate. Lines that are not at 0 indicate that an isolate that the read depth was greater than 0 but the mean read depth across the gene was less than 5. 97 log10(Read depth + 1) In a repeat analysis with the two isolates that were whole genome sequenced, the mean read depth of the two isolates was lower, but many of the 16 genes that were also absent in the 12 isolates that underwent target enrichment sequencing, were also absent from FL2009P4 and BL2009P4 (Table 3.2). The exceptions were PITG_22929 and PITG_22972, which encode RXLR effectors. They had mean read depths between 16 and 40 (Table 3.2). There was diversity in presence/absence of some effectors as 32/438 baited effector genes showed differences in presence and absence in different isolates. Out of the 32 genes, 29 encoded RXLR effectors (Table 3.3, Figure 3.2). The three genes encoding non-RXLR proteins include three enzymes: a protein inhibitor Epi3, and two genes that encode putative carbonic anhydrases (Table 3.3, Table 3.S1). Of the 32 effectors with differences in presence and absence, the isolates with the largest number of absent genes included 10232 (20 absent genes), and 9231 and 150084 both had 13 absent genes. This did not include the 16 genes that were absent in all isolates. All three of these isolates were collected from the Northeast or Midwest United States in 2009 (9231), 2010 (10232), and 2015 (150084). In contrast, isolates 702 (isolated in 2012 from North Carolina) and 1627 (isolated in 2014 from New York) had the least number of genes that were absent, and only had two genes that were absent, not counting the 16 total genes that were absent from all the isolates (Table 3.3). With the lower mean read depth across each gene, the whole-genome sequence data could not accurately be used to examine presence/absence polymorphisms, as mean read depth cut-off values were harder to deduce. 98 Table 3.3. A description of each of the genes for which there were presence/absence polymorphisms for each of the 12 PenSeq isolates. Gene ID PenSeq annotation T30-4 annotation Isolates with mean read depth of 5 or less Secreted RXLR PITG_01904 RXLR effector peptide, 10232 putative Secreted RXLR PITG_04279 RXLR effector peptide, 10232 putative- Secreted RXLR PITG_07634 RXLR effector peptide, 150084 putative Secreted RXLR PITG_09771 RXLR effector peptide, 813, 1714 putative Secreted RXLR PITG_10232 RXLR effector peptide, 9231 putative Secreted RXLR PITG_10639 RXLR effector peptide, 122344 putative Secreted RXLR PITG_12721 RXLR effector peptide, 702, 813, 1714, 150084 putative Secreted RXLR PITG_13538 RXLR effector peptide, 10232 putative Secreted RXLR PITG_14046 RXLR effector peptide, 1671, 1714, 150084 putative Secreted RXLR PITG_15712 RXLR effector peptide, 813, 10232, 150002 putative Secreted RXLR PITG_15718 RXLR effector peptide, 813, 10232, 150002 putative 99 Avr2 family PITG_15972 RXLR Secreted RXLR effector peptide, 10232 putative PITG_16827 Non-RXLR Protease inhibitor Epi3 778, 10232, 150084 PITG_17842 Non-RXLR Carbonic anhydrase, putative 778 PITG_17846 Non-RXLR Carbonic anhydrase, putative 813, 1671, 1714, 122344, Secreted RXLR PITG_18670 RXLR effector peptide, 9231, 10232, 150084 putative Secreted RXLR PITG_18675 RXLR effector peptide, 9231, 10232, 150084 putative Avrblb2 family PITG_18683 RXLR Secreted RXLR effector peptide, 9231, 10232, 150084 putative PITG_19232 Secreted RXLR RXLR effector peptide, 9231 putative Avr2 family PITG_19617 RXLR Secreted RXLR effector peptide, 10232 putative Secreted RXLR PITG_19800 RXLR effector peptide, 778 putative Avrblb2 family PITG_20301 RXLR Secreted RXLR effector peptide, 9231, 10232, 150084 putative 100 Avrblb2 family PITG_20303 RXLR Secreted RXLR effector peptide, 9231, 10232 putative Secreted RXLR PITG_20857 RXLR effector peptide, 9231, 10232, 150084 putative Secreted RXLR PITG_20934 RXLR effector peptide, 9231 putative Secreted RXLR PITG_21778 RXLR effector peptide, 10232 putative Secreted RXLR PITG_22926 RXLR effector peptide, 702, 813, 1627, 1734, 122344, putative 150002, 150084 Secreted RXLR PITG_23026 RXLR effector peptide, 1734, 813, 9231,10232, 1627, 1671, putative 122344 Secreted RXLR PITG_23126 RXLR effector peptide, 10232 putative Secreted RXLR PITG_23137 RXLR effector peptide, 9231 putative Secreted RXLR PITG_23185 RXLR effector peptide, 9231, 10232, 150084 putative Secreted RXLR PITG_23193 RXLR effector peptide, 9231, 10232, 150084 putative 101 PITG_01904 PITG_04279 PITG_07634 2 1 0 0 100 200 300 0 100 200 300 0 100 200 300 400 500 PITG_09771 PITG_10232 PITG_10639 2 1 0 702 0 100 200 300 0 100 200 300 400 500 0 100 200 300 400 778 PITG_12721 PITG_13538 PITG_14046 813 1627 2 1671 1714 1 1734 9231 10232 0 122344 0 100 200 300 0 100 200 300 400 500 0 100 200 300 150002 PITG_15712 PITG_15718 PITG_15972 (Avr2 family) 150084 2 1 0 0 100 200 300 400 0 100 200 300 400 0 100 200 PITG_18670 PITG_18675 PITG_18683 (Avrblb2 family) 2 1 0 0 100 200 300 0 100 200 300 0 100 200 300 Position (bp) 102 log10(Read depth + 1) PITG_19232 PITG_19617 (Avr2 family) PITG_19800 2 1 0 0 100 200 300 400 0 100 200 300 0 200 400 600 PITG_20301 (Avrblb2 family) PITG_20303 (Avrblb2 family) PITG_20857 2 1 0 702 0 100 200 300 0 100 200 300 0 100 200 300 400 778 PITG_20934 PITG_21778 PITG_22926 813 1627 2 1671 1714 1 1734 9231 10232 0 122344 0 100 200 300 0 100 200 300 400 0 200 400 600 150002 PITG_23026 PITG_23126 PITG_23137 150084 2 1 0 0 50 100 150 200 0 100 200 0 100 200 300 400 PITG_23185 PITG_23193 2 1 0 0 100 200 300 0 100 200 300 Position (bp) Figure 3.2. Top and bottom panels showing coverage (read depth) of effector genes for which presence/absence polymorphisms were observed. y-axis = log of read depth, x- axis = specific position across the corresponding gene. A one on the y-axis indicates a read depth of 10, and a two indicates a read depth of 100. 103 log10(Read depth + 1) PCR validation and sequencing of effector gene amplicons—PCR validation was used to confirm presence/absence polymorphisms in 12 isolates. The latter included three genes that were absent from all the isolates (PITG_16663 (Avr1), PITG_21107, PITG_15424), three genes that were present in all the isolate (PITG_14371 (Avr3a), PITG_18215 (Avr3b), PITG_22727), and three genes (PITG_07634, PITG_19800, PITG_10232) for which there were presence/absence polymorphisms (Figure 3.3). For genes that were absent from the isolates, no amplification occurred, and gels had no bands. For genes that were present in all the isolates or in some of the isolates, amplification occurred, bands were seen on gels, and sequencing of the amplicons confirmed that the proper genes of interest were amplified. A. 07634 bp 500 400 300 200 75 B. 10232 bp 500 400 300 200 75 C. 19800 bp 500 400 300 200 75 Figure 3.3. Figure 3.3. Gel Red stained agarose gel of the PCR validation of three genes (A-C) showing presence/absence polymorphisms. The order of wells is the same for each row. 104 O’GeneRuler 1 kb Plus Ladder 1) Ladder 2) 702 3) 778 4) 813 5) 1627 6) 1671 7) 1714 8) 1734 9) 9231 10) 10232 11) 122344 12) 150002 13) 150084 (-) control All but one amplification offered validation and confirmed the results of the PenSeq (Figure 3.3). For PITG_10232 was not only absent when isolate 9231 DNA was used as template, as was expected (Table 3.3), but also absent when isolate 150002 DNA was used as template, which was not shown to be absent from isolate 10232. This result was observed even after multiple primer sets were designed to amplify PITG_10232, and after many PCR attempts. Following the PCR validation, amplicons were Sanger sequenced. Alignments with the T30-4 reference sequence and BLAST searches showed high percent identity between our isolate amplicons and the corresponding reference sequences, indicating that our primers were correctly amplifying each of the nine target effectors. For PITG_10232, not only did our sequences have 98% identity with PITG_10232, but our sequences also had 98% identity with PITG_14443. For PITG_14443, all the isolates contained the gene. Variant calling analysis and determination of polymorphisms within genes—The average read depth across the variant sites was between 80.9 (isolate 813) and 130.6 (isolate 122344) for the 12 isolates across all sites (Table 3.1). For the two isolates that were whole genome sequenced, the read depth was 20.9 for FL2009P4 (300556 sites), and 33.6 for BL2009P4 (430592 sites) (Table 3.1). For the 12 isolates that underwent target enrichment re-sequencing, 68/141 of the baited genes encoding non-RXLR proteins contained at least one sequence polymorphism, most often resulting in amino acid substitution. The gene with the most polymorphisms was PITG_03855, which putatively encodes a DNA-directed RNA polymerase 1 subunit RPA1. For the two isolates that were whole genome sequenced, 98/141 of the genes that did not encode RXLRs had at least one polymorphism. PITG_17846 had 83 polymorphisms, whereas only eight were collectively present in PITG_17846 in the 12 isolates that underwent target enrichment re-sequencing (Table 3.S1; Table 3.S3). 105 For the genes encoding RXLR effectors, the combined number of polymorphisms within genes for the 12 isolates that underwent target enrichment sequencing was 233/438. For the two isolates that underwent whole-genome sequencing, 206/438 of genes contained polymorphisms. For both sets of data, the gene with the most polymorphisms was PITG_17871 with 36 polymorphisms within the isolates that were whole-genome sequenced, and 29 polymorphisms for the isolates that underwent PenSeq (Table 3.S1). There were a total of 2014 polymorphisms within 579 genes when examining the PenSeq data. Isolate 1627 contained 883 polymorphisms, followed by 9231 with 877. BL2009P4 contained 1475 polymorphisms, whereas FL2009P4 contained 1242 polymorphisms. SnpEff was used to predict the potential effects that the polymorphisms might have on protein function. While many polymorphisms were found to be synonymous changes, others were found to be missense mutations (Table 3.4, Table 3.S3). Polymorphism patterns were examined for six better characterized effectors (Table 3.4). PITG_07550 (AvrSmira) had seven polymorphisms based on the PenSeq data, but when the whole-genome sequence data was examined, a total of 11 polymorphisms existed between the two strains FL2009P4 and BL2009P4 (Table S1). Both results demonstrate that this gene has considerable sequence variation between the isolates within the US-23 clonal lineage. 106 Table 3.4. Summary of the variant calling analyses on a selection of characterized effectors. Few effectors have been well-studied in P. infestans, but the effectors below have been better described. Gene Amino acid substitution Annotation Reference Alt. 0/1 1/1 S19C missense variant A T A/T T 9231 None E80K missense variant G A G/A A 122344, 150002, 1627, 1671, 1714, None 1734, 813, 9231 M103I missense variant G T G/T T 122344, 150002, 1671, None 1734, 9231 L121L synonymous variant T C T/C C PITG_14371- Avr3a 10232, 122344, 150002, 150084, 1627, 1671, 1714, None 1734, 702, 778, 813, 9231 R124G missense variant C G C/G G 10232, 122344, 150002, 150084, 1627, 1671, 1714, None 1734, 702, 778, 813, 9231 R124K missense variant C T C/T T 122344, 150002, 150084, 1627, PITG_18215- Avr3b 1671, 1714, 10232 1734, 702, 778, 813, 9231 G85R missense variant C G C/G G 107 122344, 150002, 150084, 1627, 1671, 1714, 10232 1734, 702, 778, 813, 9231 R41L missense variant C A C/A A 122344, 150002, 150084, 1627, 1671, 1714, 10232 1734, 702, 778, 813, 9231 P107P synonymous variant A G A/G G PITG_16294- 10232, Avrvnt1 122344, 150084, 150002 1627, 1671, 1734, 9231 702, 778, 813 R2R synonymous variant T G T/G G PITG_21388- Avrblb1 122344, 1627, 10232, 1671, 1714, 150002, 1734, 702, 150084, 778, 9231 813 Substitution occurs at p.46. synonymous TCCT CCCA TCCT/CCCA CCCA No other prediction variant 122344, 150084, 1671, 1734 PITG_07550- Avrsmira1 N123N synonymous variant C T C/T T 1714 Q128R missense variant A G A/G G 108 10232, 122344, 150002, 150084, 1627, None 1671, 1714, 1734, 702, 778, 813, 9231 K131R missense variant A G A/G G 10232, 122344, 150002, 150084, None 1627, 1671, 1714, 1734, 702, 778, 813 L162L synonymous variant G A G/A A 10232, 122344, None 150084, 1671, 702, 778, 813 R170Q missense variant G A G/A A 10232, 122344, 150084, 1671, 702, 778, 813 N221D missense variant A G A/G G None 778 N18D missense variant A G A/G G PITG_04314- 150002, PexRd24 150084, 1627, 1734, 702, None 813, 9231 To further verify that the target enrichment sequencing methodology can be used to better understand diversity within the US-23 clonal lineage, PITG_14371 (Avr3a) was also sequenced for each isolate using Sanger sequencing. Based on the PenSeq results, there were five heterozygous nucleotide substitutions in Avr3a c.55A>T (S19C), c.238G>A (E80K), c.309G>T (M103I), c.363T>C (L121L), c.370C>G (R124G) (Table 3.4). The 109 peak calls had to be visualized because the nucleotide substitutions are heterozygous. The c.55A>T substitution, which was shown to be present in isolate 9231 in the PenSeq data, was also only seen in isolate 9231 in the alignment of the 12 isolates compared to the T30-4 reference sequence. For c.238G>A (E80K) there was also a mixed signal of G and A. The heterozygous substitution was predicted to be present in 8/12 isolates (Table 3.4), but based on two sequencing runs, the mixed signal made it difficult to differentiate between isolates with and without the predicted substitution. Based on the two sequencing runs for c.309G>T (M103I), which was predicted to be present in five isolates, there also wasn’t clear resolution between the five isolates predicted to contain the nucleotide substitution and the remaining seven isolates. For c.363T>C (L121L) and c.370C>G (R124G), all the isolates were predicted to contain both mutations. While the signals were again mixed for c.363 and c.370, the peaks were stronger and the calls for each of the 12 isolates indicate that all the isolates appear to have the heterozygous substitution T>C. A similar result was seen for c.370C>G: all 12 isolates contained the heterozygous nucleotide substitution from C>G, which fit with the PenSeq results (Table 3.4). When comparing the variant calls of the two isolates that were whole genome sequenced, only c.55A>T, (S19C), c.363T>C (L121L) and c.370C>G (R124G) were included as variants within PITG_14371, in contrast to the five variants present in the PenSeq dataset (Table 3.4). Within the PenSeq results, for 9231 (syn. BL200904), the point mutations c.238G>A (E80K) and c.309G>T (M103I) were present, yet neither BL2009P4 nor FL2009P4 had either mutation, even though BL2009P4 and 9231 are technically the same isolate that was re-cultured and shared among researchers (Table 3.4). Discussion Single clonal lineages of the late blight pathogen, such as US-23, continue to devastate tomato and potato fields across the US, however diversity within a clonal lineage is not 110 fully understood. PenSeq target enrichment sequencing proves to be a powerful tool to ascertain presence/absence polymorphisms among different strains of a rapidly evolving pathogen such as P. infestans, and to also examine polymorphisms within genes. Analysis of presence/absence polymorphisms—Part of what makes PenSeq a powerful tool, is the ability to rapidly detect presence/absence polymorphisms in several different strains at once. The high read depth provides greater confidence that the differences in presence and absence of genes within isolates is not due to error. Within the US-23 clonal lineage, there were differences in presence and absence of genes, providing the first data on effector differences within the US-23 clonal lineage. Avr1 (PITG_16663) and PexRD45 (PITG_09632), which belongs to the Avrblb2 family, are absent from the 12 isolates (Table 3.2). Because those effector genes are not present, the corresponding R genes would be ineffective against these isolates. In Thilliez et al. 2019, a small number of reference isolates from the US, Europe, and elsewhere, were examined beside the T30- 4 reference strain. Only T30-4 had Avr1 (Thilliez et al. 2019). R1 is an important potato resistance gene (Ballvora et al. 2002), yet little is known about the role Avr1 plays in US- 23 isolates. In terms of the effectors for which there were presence/absence polymorphisms, many have unknown functions, but three belong to the Avrblb2 family (PITG_18683, PITG_20301, and PITG_20303). Two other effector genes, PITG_15718 and PITG_19617, belong to the Avr2 family. Finally, compared to isolates such as 13_A2 isolate (a lineage that emerged and spread rapidly in the UK), and other reference isolates, only the US-23 isolates and the T30-4 strain had PITG_18215 (Avr3b) (Thilliez et al. 2019), suggesting that the potato resistance gene R3b would be effective against isolates in the US-23 clonal lineage. PCR validation—Validation of results was straightforward when genes were all absent or all present but was more challenging when it came to validation of effector genes that 111 are present in some isolates within US-23 but absent in others. Primers were designed for many of the genes exhibiting presence/absence polymorphisms, and only some of them showed the expected presence/absence patterns across the 12 isolates. Trying to amplify full genes sometimes made it more challenging to amplify the effector of interest because other effectors with very similar sequences would be cross amplified. Thus, we modified to amplify partial regions of certain effector genes. Primers were designed with the T30- 4 default reference genome but using an assembled US-23 genome could offer greater specificity to primer design. Variant Calling Analysis—Many of the baited genes had at least one polymorphism when the results were compiled from all 12 isolates. The effectors that have been well- characterized and contain polymorphisms within the genes included Avr3a, Avr3b, Avrvnt1, Avrblb1, Avrsmira1, PexRd24 (Table 3.4). The variants observed were often identical to those previously described, including within the UK lineage 13_A2, as well as the T30-4 reference isolate. We did not observe any polymorphisms within Avr4, which was noted previously (Thilliez et al. 2019). While many effectors included in this study that contain within gene polymorphisms or differences in presence or absence of genes have not been previously characterized, some genes such as PITG_14371 (Avr3a) have been well-characterized in relation to potato resistance proteins R3a (Armstrong et al. 2005; Bos et al. 2006). All 12 isolates contained PITG_14371 (Avr3a), and five polymorphisms were identified within the gene. Four of the five polymorphisms resulted in nonsynonymous amino acid substitutions (Table 3.4). Previously, it was shown that the amino acid substitutions E80K and M103I determine avirulence and virulence in plants that have the corresponding resistance protein R3a. Both amino acid substitutions are necessary activate the potato resistance gene. Specifically, AVR3aK80I103 (AVR3aKI) but not AVR3aE90M103(AVR3aEM) activates the resistance protein to trigger ETI, even though the difference is only the two 112 amino acids (Armstrong et al. 2005; Bos et al. 2010). Eight isolates were heterozygous for the point mutation that results in the E80K substitution, and five isolates were heterozygous for the point mutation that results in the M103I substitution (Table 3.4). Therefore, the isolates with both amino acid substitutions that result in the AVR3aKI form, 122344, 150002, 1671, 1734, and 9231, may activate R3a and trigger ETI, whereas only 150084, 702, 778, 10232 have the AVR3aEM form, that is able to evade recognition by R3a, because none of these isolates had predicted amino acid substitutions. Unfortunately, Sanger sequencing proved to be somewhat ambiguous for validating the results because the examined point mutations were heterozygous. While the two isolates that were whole genome sequenced had three amino acid substitutions in PITG_14371, neither isolate had E80K and M103I. We would expect BL2009P4 to have the substitutions given that the isolate was supposedly the same isolate as 9231, but instead, BL2009P4 only had the other three substitutions: S19C, L121L, and R124G, out of the total five that 9231 had (Table 3.1; Table 3.4; Table 3.S3). With the availability of two US-23 isolates that were previously whole-genome sequenced, including BL2009P4 (syn. 9231), which also underwent PenSeq target enrichment sequencing as part of this study, the opportunity to compare results proved useful. Given the lower overall read depth across the genome, trying to compare presence/absence results proved challenging. Determination of polymorphisms within genes was more successful. The results were similar but not identical to the PenSeq results (Table S1). This could potentially be attributed to lower read depth and differences in alignment to the reference genome. Much of the analysis centers on the use of the T30-4 reference genome that was released in 2009 and has many contigs. A strength of the PenSeq tool is that it has facilitated the detection and confirmation of additional forms of the Avrblb1 and Avrblb2 gene families. For example, for Avrlblb1, there were variants that weren’t observed in the 113 T30-4 genome. Based on the T30-4 genome, the gene PITG_21388, which is the representative gene of Avrblb1 (also called IPIO), was once thought to just be a single gene. It was later realized that it is in fact a gene family with allelic variants (Thilliez et al. 2019). It would be worthwhile to obtain a whole genome reference sequence from a long- read platform. This would aid in validation experiments. Examination of US-23 effector diversity using the PenSeq tools paired with a longer read sequencing technology would also offer valuable insights, and greater accuracy, since the P. infestans genome is repeat rich. Furthermore, the PenSeq results do not indicate which of the genes are being expressed and understanding expression levels of the effectors is a pivotal step. Recently the PenSeq technique was combined with single-molecule real-time sequencing as well as cDNA PenSeq to better differentiate between individual effector alleles and related paralogs, and to better deduce which effectors were being expressed (Lin et al. 2020). Overall, the PenSeq tool provides new insights on effector diversity within the US-23 clonal lineage. The information obtained from the PenSeq analysis helps to provide foundational knowledge regarding diversity of effectors within a clonal lineage. Acknowledgements We thank Kevin Myers for assistance obtaining and maintaining the P. infestans isolates, and for his expertise and guidance working with P. infestans. 114 REFERENCES 2017 Census of Agriculture. 2019. USDA National Agricultural Statistics Service. Available at: www.nass.usda.gov/AgCensus Armstrong, M.R., Whisson, S.C., Pritchard, L., Bos, J.I.B., Venter, E., Avrova, A.O., Rehmany, A.P., Böhme, U., Brooks, K., Cherevach, I., Hamlin, N., White, B., Fraser, A., Lord, A., Quail, M.A., Churcher, C., Hall, N., Berriman, M., Huang, S., Kamoun, S., Beynon, J.L., Birch, P.R.J. 2005. An ancestral oomycete locus contains late blight avirulence gene Avr3a, encoding a protein that is recognized in the host cytoplasm. Proc. Natl. Acad. Sci. U. S. A. 102:7766–7771. Ballvora, A., Ercolano, M.R., Weiss, J., Meksem, K., Bormann, C.A., Oberhagemann, P., Salamini, F., Gebhardt, C. 2002. The R1 gene for potato resistance to late blight (Phytophthora infestans) belongs to the leucine zipper/NBS/LRR class of plant resistance genes. Plant J. 30:361–371. de Bary, A. 1861. Die Gegenwärtig Herrschende Kartoffelkrankheit: Ihre Ursache und Ihre Verhütung. Eine Pflanzenphysiologische Untersuchung in Allgemein Verstándlicher Form Dargestellt. Förstner’sche Buchhandlung. Bos, J.I.B., Armstrong, M.R., Gilroy, E.M., Boevink, P.C., Hein, I., Taylor, R.M., Zhendong, T., Engelhardt, S., Vetukuri, R.R., Harrower, B., Dixelius, C., Bryan, G., Sadanandom, A., Whisson, S.C., Kamoun, S., Birch, P.R.J. 2010. Phytophthora infestans effector AVR3a is essential for virulence and manipulates plant immunity by stabilizing host E3 ligase CMPG1. Proc. Natl. Acad. Sci. U. S. A. 107:9909–9914. Bos, J.I.B., Kanneganti, T.-D., Young, C., Cakir, C., Huitema, E., Win, J., Armstrong, M.R., Birch, P.R.J., Kamoun, S. 2006. The C-terminal half of Phytophthora infestans RXLR effector AVR3a is sufficient to trigger R3a-mediated hypersensitivity and suppress INF1-induced cell death in Nicotiana benthamiana. Plant J. 48:165–176. Bozkurt, T.O., Schornack, S., Win, J., Shindo, T., Ilyas, M., Oliva, R., Cano, L.M., Jones, A.M.E., Huitema, E., van der Hoorn, R.A.L., Kamoun, S. 2011. Phytophthora infestans effector AVRblb2 prevents secretion of a plant immune protease at the haustorial interface. Proc. Natl. Acad. Sci. U. S. A. 108:20832– 20837. Dou, D., Kale, S.D., Wang, Xinle, Chen, Y., Wang, Q., Wang, Xia, Jiang, R.H.Y., Arredondo, F.D., Anderson, R.G., Thakur, P.B., McDowell, J.M., Wang, Y., Tyler, B.M. 2008. Conserved C-Terminal motifs required for avirulence and suppression of cell death by Phytophthora sojae effector Avr1b. The Plant Cell. 20:1118–1133. 115 Du, Y., Berg, J., Govers, F., and Bouwmeester, K. 2015. Immune activation mediated by the late blight resistance protein R1 requires nuclear localization of R1 and the effector AVR1. New Phytol. 207:735–747. Edgar, R. C. 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32:1792–1797. FAOSTAT-Crops. 2019. Food and Agriculture Organization of the United Nations. Available at: http://www.fao.org/faostat/en/#data/QC [Accessed September 27, 2018]. Fry, W.E., Birch, P.R.J., Judelson, H.S., Grünwald, N.J., Danies, G., Everts, K.L., Gevens, A.J., Gugino, B.K., Johnson, D.A., Johnson, S.B., McGrath, M.T., Myers, K.L., Ristaino, J.B., Roberts, P.D., Secor, G., Smart, C.D. 2015. Five Reasons to Consider Phytophthora infestans a reemerging pathogen. Phytopathology. 105:966–981. Fry, W.E., McGrath, M.T., Seaman, A., Zitter, T.A., McLeod, A., Danies, G., Small, I.M., Myers, K., Everts, K., Gevens, A.J., Gugino, B.K., Johnson, S.B., Judelson, H., Ristaino, J., Roberts, P., Secor, G., Seebold, K., Snover-Clift, K., Wyenandt, A., Grünwald, N.J., Smart, C.D. 2013. The 2009 late blight pandemic in the Eastern United States – causes and results. Plant Dis. 97:296– 306. Gao, C., Xu, H., Huang, J., Sun, B., Zhang, F., Savage, Z., Duggan, C., Yan, T., Wu, C., Wang, Y., Vleeshouwers, V.G.A.A., Kamoun, S., Bozkurt, T.O., Dong, S. 2020. Pathogen manipulation of chloroplast function triggers a light-dependent immune recognition. Proc. Natl. Acad. Sci. U. S. A. 117:9613–9620. Garrison, E., and Marth, G. 2012. Haplotype-based variant detection from short-read sequencing. arXiv:1207.3907 [q-bio]. Available at: http://arxiv.org/abs/1207.3907 [Accessed June 29, 2021]. Gilroy, E.M., Breen, S., Whisson, S.C., Squires, J., Hein, I., Kaczmarek, M., Turnbull, D., Boevink, P.C., Lokossou, A., Cano, L.M., Morales, J., Avrova, A.O., Pritchard, L., Randall, E., Lees, A., Govers, F., West, P. van, Kamoun, S., Vleeshouwers, V.G.A.A., Cooke, D.E.L., Birch, P.R.J. 2011. Presence/absence, differential expression, and sequence polymorphisms between PiAVR2 and PiAVR2-like in Phytophthora infestans determine virulence on R2 plants. New Phytologist. 191:763–776. Goodwin, S. B., Cohen, B. A., and Fry, W. E. 1994. Panglobal distribution of a single clonal lineage of the Irish potato famine fungus. Proc. Natl. Acad. Sci. U. S. A. 91:11591–11595. Haas, B.J., Kamoun, S., Zody, M.C., Jiang, R.H.Y., Handsaker, R.E., Cano, L.M., Grabherr, M., Kodira, C.D., Raffaele, S., Torto-Alalibo, T., Bozkurt, T.O., Ah-Fong, A.M.V., 116 Alvarado, L., Anderson, V.L., Armstrong, M.R., Avrova, A., Baxter, L., Beynon, J., Boevink, P.C., Bollmann, S.R., Bos, J.I.B., Bulone, V., Cai, G., Cakir, C., Carrington, J.C., Chawner, M., Conti, L., Costanzo, S., Ewan, R., Fahlgren, N., Fischbach, M.A., Fugelstad, J., Gilroy, E.M., Gnerre, S., Green, P.J., Grenville-Briggs, L.J., Griffith, J., Grünwald, N.J., Horn, K., Horner, N.R., Hu, C.-H., Huitema, E., Jeong, D.-H., Jones, A.M.E., Jones, J.D.G., Jones, R.W., Karlsson, E.K., Kunjeti, S.G., Lamour, K., Liu, Z., Ma, L., MacLean, D., Chibucos, M.C., McDonald, H., McWalters, J., Meijer, H.J.G., Morgan, W., Morris, P.F., Munro, C.A., O’Neill, K., Ospina-Giraldo, M., Pinzón, A., Pritchard, L., Ramsahoye, B., Ren, Q., Restrepo, S., Roy, S., Sadanandom, A., Savidor, A., Schornack, S., Schwartz, D.C., Schumann, U.D., Schwessinger, B., Seyer, L., Sharpe, T., Silvar, C., Song, J., Studholme, D.J., Sykes, S., Thines, M., van de Vondervoort, P.J.I., Phuntumart, V., Wawra, S., Weide, R., Win, J., Young, C., Zhou, S., Fry, W., Meyers, B.C., van West, P., Ristaino, J., Govers, F., Birch, P.R.J., Whisson, S.C., Judelson, H.S., Nusbaum, C. 2009. Genome sequence and analysis of the Irish potato famine pathogen Phytophthora infestans. Nature. 461:393–398. Hansen, Z.R., Everts, K.L., Fry, W.E., Gevens, A.J., Grünwald, N.J., Gugino, B.K., Johnson, D.A., Johnson, S.B., Judelson, H.S., Knaus, B.J., McGrath, M.T., Myers, K.L., Ristaino, J.B., Roberts, P.D., Secor, G.A., Smart, C.D. 2016. Genetic variation within clonal lineages of Phytophthora infestans revealed through genotyping-by-sequencing, and implications for late blight epidemiology. PLOS ONE. 11:e0165690. Jiang, R. H. Y., Tripathy, S., Govers, F., and Tyler, B. M. 2008. RXLR effector reservoir in two Phytophthora species is dominated by a single rapidly evolving superfamily with more than 700 members. Proc. Natl. Acad. Sci. U. S. A. 105:4874–4879. Kamoun, S., Furzer, O., Jones, J.D.G., Judelson, H.S., Ali, G.S., Dalio, R.J.D., Roy, S.G., Schena, L., Zambounis, A., Panabières, F., Cahill, D., Ruocco, M., Figueiredo, A., Chen, X.-R., Hulvey, J., Stam, R., Lamour, K., Gijzen, M., Tyler, B.M., Grünwald, N.J., Mukhtar, M.S., Tomé, D.F.A., Tör, M., Van Den Ackerveken, G., McDowell, J., Daayf, F., Fry, W.E., Lindqvist-Kreuze, H., Meijer, H.J.G., Petre, B., Ristaino, J., Yoshida, K., Birch, P.R.J., Govers, F. 2015. The top 10 oomycete pathogens in molecular plant pathology. Mol. Plant Pathol. 16:413–434. Kearse, M., Moir, R., Wilson, A., Stones-Havas, S., Cheung, M., Sturrock, S., Buxton, S., Cooper, A., Markowitz, S., Duran, C., Thierer, T., Ashton, B., Meintjes, P., Drummond, A. 2012. Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics. 28:1647–1649. Knaus, B. J., and Grünwald, N. J. 2017. vcfr: a package to manipulate and visualize variant call format data in R. Molecular Ecology Resources. 17:44–53. 117 Knaus, B. J., Tabima, J. F., Davis, C. E., Judelson, H. S., and Grunwald, N. J. 2016. Genomic analyses of dominant US clonal lineages of Phytophthora infestans reveals a shared common ancestry for clonal lineages US11 and US18 and a lack of recently shared ancestry among all other US lineages. Phytopathology. 106:1393–1403. Langmead, B., and Salzberg, S. L. 2012. Fast gapped-read alignment with Bowtie 2. Nat Methods. 9:357–359. Lee, H.-A., Kim, S.-Y., Oh, S.-K., Yeom, S.-I., Kim, S.-B., Kim, M.-S., Kamoun, S., Choi, D. 2014. Multiple recognition of RXLR effectors is associated with nonhost resistance of pepper against Phytophthora infestans. New Phytol. 203:926–938. Li, H. 2013. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv:1303.3997 [q-bio]. Available at: http://arxiv.org/abs/1303.3997 Li, Y., Cooke, D. E. L., Jacobsen, E., and van der Lee, T. 2013. Efficient multiplex simple sequence repeat genotyping of the oomycete plant pathogen Phytophthora infestans. J. Microbiol. Methods. 92:316–322. Lin, X., Armstrong, M., Baker, K., Wouters, D., Visser, R.G.F., Wolters, P.J., Hein, I., Vleeshouwers, V.G.A.A. 2020. Identification of Avramr1 from Phytophthora infestans using long read and cDNA pathogen-enrichment sequencing (PenSeq). Mol Plant Pathol. 21:1502–1512. Martin, M. 2011. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal. 17:10–12. Martin, M.D., Cappellini, E., Samaniego, J.A., Zepeda, M.L., Campos, P.F., Seguin- Orlando, A., Wales, N., Orlando, L., Ho, S.Y.W., Dietrich, F.S., Mieczkowski, P.A., Heitman, J., Willerslev, E., Krogh, A., Ristaino, J.B., Gilbert, M.T.P. 2013. Reconstructing genome evolution in historic samples of the Irish potato famine pathogen. Nat Commun. 4:2172. Oh, S.-K., Young, C., Lee, M., Oliva, R., Bozkurt, T.O., Cano, L.M., Win, J., Bos, J.I.B., Liu, H.-Y., Damme, M. van, Morgan, W., Choi, D., Vossen, E.A.G.V. der, Vleeshouwers, V.G.A.A., Kamoun, S. 2009. In planta expression screens of Phytophthora infestans RXLR effectors reveal diverse phenotypes, including activation of the Solanum bulbocastanum disease resistance protein Rpi-blb2. The Plant Cell. 21:2928–2947. Quinlan, A. R., and Hall, I. M. 2010. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 26:841–842. 118 Rietman, H., Bijsterbosch, G., Cano, L.M., Lee, H.-R., Vossen, J.H., Jacobsen, E., Visser, R.G.F., Kamoun, S., Vleeshouwers, V.G.A.A. 2012. Qualitative and quantitative late blight resistance in the potato cultivar Sarpo Mira is determined by the perception of five distinct RXLR effectors. Mol Plant Microbe Interact. 25:910–919. Saville, A., and Ristaino, J. B. 2019. Genetic structure and subclonal variation of extant and recent US lineages of Phytophthora infestans. Phytopathology. 109:1614– 1627. Thilliez, G.J.A., Armstrong, M.R., Lim, T.-Y., Baker, K., Jouet, A., Ward, B., van Oosterhout, C., Jones, J.D.G., Huitema, E., Birch, P.R.J., Hein, I. 2019. Pathogen enrichment sequencing (PenSeq) enables population genomic studies in oomycetes. New Phytol. 221:1634–1648. Thorvaldsdóttir, H., Robinson, J. T., and Mesirov, J. P. 2013. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform. 14:178–192. Vleeshouwers, V.G.A.A., Rietman, H., Krenek, P., Champouret, N., Young, C., Oh, S.- K., Wang, M., Bouwmeester, K., Vosman, B., Visser, R.G.F., Jacobsen, E., Govers, F., Kamoun, S., der Vossen, E.A.G.V. 2008. Effector genomics accelerates discovery and functional profiling of potato disease resistance and Phytophthora Infestans avirulence genes. PLoS One. 3:e2875. Whisson, S.C., Boevink, P.C., Moleleki, L., Avrova, A.O., Morales, J.G., Gilroy, E.M., Armstrong, M.R., Grouffaud, S., van West, P., Chapman, S., Hein, I., Toth, I.K., Pritchard, L., Birch, P.R.J. 2007. A translocation signal for delivery of oomycete effector proteins into host plant cells. Nature. 450:115–118. Wickham, H. 2016. ggplot2: Elegant Graphics for Data Analysis. 2nd ed. Springer International Publishing. Available at: http://www.springer.com/gp/book/9783319242750 Win, J., Morgan, W., Bos, J., Krasileva, K.V., Cano, L.M., Chaparro-Garcia, A., Ammar, R., Staskawicz, B.J., Kamoun, S. Adaptive evolution has targeted the C-Terminal domain of the RXLR effectors of plant pathogenic oomycetes. Plant Cell. 19:2349–2369. Zheng, X., McLellan, H., Fraiture, M., Liu, X., Boevink, P.C., Gilroy, E.M., Chen, Y., Kandel, K., Sessa, G., Birch, P.R.J., Brunner, F. 2014. Functionally redundant RXLR effectors from Phytophthora infestans act at different steps to suppress early flg22-triggered immunity. PLOS Pathogens. 10:e1004057. 119 SUPPLEMENTAL MATERIALS Table 3.S1. Table of the 579 baited effector genes and data on presence/absence polymorphisms and variants within each gene. The non-RXLR effectors are at the top, followed by the RXLR effectors. P. infestans PenSeq metadata table-Adapted from supplemental table S1 in Thilliez et. al 2019 Number Number Gene of SNPs of SNPs PenSeq length Gene ID T30-4 annotation in Presence/Absence located located annotation polymorphisms within within base genes- genes- pairs PenSeq WGS data data PITG_00004T0 NON-RXLR Protein kinase, putative 1845 All present None None PITG_00058 NON-RXLR Protease inhibitor EpiC4 519 All present 2 None Monovalent Cation:Proton PITG_00082T0 NON-RXLR Antiporter-2 (CPA2) 2088 All present None 2 family PITG_00156 NON-RXLR Beta-tubulin 1341 All present 2 2 PITG_00329 NON-RXLR Major Facilitator Superfamily (MFS) 1530 All present 3 4 Haustorium-specific PITG_00375 NON-RXLR membrane protein, 3072 All present 1 1 putative Drug/Metabolite PITG_00494T0 NON-RXLR Transporter (DMT) 1281 All present None 4 Superfamily PITG_00957 NON-RXLR NPP1-like protein 897 All present 2 2 UDP-3-O-[3- PITG_01112 NON-RXLR hydroxymyristoyl] glucosamine N- 609 All present 4 5 acyltransferase PITG_01195 NON-RXLR Aspartate aminotransferase, putative 2001 All present 2 2 PITG_01369T0 NON-RXLR Protease inhibitor Epi2 453 All present None 6 PITG_01695 NON-RXLR Conserved hypothetical protein 1173 All present 2 2 PITG_02174T0 NON-RXLR Ammonium Transporter (Amt) Family 1539 All present None 2 PITG_02175T0 NON-RXLR Ammonium Transporter (Amt) Family 1506 All present None None PITG_02277T0 NON-RXLR Cyclopropane-fatty-acyl-phospholipid synthase 1233 All present None 7 PITG_02364 NON-RXLR P-type ATPase (P-ATPase) Superfamily 2292 All present None None DNA-directed RNA PITG_02420 NON-RXLR polymerase I subunit 3540 All present 14 12 RPA2, putative Ubiquitin-specific PITG_02561T0 NON-RXLR protease, putative 7869 All present None 43 PITG_02706T0 NON-RXLR NUK6 2166 All present None 14 PITG_03167T0 NON-RXLR Major Facilitator Superfamily (MFS) 1320 All present None 1 Pro-apoptotic serine PITG_03205T0 NON-RXLR protease, putative 2901 All present None None Monovalent Cation:Proton PITG_03264T0 NON-RXLR Antiporter-1 (CPA1) 1608 All present None 5 Family 120 Monovalent Cation:Proton PITG_03265T0 NON-RXLR Antiporter-1 (CPA1) 1755 All present None 5 Family PITG_03540 NON-RXLR Major Facilitator Superfamily (MFS) 1503 All present 4 2 DNA-directed RNA PITG_03855 NON-RXLR polymerase I subunit 5436 All present 34 23 RPA1, putative PITG_03919 NON-RXLR Ribokinase, putative 975 All present 8 8 PITG_03961 NON-RXLR Conserved hypothetical 1032 All present 3 2 protein PITG_03966T0 NON-RXLR NhaC Na :H Antiporter (NhaC) Family 2055 All present None 3 PITG_04011 NON-RXLR Tetraacyldisaccharide 4'-kinase, putative 1083 All present 1 1 PITG_04071 NON-RXLR Putative extracellular 1131 All present None None dioxygenase PITG_04213 NON-RXLR Conserved hypothetical protein 300 All present None None PITG_04447 NON-RXLR Conserved hypothetical protein 2076 All present 11 11 PITG_04475 NON-RXLR Hypothetical protein 333 All present None None PITG_04844T0 NON-RXLR Folate-Biopterin Transporter (FBT) family 1452 All present None 5 PITG_04949 NON-RXLR Conserved hypothetical protein 1014 All present 1 1 PITG_05412T0 NON-RXLR Ca2 :Cation Antiporter 2040 All present None 8 (CaCA) Family PITG_05430T0 NON-RXLR Kazal-type serine protease inhibitor, putative 900 All present None 15 PITG_05437T0 NON-RXLR Epi6-like protease inhibitor 966 All present None 2 PITG_05440T0 NON-RXLR Protease inhibitor Epi6 969 All present None 4 PITG_05498 NON-RXLR Heat shock protein 90, putative 2472 All present 14 11 PITG_05854 NON-RXLR Conserved hypothetical protein 324 All present 2 1 PITG_05902 NON-RXLR Major Facilitator 1533 All present 9 10 Superfamily (MFS) PITG_06212 NON-RXLR Conserved hypothetical protein 699 All present 6 5 PITG_06600 NON-RXLR Conserved hypothetical protein 4464 All present 30 37 PITG_06706 NON-RXLR Conserved hypothetical 378 All present 1 5 protein Membrane-bound PITG_06885T0 NON-RXLR transcription factor site-1 2898 All present None 7 protease, putative PITG_07096T0 NON-RXLR Protease inhibitor Epi11 1335 All present None None Acyl-[acyl-carrier- PITG_07121 NON-RXLR protein]-UDP-N-acetylglucosamine O- 1083 All present 6 7 acyltransferase PITG_07143 NON-RXLR Catalase-peroxidase, putative 2073 All present 9 2 3-deoxy-manno- PITG_07240 NON-RXLR octulosonate cytidylyltransferase, 768 All present 3 3 putative PITG_07452T0 NON-RXLR Protease inhibitor Epi12 246 All present None 1 Major Facilitator PITG_07721T0 NON-RXLR Superfamily (MFS) 873 All present None 8 PITG_09169 NON-RXLR Protease inhibitor EpiC1 381 All present 1 1 121 PITG_09173 NON-RXLR Protease inhibitor EpiC2B 378 All present None None PITG_09175 NON-RXLR Protease inhibitor EpiC2A 378 All present 1 None PITG_09408T0 NON-RXLR UDP-N-acetylglucosamine 816 All present None None transporter, putative PITG_09409T0 NON-RXLR Transmembrane protein, putative 1524 All present None 5 DNA-directed RNA PITG_09712 NON-RXLR polymerase I, II, and III 429 All present 1 2 subunit rpabc3, putative PITG_09963 NON-RXLR Cellulose synthase catalytic subunit, putative 3246 All present 2 1 PITG_09964 NON-RXLR Cellulose synthase 2 3081 All present None None Drug/Metabolite PITG_10012T0 NON-RXLR Transporter (DMT) 1083 All present None 6 Superfamily PITG_10226T0 NON-RXLR Ammonium Transporter (Amt) Family 1557 All present None None DNA-directed RNA PITG_10445 NON-RXLR polymerase I, II, and III 621 All present 2 2 subunit RPABC1, putative PITG_10462 NON-RXLR Conserved hypothetical protein 3087 All present 4 9 PITG_10544 NON-RXLR Putative GPI-anchored 492 All present 4 5 acidic protein PITG_10891T0 NON-RXLR Inositol transporter, putative 1467 All present None 7 Mitochondrial inner PITG_10994T0 NON-RXLR membrane protease 495 All present None 2 subunit, putative PITG_11116 NON-RXLR NAD-specific glutamate dehydrogenase, putative 3165 All present 11 8 PITG_11365 NON-RXLR Conserved hypothetical 1290 All present None None protein PITG_11454 NON-RXLR Dynamin 2202 All present 9 9 PITG_11489T0 NON-RXLR Fatty-acid-CoA ligase, putative 3396 All present None 3 PITG_11753 NON-RXLR Conserved hypothetical 624 All present 2 None protein PITG_11891 NON-RXLR Conserved hypothetical protein 1008 All present None None UDP-3-O-[3- PITG_11976 NON-RXLR hydroxymyristoyl] N-acetylglucosamine 942 All present 4 4 deacetylase, putative PITG_12129T0 NON-RXLR Protease inhibitor Epi10 687 All present None 1 PITG_12131T0 NON-RXLR Protease inhibitor Epi4 957 All present None 3 PITG_12138T0 NON-RXLR Kazal-type serine protease 1263 All present None 5 inhibitor, putative PITG_12462 NON-RXLR Aconitate hydratase, putative 2103 All present 6 5 PITG_12808 NON-RXLR Amino Acid/Auxin Permease (AAAP) Family 1485 All present 5 14 DNA-directed RNA PITG_12877 NON-RXLR polymerase I, II, and III 411 All present 3 6 subunit RPABC2, putative PITG_13063 NON-RXLR Purine-cytosine permease, 1374 All present 8 18 putative PITG_13292T0 NON-RXLR Protease inhibitor Epi9 243 All present None None PITG_13473 NON-RXLR Major Facilitator Superfamily (MFS) 1407 All present 7 6 PITG_13567 NON-RXLR Endo-1,3(4)-beta-glucanase, putative 2700 All present 7 None 122 PITG_13661 NON-RXLR Conserved hypothetical protein 1053 All present 4 4 PITG_13849T0 NON-RXLR GDP-mannose transporter, putative 1020 All present None None PITG_13917 NON-RXLR Lipid-A-disaccharide 1143 All present 4 9 synthase, putative PITG_14412 NON-RXLR Carbonic anhydrase, putative 819 All present 5 17 PITG_14583 NON-RXLR Conserved hypothetical protein 1317 All present 9 8 PITG_14613 NON-RXLR Conserved hypothetical 696 All present 2 6 protein PcF and SCR74-like cys- PITG_14645 NON-RXLR rich Secreted peptide, 444 All present None None putative PITG_14710 NON-RXLR Transmembrane protein, putative 1116 All present None None PITG_14712 NON-RXLR Cell division protease ftsH 1308 All present 9 8 PITG_14720 NON-RXLR Aldose 1-epimerase, 897 All present 1 1 putative PITG_14833 NON-RXLR Conserved hypothetical protein 1803 All present None None PITG_14891 NON-RXLR Protease inhibitor EpiC3 396 All present 2 2 PITG_14993 NON-RXLR Protein-L-isoaspartate O- 720 All present 4 4 methyltransferase, putative RNA polymerase I- PITG_15566 NON-RXLR specific transcription 2364 All present 6 6 initiation factor rrn3, putative PITG_16349 NON-RXLR Conserved hypothetical 942 All present 1 1 protein DNA-directed RNA PITG_16658 NON-RXLR polymerases I and III 40 915 All present None None kDa polypeptide DNA-directed RNA PITG_16659 NON-RXLR polymerase I and III 1020 All present None None subunit RPAC1, putative PITG_16827T0 NON-RXLR Protease inhibitor Epi3 270 Absent-778, 10232, 150084 None None PITG_16865 NON-RXLR Aldose 1-epimerase, putative 993 All present 6 5 PITG_16984 NON-RXLR Cellulose synthase 4 3060 All present 9 7 Cellulose synthase PITG_17007 NON-RXLR catalytic subunit [UDP- 3429 All present 1 1 forming], putative PITG_17141T0 NON-RXLR Major Facilitator Superfamily (MFS) 1326 All present None 8 PITG_17142T0 NON-RXLR Major Facilitator Superfamily (MFS) 1380 All present None 8 2-dehydro-3- PITG_17172 NON-RXLR deoxyphosphooctonate 837 All present 4 4 aldolase PITG_17501 NON-RXLR Glucosylceramidase, 1620 All present None None putative PITG_17837 NON-RXLR Carbonic anhydrase, putative 735 All present 2 1 Carbonic anhydrase, PITG_17842 NON-RXLR putative 810 Absent-778 4 21 PITG_17846 NON-RXLR Carbonic anhydrase, putative 1377 Absent-813, 1671, 1714, 122344, 8 83 PITG_17848 NON-RXLR Carbonic anhydrase, putative 810 All present 8 None PITG_18284 NON-RXLR Carbonic anhydrase, putative 810 All present None None Phospholipid PITG_18316 NON-RXLR hydroperoxide glutathione 3312 All present 2 2 123 peroxidase, putative 3-isopropylmalate PITG_18538 NON-RXLR dehydratase large subunit, 2136 All present 6 1 putative DNA-directed RNA PITG_18777 NON-RXLR polymerase I and III 1032 All present None None subunit RPAC1, putative PITG_19005 NON-RXLR Conserved hypothetical protein 2118 All present 14 16 PITG_19286T0 NON-RXLR Major Facilitator 1302 All present None 1 Superfamily (MFS) Major Facilitator PITG_19866T0 NON-RXLR Superfamily (MFS)-like 357 All present None 1 protein PITG_20031T0 NON-RXLR UDP-sugar transporter, putative 1050 All present None 6 PITG_20786 NON-RXLR Lipase, putative 1425 All present 6 5 Metal Ion (Mn2 -iron) PITG_21098T0 NON-RXLR Transporter (Nramp) 915 All present None None Family PITG_21328 NON-RXLR Hypothetical protein 699 All present 2 None PITG_21644T0 NON-RXLR UDP-N-acetylglucosamine transporter, putative 1149 All present None 3 PITG_21980 NON-RXLR Carbonic anhydrase, putative 570 All present 3 1 PITG_22681T0 NON-RXLR Protease inhibitor Epi1 450 All present None None PITG_22692T0 NON-RXLR Epi2-like protease inhibitor 270 All present None 3 PITG_22739T0 NON-RXLR Epi5-like protease inhibitor 246 All present None None PITG_22881 NON-RXLR Protease inhibitor EpiC 327 All present None None PITG_22920T0 NON-RXLR Epi12-like protease inhibitor 243 All present None None PITG_22936T0 NON-RXLR Epi2-like protease inhibitor 516 All present None None PITG_22950T0 NON-RXLR Protease inhibitor Epi7 423 All present None 4 PITG_22995T0 NON-RXLR Protease inhibitor Epi5 267 All present None None PITG_23012T0 NON-RXLR Kazal-like protease inhibitor 276 All present None None PITG_23032T0 NON-RXLR Protease inhibitor Epi8 465 All present None 1 PITG_23065T0 NON-RXLR General transcription factor IIH subunit, putative 495 All present None 2 PITG_23119T0 NON-RXLR Epi1-like protease inhibitor 441 All absent None None PITG_23123 NON-RXLR Small cysteine rich protein 153 All present 1 None SCR50 PITG_23147T0 NON-RXLR Kazal-like protease inhibitor 276 All present None None PITG_23195T0 NON-RXLR Epi9-like protease inhibitor 282 All absent None None PITG_00366 RXLR Secreted RXLR effector 390 All present 4 5 peptide, putative PITG_00582 RXLR Secreted RXLR effector peptide, putative 648 All present None None PITG_00619 RXLR Secreted RXLR effector peptide, putative 411 All present 5 5 PITG_00774 RXLR Secreted RXLR effector 486 All present 3 3 peptide, putative PITG_00821 RXLR Secreted RXLR effector peptide, putative 294 All present 1 None Secreted RXLR effector PITG_01724 RXLR peptide, putative 504 All present 3 1 124 PITG_01904 RXLR Secreted RXLR effector peptide, putative 357 Absent-10232 2 9 PITG_01905 RXLR Secreted RXLR effector peptide, putative 309 All present 1 2 PITG_01907 RXLR Secreted RXLR effector 1011 All present 5 6 peptide, putative PITG_01934 RXLR Secreted RXLR effector peptide, putative 399 All present 2 1 PITG_02387 RXLR Secreted RXLR effector peptide, putative 327 All present None None PITG_02830 RXLR Secreted RXLR effector 357 All present None None peptide, putative PITG_02843 RXLR Secreted RXLR effector peptide, putative 381 All present 2 2 PITG_02897 RXLR Secreted RXLR effector peptide, putative 402 All present None 1 PITG_02900 RXLR Secreted RXLR effector 477 All present None None peptide, putative PITG_03155 RXLR Secreted RXLR effector peptide, putative 750 All present 1 1 PITG_03192 RXLR Secreted RXLR effector peptide, putative 435 All present 1 None PITG_04050 RXLR Secreted RXLR effector 306 All present None 3 peptide, putative PITG_04055 RXLR Secreted RXLR effector peptide, putative 1098 All present 1 None PITG_04081 RXLR Secreted RXLR effector peptide, putative 318 All present None None Avrblb2 family Secreted PITG_04085 RXLR RXLR effector peptide, 303 All present None None putative PITG_04089 RXLR Secreted RXLR effector 309 All present 1 None peptide, putative Avrblb2 family Secreted PITG_04090 RXLR RXLR effector peptide, 303 All present None None putative PITG_04097 RXLR Secreted RXLR effector peptide, putative 324 All present None None PITG_04099 RXLR Secreted RXLR effector peptide, putative 609 All present None None PITG_04139 RXLR Secreted RXLR effector 435 All present None None peptide, putative PITG_04145 RXLR Secreted RXLR effector peptide, putative 387 All present 1 1 PITG_04148 RXLR Secreted RXLR effector peptide, putative 435 All present None None PITG_04153 RXLR Secreted RXLR effector 270 All present None None peptide, putative PITG_04164 RXLR Secreted RXLR effector peptide, putative 360 All present None None PITG_04167 RXLR Secreted RXLR effector peptide, putative 492 All present 1 None PITG_04169 RXLR Secreted RXLR effector peptide, putative 363 All present None None PITG_04178 RXLR Secreted RXLR effector peptide, putative 270 All present None 2 Secreted RXLR effector PITG_04182 RXLR peptide, putative 357 All present None 1 PITG_04194 RXLR Secreted RXLR effector peptide, putative 324 All present None None PITG_04196 RXLR Secreted RXLR effector peptide, putative 423 All present 1 1 PITG_04279 RXLR Secreted RXLR effector Absent-10232 lower peptide, putative- 288 read depth 2 1 Secreted RXLR effector PITG_04290 RXLR peptide, putative 462 All present None None PITG_04300 RXLR Secreted RXLR effector peptide, putative 678 All present 1 8 125 PITG_04314 RXLR PexRd24-Secreted RXLR effector peptide, putative 465 All present 1 1 PITG_04339 RXLR Secreted RXLR effector peptide, putative 723 All present None None PITG_04351 RXLR Secreted RXLR effector 318 All present 1 5 peptide, putative PITG_04354 RXLR Secreted RXLR effector peptide, putative 441 All present None 11 PITG_04355 RXLR Secreted RXLR effector peptide, putative 441 All present 2 7 PITG_04373 RXLR Secreted RXLR effector 426 All present None None peptide, putative PITG_04388 RXLR Secreted RXLR effector peptide, putative 918 All present 1 None PITG_05014 RXLR Secreted RXLR effector peptide, putative 2052 All present 5 4 PITG_05072 RXLR Secreted RXLR effector 408 Absent-Read depth None None peptide, putative, 3' partial less than 10 PITG_05074 RXLR Secreted RXLR effector peptide, putative, 3' partial 693 All present None None PITG_05118 RXLR Secreted RXLR effector peptide, putative 315 All present 1 None Avr2 family Secreted PITG_05121 RXLR RXLR effector peptide, 342 All present 4 4 putative PITG_05146 RXLR Secreted RXLR effector 414 All present 2 2 peptide, putative PITG_05750 RXLR Secreted RXLR effector peptide, putative 495 All present None None PITG_05751 RXLR Secreted RXLR effector peptide, putative 384 All present 4 4 PITG_05771 RXLR Secreted RXLR effector 855 All present 3 3 peptide, putative PITG_05841 RXLR Secreted RXLR effector peptide, putative 663 All present 5 7 PITG_05846 RXLR Secreted RXLR effector peptide, putative 789 All present 7 7 PITG_05911 RXLR Secreted RXLR effector 408 All present 3 None peptide (Avh9.1), putative PITG_05912 RXLR Secreted RXLR effector All present, but peptide (Avh9.1), putative 408 reduced read depth in 1 None the middle PITG_05918 RXLR Secreted RXLR effector peptide, putative 408 All present 5 2 PITG_05978 RXLR Secreted RXLR effector peptide, putative 396 All present 1 1 PITG_05980 RXLR Secreted RXLR effector 486 All present None None peptide, putative PITG_05981 RXLR Secreted RXLR effector peptide, putative 285 All present None None PITG_05983 RXLR Secreted RXLR effector peptide, putative 456 All present None None PITG_06030 RXLR Secreted RXLR effector peptide, putative 627 All present 3 3 PITG_06059 RXLR Secreted RXLR effector peptide, putative 507 All present None None Secreted RXLR effector PITG_06071 RXLR peptide, putative 348 All present None None PITG_06074 RXLR Secreted RXLR effector peptide, putative 507 All present 1 1 PITG_06076 RXLR Secreted RXLR effector peptide, putative 510 All present 6 4 Avr2 family Secreted PITG_06077 RXLR RXLR effector peptide, 357 All present 1 None putative Secreted RXLR effector PITG_06083 RXLR peptide, putative 309 All present None None PITG_06087 RXLR Secreted RXLR effector 384 All present None None 126 peptide, putative PITG_06092 RXLR Secreted RXLR effector peptide, putative 288 All present None None PITG_06094 RXLR Secreted RXLR effector 489 All present None None peptide, putative PITG_06099 RXLR Secreted RXLR effector peptide, putative 489 All present None None PITG_06246 RXLR Secreted RXLR effector All present-all reads peptide, putative 1578 in the middle 6 21 PITG_06290 RXLR Secreted RXLR effector 483 All present None None peptide, putative PITG_06305 RXLR Secreted RXLR effector peptide, putative 276 All present None None PITG_06308 RXLR Secreted RXLR effector peptide, putative 765 All present None None PITG_06375 RXLR Secreted RXLR effector 1407 All present 4 2 peptide, putative PITG_06413 RXLR Secreted RXLR effector peptide, putative 339 All present None None PITG_06419 RXLR Secreted RXLR effector peptide, putative 699 All present 2 None PITG_06432 RXLR Secreted RXLR effector 513 All present 2 2 peptide, putative PITG_06478 RXLR Secreted RXLR effector peptide, putative 1455 All present 11 15 PITG_06485 RXLR Secreted RXLR effector peptide, putative 324 All present 1 5 PITG_07203 RXLR Secreted RXLR effector 285 All present 1 1 peptide, putative PITG_07387 RXLR Avr4 Secreted RXLR effector peptide 864 All present None None PITG_07414 RXLR Secreted RXLR effector peptide, putative 474 All present 3 3 PITG_07435 RXLR Secreted RXLR effector 501 All present 5 8 peptide, putative PITG_07451 RXLR Secreted RXLR effector peptide, putative 384 All present None None PITG_07482 RXLR Secreted RXLR effector peptide, putative 255 All present None None Avr2 family Secreted PITG_07499 RXLR RXLR effector peptide, 357 All present None None putative Avr2 family Secreted PITG_07500 RXLR RXLR effector peptide, 357 All present None None putative PITG_07533 RXLR Secreted RXLR effector 780 All present None 2 peptide, putative Secreted RXLR effector PITG_07550 RXLR peptide, putative- 717 All present 6 15 Avrsmira1 PITG_07555 RXLR Secreted RXLR effector peptide, putative 513 All present 7 11 Secreted RXLR effector PITG_07558 RXLR peptide, putative- 735 All present None None Avrsmira2-Avr8 Secreted RXLR effector PITG_07566 RXLR peptide, putative 654 All present 6 8 PITG_07569 RXLR Secreted RXLR effector peptide, putative 864 All present 6 10 PITG_07587 RXLR Secreted RXLR effector peptide, putative 447 All present 5 10 Secreted RXLR effector PITG_07594 RXLR peptide, putative 450 All present None None PITG_07597 RXLR Secreted RXLR effector peptide, putative 450 All present 1 10 PITG_07630 RXLR Secreted RXLR effector peptide, putative 960 All present 3 2 127 PITG_07634 RXLR Secreted RXLR effector peptide, putative 480 Absent-150084 None None PITG_07689 RXLR Secreted RXLR effector peptide, putative 315 All present 1 1 PITG_07736 RXLR Secreted RXLR effector 363 All present 1 6 peptide, putative PITG_07741 RXLR Secreted RXLR effector peptide, putative 348 All present None 10 PITG_07766 RXLR Secreted RXLR effector peptide, putative 348 All present 2 11 PITG_07947 RXLR Secreted RXLR effector 462 All present 3 3 peptide, putative PITG_07954 RXLR Secreted RXLR effector peptide, putative 801 All present 1 2 PITG_08074 RXLR Secreted RXLR effector peptide, putative 732 All present 1 1 PITG_08074 RXLR Secreted RXLR effector 516 All present 1 1 peptide, putative PITG_08174 RXLR Secreted RXLR effector peptide, putative 594 All present None None Avr2 family Secreted PITG_08278 RXLR RXLR effector peptide, 357 All present None None putative PITG_08399 RXLR Secreted RXLR effector peptide, putative 402 All present None None PITG_08500 RXLR Secreted RXLR effector 465 All present 5 4 peptide, putative PITG_08624 RXLR Secreted RXLR effector peptide, putative 1014 All present 1 None PITG_08903 RXLR Secreted RXLR effector peptide, putative 504 All present 3 1 Avr2 family Secreted PITG_08943 RXLR RXLR effector peptide, 351 All present None None putative PITG_08949 RXLR Secreted RXLR effector 300 All present None None peptide, putative PITG_09109 RXLR Secreted RXLR effector peptide, putative 705 All present 2 4 PITG_09111 RXLR Secreted RXLR effector 351 All present None None peptide, putative PITG_09160 RXLR Secreted RXLR effector peptide, putative 507 All present 1 None PITG_09216 RXLR Secreted RXLR effector 528 All present None 4 peptide, putative PITG_09218 RXLR Secreted RXLR effector peptide, putative 498 All present None 8 PITG_09316 RXLR Secreted RXLR effector peptide, putative 1146 All present 1 1 PITG_09496 RXLR Secreted RXLR effector 318 All present 2 None peptide, putative PITG_09497 RXLR Secreted RXLR effector peptide, putative 216 All present None None PITG_09498 RXLR Secreted RXLR effector peptide, putative 318 All present 1 None PITG_09499 RXLR Secreted RXLR effector peptide, putative 318 All present None None PITG_09503 RXLR Secreted RXLR effector peptide, putative 303 All present None None Secreted RXLR effector PITG_09510 RXLR peptide, putative 315 All present None None PITG_09585 RXLR Secreted RXLR effector peptide, putative 1020 All present 2 5 PITG_09586 RXLR Secreted RXLR effector peptide, putative 777 All present 6 6 Secreted RXLR effector PITG_09622 RXLR peptide, putative 834 All present None 5 PITG_09632 RXLR Secreted RXLR effector peptide, putative 321 All absent None None 128 PITG_09647 RXLR Secreted RXLR effector peptide, putative 834 All present 3 19 PITG_09685 RXLR Secreted RXLR effector peptide, putative 273 All present 1 7 PITG_09689 RXLR Secreted RXLR effector 390 All present 2 3 peptide, putative PITG_09732 RXLR Secreted RXLR effector peptide, putative 1428 All present 9 12 PITG_09739 RXLR Secreted RXLR effector peptide, putative 381 All present None None PITG_09741 RXLR Secreted RXLR effector 414 All present None None peptide, putative PITG_09754 RXLR Secreted RXLR effector peptide, putative 285 All absent None None PITG_09758 RXLR Secreted RXLR effector peptide, putative 414 All present 3 4 PITG_09771 RXLR Secreted RXLR effector 348 Absent-1714, 813 None None peptide, putative PITG_09773 RXLR Secreted RXLR effector peptide, putative 387 All present None None PITG_09861 RXLR Secreted RXLR effector peptide, putative 702 All present None 4 PITG_09915 RXLR Secreted RXLR effector 396 All present None None peptide, putative PITG_09935 RXLR Secreted RXLR effector peptide, putative 396 All present None None PITG_10116 RXLR Secreted RXLR effector peptide, putative 957 All present 2 5 PITG_10227 RXLR Secreted RXLR effector 414 All present None None peptide, putative PITG_10232 RXLR Secreted RXLR effector peptide, putative 522 Absent-9231 2 None PITG_10244 RXLR Secreted RXLR effector peptide, putative 405 All present 1 1 PITG_10248 RXLR Secreted RXLR effector 720 All present 3 10 peptide, putative PITG_10339 RXLR Secreted RXLR effector peptide, putative 591 All present 1 3 PITG_10341 RXLR Secreted RXLR effector peptide, putative 1275 All present 5 8 PITG_10347 RXLR Secreted RXLR effector 1329 All present None None peptide, putative PITG_10396 RXLR Secreted RXLR effector peptide, putative 432 All present 5 1 PITG_10540 RXLR Secreted RXLR effector peptide, putative 1203 All present 1 None PITG_10639 RXLR Secreted RXLR effector 456 Absent-122344 None None peptide, putative PITG_10640 RXLR Secreted RXLR effector peptide, putative 480 All present None None PITG_10654 RXLR Secreted RXLR effector peptide, putative 462 All present 3 7 PITG_10672 RXLR Secreted RXLR effector peptide, putative 420 All present 1 11 PITG_10673 RXLR Secreted RXLR effector peptide, putative 243 All present None 5 Secreted RXLR effector PITG_10808 RXLR peptide, putative 663 All present 3 1 PITG_11344 RXLR Secreted RXLR effector peptide, putative 354 All present None None PexRD2 family Secreted PITG_11350 RXLR RXLR effector peptide, 366 All present 2 2 putative PexRD2 family Secreted PITG_11383 RXLR RXLR effector peptide, 366 All present None None putative PITG_11384 RXLR PexRD2 family Secreted RXLR effector peptide, 366 All present 4 None 129 putative PITG_11429 RXLR Secreted RXLR effector peptide, putative 462 All present 1 1 PITG_11484 RXLR Secreted RXLR effector 408 All present None None peptide, putative-Avr10 PITG_11507 RXLR Secreted RXLR effector peptide, putative 291 All present None None PITG_11839 RXLR Secreted RXLR effector peptide, putative 468 All present 1 6 PITG_11947 RXLR Secreted RXLR effector 1154 All present 3 8 peptide, putative PITG_11952 RXLR Secreted RXLR effector peptide, putative 810 All present 2 14 PITG_11953 RXLR Secreted RXLR effector peptide, putative 621 All present None 17 PITG_12010 RXLR Secreted RXLR effector 252 All present None None peptide, putative PITG_12046 RXLR Secreted RXLR effector peptide, putative 378 All present 2 2 PITG_12276 RXLR Secreted RXLR effector peptide, putative 423 All present None None PITG_12402 RXLR Secreted RXLR effector 417 All present None None peptide, putative PITG_12458 RXLR Secreted RXLR effector peptide, putative 969 All present 1 1 PITG_12706 RXLR Secreted RXLR effector peptide, putative 357 All Absent-most coverage after gene None None PITG_12710 RXLR Secreted RXLR effector 303 All present 2 3 peptide, putative PITG_12719 RXLR Secreted RXLR effector peptide, putative 288 All present None None PITG_12721 RXLR Secreted RXLR effector Absent-peptide, putative 330 150084,702,813,1714 None None PITG_12722 RXLR Secreted RXLR effector 330 All present 3 3 peptide, putative PITG_12731 RXLR Secreted RXLR effector peptide, putative 1116 All present 2 4 PITG_12737 RXLR Secreted RXLR effector peptide, putative 507 All present None None PITG_12761 RXLR Secreted RXLR effector 849 All present 1 None peptide, putative PITG_12791 RXLR Secreted RXLR effector peptide, putative 1452 All present 3 3 PITG_12816 RXLR Secreted RXLR effector peptide, putative 246 All present 1 1 PITG_12851 RXLR Secreted RXLR effector 336 All present None None peptide, putative PITG_12952 RXLR Secreted RXLR effector peptide, putative 453 All present 1 7 PITG_13018 RXLR Secreted RXLR effector peptide, putative 636 All present 8 9 PITG_13093 RXLR Secreted RXLR effector 465 All present 2 2 peptide, putative PITG_13119 RXLR Secreted RXLR effector peptide, putative 903 All present 6 9 Secreted RXLR effector PITG_13125 RXLR peptide, putative 660 All present 4 3 PITG_13306 RXLR Secreted RXLR effector peptide, putative 300 All present 2 2 PITG_13452 RXLR Secreted RXLR effector peptide, putative 294 All present 3 2 Secreted RXLR effector PITG_13481 RXLR peptide, putative 423 All present None None PITG_13503 RXLR Secreted RXLR effector peptide, putative 753 All present 11 18 PITG_13507 RXLR Secreted RXLR effector peptide, putative 861 All present 10 10 130 PITG_13509 RXLR Secreted RXLR effector peptide, putative 660 All present 5 9 PITG_13538 RXLR Secreted RXLR effector peptide, putative 480 Absent-10232 None 16 PITG_13543 RXLR Secreted RXLR effector 522 All present None None peptide, putative PITG_13550 RXLR Secreted RXLR effector peptide, putative 369 All present 1 1 PITG_13612 RXLR Secreted RXLR effector peptide, putative 420 All present None None PITG_13628 RXLR Secreted RXLR effector 726 All present 1 1 peptide, putative Avr2 family Secreted PITG_13930 RXLR RXLR effector peptide, 345 All present None None putative Avr2 family Secreted PITG_13936 RXLR RXLR effector peptide, 291 All present None None putative Avr2 family Secreted PITG_13940 RXLR RXLR effector peptide, 345 All present 1 None putative Avr2 family Secreted PITG_13956 RXLR RXLR effector peptide, 345 All present 1 None putative PITG_13959 RXLR Secreted RXLR effector peptide, putative 390 All present None None PITG_14046 RXLR Secreted RXLR effector 348 Absent-1671, 1714, peptide, putative 150084 1 1 PITG_14086 RXLR Secreted RXLR effector 366 All present None 5 peptide, putative PITG_14093 RXLR Secreted RXLR effector peptide, putative 396 All present None None PITG_14360 RXLR Secreted RXLR effector peptide, putative 483 All present None 6 Avr3a family Secreted PITG_14368 RXLR RXLR effector peptide, 447 All present 1 None putative PITG_14371 RXLR Secreted RXLR effector 444 All present 5 3 peptide, Avr3a Avr3a family Secreted PITG_14374 RXLR RXLR effector peptide, 444 All present None 3 putative PITG_14432 RXLR Secreted RXLR effector peptide, putative 372 All present 4 3 PITG_14434 RXLR Secreted RXLR effector peptide, putative 372 All present None None PITG_14443 RXLR Secreted RXLR effector 522 All present 2 None peptide, putative PITG_14662 RXLR Secreted RXLR effector peptide, putative 405 All present 2 1 PITG_14673 RXLR Secreted RXLR effector peptide, putative 825 All absent None None PITG_14685 RXLR Secreted RXLR effector peptide, putative 1410 All present 15 20 PexRD8 family Secreted PITG_14732 RXLR RXLR effector peptide, 420 All present 1 1 putative PexRD8 family Secreted PITG_14736 RXLR RXLR effector peptide, 429 All present None None putative PexRD8 family Secreted PITG_14737 RXLR RXLR effector peptide, 429 All present None None putative PexRD8 family Secreted PITG_14738 RXLR RXLR effector peptide, 429 All present None None putative PITG_14783 RXLR Secreted RXLR effector peptide, putative 393 All present None None 131 PITG_14787 RXLR Secreted RXLR effector peptide, putative 393 All present None None PITG_14788 RXLR Secreted RXLR effector peptide, putative 717 All present 3 2 PITG_14797 RXLR Secreted RXLR effector 369 All present None None peptide, putative PITG_14884 RXLR Secreted RXLR effector peptide, putative 1812 All present None 2 PITG_14932 RXLR Secreted RXLR effector peptide, putative 270 All present None None PITG_14954 RXLR Secreted RXLR effector 498 All present None None peptide, putative PITG_14955 RXLR Secreted RXLR effector peptide, putative 498 All present 1 None PITG_14960 RXLR Secreted RXLR effector peptide, putative 498 All present None None PITG_14965 RXLR Secreted RXLR effector 270 All present None None peptide, putative PITG_14983 RXLR Secreted RXLR effector peptide, putative 399 All present None None PITG_14984 RXLR Secreted RXLR effector peptide, putative 399 All present 2 2 PITG_14986 RXLR Secreted RXLR effector 399 All present 6 6 peptide, putative PITG_15032 RXLR Secreted RXLR effector peptide, putative 1458 All present 8 7 PITG_15037 RXLR Secreted RXLR effector peptide, putative 345 All present 4 3 PITG_15038 RXLR Secreted RXLR effector 1488 All present None None peptide, putative PITG_15105 RXLR Secreted RXLR effector peptide, putative 2082 All present 2 1 PITG_15109 RXLR Secreted RXLR effector peptide, putative 342 All present 1 None PITG_15110 RXLR Secreted RXLR effector 2130 All present 3 2 peptide, putative PITG_15114 RXLR Secreted RXLR effector peptide, putative 1566 All present 2 2 PITG_15123 RXLR Secreted RXLR effector peptide, putative 1548 All present 6 2 PITG_15125 RXLR Secreted RXLR effector 1548 All present None None peptide, putative PITG_15127 RXLR Secreted RXLR effector peptide, putative 1548 All present 4 3 PITG_15142 RXLR Secreted RXLR effector peptide, putative 1473 All present None None PITG_15152 RXLR Secreted RXLR effector 2283 All present 1 None peptide, putative PITG_15162 RXLR Secreted RXLR effector peptide, putative 510 All present None None PITG_15166 RXLR Secreted RXLR effector peptide, putative 519 All present None None PITG_15177 RXLR Secreted RXLR effector peptide, putative 507 All present 2 1 PITG_15225 RXLR Secreted RXLR effector peptide, putative 666 All present 4 4 Secreted RXLR effector PITG_15226 RXLR peptide, putative 675 All present 8 8 PITG_15255 RXLR Secreted RXLR effector peptide, putative 384 All present 1 1 PITG_15277 RXLR Secreted RXLR effector peptide, putative 486 All present 4 1 Secreted RXLR effector PITG_15278 RXLR peptide, putative 1560 All present 9 11 PITG_15287 RXLR PexRD1 Secreted RXLR effector peptide, putative 642 All present None 3 PITG_15297 RXLR Secreted RXLR effector peptide, putative 360 All present None None 132 PITG_15303 RXLR Secreted RXLR effector peptide, putative 387 All present 1 1 PITG_15304 RXLR Secreted RXLR effector peptide, putative 552 All present None None PITG_15315 RXLR Secreted RXLR effector 405 All present 1 7 peptide, putative PITG_15318 RXLR Secreted RXLR effector peptide, putative 360 All present None None PITG_15337 RXLR Secreted RXLR effector peptide, putative 501 All present 3 4 PITG_15341 RXLR Secreted RXLR effector 354 All present 3 3 peptide, putative PITG_15424 RXLR Secreted RXLR effector peptide, putative 687 All absent None None PITG_15556 RXLR Secreted RXLR effector peptide, putative 369 All present None 5 PITG_15679 RXLR Secreted RXLR effector 816 All present 1 1 peptide, putative PITG_15712 RXLR Secreted RXLR effector Absent-15002, peptide, putative 414 813,10232 2 1 PITG_15718 RXLR Secreted RXLR effector 465 Absent-15002, peptide, putative 813,10232 None None PITG_15728 RXLR Secreted RXLR effector 561 All present 2 4 peptide, putative PITG_15732 RXLR Secreted RXLR effector peptide, putative 984 All present 1 1 PITG_15763 RXLR Secreted RXLR effector peptide, putative 1002 All present 6 15 PITG_15764 RXLR Secreted RXLR effector 1056 All present 4 15 peptide, putative Avr2 family Secreted PITG_15972 RXLR RXLR effector peptide, 285 Absent-10232 None None putative PITG_16180 RXLR Secreted RXLR effector peptide, putative 330 All absent None None PITG_16188 RXLR Secreted RXLR effector peptide, putative 375 All present 3 1 PITG_16193 RXLR Secreted RXLR effector 420 All present None None peptide, putative PITG_16195 RXLR Secreted RXLR effector peptide, putative 1530 All present 5 4 PITG_16233 RXLR Secreted RXLR effector peptide, putative 372 All present None None PITG_16235 RXLR Secreted RXLR effector 390 All present 3 None peptide, putative PITG_16242 RXLR Secreted RXLR effector peptide, putative 390 All present 1 None PITG_16248 RXLR Secreted RXLR effector peptide, putative 360 All present None None PITG_16282 RXLR Secreted RXLR effector 288 All present None None peptide, putative PITG_16283 RXLR Secreted RXLR effector peptide, putative 609 All present 1 1 Secreted RXLR effector PITG_16285 RXLR peptide, putative 414 All present None None PITG_16294 RXLR Avrvnt1-Secreted RXLR effector peptide, putative 462 All present 1 None PITG_16402 RXLR Secreted RXLR effector peptide, putative 360 All present None None Secreted RXLR effector PITG_16409 RXLR peptide, putative 390 All present None None PITG_16424 RXLR Secreted RXLR effector peptide, putative 390 All present 2 None PITG_16427 RXLR Secreted RXLR effector peptide, putative 372 All present 1 None Secreted RXLR effector PITG_16428 RXLR peptide, putative 429 All present 3 4 PITG_16515 RXLR Secreted RXLR effector 468 All present 1 None 133 peptide, putative PITG_16529 RXLR Secreted RXLR effector peptide, putative 468 All present None None PITG_16541 RXLR Secreted RXLR effector 285 All present 2 1 peptide, putative PITG_16663 RXLR Avr1 Secreted RXLR effector peptide, putative 627 All absent None None PITG_16705 RXLR Secreted RXLR effector peptide, putative 2037 All present None None PITG_16726 RXLR Secreted RXLR effector 1527 All present None None peptide, putative PITG_16737 RXLR Secreted RXLR effector peptide, putative 729 All present None None PITG_16738 RXLR Secreted RXLR effector peptide, putative 729 All present 3 2 PITG_16836 RXLR Secreted RXLR effector 570 All present 3 3 peptide, putative PITG_16844 RXLR Secreted RXLR effector peptide, putative 930 All present None None PITG_16845 RXLR Secreted RXLR effector peptide, putative 930 All present 1 1 PITG_17063 RXLR Secreted RXLR effector 483 All present None None peptide, putative PITG_17309 RXLR Secreted RXLR effector peptide, putative 1050 All present None None PITG_17316 RXLR Secreted RXLR effector peptide, putative 1050 All present None None PITG_17670 RXLR Secreted RXLR effector 855 All present 1 9 peptide, putative PexRD8 family Secreted PITG_17838 RXLR RXLR effector peptide, 429 All present 2 None putative PITG_17871 RXLR Secreted RXLR effector, putative 1539 All present 20 36 PITG_18147 RXLR Secreted RXLR effector 504 All present None None peptide, putative PITG_18156 RXLR Secreted RXLR effector peptide, putative 633 All present 6 12 PITG_18215 RXLR Avr3b-Secreted RXLR effector peptide, putative 447 All present 3 4 PITG_18221 RXLR Secreted RXLR effector 318 All present 1 1 peptide, putative PITG_18318 RXLR Secreted RXLR effector peptide, putative 378 All present None None PITG_18325 RXLR Secreted RXLR effector peptide, putative 390 All present 1 1 PITG_18405 RXLR Secreted RXLR effector 480 All present 2 1 peptide, putative PITG_18510 RXLR Secreted RXLR effector peptide, putative 540 All present 2 2 PITG_18609 RXLR Secreted RXLR effector peptide, putative 450 All present None None PITG_18670 RXLR Secreted RXLR effector 318 Absent-150084, None None peptide, putative 10232, 9231 PITG_18675 RXLR Secreted RXLR effector Absent-150084, peptide, putative 324 10232, 9231 None None Avrblb2 family Secreted PITG_18683 RXLR RXLR effector peptide, 303 Absent-150084, 10232, 9231 1 None putative Secreted RXLR effector PITG_18685 RXLR peptide, putative 318 All present 1 None PITG_18880 RXLR Secreted RXLR effector peptide, putative 465 All present 1 2 PITG_18908 RXLR Secreted RXLR effector peptide, putative 504 All present None None PITG_18956 RXLR Secreted RXLR effector 330 All present None None 134 peptide, putative PITG_18981 RXLR Secreted RXLR effector peptide, putative 432 All present 1 1 PITG_18986 RXLR Secreted RXLR effector 357 All present 2 2 peptide, putative PITG_19232 RXLR Secreted RXLR effector peptide, putative 435 Absent-9231 None None PITG_19302 RXLR Secreted RXLR effector peptide, putative 1524 All present 9 None PITG_19307 RXLR Secreted RXLR effector 1221 All present None None peptide, putative PITG_19308 RXLR Secreted RXLR effector peptide, putative 288 All present None None PITG_19309 RXLR Secreted RXLR effector peptide, putative 879 All present None None PITG_19518 RXLR Secreted RXLR effector 804 All present None None peptide, putative PITG_19523 RXLR Secreted RXLR effector peptide, putative 759 All present 2 13 PITG_19526 RXLR Secreted RXLR effector peptide, putative 378 All present None None PITG_19528 RXLR Secreted RXLR effector 381 All present None None peptide, putative Avr2 family Secreted PITG_19617 RXLR RXLR effector peptide, 357 Absent-10232 1 None putative PITG_19655 RXLR Secreted RXLR effector peptide, putative 1077 All present None None PITG_19800 RXLR Secreted RXLR effector peptide, putative 621 Absent-778 1 1 PITG_19831 RXLR Secreted RXLR effector 636 All present None None peptide, putative PITG_19992 RXLR Secreted RXLR effector peptide, putative 1077 All present None None PITG_19994 RXLR Secreted RXLR effector peptide, putative 777 All present None None PITG_19996 RXLR Secreted RXLR effector 390 All present 1 2 peptide, putative PITG_20052 RXLR Secreted RXLR effector peptide, putative 504 All present 2 2 PITG_20144 RXLR Secreted RXLR effector peptide, putative 501 All present 5 4 Avrblb2 family Secreted PITG_20300 RXLR RXLR effector peptide, 303 All present None None putative Avrblb2 family Secreted Absent-9231, 10232, PITG_20301 RXLR RXLR effector peptide, 303 putative 150084 None None Avrblb2 family Secreted PITG_20303 RXLR RXLR effector peptide, 303 Absent-9231, 10232 None None putative PITG_20336 RXLR Secreted RXLR effector peptide, 3' partial 258 All Absent None None PITG_20365 RXLR Secreted RXLR effector 357 All present 3 None peptide, putative PITG_20616 RXLR Secreted RXLR effector peptide, putative 1065 All present 2 None Secreted RXLR effector PITG_20857 RXLR peptide, putative 396 9231, 10232, 150084 1 None PITG_20934 RXLR Secreted RXLR effector peptide, putative 372 Absent-9231 None None PITG_20972 RXLR Secreted RXLR effector peptide, putative 354 All present None None Secreted RXLR effector PITG_21107 RXLR peptide, putative 438 All absent None None PITG_21152 RXLR Secreted RXLR effector peptide, putative 441 All present None None 135 PITG_21190 RXLR Secreted RXLR effector peptide, putative 1020 All present 1 1 PITG_21238 RXLR Secreted RXLR effector peptide, putative 384 All present 1 None PITG_21288 RXLR Secreted RXLR effector 714 All present None None peptide, putative PITG_21303 RXLR Secreted RXLR effector peptide, putative, 3' partial 639 All absent None None PITG_21362 RXLR Secreted RXLR effector peptide, putative, 3' partial 735 All present 2 1 PITG_21388 RXLR Avrblb1 Secreted RXLR 459 All present 1 5 effector peptide, ipi01 PexRD2 family Secreted PITG_21422 RXLR RXLR effector peptide, 366 All present 1 None putative PITG_21739 RXLR Secreted RXLR effector peptide, putative 315 All present None None PITG_21740 RXLR Secreted RXLR effector peptide, putative 2130 All present 1 None PITG_21778 RXLR Secreted RXLR effector 420 Absent-10232 1 None peptide, putative PITG_21933 RXLR Secreted RXLR effector peptide, putative 390 All present None None Avr2 family Secreted PITG_21949 RXLR RXLR effector peptide, 345 All present None None putative PITG_22089 RXLR Secreted RXLR effector peptide, putative 408 All present 5 None PITG_22118 RXLR Secreted RXLR effector 351 All present None None peptide, putative PITG_22256 RXLR Secreted RXLR effector peptide, putative 327 All Absent None None PITG_22375 RXLR Secreted RXLR effector peptide, putative 432 All present None None PITG_22547 RXLR Secreted RXLR effector 423 All present None None peptide, putative PITG_22675 RXLR Secreted RXLR effector peptide, putative 363 All present None None PITG_22676 RXLR Secreted RXLR effector peptide, putative 486 All present 1 1 PITG_22683 RXLR Secreted RXLR effector 522 All present 2 None peptide, putative PITG_22712 RXLR Secreted RXLR effector peptide, putative, 3' partial 444 All present None None PITG_22724 RXLR Secreted RXLR effector peptide, putative 474 All present 3 None PITG_22725 RXLR Secreted RXLR effector 273 All present None None peptide, putative PITG_22727 RXLR Secreted RXLR effector peptide, putative 213 All present None None PITG_22757 RXLR Secreted RXLR effector peptide, putative 486 All present 3 3 PITG_22766 RXLR Secreted RXLR effector peptide, putative 246 All present None None PITG_22802 RXLR Secreted RXLR effector peptide, putative 225 All present 1 2 Secreted RXLR effector PITG_22813 RXLR peptide, putative 219 All present 2 2 PITG_22816 RXLR Secreted RXLR effector peptide, putative 246 All present 2 3 PITG_22825 RXLR Secreted RXLR effector peptide, putative 444 All present 3 7 Secreted RXLR effector PITG_22844 RXLR peptide, putative 555 All present 1 1 PITG_22868 RXLR Secreted RXLR effector peptide, putative 456 All present None 7 PITG_22879 RXLR Secreted RXLR effector peptide, putative 996 All present 3 19 136 PITG_22884 RXLR Secreted RXLR effector peptide, putative 465 All present 1 7 PITG_22889 RXLR Secreted RXLR effector peptide, putative 231 All present 1 None PITG_22891 RXLR Secreted RXLR effector 252 All present 1 1 peptide, putative PITG_22894 RXLR Secreted RXLR effector peptide, putative 276 All present 1 1 PITG_22896 RXLR Secreted RXLR effector peptide, putative 585 All present None 2 PITG_22900 RXLR Secreted RXLR effector 237 All present 2 2 peptide, putative PITG_22922 RXLR Secreted RXLR effector peptide, putative 1467 All present 10 17 PITG_22925 RXLR Secreted RXLR effector peptide, putative 378 All present None None Secreted RXLR effector 702, 813, 1627, 1734, PITG_22926 RXLR peptide, putative 597 122344, 150002, None None 150084 PITG_22929 RXLR Secreted RXLR effector 342 All absent None 6 peptide, putative PITG_22932 RXLR Secreted RXLR effector peptide, putative 354 All present None 6 PexRD2 family Secreted PITG_22935 RXLR RXLR effector peptide, 366 All present None None putative PITG_22945 RXLR Secreted RXLR effector peptide, putative 507 All present None 1 PITG_22972 RXLR Secreted RXLR effector 231 All absent None 10 peptide, putative PITG_22978 RXLR Secreted RXLR effector peptide, putative 570 All present 3 4 PITG_22986 RXLR Secreted RXLR effector peptide, putative 294 All present 1 2 PITG_22987 RXLR Secreted RXLR effector 213 All present None None peptide, putative PITG_22990 RXLR Secreted RXLR effector peptide, putative 465 All present None 12 PITG_22998 RXLR Secreted RXLR effector peptide, putative 306 All present None None PITG_22999 RXLR Secreted RXLR effector 306 All present 2 3 peptide, putative PITG_23000 RXLR Secreted RXLR effector peptide, putative 432 All present 1 3 Avr2 family Secreted PITG_23008 RXLR RXLR effector peptide, 345 All present None None putative PITG_23011 RXLR Secreted RXLR effector peptide, putative 225 All present 1 None PITG_23014 RXLR Secreted RXLR effector peptide, putative 510 All present 1 None PITG_23016 RXLR Secreted RXLR effector 234 All present 2 1 peptide, putative PITG_23024 RXLR Secreted RXLR effector peptide, putative 969 All present 8 14 Absent-813, 1734, PITG_23026 RXLR Secreted RXLR effector peptide, putative, 3' partial 219 122344,10232, 1671, 1 None 9231 PITG_23035 RXLR Secreted RXLR effector peptide, putative 2805 All present 17 24 Secreted RXLR effector PITG_23046 RXLR peptide, putative 336 All present 1 None PITG_23061 RXLR Secreted RXLR effector peptide, putative 648 All present 3 4 PITG_23069 RXLR Secreted RXLR effector peptide, putative 360 All present 3 None 137 PITG_23074 RXLR Secreted RXLR effector All present-9231 peptide, putative 360 lower read depth in 1 None the middle PITG_23092 RXLR Secreted RXLR effector peptide, putative 516 All present 3 4 PITG_23117 RXLR Secreted RXLR effector peptide, putative 375 All present 1 1 PITG_23126 RXLR Secreted RXLR effector 270 Absent-10232 None None peptide, putative PITG_23129 RXLR Secreted RXLR effector peptide, putative 333 All present None None PITG_23131 RXLR Secreted RXLR effector peptide, putative 360 All present 2 1 PITG_23132 RXLR PexRD36 Secreted RXLR 231 All present None None effector peptide, putative PITG_23135 RXLR Secreted RXLR effector peptide, putative 273 All present None None PITG_23137 RXLR Secreted RXLR effector peptide, putative 474 Absent-9231 None None PITG_23154 RXLR Secreted RXLR effector 240 All present 4 3 peptide, putative PITG_23185 RXLR Secreted RXLR effector 294 Absent-9231, 10232, peptide, putative 150084 None None PITG_23193 RXLR Secreted RXLR effector 271 Absent-9231, 10232, None None peptide, putative 150084 138 Table 3.S2. Primers used in PCR analyses for this study. Primer Source of primers Forward primer sequence 5'-3' Reverse primer sequence 5'-3' Presence/absence of genes in isolates Thilliez et TAATTTCGCTCTTCACACGA GCGTTTCAGCAGTTAGAATCG PITG_14371 al. 2018 TAAGCTCTC GATTTTCTG All present PITG_18215 (Zheng et al. 2014) ATGCGAGCCTACTTTGTCCT CAACACGAAGAGAGCGAGTC All present PITG_22727 Thilliez et GAATTTCGATTTCCTGATTCT CCTTCTTTTAAGCGTAATCCCal. 2018 TGATGATCTTGTTC CCTTTTACAG All present Designed PITG_19800 in this AACGCTCACTTCCCAATTTC ACATCCTTCGACGGGACA Differences in study presence/absence Designed Differences in PITG_07634 in this GGAGGCTTTTCCGTGCCTAT TTGGAATCGTCGCCGTACTT study presence/absence Designed PITG_10232 in this GCTGTCCACTGATGTCCCTC GCGGTACGCTTGACTTTTCC Differences in study presence/absence PITG_16663 (Du et al. CACCATGTTCGACCACGACA 2015) AGG TTAAAATGGTACCACAACATG All Absent TCC Designed PITG_21107 in this CACCAAACACCTTCCCCGTA TCGAATGTTCTTGCTGCCGA All Absent study Designed PITG_15424 in this CTTTTGGCGGTCGCTTTTGT TGGATCCGCGCTCAAAATCT All Absent study 3.S3. Full variant call analysis with SnpEff annotations for the 579 baited genes. (.csv format) 139 CHAPTER 4 WHOLE-GENOME INTROGRESSION DETECTION AND HAPLOTYPE ANALYSIS REVEALS EARLY BLIGHT RESISTANCE IN MODERN TOMATO BREEDING LINES TRACE TO ‘DEVON SURPRISE AND ‘HAWAII 7998* Abstract The genomes of eleven landmark accessions with early blight disease resistance were sequenced and reported, including the tomato (Solanum lycopersicum L.) breeding lines NC 1 CELBR and Campbell 1943, as well as Solanum habrochaites PI 126445. Tools were developed to infer local ancestry, to define cryptic introgressions underlying disease resistance in modern tomato (Solanum lycopersicum L.) breeding lines. The source of the early blight quantitative resistance locus (QTL) EB-9 was traced backwards from Campbell 1943 to the Ailsa Craig derived heirloom tomato Devon Surprise. The foliar resistance QTL EB-5 was traced back to Hawaii 7998. We failed to detect strong evidence for PI 126445 introgressions in modern breeding lines, despite it being a commonly cited resistance source. Identification of the shared ancestral haplotypes in tomatoes along the historical breeding pedigree enabled fine mapping of EB-5 and EB-9 and the identification of candidate variants that could underlie early blight disease resistance. We surveyed the re-sequenced genomes of an additional 764 sequenced accessions, predicting EB-9 resistance in several heirloom tomatoes, several accessions of S. *The chapter is the basis of a manuscript in preparation for submission to PNAS. Taylor Anderson and Martha Sudermann are first co-authors and contributed equally to the design and execution of experiments, the writing of the manuscript, and the revision process. Other authors include David M. Francis, Darlene M. DeJong, Christine D. Smart, and Martha A. Mutchler (corresponding author). 140 lycopersicum var. cerasiforme, and S. pimpinellifolium PI 37009. We then acquired seed of these accessions from several sources and confirmed that we could predict early blight resistance in sequenced accessions with high accuracy. There was little evidence of foliar EB-5 resistance among sequenced accessions. We also defined introgressions underlying bacterial spot resistance at Rx-3 and QTL-11 in modern breeding lines. Our work makes it easier to breed for effective resistance to early blight disease in cultivated tomato, while also demonstrating an efficient method to leverage the ever-growing sequence resources for tomato to predict the value of publicly available genetic resources by detecting local homology and putative introgressions in genome-scale data. . 141 Introduction Early blight is caused by the necrotrophic fungus Alternaria linariae (syn. Alternaria tomatophila). It is a widespread and damaging disease of tomato (Solanum lycopersicum L.) (Rotem 1994; Woudenberg et al. 2014). The disease can girdle the stems of young transplants causing plant collapse in a disease phase known as collar rot (Pritchard and Porte 1921). On mature plants, lesions can form on the fruit, stems, or leaves, causing branch collapse, defoliation, and reductions in marketable fruit yields under warm, humid conditions (Barksdale 1971). Genetic resistance to early blight disease in tomato is quantitatively inherited and is unlikely to conform to the classic gene-for-gene hypothesis (Barksdale and Stoner 1973, 1977; Martin and Hepperly 1987; Nash and Gardner 1988; Maiero et al. 1989, 1990). However, several tomato breeding lines have demonstrated a substantial degree of resistance to early blight (Adhikari et al. 2017). The resistance in these lines can be paired with modest chemical control to minimize early blight yield losses in the field (Zitter et al. 2005; Zitter and Drennan 2007, 2008). However, difficulties in obtaining reliable phenotype data and the complex inheritance of resistance make the early blight pathosystem particularly well-suited to efficient marker-based selection (Adhikari et al. 2017). Much of the early blight resistance in cultivated tomato is thought to originate from the mid-century canning tomato breeding line Campbell 1943 and the wild Solanum habrochaites accession PI 126445. Campbell 1943 exhibits strong resistance to early blight stem lesions and collar rot and moderate resistance to defoliation in the field (Gardner 1990; Anderson et al. 2021). While the source of resistance in Campbell 1943 is unknown, it is possible that Dr. George B. Reynard of Campbell’s tomato seed leveraged his knowledge of a simply inherited collar rot resistance in the tomato Devon Surprise, from his prior role as a USDA researcher in the early 1940s, to create Campbell 1943 (Reynard and Andrus 1944, 2016). An appendix to the Interim Report of the 142 Committee on Varietal Pedigrees of the Tomato Genetics Cooperative suggests that Devon Surprise was itself a ‘mutant’ of the heirloom variety Ailsa Craig. Subsequent breeding efforts by Dr. Randolph Gardner at North Carolina State University (NCSU) transferred the Campbell 1943 resistance into the fresh market tomato background of NC EBR 2 in the 1980s (Barksdale 1971) (Gardner 1988). Separately, Gardner introgressed foliar early blight resistance from Solanum habrochaites PI 126445, into fresh market tomato, creating NC EBR 1 (Gardner 1988). Resistance from NC EBR 1 and NC EBR 2 was then combined in NC 1 CELBR, which has been used widely and is the source of stem and foliar early blight resistance for several Cornell tomato breeding lines (Gardner and Panthee 2010). We recently characterized quantitative trait loci (QTL) conferring partial early blight resistance in Cornell fresh market and The Ohio State (OSU) processing tomato breeding lines (Anderson et al. 2021). These QTL represent a readily deployable resistance resource for breeding programs, but further work is needed to resolve their genomic positions and to develop reliable predictive markers. Two of these QTL, EB-5 and EB-9 conferred appreciable levels of early blight resistance in the cultivated background and were effective across all growing environments. The EB-5 locus was associated with resistance to defoliation, while EB-9 was associated with stem lesion, collar rot, and moderate foliar resistance. We hypothesized that EB-9 traces to Campbell 1943 and possibly Devon Surprise, while EB-5 traces to the unimproved fresh market tomato breeding line Hawaii 7998. A third QTL on chromosome 1, EB-1.2, was also associated with early blight defoliation in some populations, but subsequent evaluations suggested the QTL was associated with an increase in plant size that might explain the connection with early blight resistance in the processing tomato background (Anderson T, unpublished observations). We speculated that EB-1.2 could be derived from PI 126445. 143 The goal was to use comparative genomics to clarify the ancestry of early blight resistance in modern tomatoes and to define the ancestral introgressions underlying EB- 5 and EB-9. Decades of breeding for early blight resistance should have whittled down the resistance haplotypes from the ancestral sources. Thus, we hypothesized that identification of the shared ancestral haplotypes among resistant tomatoes would enable fine-mapping of EB-5 and EB-9. To do this, we sequenced the genomes of 9 landmark early blight resistant tomato breeding lines and the wild PI 126445 and analyzed these data together with 764 existing genome sequences for a variety of tomato and wild species accessions. We developed a flexible and computationally efficient protocol for identifying homologous haplotypes among the sequenced accessions and used this method to delineate the boundaries of the ancestral resistance introgressions and to predict early blight resistance in publicly available sequenced accessions. The refined QTL intervals helped to identify haplotype-specific molecular markers and putative causative resistance loci for EB-5 and EB-9 that will empower future molecular breeding efforts for early blight resistance. Materials and Methods Plant material—Nine tomatoes representing over 80 years of breeding for early blight resistance were sequenced (Table 4.1). The Cornell tomato breeding program provided seed of the elite tomato breeding lines CU151095-146, CU151011-146, CU191357, and CU201041. Tomatoes CU151095-146 and CU151011-146 are full-sib fresh market inbreds with a complex pedigree and phenotypically selected early blight resistance tracing to NC 1 CELBR. CU191357 has additional early blight resistance (EB-5) from OH08-7663 in the CU151095-146 background. CU201041 is a processing tomato breeding line in the background of OH08-7663 with introgressed resistance (EB-1.1|EB- 1.2 + EB-9) from CU151095-146. Seed of Campbell 1943 and the fresh market lines NC EBR 1, NC EBR 2, and NC 1 CELBR were obtained from Dr. Gardner at NCSU and 144 were described previously. Seed of S. habrochaites PI 126445 and Devon Surprise were requested from the Northeast Regional PI Station USDA, ARS Plant Genetic Resources Unit in Geneva, NY. An unrelated Cornell line with known introgressions from Solanum pennellii LA0716, CU17NBL, was grown and sequenced for use in the validation of introgression detection methods. Plants were seeded into 10 cm pots under 14h light at 27°C, 10h dark at 22°C at the Guterman Bioclimatic Lab in Ithaca, NY. Acquisition of whole genome sequence de novo and from existing studies—DNA was extracted from 85g of leaf tissue from four-week-old plants using the DNeasy Plant Mini Kit (Qiagen, Valencia, CA, USA). Multiplexed genomic DNA libraries were prepared for sequencing using the Nextera Flex platform (Illumina Inc., San Diego, CA, USA) and sequenced on a NextSeq500 (paired-end 2 x 150bp) by the Cornell Institute of Biotechnology Genomics Facility. Two plants of NC 1 CELBR were sequenced because we previously observed residual heterozygosity in the seed lot. These sequences will be deposited in the National Center for Biotechnology Information (NCBI) Sequence Read Archives (SRA). Raw sequences for Hawaii 7998, OH08-7663, OH7536, and OH88119 were provided by Dr. Francis, OSU. Sequence for CU151011-146 was provided by Dr. Martin of the Boyce Thompson Institute for Plant Research. An additional 764 sequences for tomato lines, F1 hybrids, and wild species were obtained from the NCBI SRA deposits SRP094624, SRP045767, ERP004618, and SRP150040 (Table 4.S1, Supplement 1) (100 Tomato Genome Sequencing Consortium et al. 2014; Lin et al. 2014; Tieman et al. 2017; Gao et al. 2019). Sequence handling—Raw FASTQ files were mapped to the SL4.0 Heinz 1706 reference tomato genome using BWA-MEM 0.7.17 (Li 2013; Fernandez-Pozo et al. 2015; Hosmani et al. 2019). Variants were called jointly for all 775 sequences with the GATK4 HaplotypeCaller 4.1.4.1 and filtered using VCFtools 0.1.13 and BCFtools (Li et al. 2009; Danecek et al. 2011; Poplin et al. 2018). Single nucleotide polymorphisms (SNPS) were 145 retained at sites for which the proportion of missing data was less than 20% and where the minor allele frequency exceeded 0.05. A site was removed if it was multi-allelic or if greater than 50% of samples were heterozygous. We removed sequences that did not possess genotypes for at least 20% of variants. Data were phased and imputed with Beagle 5.0 (Browning and Browning 2007; Browning et al. 2018). Whole-genome genetic variance decomposition and sample clustering—The whole- genome genotype matrix of 775 sequences was pruned for redundant SNPs within 100 bp physical windows using BCFtools (to reduce impacts of wild-species divergence), then scaled and centered using Scikit-learn version 0.23.2 for Python 3 on 64-bit Linux CentOS 7 (Li et al. 2009; Pedregosa et al. 2011). Scikit was used to calculate principal components and to hierarchically cluster sequences using Ward distance criterium and a range of k={3-20} clusters with Scikit. A value of k=10 clusters was empirically chosen for its balance of accuracy and sensitivity when cluster membership was cross-referenced with passport data and by scree visualization. Clusters were named according to “majority rules” taxonomic info from passport data and were visualized with ggplot2 in R 3.6.0 (R Core Team 2020) Determination of introgression boundaries by sliding window haplotype clustering analysis—Haplotype homology was detected on a sliding window using a custom Python 3 script built around Scikit-learn 0.23.2. The script windowed the genomes of all sequences, using physical distance, and used hierarchical agglomerative clustering to identify haplotype clusters. The threshold distance, d, under which clusters are merged, was determined by maximizing the silhouette score. If two or more sequences were clustered together (indicating sequence similarity) their haplotypes were considered homologous for the genomic window. We assume homology (as opposed to sequence similarity) within the narrow genetic base of cultivated tomato. We visualized haplotypes using ggplot2 in R. The analysis was repeated with the window sizes {100 Kb, 250 Kb, 146 500 Kb, 1 Mb}, corresponding step sizes of {25 Kb, 100 Kb, 100 Kb, 250 Kb}, and d ranges of {2-80, 2-100, 20-120, 20-200}. Evaluating evidence for introgressions with Patterson’s D and coalescent tree inference—We estimated relationship trees using the SVDquartets program implemented in PAUP*, with 100 bootstrap replicates (Swofford 2003; Chifman and Kubatko 2014, 2015). Introgression boundary refinement using pairwise ancestry painting—Introgression boundaries for EB-5 and EB-9 were identified by localized pairwise ancestry painting with Ruby scripts from Dsuite. Ruby scripts identified polymorphic sites in the sequences for two contrasting accessions that were fixed in opposite directions. Genotypes for a panel of samples were then “painted” based on whether they shared one or more allele with either of the two contrasting accessions. Three contrasts were examined: Devon Surprise vs. Heinz 1706, Devon Surprise vs. NC 84173, and Devon Surprise vs. Yellow Pear. For EB-9, the contrasts included Hawaii 7998 vs. Heinz 1706, Hawaii 7998 vs. CU151095-146, and Hawaii 7998 vs. Yellow Pear. We visualized shared alleles with ggplot2 in R. Prediction of resistance in sequenced accessions—Hierarchical agglomerative clustering of all samples for the refined EB-5 and EB-9 genomic windows was repeated for 7 partially overlapping haplotype windows at each resistance locus (shown in Table 4.2). If a sequence was clustered with the putative resistance donor for 6-7 of these iterations, the sample to which that sequence belonged was predicted to have resistance with “high confidence”. If a sequence was clustered with the donor in 3-5 or 1-2 iterations, resistance was predicted with medium or low confidence, respectively. Fine- scale evidence for resistance donor homology was obtained by windowing the EB-5 or EB-9 interval using a 250 Kb window size and 25 Kb step size. 147 Mist chamber validation of resistance prediction—Forty-two tomato accessions (including resistant and susceptible controls) were assayed for stem lesion resistance in a mist chamber disease screen as described by (Barksdale 1969) with modifications as detailed in (Anderson et al. 2021). Seed was obtained as described previously or from the Tomato Genetics Resource Center at University of California-Davis, the Universidad Politécnica de Valencia, Spain (by way of Dr. Esther van der Knapp at University of Georgia), or commercially (Table 4.S2). Each accession was represented five times in each of two experimental replicates. Stem lesions were rated on a linear scale for percentage of diseased stem area. Analysis of variance was done in R version 4.02 (51) with agricolae version 1.3-3 (Mendiburu 2020) used for the Tukey honest significant difference test (α = .05). Identification of gene annotations with SNP variants—Refined QTL intervals were surveyed for putative causative loci underlying EB-5 and EB-9. The Integrative Genome Viewer (IGV) was used initially to visualize SNP density and SNP patterns at sites within the introgression boundaries (Thorvaldsdóttir et al. 2013; Robinson et al. 2017). SnpEff was used to annotate variants that fell within and outside of predicted genes and to screen for functional changes in reference to the SL4.0 reference genome and the ITAG4.1 gene annotations (Cingolani et al. 2012; Hosmani et al. 2019). Variants were considered to possibly underly resistance if the alleles were mutually exclusive in known resistant (EB- 5 or EB-9) and susceptible (lacking EB-5 and/or EB-9) sets of accessions. Identification of high-quality polymorphic sequences—SNP variants were selected from the EB-5 interval SL4.0ch05:62,350,391 – 63,450,391 and the EB-9 interval SL4.0ch09:62,452,852 – 63,002,852 from the filtered, unimputed genotype matrix. Genotypes with a read depth under 10.0 were set to “NA”, and sites were selected with a minimum sample call rate of at least 70% and a maximum heterozygosity of 0.5 across all taxa. From this matrix, we identified sites at which an allele was shared among early 148 blight resistant accessions from the hypothesized introgression pathway for either EB-9 (Figure 2A) or EB-5 (Figure 3A), and was absent from relevant susceptible tomatoes, as described in the results. Flanking sequences for the SNPs were obtained from the SL4.0 genome sequence. Results Re-sequencing depth across sources—Read depths for 775 genome sequences aligned to the SL4.0 reference varied primarily according to the data source. Mean depths across SNP variants ranged from 5.5x to 24.3x for the newly sequenced early blight resistant accessions (Table 4.1). Among these, S. habrochaites accession PI 126445 had the lowest mean depth because of its large genome size and distant relationship to the tomato reference. Depths for sequences from OSU ranged from 8.3x to 11.9x for OH88119 and OH08-7663, respectively. Sequences from ERP004618 had relatively high average depths from 12.6x to 39.0x, but average depths were lower, 0.3x to 13.5x, for sequences from SRP045767. From SRP094624, the depth ranged between 2.1x to 22.2x, and from SRP150040 the range was 0.9x to 39.9x (Table 4.1). 149 Table 4.1 Brief descriptions of core sequenced accessions discussed throughout this work, including the market class and species of the tomato accessions, the number of independent sequences with the same name, and the average sequenced depth across filtered SNP variants for our mapping to the SL4.0 genome Number of Mean SNP Entry Name Tomato Type Species Sequence from Sequenced Variant Accessions Depth Devon Surprise Heirloom S. lycopersicum This study 1 21.0 Campbell Early 1943 processing S. lycopersicum This study 1 20.0 NC EBR 1 Fresh market S. lycopersicum This study 1 15.6 NC EBR 2 Fresh market S. lycopersicum This study 1 11.2 NC 1 CELBR Fresh market S. lycopersicum This study 2 17.0, 24.3 CU191357 Fresh market S. lycopersicum This study 1 20.5 CU201041 Processing S. lycopersicum This study 1 18.8 PI 126445 Wild species S. habrochaites This study 1 5.5 CU151095- 146 Fresh market S. lycopersicum This study 1 16.7 CU151011- Boyce 146 Fresh market S. lycopersicum Thompson 1 27.9 Institute The Ohio Hawaii 7998 Fresh market S. lycopersicum State 2 10.8, 4.5 University The Ohio OH7536 Processing S. lycopersicum State 1 11.2 University 150 The Ohio OH08-7663 Processing S. lycopersicum State 1 11.9 University The Ohio OH88119 Processing S. lycopersicum State 1 8.3 University M82 Processing S. lycopersicum SRP045767 1 5.0 NC 84173 Fresh market S. lycopersicum SRP045767 1 5.8 Heinz 1706 Processing S. lycopersicum SRP045767 1 4.4 Brandywine Heirloom S. lycopersicum ERP004618 2 33.4, 6.5 Yellow Pear Heirloom S. lycopersicum SRP094624 1 5.1 151 Classification and demographic history of sequenced accessions—We summarized the contents of the sequence dataset by performing hierarchical agglomerative clustering of all 775 whole-genome sequences. The resulting clusters broadly agreed with the known demography for tomato and its wild relatives. When the sequences were projected onto a 3-dimensional principal component (PCA) space (Figure 4.1), the data were roughly circumscribed by S. pimpinellifolium, S. lycopersicum var. cerasiforme, cultivated heirloom tomato accessions, and relatives of tomato from the green-fruited clade. The primary axis of variation reflected genetic differentiation between cultivated tomato and its wild progenitor S. pimpinellifolium. The second principal component reflected additional variation within the red-fruited clade, including that among the canning/processing and fresh market classes. The third principal component reflected the divergence between the green and red-fruited Solanum clades. The green-fruited wild species cluster incorporated all accessions of S. chmielewskii, S. arcanum, S. neorickii, S. hualyasense, S. peruvianum, S. corneliomuelleri, S. chilense, S. habrochaites, and S. pennellii. The red and yellow-fruited wild cluster had accessions of S. galapagense, S. cheesmaniae, S. pimpinellifolium, and S. lycopersicum var. cerasiforme. Most of the genetic variance in the dataset was within S. pimpinellifolium and S. lycopersicum var. cerasiforme, for which there was also considerable overlap and ambiguity of cluster membership, reflecting the complex history of tomato domestication. The heirloom tomato cluster had the least genetic variation, despite comprising 35.0% of the accessions in the dataset. Greater genetic diversity within the fresh market and processing tomato clusters reflected efforts by breeders to improve disease resistance and horticultural performance through interspecific hybridization and introgression. 152 Figure 4.1 Whole-genome classification of 775 tomato and related wild species accessions projected onto a 3-dimensional PCA space. Samples were classified by Ward agglomerative hierarchical clustering (k=10) and clusters were named according to majority rules taxonomic passport information Evaluation of a windowed haplotype classification method—We developed a custom haplotype analysis method for the identification of homologous haplotypes that is both simple and computationally efficient. Briefly, sequences are windowed and stepped according to user specifications. Next, a hierarchical agglomerative clustering model is built using Ward distance. Sequences in a window are then clustered together into homologous haplotypes if their pairwise distance falls under a distance threshold. The program determines the distance threshold, d, by iterating through a user-specified range and identifying the value of d that maximizes the mean silhouette coefficient for all 153 sequences. This approach allows the clustering algorithm to account for unequal information content across genomic windows. The code is built on Scikit-learn’s clustering module and will be available for download as a Python 3 command-line script. The optimal range for d depends on the size of the genomic window under consideration. Because the genetic information content and mean pairwise distance among samples increases along with window size, the optimal value of d that returns the maximal number of meaningful clusters will also increase (Figure 4.S1). A value of d that is too small for a genomic window may cause homologous haplotypes to be split into separate clusters because of minor genotyping errors or residual heterozygosity in seed lots. A d value that is too large can result in the clustering of divergent haplotypes. Visualization of d values returned by the analysis gives insight into the chosen d range, with an ideal d distribution, in our experience, being uniform to somewhat left-skewed (Figure 4.S2). We evaluated our haplotype analysis methodology by detecting introgressions in the well-characterized breeding line CU17NBL that is known to have six S. pennellii LA0716 introgressions across five chromosomes (MA Mutschler, unpublished). Using the cluster data from our windowed analysis, we defined LA0716 introgressions as those where the haplotypes from breeding line CU17NBL clustered with LA0716 but not with two cultivated breeding lines without LA0716 introgressions: M82 and NC 1 CELBR. With a 100 Kb window size, 25 Kb step size, and d range of {2-80}, the clustering algorithm identified 12 putative LA0716 introgressions, which is greater than the true number of introgressions (Figure 4.S3). All distance thresholds returned for the erroneous windows fell in the top 10% of the d distribution, indicating little distinguishing genetic information in these windows. Seven putative introgressions were detected, including all six known introgressions, when the input parameters were changed to a 250 Kb window size, 100 Kb step size, and d range of {2-100}. With a window size of 500 Kb or greater, 154 a step size of 100 Kb, and d range {20-200}, we detected all but the smallest known introgression on chromosome 7. Thus, larger window sizes showed decreased sensitivity to the identification of smaller introgressions, but greater accuracy in the detection of known introgressions. Early blight resistance underlying EB-9 in modern tomatoes traces to Devon Surprise—We found compelling evidence for introgression from Devon Surprise in modern early blight resistant breeding lines. To demonstrate the value of our windowed haplotype method over SNP-based methods, we began by identifying SNPs in the genome where an allele was both 1) shared among landmark accessions along the putative EB-9 introgression pathway (Figure 4.2A) was absent from three tomatoes susceptible to early blight stem lesions: NC EBR 1, NC 84173, and Brandywine. In total, 616 SNPs across twelve chromosomes fit this specified pattern (Figure 4.2B). In contrast, our windowed haplotype clustering method found just three haplotypes fitting the pattern (Figure 4.2C). One of these haplotypes fell within the previously defined EB-9 interval on chromosome 9 (Figure 4.2D). Chromosome-level visualizations showed the decay of the chromosome 9 introgression from Devon Surprise as it was transferred through Campbell 1943 into a modern cultivated background. The other two haplotypes fell on chromosomes 8 and 12 and did not coincide with resistance QTL (Figure 4.S4). However, these haplotypes were also homologous with OH08-7663, which we had used as a QTL mapping parent. 155 Figure 4.2 Fine-mapping EB-9 stem and foliar early blight resistance by comparative sequencing. A. EB- 9 Resistance is hypothesized to be derived from the heirloom Devon Surprise. Landmark tomato lines along the EB-9 introgression pathway are shown in orange (many generations are omitted). B. Visualization of SNP variants (orange bars) in the tomato genome that fit the expected pattern of introgression (i.e., an allele is shared among all resistant tomatoes from the hypothesized pedigree in panel A but is absent from the early blight stem-susceptible controls NC EBR 1, NC 84173, and Brandywine). C. Haplotypes that fit the pattern of introgression based on a custom 250 Kb sliding window analysis are colored orange, showing homology and putative introgression from Devon Surprise. D. Zoomed-in visualizations of the three putative introgressions from Devon Surprise. Red lines indicate the low-confidence introgression limits for the window (window center ± ½*window step size), while blue lines indicate the outer window edges. The prior QTL mapping boundaries for EB-9 are shown. E. Evidence of shared haplotypes for 20 tomatoes relevant to this study, including early blight stem resistant lines, lines with foliar early blight resistance only, and a few famous tomato lines. The hierarchical tree contains bootstrap support values for the average EB-9 genomic interval estimates (see Table 2). 156 There was evidence for introgression between EB-9 breeding lines and S. habrochaites PI 126445, which was also reflected in the haplotype tree (Figure 4.2E). However, as S. habrochaites was the only wild species included in this localized test, the data merely suggest a wild species origin for EB-9, rather than a cultivated one. The inferred EB-9 tree (Figure 4.2E) closely approximated the putative breeding pedigree (Figure 4.2A), reflecting modifications to the introgression over decades of breeding. Interestingly, our haplotype analysis indicated that Devon Surprise and Ailsa Craig overlapped for just a portion of the EB-9 interval (Figure 4.2E). Estimation of the whole- genome species tree for a relevant subset of accessions confirmed a close genetic relationship between Devon Surprise and Ailsa Craig (Figure 4.S5) supporting the assertion that Devon Surprise is a natural mutant of Ailsa Craig. Delineation of the ancestral Devon Surprise introgression boundaries enabled fine mapping of the EB-9 resistance locus. Introgression boundaries were determined by two methods, with broad agreement. First, we used SNP-based pairwise chromosome painting to estimate the EB-9 boundary. Three contrasts were examined between Devon Surprise and one of three accessions lacking stem lesion resistance (Table 4.2). All three contrasts delimited the ancestral EB-9 haplotype between SL4.0ch09:62,599,611 and SL4.0ch09:62,945,798 (Table 4.2; Figure 4.S6). These boundaries represent an 81.4% reduction over the EB-9 interval from QTL mapping. The introgression boundaries were also estimated using our haplotype clustering method. Using a 250 kb window size, we delimited EB-9 to between SL4.0ch09:62,452,852 and SL4.0ch09:63,002,852, which represents a 70.4% reduction in the QTL interval (Figure 4.2D). Visualization of the pairwise cluster-based homology with Devon Surprise reveals that NC 1 CELBR was segregating for the upper portion of the Devon Surprise introgression (NC 2 CELBR was sequenced twice), delimiting its upper bound (Figure 4.2E). The lower boundary was 157 delimited by CU151095-146, which had lost the lower portion of the Devon Surprise introgression. To investigate the effect of varying the windowing size on our clustering approach to introgression boundary estimation, we repeated the analysis for EB-9 using window sizes of 100 Kb, 250 Kb, 500 Kb, and 1 Mb. There was consistent evidence for Devon Surprise ancestry at EB-9 regardless of the window size. However, varying the parameter affected the trade-off between clustering accuracy and the estimated introgression size. A window size of 100 Kb resulted in noisy haplotype data, with many discontinuous haplotypes showing homology to Devon Surprise. This noise posed a challenge to accurate boundary estimation. We combatted this issue by setting a minimum of 15 SNPs per window for the clustering analysis, ensuring that haplotype clusters were not estimated for data-poor genomic windows. Larger window sizes gave increasingly accurate estimations of haplotype homology, but also resulted in wider and increasingly conservative introgression intervals (Table 4.2). We found the 250 Kb window size to be a good compromise for our data, as reflected by the small EB-9 interval in Table 2, the clear homology visualizations in Figure 4.2E, and the desirable distribution of d scores (Figure 4.S2). A survey of all 775 sequences identified 48 accessions that shared homology with Devon Surprise for at least one candidate EB-9 interval (Figure 4.S7) Of the 48 sequences, nine belonged to the landmark breeding lines already known to have stem lesion resistance. Four additional accessions shared homology with Devon Surprise for all seven of the candidate intervals investigated (see Table 4.2) and are predicted to have EB-9 early blight resistance. These accessions were the heirloom tomatoes Gardner’s Delight (two sequences), Monplaisir, Yellow Perfection, and Katinka Cherry. Twenty and sixteen accessions clustered with Devon Surprise in 3-5 and 1-2 of the candidate intervals, respectively, and may have EB-9 stem lesion resistance. All accessions that had similar 158 sequences to Devon Surprise at EB-9 were either S. lycopersicum or S. lycopercium var. cerasiforme, except for one sequence belonging to S. pimpinellifolium PI 370093. Fine- scale windowing (250 Kb) of the full EB-9 interval for these sequences revealed at least 7 sub-haplotypes with distinctive patterns of homology (Figure 4.S7). 159 Table 4.2 Comparison of EB-5 and EB-9 QTL boundaries as determined from three different mapping approaches. Traditional QTL mapping in populations derived from the cross of CU151095-146 and OH08-7663 gave initial boundaries for the early blight resistance QTL. Pairwise chromosome painting was used to delimit the ancestral introgression boundaries from Devon Surprise (EB-9) and Hawaii 7998 (EB-5), narrowing the QTL intervals. A custom windowed haplotype clustering methodology gave similar introgression boundaries. All positions are base pairs in the SL4.0 tomato reference genome sequence EB-5 (Chr. 5) EB-9 (Chr. 9) Upper Boundary Lower Boundary QTL Interval Upper Boundary Lower Boundary QTL Interval QTL Mappinga Minimum boundaries 62,700,265 63,842,577 1,142,312 61,819,509 63,679,761 1,860,252 Pairwise Ancestry Paintingb Devon Surprise vs. NC 84173 - - - 62,599,611 62,943,349 343738 Devon Surprise vs. Yellow Pear - - - 62,599,611 62,943,349 343738 Devon Surprise vs. Heinz 1706 - - - 62,599,611 62,943,349 343738 Hawaii 7998 vs. CU151095-146 62,728,321 63,401,903 673,582 - - - Hawaii 7998 vs. Yellow Pear 62,699,938 63,401,903 519,858 - - - Hawaii 7998 vs. Heinz 1706 62,417,592 63,401,903 701,638d - - - Windowed Clusteringc 100 Kb sliding window 62,412,891 63,412,891 712,626d 62,540,352 62,940,352 400,000 250 Kb sliding window 62,350,391 63,200,391 500,126d 62,452,852 63,002,852 550,000 500 Kb sliding window 62,350,391 63,450,391 750,126d 62,252,852 63,052,852 800,000 1 Mb sliding window 62,350,391 63,650,391 950,126d 61,877,852 63,627,852 1,750,000 a. Boundaries determined from Anderson et al. 2021 b. Pairwise ancestry painting looks for evidence of introgression from one entry relative to a contrasting entry for each SNP. We present three contrasts for EB-5 and EB-9 for comparative purposes. c. A conservative range is given based on the physical position of the outer window edges d. In this case, the upper boundary of the ancestral introgression extends beyond the previously determined upper QTL boundary, so the upper QTL boundary was subtracted from the lower one 160 Experimental validation of predicted EB-9 resistance—We successfully predicted the EB-9-mediated stem lesion resistance phenotype in publicly available germplasm. In comparison to the predicted-susceptible fraction, resistant-predicted accessions demonstrated strong reductions in stem lesion and collar rot disease (p < 0.001). There were no statistical differences among the fractions we predicted would be resistant with low, medium, or high confidence (Fig 4.3). The mean disease for individual accessions also varied (p < 0.001) (Table 4.S2). Figure 4.3. Boxplot of the mist chamber validation experiment of EB-9 stem resistance. There were 5 replications of each accession/experiment. Disease ratings of stem lesions were on a scale of 0-100%. Each tomato accession was either susceptible and had no EB-9 stem resistance, or had high, medium, or low confidence of resistance based on similarity to Devon Surprise. ANOVA results suggested a significant difference between mean disease ratings across the different resistance phenotypes (p < 0.001). The letters above the bar plot suggest there is only a statistically significant difference between the susceptible accessions compared to the accessions with predicted resistance, based on a Tukey adjusted 95% confidence level. 161 Early blight and bacterial spot resistance traces to introgressions from Hawaii 7998—Our cluster-based introgression detection method identified several genomic windows shared among EB-5 breeding lines and absent from the early blight susceptible tomatoes OH88119, NC 84173 and Brandywine (Figure 4.4A; Figure 4.4B). One of these haplotypes fell within the QTL interval for EB-5 (Figure 4.4C). It was possible to trace the decay in the Hawaii 7998 introgression through OH08-7663 into the fresh market breeding line CU191357. 162 Figure 4.4. Fine-mapping EB-5 foliar early blight resistance by comparative sequencing. A. EB-5 resistance was hypothesized to be derived from S. pimpinellifolium, but we found little evidence for such a putative donor. Landmark tomato lines along the EB-5 resistance breeding pathway are shown in orange. B. Haplotypes that fit the pattern of introgression (i.e., an allele is shared among all resistant tomatoes from the hypothesized pedigree in panel A but is absent from the early blight susceptible controls OH88119, NC 84173 and Brandywine) based on a 250 Kb sliding window analysis are colored orange, showing homology and putative introgression from Hawaii 7998. C. Zoomed-in visualizations of the two putative introgressions from Hawaii 7998. Red lines indicate the low-confidence introgression limits for the window (window center ± ½*window step size), while blue lines are located at the window edges. The prior QTL mapping boundaries for EB-5 are shown. D. Evidence of shared haplotypes for 20 tomatoes relevant to this study, including early blight resistant lines and famous tomatoes. The hierarchical tree contains bootstrap support values for the average EB-5 genomic interval estimates (see Table 2) 163 The EB-5 QTL interval was refined by delimiting the ancestral Hawaii 7998 introgression boundaries in resistant breeding lines using pairwise SNP chromosome painting and haplotype clustering. Pairwise SNP contrasts gave estimated upper boundaries for EB-5 between SL4.0ch05:62,417,592 and SL4.0ch05:62,728,321, depending on the contrast, while the lower boundary was consistently estimated at SL4.0ch05:63,401,903 (Table 4.2, Figure 4.S8). These intervals represent a 38.6 - 54.5% reduction in the EB-5 interval relative to those obtained from QTL mapping. Our haplotype clustering analysis with a 250 Kb window size gave an EB-5 interval between SL4.0ch05:62,350,391 and SL4.0ch05:63,200,391 (Figure 4.4C). The upper boundary of the ancestral introgression was shared by the resistant breeding lines OH08-7663, CU191357, and CU201041, while the lower boundary was delimited by CU191357 (Figure 4.4D). The upper haplotype boundary exceeded that obtained from QTL mapping. With the QTL boundary marking the upper limit of the EB-5 interval, the refined EB-5 QTL interval represents a 56.2% reduction relative to QTL mapping. Varying the windowing parameters, as done for EB-9, gave similar results, with the 250 Kb and 500 Kb window sizes offering the best compromise between clustering accuracy and overly conservative haplotype intervals. We found little evidence for EB-5 resistance in accessions from outside of the Cornell or OSU breeding programs. Using our haplotype clustering method, we identified accessions from outside our breeding programs with homology to Hawaii 7998 for the refined EB-5 intervals. However, these accessions clustered with Hawaii 7998 for just one or two of the seven EB-5 intervals investigated (Figure 4.S9). Furthermore, fine-scale (250 Kb) windowing of the full EB-5 interval revealed little continuous homology with Hawaii 7998, suggesting that these haplotypes were relatively divergent. The chromosome 5 introgression from Hawaii 7998 in modern breeding lines also overlapped with the putative location of the Rx-3 bacterial spot (Xanthomonas sp.) disease 164 resistance locus. Fine-map data for Rx-3 has not been published, but the QTL is known to be located between approximately 61.9 Mb and 63.2Mb on chromosome 5 of SL4.0 (DF Francis, pers. comm.). Both OH7536 and OH08-7663 have Rx-3 resistance tracing to Hawaii 7998 (Yang and Francis 2005; Sim et al. 2015). Given the partially overlapping Hawaii 7998 introgressions in these two lines, we used our haplotype analysis with a 250 kb window size to delineate the Rx-3 interval to between SL4.0ch05:62,850,391 and SL4.0ch05:63,400,391. We believe that CU191357 and CU201041 also have Rx-3 resistance, which could further narrow the interval, but neither of these lines has yet been tested against the bacterial spot pathogen. Several early blight resistant breeding lines were homologous with Hawaii 7998 for a large centromeric chromosome 11 haplotype that contains the bacterial spot quantitative resistance locus QTL-11. Using our haplotype analysis with a 250 Kb window size, we estimated the shared haplotype among breeding lines known to have QTL-11 to extent from SL4.0ch11:13,059,026 to SL4.0ch11:48,509,026 (Figure 4.S10). Because this QTL is hypothesized to be derived from S. pimpinellifolium, we looked among all S. pimpinellifolium accessions in our dataset for homology with Hawaii 7998 for the centromeric haplotype. One S. pimpinellifolium accession, LA0722, was homologous with Hawaii 7998 for the entire haplotype window (Figure S11, Supplement 3). We predicted the QTL-11 centromeric haplotype in a total of 58 sequences, 38 of which were clustered with Hawaii 7998 for all seven of the QTL-11 intervals we investigated. Most of these accessions were modern cultivated tomatoes, including NC 1 CELBR, M82, Micro Tom, Florida 7060, Florida 8059, Peto 9543, Jelly Bean Hybrid, and Mountain Spring VFF Hybrid (Figure 4.S11). Limited evidence for introgressions from S. habrochaites PI 126445 in modern breeding lines—We observed limited evidence for shared PI 126445 introgressions in the genomes of early blight resistant breeding lines (Figure 4.5). Our cluster-based 165 introgression detection method failed to identify any haplotypes common among the landmark breeding lines from the putative introgression pathway (Figure 4.5A) and PI 126445 with our default window size of 250 Kb. To increase the detection sensitivity, we reduced the window size to 100 Kb. This analysis revealed homologous haplotypes on chromosomes 10 and 11 that were shared among breeding lines with putative PI 126445 ancestry but were absent from NC 84173, Brandywine, and NC EBR 2 (Figure 4.5B). Neither introgression overlapped with our previously mapped early blight resistance QTL. We further investigated the likelihood of an S. habrochaites introgression underlying these haplotypes using principal component analysis. Breeding lines with putative PI 126445 ancestry received PCA coordinates for the chromosome 10 haplotype that were close to other S. habrochaites accessions in the dataset and relatively distant from most cultivated tomatoes, suggesting an S. habrochaites ancestry. In contrast, PCA analysis for the chromosome 11 haplotype placed the resistant breeding lines alongside many cultivated accessions. In addition, the chromosome 11 haplotype fell within the centromeric haplotype with homology to Hawaii 7998, described above. Because we saw little evidence of recombination near the chromosome 11 centromere, this result is probably spurious. To determine whether a PI 126445 region was introgressed into NC EBR 1 but subsequently lost during breeding, we repeated the 100 Kb windowed haplotype search using a relaxed search pattern. We sought haplotypes from PI 126445 that were shared with just NC EBR 1 but not with NC 84173, Brandywine, and NC EBR 2. This pattern returned additional putative introgressions from PI 126445 on chromosomes 2, 9, 10, and 12 (Figure 4.5C). Principal component investigation revealed the chromosome 12 introgression was the most likely of these haplotypes to be derived from S. habrochaites. 166 Fig. 4.5. Looking for evidence of S. habrochaites PI 126445 introgressions in early-blight resistant breeding lines. A. PI 126445 is thought to be the donor early blight foliar resistance in many modern tomato breeding lines. Landmark tomato lines along the early blight resistance breeding pathway are shown in orange. B. Haplotypes that fit the pattern of introgression (i.e., an allele is shared among all resistant tomatoes from the hypothesized pedigree in panel A but are absent from tomatoes without PI 126445 in their pedigree, NC 84173, Brandywine, and NC EBR 2) based on a 100 Kb sliding window analysis are colored orange, showing homology and putative introgression from PI 126445. C. Evidence for introgressions in NC EBR 1 that are not in NC 84173, Brandywine, and NC EBR 2. D. Evidence of shared haplotypes for 20 tomatoes relevant to this study. 167 Identification of putative causative loci—Boundaries for EB-5 and EB-9 were confirmed visually with IGV, showing greater SNP density and a conserved variant pattern within the introgression boundaries for tomatoes containing the proposed EB-5 or EB-9 introgressions. We identified 82 SNP variants within EB-5, 27 of which fell within 12 predicted genes potentially underlying resistance. Within EB-9, 90 SNP variants were identified with 16 variants located in eight putative genes. The remaining SNPs were in intergenic regions (Table 4.S3; Table S4). We considered a gene in the EB-5 or EB-9 intervals to possibly underly early blight disease resistance if it contained SNP variants that were mutually exclusive in the known resistant and susceptible sets of accessions presented above. The predicted functions of such genes were varied, as were the functional effects of the variants. Within EB-5, Solyc05g053980.1 putatively encoded a plant resistance protein, while others encoded enzymes including gibberellin 2-oxidase 1, a galactosyltransferase family protein, a protein phosphatase-2C, a DNA (Cytosine-5)-methyltransferase, and a serine hydroxymethyltransferase (Table 4.3). Two genes encoded plant self-incompatibility proteins. Putative protein-coding genes containing variants on chromosome 9 encoded two potassium transporters, a metal tolerance protein C1, cation efflux protein, a peptide chain release factor 1, eukaryotic translation initiation factor 4G, an F-box protein, and a 2-oxoglutarate and Fe(II)-dependent oxygenase protein 1 (Table 4.4). Many of the variants were in introns and classified as modifiers. Some variants were missense variants with predicted moderate effects, and a smaller number of variants were either synonymous or located in the 5’ untranslated region, with low putative impacts on gene function (Tables 4.3; Table 4.4). 168 Table 4.3 Highlighted variants and gene annotations from SnpEff that fall within the refined EB-5 interval (SL4.0ch05:62566094 - SL4.0ch05:63401898) for the SL4.0 genome and ITAG4.1 gene annotation Genes Gene SNP Variant Predicted Predicted Function Location Location Description Variant (bp in (bp in Impact Chr. 5) Chr. 5) Solyc05g053080.2 62602656- 62602681 5' UTR Low Unknown 62605054 62602843 Missense Moderate 62602847 Missense Moderate 62603260 Intron Modifier 62603437 Intron Modifier 62603449 Intron Modifier 62603689 Intron Modifier 62603792 Intron Modifier 62603828 Intron Modifier 62603916 Intron Modifier Solyc05g053260.3 62784717- 62792542 Intron Modifier DNA (Cytosine-5)- 62801391 methyltransferase DRM2 Solyc05g053280.3 62803504- 62805073 Intron Modifier Galactosyltransferase 62809045 family protein Solyc05g053290.3 62811779- 62811946 Missense Moderate Protein phosphatase-2C 62814389 Solyc05g053340.5 62856750- 62858495 Intron Modifier Gibberellin 2-oxidase 1 62863385 62858789 Intron Modifier Solyc05g053450.3 62911678- 62912425 Intron Modifier Late embryogenesis 62913123 62912458 Intron Modifier abundant protein 1-like 62912463 Intron Modifier 62912551 Intron Modifier 62912599 Intron Modifier Solyc05g160450.1 62938728- 62938903 Missense Moderate Plant self-incompatibility 62939307 S1 Solyc05g160460.1 62940755- 62941209 Missense Moderate Plant self-incompatibility 62941326 62941241 5'UTR Low S1 Solyc05g053600.3 63054527- 63059261 Intron Modifier Pleiotropic drug resistance 63061342 protein Solyc05g053770.4 63182158- 63189871 5' UTR Low Myb-like protein X 63190019 Solyc05g053810.3 63231432- 63233342 Intron Modifier Serine 63235896 hydroxymethyltransferase Solyc05g053980.1 63362638- 63362833 Synonymous Low Plant resistance protein 63363006 169 Table 4.4 Highlighted variants and gene annotations from SnpEff that fall within the refined EB-9 interval (SL4.0ch09:62599611 - SL4.0ch09:62943349) for the SL4.0 genome and ITAG4.1 gene annotation Gene Predicted Genes Location SNP Location Variant (bp in Chr. 9) Description Variant Predicted Function (bp in Chr. 9) Impact Solyc09g160100.1 62625587-62625973 62625766 Missense Moderate F-box protein Solyc09g074650.3 62684784-62691573 62689139 Intron Modifier Peptide chain release factor 1 Solyc09g074740.2 62770894- 62771619 Intron Modifier 62773833 62771967 Intron Modifier Cation efflux family protein Solyc09g074750.3 62776208-62780230 62780127 Missense Moderate Metal tolerance protein C1 62798499 Intron Modifier 62798512 Intron Modifier 62798613 Intron Modifier Solyc09g161650.1 62797984- 62798884 Intron Modifier Eukaryotic translation 62800125 62799362 Intron Modifier initiation factor 4G 62799480 Intron Modifier 62799508 Intron Modifier 62800102 Missense Moderate Solyc09g074790.3 62817971-62822507 62819042 Missense Moderate Potassium transporter Solyc09g074800.2 62835326-62840489 62839284 Missense Moderate Potassium transporter 2-oxoglutarate (2OG) and Solyc09g074920.3 62933032-62937363 62937049 Missense Moderate Fe(II)-dependent oxygenase superfamily protein 170 Several intergenic variants were clustered between genes potentially underlying resistance. For example, within EB-5, 16 variants resided between Solyc05g053900.4.1 (putative function: eukaryotic aspartyl protease family protein) and Solyc05g053910.1 (putative function: phospholipase A1) (Tables S2 and S3, Supplement 1). Within EB-9, 17 intergenic variants existed between Solyc09g160100.1 (putative function: F-box protein) and Solyc09g074580.1 (putative function of protein: glutaredoxin). Another cluster of 15 variants was located between Solyc09g161650.1 (putative function: Eukaryotic translation initiation factor 4G) and Solyc09g074780.3 (putative function: Protein indeterminate-domain 2). Polymorphic SNPs within the EB-5 and EB-9 ancestral haplotypes—Breeding for early blight resistance at EB-5 or EB-9 may be more reliable if trait-linked genetic markers are located within the ancestral haplotypes from Hawaii 7998 or Devon Surprise in modern breeding lines. These haplotypes had shown no signs of recombination over many breeding generations and are thus linked to the causative resistance loci. We identified high-quality SNPs that are specific to the ancestral haplotypes based on whether they matched the expected allele patterns given the hypothesized introgression pathways discussed above; their positions and flanking sequences are reported in Tables 4.S3 and 4.S4. Several SNPs fell in gene annotations and have the potential to be causal; they are annotated as such. The alleles for all 775 sequences in the dataset are also provided so that markers can be screened for their usefulness in related genetic backgrounds. Discussion Introgression has left characteristic traces of DNA scattered throughout the genomes of modern tomatoes (Causse et al. 2013; Menda et al. 2014; Blanca et al. 2015). Defining introgressions in modern breeding lines presents breeders with several advantages, including the ability to predict transgressive segregation from the 171 introgressions and ancestry of the parents and the opportunity to fine-map loci using the accumulation of ancestral recombinations. Furthermore, valuable introgressions in modern breeding lines are more readily deployable than those from wild species, as linkage drag on horticultural traits has often been reduced or eliminated through breeding. Defining introgressions can also give insight into the reliability of trait-linked genetic markers, since genomic regions that have remained unbroken over decades of breeding are tightly linked to the causative gene(s). Therefore, markers within or flanking the ancestral introgressions of modern lines should be reliable in the context of molecular breeding. Efficient haplotype-based introgression detection—Several methods have been proposed for the detection of introgressions in genome sequences. The simplest approach involves pairwise chromosomal painting (Fu et al. 2015; Der Sarkissian et al. 2015). However, this approach describes alleles relative to just two contrasting donors on which the results are highly dependent. Chromosome painting must be repeated for several pairwise contrasts to estimate introgression boundaries for complex, multi-parent breeding germplasm. Introgressions can alternatively be found by identifying stretches of high sequence polymorphism relative to a reference genome and then cross-referencing these polymorphic regions with those of an introgression-free tomato such as Yellow Pear (Menda et al. 2014; Strickler et al. 2015). However, this approach only detects introgressions from wild species. The popular ABBA-BABA test static can offer evidence for introgression from a specific ancestor (Green et al. 2010). However, we found that this statistical approach lacks the sensitivity to detect cryptic introgressions from chromosome-scale data and was highly dependent on the designated outgroup. Furthermore, the ABBA-BABA test can be unreliable at the local ancestry level (Martin et al. 2015). Several other parametric methods that infer local ancestry or tree topology could be adapted to detect introgression (Huelsenbeck and Ronquist 2001; Sankararaman 172 et al. 2008; Maples et al. 2013; Martin and Belleghem 2017; Salter-Townshend and Myers 2019). But these sophisticated methods can be slow for large numbers of taxa and may require the user to supply unknown information like the number of mixing groups, taxa clusters, genetic maps, or reference panels. Furthermore, these methods typically require specialized input file formats that are atypical of genomic breeding data. To address these challenges, we used a simple non-parametric hierarchical clustering method to group haplotypes along a sliding window. By imposing logical requirements relating to the putative ancestry (and resistance susceptibility) on the resulting cluster data, we identified putative introgressions tracing to specific ancestors. Our method takes as input a standard chromosome-level VCF and can handle genomic variants for hundreds of sequences thanks to its reliance on the efficient clustering algorithm in Python ‘Scikit-learn’. A key feature of our approach is that it leverages the available genetic information in each window to determine the optimal threshold for grouping haplotypes. This feature offers several advantages. First, the method does not require the user to pre-specify the number of clusters to output. Instead, the user can broadly calibrate the analysis to the desired level of clustering sensitivity by inputting a range of distance thresholds. Second, the optimal number of haplotype clusters can vary across genomic windows as a function of the underlying genetic variance. This enables the method to adapt to varying genetic diversity and recombination frequency across windows (without specification of genetic distances). Lastly, the clustering accuracy benefits from a greater number of input sequences. Characterization and prediction of early blight disease resistance in sequenced accessions—Our work clarified the ancestry of early blight disease resistance in several modern tomato lines, offering insights that will enable breeders to better utilize available tomato genetic resources. Whole genome sequencing of a small number of resistant accessions allowed us to trace stem and foliar resistance underlying EB-9 back more than 173 80 years to the early 20th century cultivar Devon Surprise. Similarly, we traced early blight resistance from EB-5 to the mid-century breeding line Hawaii 7998, and to the same haplotype that contains Rx-3 bacterial spot resistance (Yang and Francis 2005). We used the ancestral introgression boundaries to fine map the resistance loci and to report polymorphic sequences, specific to these haplotypes, that can be used to reliably select for early blight resistance in tomato. Our analysis predicted EB-9 resistance in 48 of 764 sequenced accessions from outside our pedigree, suggesting that early blight stem lesion resistance is common in tomato. Fine-scale homology detection revealed a distinct set of sub-haplotypes for EB- 9 that will enable further fine-mapping of the resistance locus in our upcoming work. We hypothesize that only a subset of these haplotypes will contain the causative resistance locus, as several were nonhomologous with Devon Surprise for the genomic window containing the marker solcap_snp_sl_29188, which we previously determined to be tightly linked to EB-9. Most of the predicted accessions were S. lycopericum var. cerasiforme, but one was classified as S. pimpinellifolium. This is unsurprising, as several accessions of S. pimpinellifolium are early blight resistant (Martin and Hepperly 1987; Thirthamallappa and Lohithaswa 2000; Ashrafi and Foolad 2015). Unfortunately, the last major screen of tomato germplasm for stem lesion/collar rot resistance (that we are aware of) was done by CF Andrus and colleagues in 1942. Of the 115 accessions they screened, 36 were tolerant or resistant to stem lesions. Most were from England and Europe, and included both Ailsa Craig and Devon Surprise, which we confirmed to be highly related. In our study, a majority of the predicted EB-9 accessions were from the Universidad Politécnica de Valencia, Spain, adding to the idea that early blight stem resistance is prevalent among European tomatoes. Intriguing, the genomes of Devon Surprise and Ailsa Craig are homologous for just a small portion of the refined EB-9 interval, presenting an opportunity to narrow the EB-9 interval further by confirming stem 174 resistance in these tomatoes. Mist chamber experiments demonstrated that we could accurately predict EB-9 resistance in gene bank accessions, eliminating the need for large-scale donor screening Figure 4.). To our surprise, there was no correlation between our perceived level of confidence in our cluster-based prediction and the resistance phenotype. Nevertheless, accession groups predicted to have EB-9 resistance with low, medium, or high confidence were more resistant to early blight stem disease than was the predicted-susceptible group. The ability to predict important phenotypes with efficiency and accuracy brings enhanced value to publicly available sequence and germplasm resources. Additionally, we believe that cluster-based haplotype prediction is extendable to other biological phenotypes and systems outside the context of applied plant breeding. We found the EB-5 resistance haplotype to be rare among sequenced accessions. This was unexpected, as the breeder of Hawaii 7998, Dr. J.C. Gilbert, was known to have been intermating tomato and S. pimpinellifolium for the purposes of disease resistance. Therefore, we expected to find evidence of a S. pimpinellifolium source for EB-5. Re- sequencing additional S. pimpinellifolium accessions might help to uncover the EB-5 resistance source. Further refinement of the EB-5 interval would also help to trace the introgression history of EB-5 and to predict EB-5 in additional sequenced accessions. Curiously, we saw little evidence of introgression from S. habrochaites PI 126445 in modern tomato. We detected possible PI 126445 introgressions on chromosomes 10 and 11 of early blight resistant breeding lines, but only after lowering the sensitivity of the detection method. The most promising of these loci was the putative introgression on chromosome 10. However, our prior QTL mapping population would have been segregating for this putative introgression (as it was absent from OH08-7663), but we did not find a QTL associated with early blight resistance on this chromosome. Breeding done by Dr. Gardner to eliminate linkage drag associated with PI 126445 introgressions might 175 have reduced the size of the introgression below the limit of detection for our method. Alternatively, the PI 126445 accession we sequenced could be different from that used in breeding. Finally, genotyping challenges stemming from the comparison of divergent genomes may have influenced our results. One shortcoming of QTL mapping in biparental populations is the limited genetic base for QTL detection. In our prior work, we mapped early blight resistance in a population founded by the processing tomato breeding line OH08-7663 and the fresh market line CU151095-146. As both tomatoes are regionally adapted inbreds with some degree of early blight resistance, it is possible that our QTL mapping missed important resistance loci that were shared by both parents and were therefore undetectable in our work. Here, we identified two such introgressions from Devon Surprise that could be involved in early blight resistance on chromosomes 8 and 12 (Figure S4, Supplement 3). Any association between these haplotypes and early blight resistance can only be determined through disease trials. Nevertheless, our haplotype analysis method demonstrates that detecting introgressions in parental lines can help to broaden the impact and relevance of narrowly focused QTL mapping papers. Prevalence of the QTL-11 haplotype among sequenced accessions—Hawaii 7998 has a centromeric chromosome 11 QTL (QTL-11) that confers broad-spectrum quantitative resistance to bacterial spot. To our surprise, we detected homology between the centromeric QTL-11 haplotype in Hawaii 7998 and as many as 58 accessions, including LA0722 and breeding lines from Cornell, NCSU and the University of Florida breeding programs (Figure S12). Importantly, this centromeric haplotype is homologous between Hawaii 7998, CU151095-146, and OH08-7663. The OSU breeding program recently introgressed QTL-11 into OH08-7663 to enhance bacterial spot resistance in that line (Sim et al. 2015). Subsequently, we used flanking SNPs to transfer QTL-11 from OH08- 7663 into the fresh market background of CU151095-146 (Anderson et al. 2021). Upon 176 completing the lines, we failed to detect enhanced resistance to bacterial spot in the fresh market background. Our analysis can explain this peculiar result because it suggests that QTL-11 was already in CU151095-146, even though we never intentionally bred for bacterial spot resistance, nor knowingly crossed with a QTL-11 donor. Thus, there is a clear benefit to characterizing the cryptic introgressions in parental breeding materials. Identification of possible causative loci underlying EB-5 and EB-9—Refinement of the EB-5 interval empowered the identification of possible causative resistance loci. Of particular interest within EB-5 was Solyc05g053980, which encodes a plant resistance protein (Table 4.3). While Solyc05g053980 could underly early blight EB-5 disease resistance, it may instead underlie Rx-3 bacterial spot resistance, as we found that these two loci have overlapping positional intervals and are likely from the same Hawaii 7998 introgression. Furthermore, Rx-3 is thought to be a classic R gene conferring hypersensitive resistance, which contrasts with the quantitative nature of EB-5 resistance (Yang and Francis 2005). Additional experiments are necessary to determine whether any relationship exists between Solyc05g053980 and either bacterial spot or early blight disease resistance. Five predicted genes within EB-5 encoded enzymes (Table 3). Several of these enzymes may play a role in signaling pathways and in triggering plant defense mechanisms. One putative gene encodes a DNA Cytosine-5 DNA Methyltransferase. This enzyme plays an important role in DNA methylation, genome protection and regulation, gene expression, secondary metabolism, and plant development (Wang et al. 2016). Previous studies have shown that UDP-glycosyltransferases (UGTs) glycosylate metabolites and phytohormones when plants encounter biotic and abiotic stresses. Some UGTs may play a role in jasmonic acid and abscisic acid signaling pathways. Several UGTs have also been studied in the context of the functional role they play in the hypersensitive response (Le Roy et al. 2016; Rehman et al. 2018). The gene encoding a 177 phosphatase-2C enzyme may also play a role in ABA signaling, responding to abiotic and biotic stresses, and immune activation and suppression (Sugimoto et al. 2014; Singh et al. 2016). The putative gene encoding a serine hydroxymethyltransferase has been shown to play a role in the photorespiratory pathway, which could also have an impact on both abiotic and biotic stress. In experiments with Arabidopsis, serine hydroxymethyltransferase SHMT1 was shown to help restrict cell death induced by pathogens (Moreno et al. 2005). Any of these genes could be involved in quantitative resistance to early blight disease but would require additional experiments to elucidate a link. Less is known about the remaining genes potentially underlying EB-5 resistance. For example, a gene encoding gibberellin 2-oxidase 1 was identified. The enzyme is involved in the catabolic pathway for gibberellins, which influence plant growth and development, including fruit set and seed development (Chen et al. 2016). Few studies directly address the possible role of this enzyme in disease resistance. There was also a gene of unknown function, Solyc05g053080, that contained 10 SNP variants that were only present in the known resistant set of accessions (Table 4.3). Fewer putative genes within EB-9 were identified, but several appear to encode proteins connected to signaling pathways and defense responses (Table 4.4). A putative gene for an F-box protein was identified. In previous work on tomato and tobacco, a conserved F-box protein was shown to help regulate plant hormone signaling and to be involved in plant defense responses (van den Burg et al. 2008). Another putative gene encodes the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase enzyme, which may also help to control defense responses to nectrotrophs and herbivorous insects. Not only does the enzyme play a role in oxygenation and hydroxylation reaction in plants, but some proteins in the family can hydroxylate and deactivate gibberellic acid, auxin, and defense hormones such as salicylic acid and jasmonic acid (Caarls et al. 2017). 178 Other putative genes within EB-9 encode potassium transporters, a cation efflux protein, a metal tolerance protein, and a eukaryotic translation initiation factor. Potassium helps mediate osmotic pressure and the opening and closing of stomata. Membrane potential and enzyme activity are also regulated, in part, by potassium transporters. There may be any number of ways that potassium transport, potassium availability, stomatal closure, or physiological responses could influence resistance to A. linariae (Melotto et al. 2017; Ragel et al. 2019). Metal tolerance proteins are cation efflux transporters that play roles in homeostasis, but little is known about their role in defense against fungal pathogens (Ricachenevsky et al. 2013). Eukaryotic translation initiation factors play important roles as susceptibility factors for virus infection, but less is known about the role they might play in early blight disease resistance (Piron et al. 2010). Conclusion We advanced the study of early blight resistance in tomato by narrowly defining two valuable QTL underlying early blight disease in modern cultivated breeding lines. This was accomplished not by traditional fine mapping, but by pairing historical pedigree information with publicly available genetic resources, including germplasm maintained by gene banks and breeders and the growing library of re-sequenced Solanum genomes. Our efficient method to identify homologous haplotypes in breeding materials and to predict local homology in public germplasm could be extended to any number of crops, provided the existence of a high-quality pedigree and sufficient whole genome sequence resources. We identified several genes potentially related to early blight resistance within the boundaries of the ancestral EB-5 and EB-9 introgressions. Because so little is known about the early blight pathosystem (and quantitative resistance to fungal pathogens more generally), further experiments are necessary to identify causal links between the genic polymorphisms we have highlighted and in planta early blight disease resistance. Future investigations into the expression levels of promising genes within EB-5 and EB-9 during 179 disease progression would offer insight into the early blight pathosystem and improve our understanding of quantitative disease resistance. Acknowledgements The funding for this work came from the Schmittau-Novak Small Grants Program. Thank you to Dr. Randy Gardner, NCSU, and Dr. Esther van der Knaap, UGA, for providing seed of relevant tomato breeding lines and accessions. Thank you to Dr. Gregory Martin, BTI, for providing sequence for CU151011-14. 180 REFERENCES 100 Tomato Genome Sequencing Consortium, Aflitos, S., Schijlen, E., de Jong, H., de Ridder, D., Smit, S., Finkers, R., Wang, J., Zhang, G., Li, N., Mao, L., Bakker, F., Dirks, R., Breit, T., Gravendeel, B., Huits, H., Struss, D., Swanson-Wagner, R., van Leeuwen, H., van Ham, R.C.H.J., Fito, L., Guignier, L., Sevilla, M., Ellul, P., Ganko, E., Kapur, A., Reclus, E., de Geus, B., van de Geest, H., Te Lintel Hekkert, B., van Haarst, J., Smits, L., Koops, A., Sanchez-Perez, G., van Heusden, A.W., Visser, R., Quan, Z., Min, J., Liao, L., Wang, X., Wang, G., Yue, Z., Yang, X., Xu, N., Schranz, E., Smets, E., Vos, R., Rauwerda, J., Ursem, R., Schuit, C., Kerns, M., van den Berg, J., Vriezen, W., Janssen, A., Datema, E., Jahrman, T., Moquet, F., Bonnet, J., Peters, S. 2014. Exploring genetic variation in the tomato (Solanum section Lycopersicon) clade by whole-genome sequencing. 2014. Plant J. 80:136–148. Adhikari, P., Oh, Y., Panthee, D.R. 2017. Current status of early blight resistance in tomato: an update. Int. J. Mol. Sci. 18:2019. Anderson, T.A., Zitter, S.M., De Jong, D.M., Francis, D.M., Mutschler, M.A. 2021. Cryptic introgressions contribute to transgressive segregation for early blight resistance in tomato. Theor. Appl. Genet. https://doi.org/10.1007/s00122-021- 03842-x Ashrafi, H., Foolad, M.R. 2015. Characterization of early blight resistance in a recombinant inbred line population of tomato: II. Identification of QTLs and their co-localization with candidate resistance genes. Adv. Stud. Biol. 7:149– 168. Barksdale, T.H.1971. Field evaluation for tomato early blight resistance. Plant Dis. Report. Barksdale, T.H., Stoner A.K. 1973. Segregation for horizontal resistance to tomato early blight. Plant Dis. Report. Barksdale T.H., Stoner A.K. 1977. A study of the inheritance of tomato early blight (Alternaria Solani) resistance (fungal diseases). Plant Dis. Report. Blanca, J., Montero-Pau, J., Sauvage, C., Bauchet, G., Illa, E., Díez, M.J., Francis, D., Causse, M., van der Knaap, E., Cañizares, J. 2015. Genomic variation in tomato, from wild ancestors to contemporary breeding accessions. BMC Genomics 16:257. Browning, B.L., Zhou, Y., Browning, S.R. 2018. A one-penny imputed genome from next-generation reference panels. Am. J. Hum. Gen. 103:338–348. Browning, S.R., Browning, B.L. 2007. Rapid and accurate haplotype phasing and 181 missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am. J. Hum. Gen. 81:1084–1097. Caarls, L., Elberse, J., Awwanah, M., Ludwig, N.R., Vries, M. de, Zeilmaker, T., Wees, S.C.M.V., Schuurink, R.C., Ackerveken, G.V. 2017. Arabidopsis JASMONATE-INDUCED OXYGENASES down-regulate plant immunity by hydroxylation and inactivation of the hormone jasmonic acid. Proc. Natl. Acad. Sci. U. S. A. 114:6388–6393. Causse, M., Desplat, N., Pascual, L., Le Paslier, M.-C., Sauvage, C., Bauchet, G., Bérard, A., Bounon, R., Tchoumakov, M., Brunel, D., Bouchet, J.-P. 2013. Whole genome resequencing in tomato reveals variation associated with introgression and breeding events. BMC Genomics 14:791. Chen, S., Wang, X., Zhang, L., Lin, S., Liu, D., Wang, Q., Cai, S., El-Tanbouly, R., Gan, L., Wu, H., Li, Y. 2016. Identification and characterization of tomato gibberellin 2-oxidases (GA2oxs) and effects of fruit-specific SlGA2ox1 overexpression on fruit and seed growth and development. Hort. Res. 3:1–9. Chifman, J., Kubatko, L. 2014. Quartet inference from SNP data under the coalescent model. Bioinformatics 30:3317–3324. Chifman, J., Kubatko, L. 2015. Identifiability of the unrooted species tree topology under the coalescent model with time-reversible substitution processes, site- specific rate variation, and invariable sites. J. Theor. Biol. 374:35–47. Cingolani, P., Platts, A., Wang, L.L., Coon, M., Nguyen, T., Wang, L., Land, S.J., Lu, X., Ruden, D.M. 2012. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w 1118 ; iso-2; iso-3. Fly 6:80–92. Danecek, P., Auton, A., Abecasis, G., Albers, C.A., Banks, E., DePristo, M.A., Handsaker, R.E., Lunter, G., Marth, G.T., Sherry, S.T., McVean, G., Durbin, R.l 2011. The variant call format and VCFtools. Bioinformatics 27:2156–2158. Der Sarkissian, C., Ermini, L., Schubert, M., Yang, M.A., Librado, P., Fumagalli, M., Jónsson, H., Bar-Gal, G.K., Albrechtsen, A., Vieira, F.G., Petersen, B., Ginolhac, A., Seguin-Orlando, A., Magnussen, K., Fages, A., Gamba, C., Lorente-Galdos, B., Polani, S., Steiner, C., Neuditschko, M., Jagannathan, V., Feh, C., Greenblatt, C.L., Ludwig, A., Abramson, N.I., Zimmermann, W., Schafberg, R., Tikhonov, A., Sicheritz-Ponten, T., Willerslev, E., Marques- Bonet, T., Ryder, O.A., McCue, M., Rieder, S., Leeb, T., Slatkin, M., Orlando, L. 2015. Evolutionary genomics and conservation of the endangered Przewalski’s horse. Curr. Biol. 25:2577–2583. Fernandez-Pozo, N., Menda, N., Edwards, J.D., Saha, S., Tecle, I.Y., Strickler, S.R., Bombarely, A., Fisher-York, T., Pujar, A., Foerster, H., Yan, A., Mueller, L.A. 182 2015) The Sol Genomics Network (SGN)—from genotype to phenotype to breeding. Nucleic Acids Res. 43:D1036–D1041. Fu, Q., Hajdinjak, M., Moldovan, O.T., Constantin, S., Mallick, S., Skoglund, P., Patterson, N., Rohland, N., Lazaridis, I., Nickel, B., Viola, B., Prüfer, K., Meyer, M., Kelso, J., Reich, D., Pääbo, S. 2015. An early modern human from Romania with a recent Neanderthal ancestor. Nature 524:216–219. Gao, L., Gonda, I., Sun, H., Ma, Q., Bao, K., Tieman, D.M., Burzynski-Chang, E.A., Fish, T.L., Stromberg, K.A., Sacks, G.L., Thannhauser, T.W., Foolad, M.R., Diez, M.J., Blanca, J., Canizares, J., Xu, Y., Knaap, E. van der, Huang, S., Klee, H.J., Giovannoni, J.J., Fei, Z. 2019. The tomato pan-genome uncovers new genes and a rare allele regulating fruit flavor. Nat Genet 51:1044–1051. Gardner, R. 1988. NC EBR-1 and NC EBR-2 early blight resistant tomato breeding lines. HortScience 23:779–781 Gardner, R.G. 1990. Greenhouse disease screen facilitates breeding resistance to tomato early blight. HortScience 25:222–223. Gardner, R.G., Panthee D.R. 2010. NC 1 CELBR and NC 2 CELBR: early blight and late blight-resistant fresh market tomato breeding lines. HortScience 45:975– 976. https://doi.org/10.21273/HORTSCI.45.6.975 Green, R.E., Krause, J., Briggs, A.W., Maricic, T., Stenzel, U., Kircher, M., Patterson, N., Li, H., Zhai, W., Fritz, M.H.-Y., Hansen, N.F., Durand, E.Y., Malaspinas, A.-S., Jensen, J.D., Marques-Bonet, T., Alkan, C., Prüfer, K., Meyer, M., Burbano, H.A., Good, J.M., Schultz, R., Aximu-Petri, A., Butthof, A., Höber, B., Höffner, B., Siegemund, M., Weihmann, A., Nusbaum, C., Lander, E.S., Russ, C., Novod, N., Affourtit, J., Egholm, M., Verna, C., Rudan, P., Brajkovic, D., Kucan, Ž., Gušic, I., Doronichev, V.B., Golovanova, L.V., Lalueza-Fox, C., Rasilla, M. de la, Fortea, J., Rosas, A., Schmitz, R.W., Johnson, P.L.F., Eichler, E.E., Falush, D., Birney, E., Mullikin, J.C., Slatkin, M., Nielsen, R., Kelso, J., Lachmann, M., Reich, D., Pääbo, S. 2010. A draft sequence of the Neandertal genome. Science 328:710–722. Hosmani, P.S., Flores-Gonzalez, M., Geest, H. van de, Maumus, F., Bakker, L.V., Schijlen, E., Haarst, J. van, Cordewener, J., Sanchez-Perez, G., Peters, S., Fei, Z., Giovannoni, J.J., Mueller, L.A., Saha, S. 2019. An improved de novo assembly and annotation of the tomato reference genome using single-molecule sequencing, Hi-C proximity ligation and optical maps. bioRxiv 767764. Huelsenbeck, J.P., Ronquist F. 2001. MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics 17:754–755. Le Roy, J., Huss, B., Creach, A., Hawkins, S., Neutelings, G. 2016. Glycosylation is a major regulator of phenylpropanoid availability and biological activity in plants. 183 Front. Plant. Sci. 7:735. Li, H. 2013. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv:13033997 [q-bio] Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R. 2009. The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079. Lin, T., Zhu, G., Zhang, J., Xu, X., Yu, Q., Zheng, Z., Zhang, Z., Lun, Y., Li, S., Wang, X., Huang, Z., Li, Junming, Zhang, C., Wang, T., Zhang, Yuyang, Wang, A., Zhang, Yancong, Lin, K., Li, C., Xiong, G., Xue, Y., Mazzucato, A., Causse, M., Fei, Z., Giovannoni, J.J., Chetelat, R.T., Zamir, D., Städler, T., Li, Jingfu, Ye, Z., Du, Y., Huang, S. 2014. Genomic analyses provide insights into the history of tomato breeding. Nat Genet 46:1220–1226. Maiero, M, Ng T.J., Barksdale T. 1990. Genetic resistance to early blight in tomato breeding lines. HortScience 25:344–346 Maiero, M., Ng, T.J., Barksdale, T.H. 1989. Combining ability estimates for early blight resistance in tomato. J Am Soc Hortic Sci 114 1:118-121 Malinsky, M., Matschiner, M., Svardal, H. 2020. Dsuite - fast D-statistics and related admixture evidence from VCF files. Mol. Ecol. Res. Maples, B.K., Gravel, S., Kenny, E.E., Bustamante, C.D. 2013. RFMix: a discriminative modeling approach for rapid and robust local-ancestry inference. Am. J. Hum. Genet. 93:278–288. Martin, F.W., Hepperly, P. 1987. Sources of resistance to early blight, Alternaria solani, and transfer to tomato, Lycopersicon esculentum. 1 71:85–95. Martin, G.B., Brommonschenkel, S.H., Chunwongse, J., Frary, A., Ganal, M.W., Spivey, R., Wu, T., Earle, E.D., Tanksley, S.D.1993. Map-based cloning of a protein kinase gene conferring disease resistance in tomato. Science 262:1432– 1436. Martin, S.H., Belleghem, S.M.V. 2017. Exploring evolutionary relationships across the genome using topology weighting. Genetics 206:429–438. Martin, S.H., Davey, J.W., Jiggins, C.D. 2015. Evaluating the use of ABBA–BABA statistics to locate introgressed loci. Mol. Biol. Evol. 32:244–257. Menda, N., Strickler, S.R., Edwards, J.D., Bombarely, A., Dunham, D.M., Martin, G.B., Mejia, L., Hutton, S.F., Havey, M.J., Maxwell, D.P., Mueller, L.A. 2014. Analysis of wild-species introgressions in tomato inbreds uncovers ancestral origins. BMC Plant Biology 14. 184 Moreno, J.I., Martín, R., Castresana, C. 2005. Arabidopsis SHMT1, a serine hydroxymethyltransferase that functions in the photorespiratory pathway influences resistance to biotic and abiotic stress. Plant J. 41:451–463. Nash, A.F., Gardner, R.G. 1988. Heritability of tomato early blight resistance derived from Lycopersicon hirsutum PI 126445. J Am. Soc. Hortic. Sci. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, É. 2011. Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830. Piron, F., Nicolaï, M., Minoïa, S., Piednoir, E., Moretti, A., Salgues, A., Zamir, D., Caranta, C., Bendahmane, A. 2010. An induced mutation in tomato eIF4E leads to immunity to two Potyviruses. PLOS ONE 5:e11313. Poplin, R., Ruano-Rubio, V., DePristo, M.A., Fennell, T.J., Carneiro, M.O., Auwera, G.A.V. der, Kling, D.E., Gauthier, L.D., Levy-Moonshine, A., Roazen, D., Shakir, K., Thibault, J., Chandran, S., Whelan, C., Lek, M., Gabriel, S., Daly, M.J., Neale, B., MacArthur, D.G., Banks, E. 2018. Scaling accurate genetic variant discovery to tens of thousands of samples. bioRxiv 201178. Pritchard, F., Porte, W. 1921. Collar-rot of tomato. J. Agric. Res. 21:179–184. R Core Team. 2020. A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R- project.org. Ragel, P., Raddatz, N., Leidi, E.O., Quintero, F.J., Pardo, J.M.2019. Regulation of K+ Nutrition in Plants. Front Plant Sci 10:281. Rehman, H.M., Nawaz, M.A., Shah, Z.H., Ludwig-Müller, J., Chung, G., Ahmad, M.Q., Yang, S.H., Lee, S.I. 2018. Comparative genomic and transcriptomic analyses of Family-1 UDP glycosyltransferase in three Brassica species and Arabidopsis indicates stress-responsive regulation. Sci Rep 8:1875. Reynard, G., Andrus, C. 1944. Inheritance of resistance to the collar rot phase of Alternaria solani of tomato. Phytopathology 35:25-36. Ricachenevsky, F.K., Menguer, P.K., Sperotto, R.A., Williams, L.E., Fett, J.P. 2013. Roles of plant metal tolerance proteins (MTP) in metal storage and potential use in biofortification strategies. Front Plant Sci 4:144. Robinson, J.T., Thorvaldsdóttir, H., Wenger, A.M., Zehir, A., Mesirov, J.P. 2017. Variant review with the Integrative Genomics Viewer. Cancer Res 77:e31–e34. 185 Rotem J. 1994. The Genus Alternaria: Biology, Epidemiology, and Pathogenicity. APS Press. Salter-Townshend M., Myers, S. 2019. Fine-scale inference of ancestry segments without prior knowledge of admixing groups. Genetics 212:869–889. Sankararaman, S., Sridhar, S., Kimmel, G., Halperin, E. 2008. Estimating local ancestry in admixed populations. Am. J. Hum. Genet. 82:290–303. Sim, S.-C., Robbins, M.D., Wijeratne, S., Wang, H., Yang, W., Francis, D.M. 2015. Association analysis for bacterial spot resistance in a directionally selected complex breeding population of tomato. Phytopathology 105:1437–1445. Singh, A., Pandey, A., Srivastava, A.K., Tran, L.-S.P., Pandey, G.K. 2016. Plant protein phosphatases 2C: from genomic diversity to functional multiplicity and importance in stress management. Crit Rev Biotechnol 36:1023–1035. Strickler, S.R., Bombarely, A., Munkvold, J.D., York, T., Menda, N., Martin, G.B., Mueller, L.A. 2015. Comparative genomics and phylogenetic discordance of cultivated tomato and close wild relatives. PeerJ 3:e793. Sugimoto, H., Kondo, S., Tanaka, T., Imamura, C., Muramoto, N., Hattori, E., Ogawa, K., Mitsukawa, N., Ohto, C. 2014. Overexpression of a novel Arabidopsis PP2C isoform, AtPP2CF1, enhances plant biomass production by increasing inflorescence stem growth. J Exp Bot 65:5385–5400. Swofford, D. 2003. PAUP* Phylogenetic Analysis Using Parsimony (*and Other Methods). Version 4. Sinauer Associates, Sunderland, Massachusetts Thirthamallappa, Lohithaswa, H.C. 2000. Genetics of resistance to early blight (Alternaria solani Sorauer) in tomato (Lycopersicon esculentum L.). Euphytica 113:187–193. Thorvaldsdóttir, H., Robinson, J.T., Mesirov, J.P. 2013. Integrative genomics viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform 14:178–192. Tieman, D., Zhu, G., Resende, M.F.R., Lin, T., Nguyen, C., Bies, D., Rambla, J.L., Beltran, K.S.O., Taylor, M., Zhang, B., Ikeda, H., Liu, Z., Fisher, J., Zemach, I., Monforte, A., Zamir, D., Granell, A., Kirst, M., Huang, S., Klee, H. 2017. A chemical genetic roadmap to improved tomato flavor. Science 355:391–394. https://doi.org/10.1126/science.aal1556. van den Burg, H.A., Tsitsigiannis, D.I., Rowland, O., Lo, J., Rallapalli, G., MacLean, D., Takken, F.L.W., Jones, J.D.G. 2008. The F-Box Protein ACRE189/ACIF1 regulates cell death and defense responses activated during pathogen recognition in tobacco and tomato. Plant Cell 20:697–719. 186 Wang, P., Gao, C., Bian, X., Zhao, S., Zhao, C., Xia, H., Song, H., Hou, L., Wan, S., Wang, X.2016. Genome-wide identification and comparative analysis of cytosine-5 DNA methyltransferase and demethylase families in wild and cultivated peanut. Front Plant Sci 7:7. Woudenberg, J.H.C., Truter, M., Groenewald, J.Z, Crous P.W. 2014. Large-spored Alternaria pathogens in section Porri disentangled. Stud Mycol 79:1–47. Yang, W.C., Francis D.M. 2005. Marker-assisted selection for combining resistance to bacterial spot and bacterial spark in tomato. J. Am. Soc. Hortic. Sci. 130:716- 721. Zitter, T., Drennan J. 2007. Using host resistance and light fungicides to control early blight on tomato. Plant Dis Manag Rep. 2:60. Zitter, T., Drennan, J. 2008. Using host resistance and lite fungicides to control early blight and Septoria leaf spot on tomato. Plant Dis. Manag. Rep. 2:61. Zitter, T.A., Drennan, J.L., Mutschler, M.A., Kim, M.J. 2005. Control of early blight of tomato with genetic resistance and conventional and biological sprays. Acta Hortic. 695:181–190. 2016. Obituaries. In: Delaware Valley Ornithological Club. https://dvoc.org/wp/wp- content/uploads/2016/12/C72_73Obituaries.pdf. 187 SUPPLEMENTAL MATERIALS Table 4.S1. Detailed Information on all tomato lines included in this study, including taxonomic information, principal component analysis coordinates, and whole-genome cluster identity (.csv) 188 Table 4.S2. Results of mist chamber validation experiment of EB-9 stem resistance. Disease ratings of stem lesions were on a scale of 0-100%. Each tomato accession was either susceptible and had no EB-9 stem resistance, or had high, medium, or low confidence of resistance based on similarity to Devon Surprise. ANOVA results suggested a significant difference between mean disease ratings across the different tomato genotypes (p < 0.001). Genotype Predicted Resistance Rating Group1 Seed Provenance2 OH7663 Susceptible 49 a Cornell tomato breeding program Cherokee purple Susceptible 39 ab Commercial Green Zebra Susceptible 33.5 abc Commercial NC84173 Susceptible 29 abcd Cornell tomato breeding program Gardener’s delight High confidence 25.2 abcd Commercial Azoychka Russian Low confidence 22.8 abcd Commercial Ailsa Craig Susceptible 13.5 bcd Cornell tomato breeding program Snowberry Medium confidence 10.5 bcd Commercial LA1712 Low confidence 10.2 bcd TGRC Wheatley's Frost Resistant Low confidence 10.2 bcd Commercial BGV007901 Low confidence 10 bcd Esther van der Knaap-University of Georgia Yellow Perfection High confidence 7.5 cd Commercial Yellow Pear Susceptible 4.9 cd Commercial Merida Medium confidence 3 cd TGRC Xol Laguna Medium confidence 2.9 cd TGRC BGV007911 Low confidence 2.5 d Esther van der Knaap-University of Georgia BGV007860 Medium confidence 1.4 d Esther van der Knaap-University of Georgia BGV007872 Medium confidence 1.2 d Esther van der Knaap-University of Georgia 201041 High confidence 0.8 d Cornell tomato breeding program 191357 High confidence 0.7 d Cornell tomato breeding program LA1320 Medium confidence 0.7 d TGRC BGV007857 Medium confidence 0.6 d Esther van der Knaap-University of Georgia BGV007921 Low confidence 0.5 d Esther van der Knaap-University of Georgia BGV012640 Medium confidence 0.5 d Esther van der Knaap-University of Georgia Cuba Plum Low confidence 0.5 d TGRC BGV007990 Medium confidence 0.4 d Esther van der Knaap-University of Georgia Green Gage Low confidence 0.4 d Commercial BGV007909 Low confidence 0.3 d Esther van der Knaap-University of Georgia BGV007918 Low confidence 0.3 d Esther van der Knaap-University of Georgia BGV007871 Medium confidence 0.2 d Esther van der Knaap-University of Georgia 189 Katinka Cherry High confidence 0.2 d Commercial Monplaisir High confidence 0.2 d Commercial Trujillo Low confidence 0.2 d TGRC BGV008051 Medium confidence 0.1 d Esther van der Knaap-University of Georgia Gold Currant Low confidence 0.1 d Commercial PI370093 A Low confidence 0.1 d GRIN BGV007989 Low confidence 0 d Esther van der Knaap-University of Georgia BGV008354 Medium confidence 0 d Esther van der Knaap-University of Georgia Campbell 1943 High confidence 0 d Cornell tomato breeding program Devon Surprise High confidence 0 d Cornell tomato breeding program Lemon Drop Medium confidence 0 d Commercial PE-63 Low confidence 0 d TGRC 1 The letters in the group column designate which tomato accessions are statistically different or the same, based on a Tukey adjusted 95% confidence level. 2Seed was either obtained by the Cornell tomato breeding program, commercially, through GRIN (USDA-ARS Genetic Resource Collection), TGRC (Tomato Genetics Resource Center at University of California-Davis), or from Dr. Esther van der Knaap at the University of Georgia. Table 4.S3. Full list of gene annotations that fall within the refined EB-5 interval SL4.0ch05:62566094 - SL4.0ch05:63401898 (.csv) Table 4.S4. Full list of gene annotations that fall within the refined EB-9 interval SL4.0ch09:62599611 - SL4.0ch09:62943349 (.csv) 190 Figure 4.S1. Silhouette scores and the number of clusters corresponding as a function of d (distance merging threshold) for three genomic window sizes. The data are from 775 tomato sequences beginning at SL4.0 Chromosome 9 position 62,512,575. When the genetic distance between two samples falls under a distance threshold (d) during hierarchical clustering, these samples are clustered together. The average pairwise distance among samples increases with the size of the genomic window, as a larger window captures more genotypic information. As a result, the optimal value of d will also increase. A value of d that is too low will result in too many clusters, while a d that is too large will give too few clusters. Thus, the user sets a range of d values for the algorithm to investigate that balances classification accuracy and sensitivity 191 A B Figure 4.S2. A. Distribution of clustering distance thresholds (d) as determined from the hierarchical classification algorithm for all genomic windows on chromosome 9. The analysis included 775 individual sequences for tomatoes and wild relatives B. Genome-wide distribution of algorithmically determined d thresholds for a sliding window size of 250 Kb and step size of 100 Kb for 775 genome sequences. 192 Figure 4.S3. The haplotype clustering methodology was evaluated by attempting to detect introgressions in a well-known breeding line, CU17NBL. Tomato CU17NBL is known to have the follow introgressions from Solanum pennellii LA0716: a 1.5 Mb introgression on chromosome 2, a 58.4 Mb upper and a 1.4 Mb lower introgression on chromosome 3, a 260 Kb introgression on chromosome 7, a 3.5 Mb introgression on chromosome 8, and a 1.9 Mb introgression on chromosome 10. The locations of known introgressions are shown at top. Haplotype detection was repeated with four window sizes, shown below. Haplotypes with homology to LA0716 are colored, bar widths indicate the size of the detected introgression 193 Figure 4.S4 Evidence for the chromosome 8 and 12 haplotype homology with Devon Surprise among early blight resistant breeding lines and other accessions 194 Figure 4.S5. A. Genetic windows that are shared between Devon Surprise and Ailsa Craig according to our data. Haplotypes with homology to Devon Surprise are colored orange. B. Coalescent species tree hierarchical clustering of the whole genome sequences for a relevant subset (to aid visualization only) of sequences, indicating strong support for the similarity between Devon Surprise and the two independent sequences of Ailsa Craig 195 Figure 3.S6 Pairwise ancestry painting of the EB-9 interval for the contrasts: A. Devon Surprise vs. NC 84173 B. Devon Surprise vs. Heinz 1706 and C. Devon Surprise vs. Yellow Pear 196 Figure 4.S7 Accessions predicted to have EB-9 resistance based on similarity to Devon Surprise. Seven haplotype windows were tested, ranging in size from the red bars (smallest, 300 Kb) to the x-axis limits (largest, 750 Kb). High confidence accessions were grouped with Devon Surprise in 6-7 of these windows, while medium confidence accessions were included in 3-5 windows, and low-confidence accessions in 1- 2 windows. Fine-scale sub-haplotype evidence is shown for a 250 Kb sliding window analysis with a 25 Kb step size. Windows clustering with Devon Surprise are colored. The most predictive marker from prior QTL mapping work is shown as a black bar (solcap_snp_sl_29188). Three accessions, NC 1 CELBR, Gardner’s Delight, and PI 370093, were represented twice in the dataset 197 Figure 4. S9 Pairwise ancestry painting for the contrasts A. Hawaii 7998 vs. CU151095-146 B. Hawaii 7998 vs. Heinz 1706 C. Hawaii 7998 vs. Yellow Pear 198 Figure 4.S10 Accessions predicted to have EB-5 resistance based on haplotype similarity to Hawaii 7998. Seven haplotype windows were tested, ranging in size from the red bars (smallest, 375 Kb) to the x-axis limits (largest, 750 Kb). The black bar is the most predictive marker for EB-5 from prior QTL mapping (solcap_snp_sl_231). High confidence accessions were grouped with Hawaii 7998 in 6-7 of these windows, while medium confidence accessions were included in 3-5 windows, and low-confidence accessions in 1- 2 windows. Fine-scale sub-haplotype evidence is shown for a 25 Kb sliding window analysis with a 25 Kb step size. Windows clustering with Hawaii 7998 are colored. There are two sequences for Hawaii 7998 in our dataset, the first supplied by Dr. David Francis (OSU) 199 Figure 4.S11 Accessions with homology (orange color) for the chromosome 11 centromeric region with Hawaii 7998 by our clustering analysis. Seven haplotype windows were tested, ranging in size from the red bars (smallest, 32.5 Mb) to the x-axis limits (largest, 38.1 Mb). High confidence accessions were grouped with Hawaii 7998 in 6-7 of these windows, while medium confidence accessions were included in 3-5 windows, and low-confidence accessions in 1-2 windows. Fine-scale (sub-)haplotype evidence is shown for a 250 Kb sliding window analysis with a 100 Kb step size. There are two sequences for Hawaii 7998 in our dataset, the first supplied by Dr. David Francis (OSU) 200 Figure 4.S12 Evidence for chromosome 11 homology (orange color) between Hawaii 7998 and S. pimpinellifolium accessions in the dataset using a 250 Kb window size and 100 Kb step size 201 CONCLUSION My projects centered on diseases of solanaceous plants, with particular focus on tomatoes. The research on late blight also applied to potato. While the topics are diverse, working with multiple pathosystems proved to be a worthwhile endeavor for me. I learned diverse lab techniques, the projects required different data analysis methods, and each project required different troubleshooting. Underlying themes include study of effectors, effector diversity, and understanding the genetic underpinnings of disease resistance from the perspective of the plant. Planting resistant cultivars of tomatoes is an important management strategy, and several chapters in this thesis focused on understanding pathogen diversity such that a genetically diverse group of isolates can be used when determining the efficacy of host resistance genes. I also learned more about population genetics and comparative genomics. I appreciated a change in perspective in a collaborative project with a plant breeding graduate colleague Taylor Anderson. I learned a whole new vocabulary and way of thinking for the early blight project. Much can still be learned from each of my projects, however. Below I will briefly detail some potential future directions. Prior to the Passalora fulva study in this thesis, little was known about the tomato leaf mold pathogen in the United States, despite that P. fulva is an important model fungus. My work will help to better track pathogen populations and aid in management. Future work can involve more thorough sampling to build a larger collection of isolates from each sampling location. The statistical analysis that follows can then be more robust (Grunwald et al. 2017). Towards the end of my PhD, I sought the expertise of Dr. Sandeep Sharma and transformed a P. fulva isolate based on the technique described in Anco et al. 2009. I now have a collection of GFP-tagged isolates. I was able to perform virulence assays and begin to examine the how the fungus colonizes using confocal microscopy. With more time, transformation of a few Cladosporium spp. would be useful for trying 202 to understand what the fungi are doing on the tomato leaf surfaces. It would be helpful to determine what role they play on tomato because there are few descriptions in the literature. Because Cladosporium species were isolated from P. fulva lesions in nearly every attempt prior to using a double sterilization technique, future studies to better understand the role these Cladosporium species may play in symptom development would be interesting. We observed that none of the Cladosporium species collected from leaf mold lesions caused disease on tomato when isolates were inoculated on healthy tomato foliage, but we have not done co-inoculation studies. If another fluorescent tag such as mCherry could be used in the transformation of a Cladosporium isolate collected from a leaf mold lesion, we could examine what is happening after co-inoculations with a GFP-tagged P. fulva isolate. This would help to understand if there is some sort of association between the fungi, or if the Cladosporium spp. are merely saprophytes. Substantial data were generated from the PenSeq target enrichment sequencing of effector genes for 12 isolates of Phytophthora infestans. The next logical steps are validation. Unfortunately, given strict regulations, I could not easily obtain potato varieties from the UK with R3a, R3b, or R1 to validate my PenSeq results in planta. Unfortunately, even less is known about the tomato resistance genes that interact with pathogen effector genes (Nowicki et al. 2012). As new characterization of resistance genes and their effector complements emerges, the PenSeq data will offer new insights. The PenSeq tool is robust but understanding the expression of resistance genes is also an important step in understanding exactly how the pathogen deploys the effectors. PenSeq can be performed on cDNA (Lin et al. 2020), and this could be a logical next for a closer examination of effector diversity within the US-23 clonal lineage. To make the work more robust, future work could involve collecting leaf disks right before and then at several timepoints during infection, extracting RNA, and doing RT-qPCR to understand expression levels of several effector genes. 203 helpful another graduate student. The comparative genomics project on early blight resistance in tomato will offer helpful tools to breeders trying to breed for more durable resistance. The next step is to make SNP markers available based on our findings. We have also compiled a database of 775 tomato whole genome sequences that will be publicly available. As my graduate study ends, I reflect with fondness on many aspects of my experience. I appreciated most the importance of collaboration. This proved to be one of the best parts of my graduate experience. Learning to ask for help, finding people who will ask tough questions about your research, and asking for people to review manuscripts or grant proposals is worthwhile and something that I will carry with me. 204 REFERENCES Grunwald, N. J., Everhart, S. E., Knaus, B. J., and Kamvar, Z. N. 2017. Best practices for population genetic analyses. Phytopathology. 107:1000–1010. Anco, D. J., Kim, S., Mitchell, T. K., Madden, L. V., and Ellis, M. A. 2009. Transformation of Phomopsis viticola with the green fluorescent protein. Mycologia. 101:853–858. Nowicki, M., Foolad, M. R., Nowakowska, M., and Kozik, E. U. 2012. Potato and tomato late blight caused by Phytophthora infestans: An overview of pathology and resistance breeding. Plant Dis. 96:4–17. Lin, X., Armstrong, M., Baker, K., Wouters, D., Visser, R.G.F., Wolters, P.J., Hein, I., Vleeshouwers, V.G.A.A. 2020. Identification of Avramr1 from Phytophthora infestans using long read and cDNA pathogen-enrichment sequencing (PenSeq). Molecular Plant Pathology. 21:1502–1512. 205 APPENDIX TEACHING A FIRST YEAR UNDERGRADUATE WRITING SEMINAR ON FOOD AND AGRICULTURE Introduction The John S. Knight Institute at Cornell offers teaching experiences to graduate student instructors interested in designing and teaching first-year writing seminars (FWS). In this appendix, I reflect on my experiences teaching a first-year writing seminar to undergraduates. Teaching is a crucial skill and being able to teach and communicate clearly is of great importance to scientists. Communicating scientific ideas must be prioritized as society becomes ever more polarized. Good teaching takes experience and time for refinement, and I have experienced many times over the ways that teachers enriched my life. Everyone deserves good teachers and professors, but teaching takes practice and thought, and I knew I needed more experience. As someone interested in a career in academia with research and teaching components, the opportunity to gain additional teaching skills that I could not obtain from only my TA responsibilities proved to be one of my most valuable graduate school experiences. The appendix will explore my process of designing and teaching a FWS titled PLSCI1105—Eating is an Agricultural Act. As a point of reference, I will share my syllabus and my writing assignment sequence. The design and implementation of my course was not unlike the process I take to design, plan, and execute scientific experiments. To begin with, I took WRIT7100—Teaching Writing with fellow graduate instructors. We read about how to teach, discussed pedagogical techniques, and designed and shared activities and lesson plans. I chose central themes for the course, planned units, lessons, and writing assignments. We then shared feedback during each step of the process, and our classwork was facilitated by an instructor in the Knight Institute. In the 206 extra time afforded by the pandemic during the lockdown from laboratory research, I also read supplemental literature on teaching, I took a course through Cornell’s Center for Teaching Innovation called Teaching and Learning in the Diverse Classroom, and I read most of the books and articles I wanted to incorporate in my course. After my course began, I treated each class and activity as an experiment. I took notes on lesson plans, and after class, I reflected on activities that went well and activities that needed additional refinement. This teaching journal was not unlike a lab notebook. Finally, student essay and frequent reflections on the process of writing offered tangible feedback for me as an instructor. I could see how well students were engaging and thinking about central themes and I could examine writing progress. I could then refine future lesson plans. My course concluded with student reflections, and after the course ended, I received student evaluations that were administered by the Knight Institute, as measurable outcomes. Why did I teach a first-year writing seminar? When I first heard about the possibility of designing and teaching a writing seminar, it sounded terrifying but also exciting. As someone who is excited about interdisciplinary study, I was intrigued and appreciative that this was an opportunity available to graduate students. My TA experience in PLPPM 2010-Magical Mushrooms, Mischievous Molds was valuable, but I wished for more freedom to design my own activities and assignments. I also looked back fondly on my experience teaching at an urban environmental learning center in Milwaukee and as a participant in Cornell Graduate Student School Outreach Program teaching a sequence of classes on microbes in kindergarten, third grade, and sixth grade class. My favorite classes to teach were ones that I help to design, refine, and reflect on. Throughout graduate school, I participated in GET SET teaching workshops, an active learning discussion group, ALS 6014—Theater Techniques for Enhancing Teaching and Public Speaking, and a science communication workshop. Not only did I 207 gain some valuable teaching skills, but I met friends and contacts across disciplines, and I felt more connected and able to navigate graduate school. What were the central themes of my course? After expressing an interest early on in my graduate career, I began to imagine the underpinnings of a course. A central goal of my course was to expose students to different subjects related to the plant sciences. What better way than to have food and agriculture be the entryway. As my own study of plants and microbes has progressed, I found I had some broader questions about food and agriculture. What is sustainable agriculture? What injustices exist within our food systems? How can agriculture become more equitable for farmers, farmworkers, and consumers? What role should biotechnology play in shaping future agricultural practices? Designing and teaching a class allowed me to engage with these questions for a semester. I also wanted to the theme of science communication to be a common thread. The course was divided into several central themes: 1) (Dis)Connection from the Land, 2) How Plants Shape Us and We Shape Plants 3) How Are New Fruits and Vegetables Developed? 4) Foods of the Future: How Must Agriculture Change? I couldn’t ignore questions of environmental degradation, racial injustice, inequities within the food system and as we explored these topics, I learned that my student were even more thirsty to discuss these topics than I imagined. Students were interested in majors in viticulture and enology, nutrition, labor relations, animal science, environmental science, economics, engineering, astrophysics, and plant They also came from diverse backgrounds. Stepping back to think more deeply about food and agriculture was rewarding and thought-provoking in ways I didn’t anticipate. Student perspectives 208 now shape my own understandings as well. How was the course structured? As we delved into course materials, I wanted students realize that few things can be understood in black and white terms and the ways in which we communicate matters. I wanted students to learn how to research ideas and topics using reputable sources found most easily in databases rather than search engines. To begin with, I wanted students to learn how to summarize other people’s ideas in an engaging way that avoided plagiarism. After a few weeks, we thought about how to enter the conversation ourselves and how to artfully incorporate one’s own ideas. We thought about effective analysis and how to develop pointed claims. As students developed their own claims, I wanted them to think how they could artfully substantiate a claim with evidence. I also wanted students to think about how they could incorporate counterarguments and potential rebuttals to develop their ideas further. A class debate and position paper on genetically engineered crops helped students grapple with opposing perspectives and substantiating claims with evidence. Students left behind the old parameters of the 5-paragraph essay to delve into writing an Op-Ed, a position paper, a proposal, and many shorter reflections. To provide structure, I assigned and referenced optional readings from Graff and Birkenstein’s They Say/I Say: The Moves that Matter in Academic Writing, and Hjortshoj’s Transition to College Writing. Many students realized the joy of writing for different audiences and different purposes and the delight of thinking of writing as a form of storytelling. Writing is challenging but also a creative endeavor that gives meaning and helps us make sense of the world. 209 Select Course Materials for PLSCI1105: “Eating is an Agricultural Act” Parts of the syllabus and the central writing assignments Course description—Farmer and writer Wendell Berry suggests that “eating is an agriculture act.” Although we make decisions about food every day, few of us know what is involved in growing fruits and vegetables. In this course, we will consider how plants have shaped us and we have shaped plants through domestication processes and through plant breeding. We will consider what it takes to keep plants healthy and the ways in which scientists must adjust to a rising population and a changing climate. We will debate about whether genetically modified crops should play an important role in the future, and we will consider what sustainable foods of the future could look like. Through an examination of writings by farmers, historians, scientists, and chefs, we will think critically about what goes into growing the fruits and vegetables we see neatly lining our grocery stores. We will synthesize source materials, build arguments, and communicate scientific ideas to a variety of audiences. Specific Learning Outcomes In this course, you will: 1. Summarize a variety of sources ranging from scientific literature, popular science articles, and books in the plant sciences. 2. Analyze contrasting texts to cultivate more robust perspectives on food and plants through careful consideration of the complexities associated with agriculture and food systems in the United States. 3. Engage in conversation with multiple sources by synthesizing the ideas of other writers in connection with your own ideas. 4. Construct logical arguments that are supported by evidence. 5. Engage respectfully and constructively with peers during in-class and small group discussions, and during peer-review sessions. 6. Reference and cite sources appropriately. 210 Written Components of the Course—There will be four units in this course. Each unit will contain a major writing assignment. Within each unit, there will be opportunities for shorter reflection pieces on the writing process and on readings we are doing in class. Guidelines for each assignment will be posted on “Assignments” in Canvas. Refer to the guidelines on Canvas when preparing your essays. The Writing Sequence Writing Assignment 1-Preliminary Diagnostic Essay Writing Assignment 2-Summary Paper Writing Assignment 3-Op-Ed Article Writing Assignment 4-Position/Debate Paper Writing Assignment 5-Proposal and Reflection *Students received feedback on drafts of the underlined papers before submitting final versions after peer-revisions. Unit 1—Short essay in week one (~2 pages) Based on a short writing prompt that I will provide in the first week, you will write a 2- page essay. This essay will allow me to get a sense of writing strengths and areas where I can provide more focused instruction. Summary paper (~2 pages) You will choose one reading from our first unit to create a summary paper. This will be an opportunity to practice working with quotations, framing textual evidence, and paraphrasing important ideas without unintentionally plagiarizing other people’s ideas. You will consider the author’s over-arching claims and then choose one section to focus in on. For this section, you will describe the evidence they used to substantiate their claims. You will provide enough detail to create an engaging paper but not so much that readers feel like they are just re-reading the original article. This assignment will prepare you for future assignments in which you incorporate your own voice. Unit 2—Op-Ed Article (3 pages) The ways in which scientists communicate with the public matters tremendously. Often, 211 the public feels as though scientists live in an abstract bubble. People often feel distrustful and alienated from scientists. How do we bridge the gap between scientist and citizen? One way is through communication. Communication leads to more informed citizens, and this has downstream effects on policy and research funding. One way to share ideas that are controversial or not well-understood is through an Op-Ed style article. As the author of an Op-Ed, you have an opportunity to share an idea or opinion with a broad audience. Your ideas may challenge readers. Based on one class reading, you will create an Op-Ed style article for the New York Times or the Cornell Sun. You will write an engaging article for the public. You will narrow down to one topic. Like our first writing assignment, you will summarize a small number of key points made in the readings. You will then go a step further and incorporate your own perspective in conversation with these other ideas. This assignment is intended to be a creative and more informal exercise. In later assignments, we will focus our attention on more formal methods of analysis and synthesis. At the end of the unit, you will share a short reflection piece about the writing, reading, and peer-review sessions. Unit 3—Position Paper (5 pages) In previous assignments, we summarized other people’s ideas, before incorporating our own perspectives in conversation with a limited number of other sources in an Op-Ed article. We will now consider more complex writing moves as we debate about genetically modified plants. GMOs are often spoken about with great passion, but few people have taken the time to think critically about the benefits or risks associated with them. To analyze contrasting texts and perspectives, you will write a position paper following our in-class debate on genetically modified plants. Prior to the debate, you will begin considering the contrasting perspectives related to genetically modified plants from 212 reputable sources. You will reflect on your own perspective and how your perspective can be in conversation with other perspectives. Through multiple sessions in small groups, you will continue to refine a position and build up claims and supporting evidence. As the drafts develop, you will form debate groups and share drafts as part of the debate preparation. You will compile information from your papers to prepare for an in-class debate. Guidelines will be shared about the format of the debate. Following the debate, you further refine your draft into a polished position paper. In the end, your paper will incorporate a compelling thesis and logical supporting arguments. At the end of the unit, you will share a short reflection piece about the writing, reading, and peer-review sessions. Unit 4—Proposal and Final Reflection (~4-5 pages) Your final writing assignments will consist of two parts. Part 1—Based on our readings, discussions, and your own interests, propose one solution for how agriculture can become more sustainable and equitable. You may approach this assignment from any direction you choose. You might consider questions of environmental sustainability, social justice, climate change, increasing population, and more. As a helpful reference, please offer a definition for “sustainability”, as definition vary between people. Check Canvas for additional details. Aim for a proposal of approximately 3 pages. Visuals, art, poetry, or other creative expression can be incorporated for extra credit. Part 2—For the second component, you will be a researcher of your writing and reference your previous assignments as a portfolio. Notice whether there were certain course themes and ideas that returned in many of your assignments. In your reflection of two pages address the following points: • Were there particular topics that continued to excite you or give you pause? How did your perspectives on food and agriculture change across the assignments as 213 we incorporated more ideas and writers into our discussions? How did the writing assignments change your perspectives on food and agriculture? • Discuss what your writing strengths were and areas that you will continue to work on. What are strategies for working on these skills? • Were there particular in-class activities, readings, or techniques that were beneficial in your formation as a writer? What skills do you hope to carry into your writing in the future? You can respond to each of these questions in a new section. 214 Course Schedule that was followed throughout the course In the following section, I provide a brief description of the activities for each class, and how the writing assignments manifested. Unit 1: Dis(Connection) from the Land CLASS ACTIVITIES HOMEWORK (due next class) Week 1 Th, Sept. 3 First day of class! Read (for Sept. 8): The Gift of Strawberries chapter from What is agriculture? Robin-Wall Kimmerer's Braiding Sweetgrass-Indigenous Wisdom, Go over the syllabus and Scientific Knowledge, and the Teachings Canvas page of Plants Read Wendell Berry’s essay, “The What makes an inclusive Pleasures of Eating.” classroom? https://emergencemagazine.org/story/the-pleasures-of-eating/ Introduce first assignment This will be read for the diagnostic essay. Review syllabus Write: If you haven’t already, complete a first-week survey in “Quizzes.” Also introduce yourself to our class in the “Discussions” tab of Canvas. Week 2 T Sept. 8 Time for questions about Read Chapters 1-3 of the “Soil” section the syllabus or writing from Dan Barber’s The Third Plate. assignment For context, I’ve also included the Discuss Dr. Wall- introduction of the book. Feel free to just Kimmerer’s essay in small skim the introduction to get a sense for and large groups. the layout and overarching goals of the book. What are the main components of a good Due: By Wednesday Sept. 9 at 5 pm, summary? please send me your first short essay. See details in “Assignments.” 215 Th, Sept. 10 Discuss Chapter 1-3 of The Read (For Tues. Sept. 15)-Chapters 4-6 Third Plate. of the “Soil” section from Dan Barber’s The Third Plate. Introduce summary paper expectations Skim the first 4 pages of the scientific review on the soil microbiome. Don’t How to you incorporate spend more than 30 minutes on this textual evidence into an article. essay? Due: Tues. Sept 15 at 1 pm, respond to How do you paraphrase one of the discussion questions for the rather than plagiarize? readings. Post your response in the “Discussions” tab of Canvas in the Review how to work with specified thread. Alternatively, share your quotations and MLA own discussion question. Respond to one format with small-group other person. activity Week 3 T, Sept. 15 Discuss Chapter 4-6 of Readings: No readings for Thursday. Use The Third Plate. Chapter 6 this time to draft your short summary can be skimmed. paper. Briefly talk about the Due for Thursday: Upload a rough draft complexities of soil of your summary paper to the “Summary Paper Draft” in “Assignments” to upload What makes a good your draft. peer-review group? Discuss expectations for peer-review on Thursday Th, Sept. 17 Writing workshop day Readings (Due on Tuesday): Read the introduction of Farming While Black by Leah Penniman. Also look at one article (links in Canvas) on unequal land access in the United States (you choose) or listen to the 1691 podcast episodes 1 and 2 if you need a reading break. Writing: In “Discussions”, post one question from any of the readings or the 216 podcast. We'll use these to shape our discussion on Tuesday. Unit 1 Preliminary (Diagnostic) Essay Why are we writing an essay so soon? All first-year writing students will be writing a short essay early in the semester. Take some time for this essay because it will help to focus our time together. It will allow us to draw our attention to areas of writing that remain challenging. In this assignment, you will be asked to analyze and reflect on a short essay by Wendell Berry. Take some time to first read and reflect on the piece. As you craft your essay, you will consider Berry’s central claims in the essay, and you will provide your own reflection as well. This is essay is ungraded. Have some fun but do take some time to create a compelling piece of writing. At the conclusion of this course, we will come full circle and reflect again on Berry’s work and our own writing. Assignment Parameters The writer and farmer Wendell Berry lives in Kentucky. He has written books, essays, and works of fiction centered on agriculture and rural life. Read the short essay “The Pleasures of Eating.” Reading this essay shortly after our first class will give you more time to compose your essay. In an approximately 2-3-page essay, consider the following prompt: • What does Wendell Berry mean when he tells people to “eat responsibly?” How does Berry support these claims? • Do you believe Berry’s claims are substantiated? • Is there anything you do to eat responsibly? How might your ideas be similar or different to suggestions made by Berry? Unit 1 Summary Paper You will choose one reading from our first unit to summarize. Using a single source, you will articulate the central points of a piece of writing. 217 You will consider the author’s larger goals for the work and the evidence they used to substantiate their claims. Provide enough detail to create an engaging paper but not so much that readers feel like they are just re-reading the original article. This assignment will prepare you for future assignments in which you incorporate your own analysis and perspective. You will also have an opportunity to work with quotations, frame textual evidence, and paraphrase important ideas. Please take some time to write your first draft and incorporate some substantial revisions into your final paper. I will look at the initial draft as I review the final paper. Learning how to revise is one of the most important skills of this course. What will you be doing? • Prepare a ~2.5-3-page paper summarizing the author's central claims. • Make sure your summary captures what the author is doing and saying, rather than what you believe. • Highlight the evidence the author provides and connect it to the author's central claims. • Provide some context. Briefly describe who the author is, why they may be writing, where the article takes place, and other details that could help the reader get acquainted with the piece. Assume your readers are not familiar with the piece you are summarizing. • Look at the peer-review checklist below for additional expectations. Class readings to choose from: You may incorporate readings that we have already discussed thus far in the course. The Gift of Strawberries-Robin Wall-Kimmerer The Third Plate-Dan Barber. Choose one of the sets of chapters below. • Focus on chapters 2 and 3 • Focus on chapters 4 and 5 Format of the essay • ~2.5-3 pages double-spaced essay • Font such as Times New Roman or Calibri • 12-point font • Include name, date, and assignment above the title • Reference the class reading that was used. In text citation, page numbers of references, and a works cited page are required for the single source. If ideas are paraphrased or quoted from the class reading, in text citations are also required. • Use MLA format for consistency 218 • Include an interesting title Checklist for Revisions Over-arching goals 1. Are the central goals of the piece concisely summarized early in the summary essay? 2. Does the writer summarize the central points of the chapter(s) in an engaging way? 3. Would the summary be interesting for a reader who hasn’t read the chapter? Essay organization and structure 1. Does the writer convey the over-arching goals of the chapter(s) in a concise way in the introduction or shortly after? This should be a point of orientation for the reader. 2. Does each paragraph begin with a strong topic sentence? 3. Does the order of paragraphs in the essay make sense? Do ideas build off one another or should the writer consider moving around paragraphs. (It is common that as we get going on our essays, our best ideas emerge towards the end. As part of the revision process, we often need to shift paragraphs around). 4. Is each body paragraph cohesive and concise? 5. Are there sentences that aren’t relevant? Textual evidence and paraphrasing 1. Is textual evidence cited appropriately using MLA format? Double check OWL Purdue or another resource if unsure. 2. Are quotations framed and incorporated with proper explanation? 3. Are large quotations used sparingly? 4. Do the quotations add to the overall objective of summarizing the central goals of the piece? 5. When paraphrasing is used, are ideas cited? 6. Does a works cited section exist at the end? Since you’ll be citing a single source, it can be on the same page as the final paragraph. 219 Unit 2: How Plants Shape Us and We Shape Plants CLASS ACTIVITIES HOMEWORK (due next class) Week 4 T, Sept. 22 Essay check-in Readings (Due on Thursday) We will transition to our second unit on Who has access to land how humans have domesticated plants. in the United States? We’ll also ask whether plants have shaped Why do BIPOC have less us in any way. access? Read “A Babel of Corn” from The Story Small and large group of Corn by Betty Fussell. discussions on access to land and land ownership. Writing (Due on Thursday) What is one vegetable, fruit, or grain that Introduce new unit you don't know much about? Think of something that you enjoy, a plant that is an important staple in your diet, or something that is important to your family. • Find one reputable scientific source describing where this plant came from. A helpful keyword could include "domestication." • Post your response in "Discussions" before class on Sept. 24. The summary paper is due by Thursday. Sept 24 at midnight. Th, Sept. 24 Why do we eat the plants Readings (Due on Tuesday Sept.29) we eat? Read “The Language of Science” from What makes a reputable The Story of Corn by Betty Fussell. source? Writing (Due on Tues. Sept. 29) Choose two-three possible topics to write How should you begin to an Op-Ed on. Keep the ideas for incorporate other Tuesday’s class. people’s ideas into your writing in non-academic 220 writing? For our assignment over the next three weeks, you will go through the process of writing an op-ed, from crafting an argument and drafting, to making revisions. Assignment instructions will be posted shortly. The final due date will be during the week of October 12. The summary paper is due by Thursday. Sept 24 at midnight. Remember to include your 0.5-1-page reflection of the writing process and your experience with the first unit. Guidelines are described in the assignment. Week 5 T, Sept. 29 Discuss Fussell reading on Readings (Due on Thursday, Oct. 1) corn Read your assigned short journal article. What changes did plant There will be three unique articles and six scientists make to corn reading groups. throughout the 1900s? Writing (Submit Op-Ed topic idea on What was the Friday, Oct 2) significance of hybrid corn? Choose one op-ed topic from your list of 2-3 that you came up with for class on Introduce Op-Ed Tuesday. Spend a bit of time thinking assignment about the main ideas you hope to What are components of incorporate and jot them down in a place an Op-Ed? you can reference later. Library database When you are finished, send me an idea scavenger hunt activity on for your topic and a few sentences Canvas describing some of the arguments or ideas you hope to incorporate in your Op-Ed by Friday Oct. 2 at midnight in "Assignments." It can be a rough sketch of ideas that is in flux. 221 This just gives me an opportunity to provide feedback before you devote significant time to the writing process. Th, Oct. 1 What are best practices Readings (Due on Tuesday, Oct. 6) for reading scientific Listen to the following podcast Slavery & journals? Soul Food: African Crops and Enslaved Cooks in the History of Southern Cuisine: Reading group activity on https://digpodcast.org/2020/07/26/soul- journal article food/ Writing Prepare a short Google Slides presentation on Submit topic idea on Friday, Oct 2 for your journal article to your Op-Ed present to class Choose one op-ed topic from your list of 1. Who wrote your journal 2-3 that you came up with for class on article and when was it Tuesday. was written? Is your article a primary source or For October 6: Start freewriting 1-2 a secondary source? pages of your Op-Ed article. We’ll discuss how to build arguments in the 2.What were the context of an Op-Ed on Tuesday. overarching claims of the article? On the discussion board, share one discussion question related to the podcast 3. Very generally, what by class on Tuesday, Oct. 6. I’ve added a evidence did the authors reminder for this. provide to support their claims? 4. Why is the information in the paper important? 5. How should we read papers that contain research that is still in a state of flux? Week 6 T, Oct. 6 How do you build a good Readings (Due on Thurs. Oct. 8) argument? 222 Robin Wall Kimmerer chapters Class discussion of • People of Corn, People of Life podcast • The Teachings of Grass Discuss Op-Ed topics in Writing small groups Begin crafting the body paragraphs of your Op-Ed draft. Focus less on the introduction and conclusion. Th, Oct. 8 What makes a good Op- Readings (Due on Tues, Oct. 13) Ed introduction? Botany of Desire-Michael Pollan Suggestions for Op-Ed • Intro and Apple section organization • Skim intro. Read the first 40 pages quickly and focus on the last 15 Discussion of Braiding pages most. Sweetgrass chapters Writing Small group discussion on topics, sources, and Begin drafting your Op-Ed. organization of op-eds A rough-draft will be due on Thursday Oct. 15 for peer-review discussions. Week 7 T, Oct. 13 Discussions of Apple Readings (Due on Thurs, Oct. 15) section from Botany of No readings due. Focus on drafting Desire your Op-Ed Reflect on the questions Writing posted in Canvas. As you continue your drafting work, think Clarify Op-Ed rough draft about how you will add your deadline for Thursday introductions and conclusions. Recap of second unit A rough-draft will be due on Thursday Oct. 15 for peer-review discussions. Submit on Canvas in “Assignments.” Feel free to schedule a short Zoom meeting if you are feeling stuck. Th, Oct. 15 Peer-review session Readings (Due on Tues, Oct. 20) 223 Watch the two videos posted on Canvas Download peer-review on plant breeding, including the video reference sheet from the with UW Madison professor Irwin Canvas module for Oct. Goldman. 15. Reflect on the questions posted in the Submit a draft before modules before and after watching the class, so peer-review videos. No need to turn anything in but groups can be made. keep this free-writing work. This background will be helpful as we delve into our new unit and begin discussing genetic engineering in comparison to traditional breeding strategies Writing Begin the process of revision of your Op- Ed articles, while peer-review comments are fresh. A final submission will be due on October 26 at midnight. Unit 2 Writing Assignment—Op-Ed Article Rationale Now that you have completed your summary essay, you will incorporate your own perspective in an Op-Ed article. You will choose a relevant topic and enter conversation with 2-3 sources. You can choose class readings or reputable sources outside our course material from the first or second unit. This is an opportunity to experiment with different writing moves in a creative manner. Your task will be to create an engaging essay on a topic that few people have engaged with. 224 What will you be doing and why? The ways in which scientists communicate with the public matters tremendously. Often, the public feels as though scientists live in an abstract bubble. People often feel distrustful and alienated from scientists. How do we bridge the gap between scientist and citizen? One way is through communication. Communication leads to more informed citizens, and this has downstream effects on policy and research funding. One way to share ideas that are controversial or not well-understood is through an Op-Ed article. As the author of an Op-Ed, you have an opportunity to share an idea or opinion with a broad audience. As you write, consider who your target audience and how you want to challenge readers’ perspectives on the topic. You will create an Op-Ed style article for the New York Times or the Cornell Sun. You will write an engaging article for the public. You will incorporate your own perspective in conversation with three other sources (maximum of 5). This assignment is intended to be a creative exercise. You are welcome to submit your Op-Ed to a newspaper. In later assignments, we will focus our attention on more formal methods of analysis and synthesis. What is an Op-Ed? We will review components of an Op-Ed in class and but for additional reference: https://projects.iq.harvard.edu/files/hks-communications- program/files/new_seglin_how_to_write_an_oped_1_25_17_7.pdf Questions to guide the drafting process 1. Does your introduction have a hook? 2. Is your overarching claim or argument conveyed concisely at the beginning? 3. How will you convey to the reader that this topic is important and matters? Think about the inverted triangle formula we discussed in class where you start more broadly with a hook and make your central point, before narrowing in on claims and evidence. 4. Are your central claims conveyed clearly and is textual evidence provided to support these points? 5. Does your concluding section reiterate the main argument again? Think of ways to provoke readers to think more about your topic. 225 Class readings to consider: You may incorporate readings that we have already discussed thus far in the course. Format of final assignment • 700-1000 words (~3 to 4 pages double-spaced) • Reference class readings that were used. In text citation, page numbers of references, and bibliography are not required for this assignment, but if ideas were paraphrased or quoted from our class readings, the title and author of the reading should be referenced. • Include name, date, and title at the top of your essay • Include an interesting and catchy title • Provide a ~1-page reflection when submitting the final Op-Ed article about your experience writing the Op-Ed. What was challenging about this assignment? Who was your target audience and how was writing for a target audience easy or challenging? What was rewarding about the writing process? How was the peer- review process? • Guidelines will be available on Canvas Questions to guide the revision process Macro-level details 1. Does your introduction have a hook? 2. Is your overarching idea or argument conveyed concisely at the beginning? 3. How will you convey to the reader that this topic is important and matters? • Think about the inverted triangle formula we discussed in class. Try to start more broadly with a hook and your central point. Then begin narrowing down on your central claims. 4. Do you have concise and clear topic sentences that help the reader anticipate what they will read in the subsequent section? 5. Are your central claims conveyed clearly? 6. Did you provide textual evidence to support your claims points? • Is the textual evidence paraphrased properly? Are direct quotations incorporated correctly with proper MLA formatting of in-text citations? • Did you avoid plagiarism? • Check out the Turnitin tool as you submit your piece. You may resubmit your piece as many times as you like before the due date if you are using this tool. I won't read your essays before the due date. 7. How have you incorporated a counterargument and a rebuttal to re-substantiate your claim? 8. Does your concluding section reiterate the main argument again? 9. How do you provoke readers to think more about your topic as your piece ends? Micro-level considerations 226 Once you feel confident in your overall organization, claims, and research, think about ways you might improve your writing at the sentence level. 1. Mechanics-Is the paper formatted correctly? Have you incorporated quotations properly? Is spelling free of errors? Have you used proper capitalization? Have you used the apostrophe correctly? 2. Syntax-How is the grammar of your piece? If you read your paper out loud, can you catch strange word order issues? Are there inconsistencies in verb tense? What about dangling modifiers? See example: Incorrect: To improve her essay, each page was proofread. Correct: To improve her essay, Lily proofread each page. 3. Punctuation- Have you incorporated the proper commas (,) colons (:) and semicolons (;), apostrophe (') and hyphens (-)? Remember, colons, semicolons, and hyphens can also add interest and variation to sentences. 4. Register- Is the degree of formality or informality of vocabulary or syntax consistent across the essay? 5. Style- Is the word choice or sentence patterns varied and engaging to a reader? Style can also mean discussions on choice of register Unit 3: How are new fruits and vegetables developed? CLASS ACTIVITIES HOMEWORK (due next class) Week 8 T, Oct. 20 Discussion of plant Readings (Due on Thurs, Oct. 22) breeding Read first chapter of Mendel in the Kitchen titled “Against the Ways of Nature” What is the difference between “traditional” Writing breeding methods and methods that Finish a more complete draft of your Op- incorporate genetic Ed. modification? Consider the organization. Is your main idea obvious to the reader and introduced early? • Are the ideas in paragraphs organized by strong topic sentences? • Make sure you are paraphrasing and not pulling text from articles without putting the ideas into your own words 227 • Now is the time to add citations properly if you haven’t already. • Begin to look more at the sentence- level details. Check out the Canvas page for components to consider. For Thursday, come prepared to read your essay out loud in new groups. I will provide suggestions for providing feedback. Th, Oct. 22 Discuss Mendel in the Readings (Due on Tues. Oct 27) Kitchen reading on Go through the interactive article on Al golden rice. Jazeera about Bt eggplant and read the first four pages of a scientific review on Bt What are possible eggplant in Bangladesh. Links in Canvas. criticisms? In preparation for class on Tues: Locate Peer-review activity one reputable article that criticizes Share your essay with 1- genetically engineered eggplant in 2 other peers. developing countries. Writing: Submit Op-Ed assignment by midnight on Monday, Oct. 26. Remember to include the short reflection based on the questions in the assignment document. Week 9 T, Oct. 27 Bt eggplant discussion Readings (Due on Thurs. Oct. 29) Watch short video clip Read about public perception of GM crops from a study conducted by the Pew Learn about Rainbow Research Center. See links on Canvas. papaya in Hawaii Read New Yorker article on Indian activist What are critiques of Vandana Shiva GE crops? https://www.newyorker.com/magazine/201 4/08/25/seeds-of-doubt What will the remaining weeks of the semester Writing: look like? 228 Choose a position to the following question (and explain why in a small paragraph): In light of a changing climate and an increasing population, do genetically engineered food crops offer a sustainable and equitable solution for helping to feed the world? You can agree or disagree completely or choose a position in the middle. Find two reputable sources to back up your claim. Submit your idea on the discussion board in the appropriate forum. Th, Oct. 29 How does the public Readings (Due on Tues. Nov. 3) perceive GE crops? Read the short popular science article from Science on gene editing in China and view Discussion of New a short video from Wall Street Journal. Yorker article on Indian activists and surveys Writing: from the Pew Institute. Follow the prompts on Canvas to help Go into break-out rooms prepare for our debate next week. On to talk briefly begin Tuesday, you will work in small groups for preparing for in class ~30 minutes. debate. Download debate information on Canvas for Oct. 29. Week 10 T, Nov. 3 Check-in Readings (Due on Thurs. Nov. 5) No readings due for Thurs. Nov. 5 Short discussion of newer genetic If things are feeling hard or overwhelming engineering techniques at the moment, take a look at the opportunities for extra credit on Canvas for Debate preparations in Nov. 3. 229 small groups Writing: Prepare for debate on Nov. 5. To spread out the work, begin to put some ideas on paper in preparation to draft your position paper. See prompts on Canvas. Th, Nov. 5 In class debate on Readings (Due on Tues. Nov. 10) genetically engineered Read chapters 25-30 from Barber’s The crops! Third Plate. Writing (Due Tues. Nov. 10) Begin freewriting the main arguments of your position paper, as well as possible rebuttals. We’ll have more detailed in-class discussions about the analysis and synthesis components of the paper next week. Mid-Year Survey Week 11 T, Nov. 10 Discussion of Chapters Readings (Tues. Nov. 10) 25-30 from The Third Read the remaining chapters from the Plate. “Seeds” section of Barber’s book. Recap of genetic Writing (Due Thurs. Nov. 12) engineering Transition to the 4th Continue to free-write and draft your main Unit! body paragraphs in preparation for a full draft due by Nov. 19. What must agriculture look like in the future? Th, Nov. 12 Discussion of Chapters Readings 30-31 from The Third No readings due until Dec. 1. Plate. 230 A few writing tips: If you are interested in thinking more about the process of writing, check out “They Short writing workshop Say, I Say” in the “Writing Resources” with peers section of the Modules. Address the following An optional video/short questions in discussion article/podcast/surprise will be posted for when discussing your each day we aren’t in class. This is just for draft: fun/extra credit. • What is your topic/main In preparation for Dec. 1 class, listen to one claim? of the SARE podcast episodes on • What are your sustainable agriculture. central https://www.sare.org/resources/our-farms- arguments? our-future-podcast/ • What evidence will you use to Writing from Nov. 12-Dec. 1 substantiate your claims? • Prepare a rough draft of body • How are you paragraphs for Nov. 17-19. incorporating • Submit on Canvas in counterargument “Assignments”. s? • Complete your 2 peer-reviews by • How might you Dec. 26. organize your • Begin revisions on paper draft? • Schedule conference with Martha Unit 3 Writing Assignment-Position Paper Information Rationale In previous assignments, you summarized other people’s ideas before incorporating your own perspective in conversation with a limited number of other sources in an Op-Ed article. You will now consider more complex writing moves as you debate about genetically engineered (GE) plants. GEs are often spoken about with great passion, but few people have taken the time to think critically about the benefits or risks associated with them. During this unit, you will take a position and make claims that are supported by evidence. The central question we are asking is: 231 If you consider a changing climate and an increasing population, do genetically engineered food crops offer a sustainable and equitable solution for helping for helping to feed the world? As you begin to formulate your own position, an in-class debate will allow you to explore the nuances of a variety of perspectives and counterarguments. After the debate, you will return to the initial ideas you have generated for your position paper with fresh eyes. You will make substantial revisions to several drafts of your paper. The revision process will allow you to submit a cohesive paper with tighter arguments and supporting evidence. What will you be doing and why? To analyze contrasting texts and perspectives, you will write a position paper following our in-class debate on genetically modified plants. We have been considering the contrasting perspectives related to genetically modified plants from reputable sources. You will reflect on your own perspectives and how you can place your own perspective in conversation with other perspectives. Following the debate, you will further refine initial reflections and ideas into a cohesive draft. Your final paper will incorporate a compelling thesis and logical supporting arguments. Again, at the end of the unit, you will share a short reflection piece about the writing, reading, and peer-review sessions. Format of final assignment • 5-6 pages double-spaced with a complete introduction and conclusion • In text citation, page numbers of references, and the works cited in MLA format are required • Remember to cite any ideas that are not your own, including quotations, summaries of other sources, and paraphrasing of another writer’s ideas • Aim to incorporate at least 3-4 outside sources o Remember that the readings we did in class also count • Include name, date, and title at the top of your essay • Include an interesting and catchy title 232 Unit 4: Foods of the Future: How Must Agriculture Change? CLASS ACTIVITIES HOMEWORK (due next class) Week 13 T, Dec. 1 Discuss the SARE podcast Readings for Dec. 3 episode you listened to in No readings. Check out the short series of small groups. videos on farmworkers. These videos will spur our discussion on the people who How would you define grow and harvest our food for Thursday sustainable agriculture? Writing for the week of Nov. 30 • Begin to revise your position papers. • Look at comments made by peers and Martha • Focus on the larger organization and then • Zoom into sentence level details. • See if you can make long sentences clearer, more active, and more concise. Th, Dec. 3 Who is growing and No readings for Dec. 8. harvesting the food that we eat? On the Canvas discussion board, share one story of a food tradition or recipe. Discuss videos we Respond to at least one other person with watched on farmworkers a question or comment. before and during the pandemic. If you are interested in thinking more about the process of writing, check out What was one thing you “They Say, I Say” in the “Writing didn’t know much about? Resources” section of the Modules. What are some ways that Writing for the week of Nov. 30 farm work can be made • Continue to revise your position more equitable? papers. In small groups, locate one • Look at comments made by peers shorter article on and Martha farmworkers. Think about • Focus on the larger organization and then zoom into sentence level 233 something you don’t know details. much about. • See if you can make long sentences clearer, more active, and Prepare a short 2–3- more concise. minute presentation summarizing the article you chose. Also come up with a discussion question to pose to the class. You can include visuals and share your screen if you like. Some possible themes: 1. Workers’ rights 2. Pandemic 3. Food justice 4. United Farmworkers’ Movement 5. Farmworkers in US 6.Farmworkers in other countries 7. Immigration challenges 8. Environmental justice Week 14 T, Dec. 8 NO CLASS No readings due for Thursday Dec. 10. Writing workshop day. We’ll spend the first 20 minutes of See Module for Dec. 8 for Thursday’s class on a short Unit 3 a checklist of what to reflection. work on. Prepare to discuss the food tradition or There is an optional recipe you shared on the Discussion board open Zoom 234 meeting/office hour from for Thursday. You can share a visual or 7:30 pm to 8:30 p. recipe if you like. WRITING Focus on completing your position paper final draft Check out the final proposal and reflection assignment. Assignment parameters are on Canvas. Th, Dec. 10 How do our food WRITING traditions and family recipes shape us? Begin drafting your final proposal and reflection. Share a food tradition or specific recipe that Come prepared to share a bit about the reminds you of your topic of your proposal on Tuesday. family or where you are from Week 15 T, Dec. 15 Final course wrap-up WRITING Reflect on central course Complete final proposal and reflection for themes together Dec. 21. Complete final course evaluation Final Writing Assignment-Proposal and Reflection Your final writing assignment will consist of two parts. Part 1) Based on our readings, discussions, and your own interests, propose one solution for how agriculture can become more sustainable. Afterwards, reflect on one thing that you can do to eat more responsibly. 235 • You may approach this assignment from any direction you choose. You can consider questions of environmental sustainability, social justice, climate change, an increasing population, or other approaches. • As a helpful reference, please offer a definition for “sustainability”, as definitions vary between people. • Zooming in again to the level of the individual eater, reflect for a second time on what you now think it means to "eat responsibly." Do you agree with your early summary and analysis of Berry's essay The Pleasures of Eating? • Aim for a proposal/reflection of at least 2.5-3 pages. Visuals, art, poetry, or other creative expression may be incorporated for extra credit. Part 2) For the second component, you will be a researcher of your writing and reference your previous assignments as a portfolio. Compile your writing and notice whether there were certain course themes and ideas that returned in many of your assignments. Feel free to organize this reflection in any way you choose. Reflection on course themes • Were there particular topics that continued to excite you? • How did your perspectives on food and agriculture change throughout the course? • How did the readings, writing assignments, and discussions change your perspectives on food and agriculture? • What is one topic you want to learn more about? Reflection on your writing • Discuss what your writing strengths are and areas that you will continue to work on • Were there particular in-class activities, readings, or techniques that were beneficial in your writing progress? • What skills do you hope to carry into your writing in the future? • What skills will you continue to work on? • Aim for a reflection of at least 2 pages. The importance of reflection when learning to write After each unit, students reflected on their writing for the unit, participation, and effort. This was a requirement laid out in the grading agreement. This informal writing was also a way for students to reflect on the process of learning, reading, and writing during each step of the class. Students also offered suggestions for how the course might be improved 236 and aspects of the course that they found helpful. I found these reflections to be very helpful and I adjusted my lesson plans accordingly. If students were struggling with something, it was also an opportunity to maintain open communication. As the semester progressed, I began giving time in class to complete the end of unit reflections. The quality of the reflections improved for some students. Student Evaluations and Comments I received mid-year and final evaluations. At the time of the mid-term evaluations, I received constructive feedback and overwhelmingly positive comments. This was a pleasant surprise. A few students did feel like reading level was becoming more than they could handle. I scaled back readings at the end of the semester. Students appreciated small-group discussions and activities. Taking this into account, I continued to incorporate team activities and I tried to create new activities to keep the class engaged. Due to heavy reading in other classes, students were appreciative when I assigned podcast or videos instead of readings. I made significant changes to my original course plans after the Thanksgiving break to be more accommodating of the fact that some students were feeling increasingly overwhelmed by busy course schedules and travel away from campus for the final weeks of the semester. Students finished their final weeks of the semester off-campus. For many students, this was an adjustment and a few students communicated this to me and asked for small extensions on assignments. The question remains whether my writing course was a meaningful learning opportunity for students, both in terms of learning more about food and agriculture and about writing. The course evaluations seem to show that most students who took the survey (13/17), appreciated and learned a great deal from the course (Figure A.1). I received few negative comments. 237 Some comments included: [Martha] made the class super inclusive and engaging through free-writing and in-class discussions. Truly one of the best professors I've had during the past three semesters here at Cornell. Not only was she the most understanding professor I had (in terms of the difficult transition to school and back home), but she was also the ONLY understanding professor I had this semester. She made the class super inclusive and engaging through free-writing and in-class discussions. Truly one of the best professors I've had during the past three semesters here at Cornell. The writing assignments over the course of the semester were interesting and helped me develop my voice as a writer. They also helped me better understand the differences and appropriate balance between summary and personal analysis depending on the type of assignment. The teacher feedback through comments on papers as well as through conferences helped me to pinpoint some of my weaknesses as a writer that I would have likely overlooked otherwise. The class give me the chance to write more formal essays like summary essay and some more opinion-based op-ed and position essay. I learn about formal academic writing, how to write formal citation, how to utilize resources to support my point of view. The critique I received in the final evaluations was that a student didn’t feel peer-review groups were as helpful as they could be. Other students found the peer review activities to be one of the best features of the course. I did provide checklists to guide students, but I did notice that not every student put as much time into going through peer writing. This might have been frustrating for students who did put more time into peer review activities and expected the same for classmates’ reading their essays. I also felt like Zoom made it feel a bit intrusive to check in on small groups. In a physical classroom, it would be much easier to roam around providing quick feedback. Teaching a physical class would be a great way to expand my teaching skills further. 238 Table A.1. Final PLSCI 1105 Course evaluations collected in December of 2021 by the Knight Institute. Survey Questions Mean StDevP Count 1 2 3 4 5 (1 = strongly agree; 2 = agree; 3 = agree somewhat; 4 = disagree; 5 = strongly disagree) N01. In this seminar, online or in person, the instructor managed the COVID-disrupted 1.08 0.27 13 12 1 0 0 0 environment effectively N02. In this seminar, the instructor fostered a learning environment where I felt respected and 1.08 0.27 13 12 1 0 0 0 empowered to participate. N03. In this seminar, the workload was appropriate for a 3-credit class 1.00 13 13 0 0 0 0 N04. In this seminar, informal/preparatory writing work helped me engage with the readings 1.08 0.27 13 12 1 0 0 0 and draft an essay. N05. In this seminar, we spent an appropriate amount of time focusing on writing. 1.38 0.84 13 10 2 0 1 0 N06. In this seminar, we spent an appropriate amount of time focusing on revising. 1.23 0.58 13 11 1 1 0 0 N07. In this seminar, the instructor provided helpful feedback on papers. 1.31 0.72 13 11 0 2 0 0 N08. In this seminar, I had sufficient opportunities to confer individually with the 1.00 0.00 13 13 0 0 0 0 instructor. N09. In this seminar, individual conferences were helpful. 1.38 0.74 13 10 1 2 0 0 N10. In this seminar, the instructor supported my development as a student. 1.23 0.58 13 11 1 1 0 0 239 N11. In class, in conferences, and in paper comments, the instructor emphasized developing 1.38 0.62 13 9 3 1 0 0 strong, evidence-based arguments. N12. In class, in conferences, and in paper comments, the instructor emphasized focusing an essay on a significant problem, hypothesis, thesis, 1.31 0.61 13 10 2 1 0 0 argument, or idea. N13. In class, in conferences, and in paper comments, the instructor emphasized 1.31 0.82 13 11 1 0 1 0 organization: paragraph structure, transitions, etc. N14. In class, in conferences, and in paper comments, the instructor emphasized working 1.15 0.36 13 11 2 0 0 0 with source material properly. N15. In class, in conferences, and in paper comments, the instructor emphasized revising essays to enhance the reader’s experience of 1.08 0.27 13 12 1 0 0 0 interest, clarity and persuasiveness. N16. In class, in conferences, and in paper comments, the instructor emphasized editing 1.31 0.61 13 10 2 1 0 0 essays to elimina te surface flaws. 240 Final Reflection on the Course As my course ended, it took me a few weeks to recalibrate. Without the structure of preparing for class, revising course plans, teaching, and engaging with students, I again felt some of the isolation imposed by the pandemic. It was then that a realized that the opportunity to teach, not only helped me gain new skills, but it also helped energize me during a difficult year. My evenings and weekends were taken up with coursework, so I could spend my weekdays working on research. I was concerned that I would feel crunched for time, but I found doing both teaching and research was manageable and even beneficial for my productivity. Designing and then teaching a course was creative. When I can do creative work outside of research, I find it translates to more creativity in lab and as I think critically about my research. Based on student feedback, I believe my students also found my course provided structure, thought-provoking readings, and questions that will remain with them for a long time. My students are coming of age in a polarized world at a time of global pandemic. My students voted for the first time this November. I could sense some urgency in the ways in which some of my students felt like they must be agents of change, and as a result, they devoured course readings and engaged in discussions with an energy that I didn’t anticipate. It motivated me to be a better instructor. An observation the struck me often was that students were most engaged when I wove in assignments and readings that grappled with intersecting questions related to social justice. As we learned about soil, we also discussed indigenous relationships with land. Through readings by Leah Penniman in Farming While Black, we talked about the skills and contributions enslaved Africans made in shaping agricultural practices in the United States, even though much of this history is now buried. We talked about culinary traditions of marginalized groups that are now being appropriated by upscale restaurants. We considered the anti-racism work that must continue to make agriculture more 241 equitable. It is no coincidence that BIPOC own so little land in the United States. We also thought about how history seems to repeat itself, as farmworkers feed the country but receive little healthcare, income, housing, and human rights. I learned a great deal from my students and know that this will be reflected in my future thinking, research, and teaching. Conclusion I’m grateful for the opportunity to teach a seminar to undergraduates. I didn’t realize I would be teaching in the middle of a pandemic, in the middle of political strife, and at a time when we are reckoning with racial injustice. I receive satisfaction knowing that I might have helped some students develop their writing, study skills, and their critical thinking. I also have my students to thank for helping me to gain more confidence in my public speaking and teaching, all while thinking more critically about agriculture and the plant sciences. I leave this experience with new excitement, new ideas, and most importantly, hope. 251