JavaScript is disabled for your browser. Some features of this site may not work without it.
The Inference of Selective Sweep Parameters from their Genomic Footprint
dc.contributor.author | Vasconcellos Caldas, Ian | |
dc.date.accessioned | 2022-10-31T16:21:01Z | |
dc.date.available | 2022-10-31T16:21:01Z | |
dc.date.issued | 2022-08 | |
dc.identifier.other | VasconcellosCaldas_cornellgrad_0058F_13249 | |
dc.identifier.other | http://dissertations.umi.com/cornellgrad:13249 | |
dc.identifier.uri | https://hdl.handle.net/1813/112079 | |
dc.description | 123 pages | |
dc.description.abstract | A selective sweep happens when an adaptive allele increases in frequency and spreads through the population; this mode of natural selection has been implicated in rapid adaptation relevant to agriculture, medicine, and human evolution. Understanding the role of sweeps in nature involves the knowledge of their evolutionary parameters, such as the strength of selection driving a sweep and the number and origin of adaptive alleles. The overarching goal of this thesis is to discover whether these evolutionary parameters are identifiable from the characteristic patterns of genetic variation that a sweep leaves in the genome (its "footprint"). For that purpose, three systems of known rapid evolution events across the tree of life are explored in detail: the evolution of pesticide resistance in Drosophila melanogaster; the evolution of anti-malarial drug resistance in the malaria parasite Plasmodium falciparum in Thailand; and the evolution of dehydration resistance in the human Turkana population of northern Kenya. I propose and develop a method of inferring sweep parameters based on simulated datasets and supervised machine learning that can be applied to any organism, requiring only a sample from a single population at a single point in time. Application of the method to the three study systems shows that the trained machine learning models perform well at inferring sweep strength and the origin of adaptive alleles, recovering known parameters of evolutionary history of control loci. Thus, I provide evidence that sweep parameters are indeed identifiable from present-day sweep signatures. I find that known selective sweeps are often driven by very strong selection, with selection coefficients on the order of 10% or higher, and that soft sweeps, with more than one adaptive haplotype, are common across organisms. These results contribute to the understanding of patterns of adaptation in nature, and open the door to a promising field of computational inference of sweep parameters, which complements the identification of selective sweep candidates and can guide experimental follow-up efforts. | |
dc.language.iso | en | |
dc.subject | adaptation | |
dc.subject | machine learning | |
dc.subject | selective sweep | |
dc.title | The Inference of Selective Sweep Parameters from their Genomic Footprint | |
dc.type | dissertation or thesis | |
thesis.degree.discipline | Computational Biology | |
thesis.degree.grantor | Cornell University | |
thesis.degree.level | Doctor of Philosophy | |
thesis.degree.name | Ph. D., Computational Biology | |
dc.contributor.chair | Clark, Andrew | |
dc.contributor.chair | Messer, Philipp | |
dc.contributor.committeeMember | Boyko, Adam R. | |
dcterms.license | https://hdl.handle.net/1813/59810.2 | |
dc.identifier.doi | http://doi.org/10.7298/v9gj-0a53 |