Show simple item record

dc.contributor.authorVasconcellos Caldas, Ian
dc.date.accessioned2022-10-31T16:21:01Z
dc.date.available2022-10-31T16:21:01Z
dc.date.issued2022-08
dc.identifier.otherVasconcellosCaldas_cornellgrad_0058F_13249
dc.identifier.otherhttp://dissertations.umi.com/cornellgrad:13249
dc.identifier.urihttps://hdl.handle.net/1813/112079
dc.description123 pages
dc.description.abstractA selective sweep happens when an adaptive allele increases in frequency and spreads through the population; this mode of natural selection has been implicated in rapid adaptation relevant to agriculture, medicine, and human evolution. Understanding the role of sweeps in nature involves the knowledge of their evolutionary parameters, such as the strength of selection driving a sweep and the number and origin of adaptive alleles. The overarching goal of this thesis is to discover whether these evolutionary parameters are identifiable from the characteristic patterns of genetic variation that a sweep leaves in the genome (its "footprint"). For that purpose, three systems of known rapid evolution events across the tree of life are explored in detail: the evolution of pesticide resistance in Drosophila melanogaster; the evolution of anti-malarial drug resistance in the malaria parasite Plasmodium falciparum in Thailand; and the evolution of dehydration resistance in the human Turkana population of northern Kenya. I propose and develop a method of inferring sweep parameters based on simulated datasets and supervised machine learning that can be applied to any organism, requiring only a sample from a single population at a single point in time. Application of the method to the three study systems shows that the trained machine learning models perform well at inferring sweep strength and the origin of adaptive alleles, recovering known parameters of evolutionary history of control loci. Thus, I provide evidence that sweep parameters are indeed identifiable from present-day sweep signatures. I find that known selective sweeps are often driven by very strong selection, with selection coefficients on the order of 10% or higher, and that soft sweeps, with more than one adaptive haplotype, are common across organisms. These results contribute to the understanding of patterns of adaptation in nature, and open the door to a promising field of computational inference of sweep parameters, which complements the identification of selective sweep candidates and can guide experimental follow-up efforts.
dc.language.isoen
dc.subjectadaptation
dc.subjectmachine learning
dc.subjectselective sweep
dc.titleThe Inference of Selective Sweep Parameters from their Genomic Footprint
dc.typedissertation or thesis
thesis.degree.disciplineComputational Biology
thesis.degree.grantorCornell University
thesis.degree.levelDoctor of Philosophy
thesis.degree.namePh. D., Computational Biology
dc.contributor.chairClark, Andrew
dc.contributor.chairMesser, Philipp
dc.contributor.committeeMemberBoyko, Adam R.
dcterms.licensehttps://hdl.handle.net/1813/59810.2
dc.identifier.doihttp://doi.org/10.7298/v9gj-0a53


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Statistics