Genome-Wide Patterns Of Population Structure And Ancestry Among Continental And Admixed Populations
Population genetics seeks to use genetic data to illuminate patterns of human diversity, investigate how populations are related, and to provide insights into population history, such as migrations events and population sizes. Furthermore, an understanding of population genetics is necessary to disentangle population structure from genetic associations with traits, to learn how genes affect phenotype or to perform disease association mapping. I use high-density single nucleotide polyphorphism (SNP) data to examine population structure in humans among several world-wide populations. I show that principal components analysis (PCA) and STRUCTURE, a bayesian clustering method, are able to resolve structure both among continents as well as illuminate substructure within Europe, South Asia, and East Asia. In an analysis of 12 West African populations, I demonstrate that population structure within the West African samples reflects linguistic relationships and geographical distances, and also shows signals of the Bantu expansion. I proceed to focus on several questions involving populations of mixed ancestry, or admixed populations. First, I introduce a new method for inferring individual ancestry along the genome, or "local ancestry". This method leverages principal component analysis to allow computationally efficient ancestry estimation using high-density SNP data. I apply this method to a sample of African Americans and witness a large range of ancestry proportions across in- dividuals in this panel. I find that the African Americans have a greater propotion of African ancestry on the X chromosome versus the autosomes, consistent with a greater female African and male European ancestry contribution. Since previous studies have suggested a West African ancestral population of African Americans, I use estimates of African and European segments of the genome to examine which of 12 West African populations is closest to the African ancestral population. I find that, consistent with the West African results of previous studies and historical records, the African regions of African American genomes show the lowest genetic divergence to West African populations Igbo, Brong, and Yoruba, which are non-Bantu Niger-Kordofanian speaking populations. Hispanic/Latino (HL) populations possess a complex genetic structure reflecting recent admixture among Native American, European, and West African populations. I estimate ancestry among five Hispanic/Latino populations (Mexico, Ecuador, Colombia, Puerto Rico, and Dominican Republic) and illuminate patterns of ancestry among populations. These differences among HL populations reflect geographic proximity to slave trade routes and ports, European colonizations, and historical migrations. I show a consistent sex bias in ancestry proportions across all five HL populations with higher Native American and lower European ancestry on the X chromosome compared to the autosomes. The ancestry difference on the X versus the autosomes suggests a greater Native American female and European male ancestry contribution bias in all five HL populations, and is further supported by Y chromosome and mitochondrial DNA haplotyping. Lastly, I discuss challenges in identifying the closest Native American ancestral population to the HL populations, such as poor Native American population sampling or substructure within the Americas. However, I am able to show that the Nahua (for Meso-American populations) and the Quechua (for South American populations) are the two populations least differentiated from the Native American segments of the HL individuals.
Population genetics; Population structure; Admixture inference
Bustamante, Carlos D.
Keinan, Alon; Clark, Andrew
Ph.D. of Biometry
Doctor of Philosophy
dissertation or thesis