Population Genetic Analyis Of Entire Genomes, From Snp Discovery To Genome-Wide Scans For Selection
The analysis of molecular genetic data has driven the fields of molecular biology, genetics, population genetics, and quantitative genetics for over half a century. Only recently though has technology advanced to the point where molecular genetic data can be acquired cheaply and efficiently for the entire genome of several individuals enabling scientists to conduct genome-wide comparisons between several individuals or several population samples, and ask comprehensive questions regarding the nature of genetic variation in extant populations and the evolutionary forces in the population's history which generated and influenced this variation. Several challenges exist to utilizing these new technologies successfully however and in most cases both experimental optimization of laboratory protocols and the customization or de novo implementation of computational and statistical analysis methods are required to obtain adequate results. Even when the raw physical data acquired by these technologies has been successfully rendered into biologically meaningful molecular genetic data, the analysis of these large, genome-wide datasets is formidable and again requires advanced and customized methods to ask biologically motivated questions and produce conclusive results which may not have been obtainable without complete genome information. Here, I discuss two main technologies for the acquisition of genome-wide molecular data, next-generation sequencing technologies and fixed-array highly multiplexed SNP genotyping, and discuss the challenges in applying them in plant systems. Additionally, I demonstrate a population genetic analysis for the detection of recent selective sweeps in four subpopulations of Oryza sativa (cultivated Asian rice) and one Oryza rufipogon population (wild Asian rice) utilizing the genome-wide molecular data acquired by next-generation sequencing. The development of an improved and accurate statistical method to detect selection in population genomic analysis combined with genome-wide data in each of these subpopulations allowed the extent and location of selective sweeps in Oryza sativa subpopulations and its wild progenitor Oryza rufipogon to be quantified and compared for the first time, revealing that each cultivated subpopulation appears to have a largely unique and independent selective and domestication history, but several advantageous alleles for cultivation of rice that originated and were selected for in one subpopulation have been introduced into other subpopulations by introgression.
dissertation or thesis