Uncovering variation in the repetitive portions of genomes to elucidate transposable element and satellite evolution
MetadataShow full item record
McGurk, Michael Peter
Eukaryotic genomes are replete with repeated sequence, in the form of transposable elements (TEs) dispersed throughout genomes and as large stretches of tandem repeats (satellite arrays). Neutral and selfish evolution likely explain their prevalence, but repeat variation can impact function by altering gene expression, influencing chromosome segregation, and even creating reproductive barriers between species. Yet, while population genomic analyses have illuminated the function and evolution of much of the genome, our understanding of repeat evolution lags behind. Tools that uncover population variation in non-repetitive portions of genomes often fail when applied to repetitive sequence. To extend structural variant discovery to the repetitive component of genomes we developed ConTExt, employing mixture modelling to discover structural variation in repetitive sequence from the short read data that commonly comprises available population genomic data. We first applied ConTExt to investigate how mobile genetic parasites can transform into megabase-sized tandem arrays, as some satellites clearly originated as TEs. Making use of the Global Diversity Lines, a panel of Drosophila melanogaster strains from five populations, this study revealed an unappreciated consequence of transposition: an abundance of TE tandem dimers resulting from TEs inserting multiple times at the same locus. Thus, the defining characteristic of TEs—transposition—regularly generates structures from which new satellite arrays can arise, and we further captured multiple stages in the emergence of satellite arrays ongoing in a single species. We then investigated the complex array of processes which shape TE evolution, focusing on the putatively domesticated HeT-A, TAHRE, and TART (HTT) elements that maintain the telomeres of Drosophila. To provide context, we compared HTT variation to that of other TE families with known properties. Our results suggest that differences between HTT variation and other TE families largely result from the rapid sequence turnover at telomeres. We further suggest that the localization of the HTTs to the telomere reflects a successful evolutionary strategy rather than pure domestication. However, we find evidence that susceptibility to host regulation varies among HTTs and across populations, suggesting that despite constituting the mechanism of telomere maintenance, the HTTs remain in conflict with the genome like any other TE.
Supplemental file(s) description: Supplemental Table S1, Supplemental Table S2, Supplemental Table S3, Supplemental Table S4
Repetitive DNA; Satellite DNA; Transposable elements; Evolution & development; Genetics; Bioinformatics
Barbash, Daniel A.
Clark, Andrew; Soloway, Paul
Biochemistry, Molecular and Cell Biology
Ph.D., Biochemistry, Molecular and Cell Biology
Doctor of Philosophy
Attribution-NonCommercial 4.0 International
dissertation or thesis
Except where otherwise noted, this item's license is described as Attribution-NonCommercial 4.0 International