JavaScript is disabled for your browser. Some features of this site may not work without it.
Exploring the Human Gut Microbiome: Statistical Methods, Computation, and Applications in Metagenomics

Author
New, Felicia
Abstract
The totality of microbial species and their associated genomes living within the human gastrointestinal tract are known collectively as the human gut microbiome. The human gut microbiome is an integral part of human health. There is some evidence that human genomic variation is associated with differences in the composition of the gut microbiome, leading to potential health effects. For example, mutations in NOD2, a gene associated with Crohn’s disease, and mutations in MEFV, a gene causing Mediterranean fever, are associated with compositional shifts in certain bacterial phyla. By jointly analyzing the genomes and the metagenomes of individuals in a population, we can uncover the connection between the two, and how they relate to health outcomes using health or phenotype data. To investigate these questions, I used the shotgun metagenomic sequencing data, along with genotype and phenotype information, for 250 adult female twins from TwinsUK. To understand the link between the gut microbiome’s composition and functions with human health outcomes, I apply classical statistical and machine learning methods to identify features of the gut microbiome that can predict host diseases and phenotypes. I find interesting results for anxiety symptoms within twin pairs who are discordant for anxiety. Specifically, 175 genes were found to be enriched in the twins without anxiety and absent in those with anxiety. Using strain-level metagenomic analyses, I identify the source of these genes as a species within the genus Azospirillum. Studies of the impact of host genetics on the gut microbiome composition have mainly focused on the impact of individual host variants, without considering their collective impact or the specific functions of the gut microbiome. To assess the aggregate role of human genetics on the gut microbiome composition and function, I apply both the Tweedie distribution, for modeling gene and species abundances in metagenomic data, and the multivariate data integration method known as sparse canonical correlation analysis to the challenge of identifying correlations between overall host genetics and the composition of the gut microbiome or its composite functions.
Description
162 pages
Date Issued
2021-12Subject
Biostatistics; Computational Biology; Genomics; Metagenomics; Microbiome; Tweedie
Committee Chair
Brito, Ilana Lauren
Committee Member
Messer, Philipp; Clark, Andrew
Degree Discipline
Genetics, Genomics and Development
Degree Name
Ph. D., Genetics, Genomics and Development
Degree Level
Doctor of Philosophy
Rights
Attribution-NonCommercial-ShareAlike 4.0 International
Type
dissertation or thesis
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-ShareAlike 4.0 International