Understanding Metabolic Disease and Dietary Interventions Through Statistical Analysis of Metabolomics Data
Metabolomics data analysis is crucial to understanding complex biological systems, as it provides an instantaneous snapshot of cellular and physiological processes, revealing insights into the metabolic component of diseases and other phenotypes. However, the vast amounts of data produced by metabolomics technologies alongside the intricate relationships underlying metabolic processes make interpretation of metabolomics data virtually impossible without advanced statistical tools. This dissertation synthesizes three interrelated but distinct studies that collectively contribute to a holistic approach for metabolomics data analysis, advancing both scientific research and medical discovery. The first study addresses the limitations of current dimensionality reduction methods. Metabolomics data can be made more interpretable via dimensionality reduction which deconvolutes datasets into underlying metabolic processes. However, current state-of-the-art methods are widely incapable of detecting nonlinearities in metabolomics data. Using deep learning, specifically variational autoencoders, we outperform traditional methods, adeptly capturing non-linear relationships and uncovering signal transferability between datasets, indicating universal metabolic processes present in blood. The second study pivots to the functional interpretation of metabolite relationships within metabolomics and multi-omics datasets. Existing clustering methods often overlook the hierarchical nature of biological systems, constraining downstream analyses and interpretation. To address this, we introduce AutoFocus, an innovative hierarchical clustering method. Applied to multi-platform datasets, AutoFocus uncovers multi-omic modules associated with Type 2 Diabetes and Alzheimer’s disease at various scales, providing a more nuanced comprehension of disease pathways. Finally, as metabolomics data is highly reflective of an individual’s environment, our third study uses metabolomics to examine the impact of lifestyle interventions, specifically the Ketogenic Diet (KD). KD has shown promise as a therapeutic in epilepsy, metabolic syndrome, and cancer, but these effects have not been well characterized mechanistically. Through a KD intervention trial profiling both blood and Cerebrospinal Fluid (CSF) metabolomics, we identify significant metabolic impacts of the diet on amino acid metabolism, cholesterol, and inflammation, highlighting the diet’s potential in managing Alzheimer’s disease risk and predispositions for other metabolic disorders. In sum, this dissertation presents a multi-faceted approach to metabolomics data analysis. The collective findings deepen our understanding of the intricate relationships inherent in metabolomics data and offer innovative perspectives on disease metabolism and therapeutic intervention.