DECIPHERING MULTI-LAYER FUNCTIONAL EFFECTS OF GENOMIC VARIANTS IN HUMAN DISEASES
Access Restricted
Access to this document is restricted. Some items have been embargoed at the request of the author, but will be made publicly available after the "No Access Until" date.
During the embargo period, you may request access to the item by clicking the link to the restricted file(s) and completing the request form. If we have contact information for a Cornell author, we will contact the author and request permission to provide access. If we do not have contact information for a Cornell author, or the author denies or does not respond to our inquiry, we will not be able to provide access. For more information, review our policies for restricted content.
No Access Until
Permanent Link(s)
Collections
Other Titles
Author(s)
Abstract
Every individual has millions of genomic variants compared to a reference genome. Only a small fraction of these variants can have significant impacts on human diseases. The challenge in human genomics research lies in identifying these large-effect variants and understanding how subtle changes in the DNA translate into disease phenotypes. This involves unraveling complex intermediate layers of how these genetic alterations exert their functional effects across multiple scales. In this dissertation, I present my research on bridging the gap between genetic variation and disease phenotypes, which is fundamental to advancing personalized medicine and targeted therapies. In Chapter 2, I present several computational approaches to identify functional disease-associated variants, leveraging genomic background mutability models, 3D protein structural information, transcription-based enhancer identification strategies, and enhancer-gene linkage mapping approaches. In Chapter 3, I developed a unified, end-to-end 3D structurally-informed protein interaction network propagation framework, NetFlow3D, that systematically maps the multiscale mechanistic effects of somatic mutations in cancer. NetFlow3D anisotropically propagates the impacts of spatial clusters of mutations on 3D protein structures across the protein interaction network, with propagation guided by the specific 3D structural interfaces involved, to identify significantly interconnected network “modules”, thereby uncovering key biological processes driving cancer. In Chapter 4, I established an integrative framework to delve into the etiology underlying autism spectrum disorder (ASD), which combines: (i) a gene-centric statistical model integrating coding and noncoding evidence of rare variant association, (ii) likely altered PPIs–as revealed by the presence of damaging de novo missense variants on their 3D structural interfaces, and (iii) the topology of the PPI network. The integration of noncoding data has nearly doubled the analytical power of gene discovery, and has uncovered an emerging class of potential ASD pathways. In summary, the theme of my thesis is identifying disease-associated variants by leveraging various biological data, and combining their complementary insights to decipher the complex mechanisms underlying in human diseases. The core principle of my approach is to strategically integrate these separate insights into unified framework architectures that closely aligns with the underlying biological nature, thereby effectively converging relevant signals while filtering out noise, and at the same time, systematically unraveling the complex intermediate layers that illustrate how subtle genetic changes translate into observable disease phenotypes.
Journal / Series
Volume & Issue
Description
Sponsorship
Date Issued
Publisher
Keywords
Location
Effective Date
Expiration Date
Sector
Employer
Union
Union Local
NAICS
Number of Workers
Committee Chair
Committee Co-Chair
Committee Member
Booth, James