Cornell University
Library
Cornell UniversityLibrary

eCommons

Help
Log In(current)
  1. Home
  2. Cornell University Graduate School
  3. Cornell Theses and Dissertations
  4. SINGLE NUCLEOTIDE RESOLUTION MODELING OF TRANSCRIPTION INITIATION

SINGLE NUCLEOTIDE RESOLUTION MODELING OF TRANSCRIPTION INITIATION

File(s)
He_cornellgrad_0058F_15331.pdf (24.13 MB)
Permanent Link(s)
https://doi.org/10.7298/tt86-xc15
https://hdl.handle.net/1813/121031
Collections
Cornell Theses and Dissertations
Author
He, Adam
Abstract

Deciphering how cis-regulatory DNA sequences encode transcriptional regulation, and how noncoding genetic variation disrupts these processes, is a central challenge in human genetics. This thesis develops deep learning models to characterize the sequence basis of transcriptional regulation and highlights methods for improving variant effect prediction. First, I present CLIPNET, a base-pair resolution model of transcription initiation trained on population-scale PRO-cap data. CLIPNET reveals a combinatorial grammar in which transcriptional activators and core promoter motifs act synergistically to determine trasncription initiation, with activators primarily modulating initiation levels and core promoter elements specifying initiation sites. The model further identifies DPR motifs and AT-rich candidate TFIID-binding sequences as prevalent determinants of transcription initiation in TATA-less promoters. Next, I show that training sequence-to-function models on functional genomic data with matched personal genomes substantially improves prediction of the molecular impact of genetic variants. Variant effect representations learned in this framework transfer across cell types and experimental readouts. Collectively, this work advances understanding of the cis-regulatory code of transcription and establishes strategies for improving variant effect prediction.

Description
113 pages
Date Issued
2025-12
Keywords
deep learning
•
enhancers
•
genomics
•
transcription initiation
•
transcriptional regulation
•
variant effect prediction
Committee Chair
Danko, Charles
Committee Member
Lis, John
Feschotte, Cedric
Yu, Haiyuan
Degree Discipline
Computational Biology
Degree Name
Ph. D., Computational Biology
Degree Level
Doctor of Philosophy
Rights
Attribution 4.0 International
Rights URI
https://creativecommons.org/licenses/by/4.0/
Type
dissertation or thesis

Site Statistics | Help

About eCommons | Policies | Terms of use | Contact Us

copyright © 2002-2026 Cornell University Library | Privacy | Web Accessibility Assistance