ENHANCER-CENTRIC DISSECTION OF CIS-

REGULATORY LOGIC IN HUMAN CELLS 

 
A Dissertation 

Presented to the Faculty of the Graduate School 

of Cornell University 

In Partial Fulfillment of the Requirements for the Degree of 

Doctor of Philosophy 

 
by 

Zhou Zhou 

December 2025


© 2025 Zhou Zhou


ENHANCER-CENTRIC DISSECTION OF CIS-REGULATORY LOGIC 

IN HUMAN CELLS 

 
Zhou Zhou, Ph. D. 

Cornell University 2025 

 
Enhancers are essential cis-regulatory elements that orchestrate cell-type-

specific gene expression. While conceptually simple, their mechanisms of action are 

remarkably complex. At the level of individual elements, active enhancers typically 

consist of a central transcription factor binding region flanked by a pair of divergent 

core promoters. On a broader genomic scale, individual enhancers frequently 

collaborate with other regulatory elements to achieve precise and robust transcriptional 

control. In this thesis, I aim to bridge these two layers of enhancer regulation through a 

combination of genome engineering, high-throughput screening, and functional 

genomics. I first dissected the sequence–function relationship of a model long-range 

human enhancer, “eNMU,” which drives an extraordinary ~10,000-fold activation of its 

target gene NMU from 94 kb away. Systematic dissection guided by the divergent 

transcription model revealed extensive transcription factor synergy at this enhancer and 

uncovered a complex interplay between the core divergently transcribed enhancer unit 

and surrounding cis-regulatory elements. Notably, these include intrinsically inactive 

facilitators that augment and buffer enhancer output, as well as an adjacent retroviral 

long terminal repeat (LTR) promoter that acts to repress enhancer activity. These two 


emerging modes of cis-acting logic may be broadly utilized across the genome, 

suggesting that the complexity of enhancer regulation has been significantly 

underestimated. In a parallel line of investigation, I identified a heat shock–induced, 

HSF1-bound distal enhancer and systematically characterized ~200 HSF1-bound distal 

elements using a massively parallel episomal reporter assay and the sensitive PRO-cap 

assay. This analysis revealed that heat-induced enhancer transcription is a prerequisite 

for heat-induced enhancer activity. Together, these findings offer new insights into the 

multilayered mechanisms by which enhancers function and are themselves regulated. 


v 

BIOGRAPHICAL SKETCH 

Zhou Zhou was born and raised in the city of Nanchang, People’s Republic of 

China. She attended the Attached Middle School of Jiangxi Normal University in her 

hometown. In 2018, she earned her B.S. in Cell and Molecular Biology from the Chinese 

University of Hong Kong in Hong Kong SAR. After obtaining her bachelor’s degree, 

Zhou Zhou joined the graduate program of Biochemistry, Molecular and Cell Biology 

at Cornell University to pursue a doctorate degree in molecular biology, where she 

explored the intersection of enhancer biology, transcription regulation, and genome 

engineering, under the mentorship of Professor John T. Lis.  


vi 

 
To my mentors and friends—thank you for being both. 


vii 

ACKNOWLEDGMENTS 

I am deeply grateful to my parents for their unconditional love throughout my 

academic journey, to my grandparents and cousins for unwavering support, and to my 

cats Bubble and Sushi for being a constant source of comfort. To my childhood friends, 

Lianghui (Lily) Li, Mengyuan (Vimy) Wan, Ziya Zhong, and Ximeng (Simon) Fan—

my soul-sisters—your blessings walk with me, always. Especially to Lily—thank you 

for being by my side, from our kindergarten days to beautiful sunsets on Libe Slope. I’ll 

always remember those two springs we spent together, and your unwavering belief in 

me.    

To my college friends Zheng (April) Wang, Yinuo Hu, Yue (Anna) Gao, Jinci 

(Tina) Liu, Jiabin (Jessica) Chen, and Yifan (Evian) Yao: though oceans apart, our bond 

remains unshaken. I cherish our enduring friendship. 

To the wonderful friends in Ithaca: Tong, thank you for countless joyful 

memories and for always grounding me in reality. Elif, my brilliantly artistic roommate, 

thank you for the laughter, the tears, and the deep trust we shared. My exceptional cohort 

mates—Xinchen Chen, Fangyu Wang, and Yiwen Qin—thank you for growing 

alongside me. Xinchen, in particular, your support during my darkest days is something 

I will always carry with gratitude. Haining Chen, Junke Zhang, Kunlin Li, You Chen, 

Li Yao, Shuran Wang, and others—thank you for your companionship, wisdom, and 

genuine care for me. 

To my lab peers Alex and Philip: your dedication, discipline, and kindness have 

been so humbling and inspiring to me. I have learned so much from you both. To Janis—


viii 

thank you for putting up with my messiness and for being the lab mom we all rely on. 

To our brilliant past and present postdocs Anni, Erin, Eric, Takuya, Jin (Yu Lab), Jinjoo 

(Yu Lab), Sagar, Gopal, Yuko, and Raphael for warmly help and insightful discussions. 

To our current graduate students Jaret, Yiyang, Cara, James, Miliarys, Brent, Adam, 

Katya, and Yang—and to alumni Nate, Lina, Kara, Julius, Mike, and Jawaher—for the 

camaraderie and shared journey. To my undergrad colleagues—Zining, Kiara, Albert, 

Rachel, and Jessica—thank you for your enthusiasm, hard work, and the joy of working 

together. 

I am especially grateful to my committee members, Andrew and Charles. 

Andrew, your encouragement during the formative years of my Ph.D. meant more than 

you may realize; it was a light in times of doubt. Thank you also for broadening my 

scientific horizons beyond the lab by involving me in the faculty search committee. 

Charles, thank you, along with Adam He, for sharing expertise in computational biology 

and for your steady support throughout my thesis journey. 

To the mentors who shaped me most profoundly—Abdullah and Judhajeet. 

Words fall short of capturing what you mean to me. You welcomed me into the lab and 

became the reason I stayed. Judhajeet, thank you for teaching me to experiment with 

rigor, and for always listening with patience and guiding me with care. Your 

companionship during the pandemic was a lifeline, and your emotional support gave me 

hope when I needed it most. Abdullah, your bold ideas and sharp thinking have left a 

lasting mark on my scientific taste. Thank you for pushing me to grow, even when it 

brought tears, and for your unwavering care, always offered with a golden heart. I feel 


ix 

incredibly lucky to have had you both not only as my mentors, but also as my role 

models and lifelong friends. 

And to John—my advisor, mentor, and comrade in science—thank you for 

believing in me through every struggle, detour, and moment of doubt. Thank you for 

nurturing in me the confidence to think independently, the courage to fight 

perfectionism, and the curiosity to remain an active learner. Your optimism, clarity, and 

commitment to mentorship have been a guiding light. You’ve shown me how science 

can be both serious and fun—a lesson I’ll carry forward always. 

Finally, thank you to Cornell and Ithaca—for the tranquil beauty and the 

peaceful freedom to grow. I’ve often complained about the isolation and solitude, but 

in truth, it gave me the space I needed to find myself. I know I’ll miss it wherever I go. 

 
x 

TABLE OF CONTENTS 

BIOGRAPHICAL SKETCH ........................................................................................... v 

ACKNOWLEDGMENTS ............................................................................................ vii 

TABLE OF CONTENTS ................................................................................................ x 

LIST OF FIGURES ...................................................................................................... xiii 

LIST OF TABLES ......................................................................................................... xv 

Chapter 1 Introduction ................................................................................................. 1 

1.1 A Historical Perspective on Enhancer Discovery and Characterization .......... 1 

1.1.1 The Emergence of Enhancer Biology: When Cis-Regulatory Code Met 

Trans-Acting Factors (1980s–1990s) .................................................................... 1 

1.1.2 The Genomics Revolution: Epigenomic Signatures of Active Enhancers 

(2000s–2010s) ......................................................................................................... 5 

1.1.3 The Power of Modern Molecular Biology and Genetics: Quantifying and 

Predicting Enhancer Activity (2010s–present) .................................................... 8 

1.2 The Enigma of Enhancer–Promoter Communication ..................................... 11 

1.2.1 Spatial Connectivity between Enhancers and Promoters ........................ 11 

1.2.2 Biochemical Compatibility between Enhancers and Promoters ............. 15 

1.3 Revisiting the Regulatory Complexity of Enhancers ...................................... 18 

1.3.1 Reflection on What Makes an Enhancer .................................................. 18 

1.3.2 Transcriptional Regulatory Hubs: Interplay Among Cis-Acting Elements

 .............................................................................................................................. 22 

Chapter 2 Robust regulatory interplay of enhancers, facilitators, and promoters in a 
native chromatin context ............................................................................................ 24 

2.1 Abstract .............................................................................................................. 24 

2.2 Introduction ....................................................................................................... 24 


xi 

2.3 Results ................................................................................................................ 27 

2.3.1 eNMU landing pad as a powerful system to study enhancer function at 

the native locus .................................................................................................... 27 

2.3.2 Screening for functional units and motifs in eNMU ................................ 33 

2.3.3 Interplay of regulatory factor binding at eNMU ...................................... 39 

2.3.4 TF-specific regulation of chromatin accessibility and nascent 

transcription ........................................................................................................ 42 

2.3.5 Facilitator e2 universally confers enhancer robustness ........................... 47 

2.3.6 A 3D regulatory hub of enhancer, promoter and facilitators of NMU ... 50 

2.3.7 Dynamics of eNMU regulation during erythroid differentiation ............ 55 

2.3.8 A putative LTR promoter as a built-in negative regulatory element for 

enhancer activity ................................................................................................. 58 

2.4 Discussion .......................................................................................................... 62 

2.5 Methods ............................................................................................................. 68 

2.6 Acknowledgements ........................................................................................... 91 

Chapter 3 Investigating the necessity of enhancer transcription for enhancer 
function in a heat-inducible system ........................................................................... 93 

3.1 Abstract .............................................................................................................. 93 

3.2 Introduction ....................................................................................................... 94 

3.3 Results ................................................................................................................ 95 

3.3.1 Evidence of an HSF1-bound, heat-inducible enhancer ........................... 95 

3.3.2 Systematically testing enhancer activity of a library of HSF1-bound 

candidate elements using eSTARR-seq .............................................................. 98 


xii 

3.3.3 Induced transcription initiation at “untranscribed” elements detected by 

PRO-cap ............................................................................................................. 101 

3.4 Discussion ........................................................................................................ 103 

3.5 Methods ........................................................................................................... 104 

3.6 Acknowledgements ......................................................................................... 107 

Chapter 4 Conclusion and Perspectives ................................................................... 108 

Appendix A A flexible Protein-Tagging Strategy for Mapping CDK9 
Chromatin Occupancy and Nuclear Proximity Interactome .................................. 114 

 
xiii 

LIST OF FIGURES 

Fig. 1.1 Molecular architecture (A) (adapted from Tippens et al., 2020) and 

epigenomic features (B) of active human enhancers. .................................................. 8 

Fig. 1.2 The transcription cycle of RNA Pol II at human genes (adapted from Fuda 

et al., 2009). .................................................................................................................. 18 

Fig. 2.1 eNMU is composed of an autonomous enhancer e1 and a facilitator e2. ... 29 

Fig. 2.2 eNMU specifically regulates NMU gene transcription in K562. ................. 30 

Fig. 2.3 eNMU landing pad as a powerful system to study enhancer function at the 

native locus. ................................................................................................................. 32 

Fig. 2.4 The HCR-FlowFISH screen for functional sequences in eNMU. ................ 35 

Fig. 2.5 Key functional units and motifs in eNMU. ................................................... 37 

Fig. 2.6 Multiplicative effects of double mutants revealed by FlowFISH screen. ... 38 

Fig. 2.7 Interplay of regulatory factor binding at eNMU. ......................................... 40 

Fig. 2.8 TF-specific regulation of chromatin accessibility at the enhancer and 

promoter of NMU. ....................................................................................................... 44 

Fig. 2.9 TF-specific regulation of chromatin accessibility at the enhancer and 

promoter of NMU. ....................................................................................................... 46 

Fig. 2.10 Testing CRISPR-validated heterologous K562 dTREs at the eNMU locus.

 ...................................................................................................................................... 48 

Fig. 2.11 Facilitator e2 universally confers enhancer robustness. ............................ 49 

Fig. 2.12 A 3D regulatory hub of enhancer, promoter and facilitators of NMU. .... 52 

Fig. 2.13 Additional epigenomic features of enhancer, promoter and facilitators. . 53 

Fig. 2.14 Dynamics of eNMU regulation during erythroid differentiation. ............ 57 

Fig. 2.15 A putative LTR promoter as a built-in negative regulatory element for 

enhancer activity. ........................................................................................................ 60 

Fig. 3.1 Homozygous deletion of eTAX1BP1 abolishes heat inducibility of 


xiv 

TAX1BP1 expression. .................................................................................................. 96 

Fig. 3.2 eTAX1BP1 activates episomal reporter gene expression in response to heat 

shock. ........................................................................................................................... 98 

Fig. 3.3 Four classes of HSF1-bound elements for eSTARR-seq testing. .................. 99 

Fig. 3.4 Systematically testing heat-induced enhancer activity of HSF1-bound 

candidate elements using eSTARR-seq. ................................................................... 100 

Fig. 3.5 PRO-cap detects induced transcription initiation at “untranscribed” 

elements upon HS. ..................................................................................................... 102 

 
Fig. A.1 Schematic of Bxb1-mediated flexible protein-tagging strategy for CDK9.

 .................................................................................................................................... 115 

Fig. A.2 Western blot analysis of tagged CDK9 clonal cell lines. ........................... 115 

Figure A.3 Probing physical interaction between CDK9-FLAG-APEX2 and Cyclin 

T1 by Co-IP analysis. ................................................................................................. 116 

Figure A.4 Nuclear isolation effectively eliminates endogenously biotinylated 

proteins. ..................................................................................................................... 117 

Figure A.5 Optimizing CDK9-EGFP ChIP-seq with different crosslinking 

conditions. .................................................................................................................. 118 


xv 

LIST OF TABLES 

Table 3.１qPCR primer sequences used in this study ............................................. 105 

Supplementary Table S1. (separate spreadsheet) Sequences of all mutants analyzed 

in Chapter 2. 

Supplementary Table S2. (separate spreadsheet) Sequences of all oligonucleotides 

and dsDNA fragments synthesized in Chapter 2. 

Supplementary Table S3. (separate spreadsheet) Raw read counts and calculated 

activity scores for individual barcodes from the HCR-FlowFISH screen in Chapter 

2. 

Supplementary Table S4. (separate spreadsheet) Public datasets and their sources 

used in Chapter 2. 

Supplementary Table S5. (separate spreadsheet) Information and eSTARR-seq-

measured enhancer activity of HSF1-bound elements tested in Chapter 3.  


1 

CHAPTER 1 INTRODUCTION 

In the human genome, RNA polymerase II (Pol II) transcribes all ~20,000 

protein-coding genes along with several important classes of non-coding RNAs, 

including long non-coding RNAs (lncRNAs) and microRNA (miRNA) precursors. A 

central question in biology is how different cell types, responding to diverse signaling 

cues, establish and maintain their unique Pol II transcriptional programs—a process 

fundamental to development and disease. While gene promoters provide the core 

sequence architecture required to initiate Pol II transcription, many genes rely on an 

additional class of cis-regulatory elements, enhancers, to achieve cell-type-specific 

transcription activation. This thesis focuses on enhancers and aims to offer new 

conceptual insights into enhancer mechanisms in human cells. In this introductory 

chapter, I begin by reviewing the historical progression of enhancer identification and 

characterization. I then explore the longstanding enigma of enhancer–promoter 

communication, highlighting the field’s continuously evolving understanding. Finally, 

I discuss several key questions in enhancer biology, including the functional features of 

enhancers and the complex interplay among multiple cis-regulatory elements. 

1.1 A Historical Perspective on Enhancer Discovery and Characterization  

1.1.1 The Emergence of Enhancer Biology: When Cis-Regulatory Code Met Trans-

Acting Factors (1980s–1990s)  

The story of enhancer biology is deeply intertwined with the broader advances 

in molecular biology and genomics technologies, whose impact has extended across 


2 

numerous areas of research. What makes enhancer biology uniquely fascinating is its 

paradoxical nature—simple in concept, yet complex in action—a theme this mini-

review seeks to illustrate. The term “enhancer” was first introduced in 1981 by Walter 

Schaffner’s group (Banerji et al., 1981), who demonstrated using transfected plasmids 

in Hela cells that a 72-bp simian virus 40 (SV40) DNA could markedly enhance the 

expression of a rabbit β-globin reporter gene when inserted several kilobases upstream 

or downstream in cis, in either orientation relative to the promoter. This seminal finding 

led to the classical definition of enhancers as DNA elements that can activate 

transcription from a promoter in a position- and orientation-independent manner. 

Shortly thereafter, the first mammalian enhancers were discovered in rearranged mouse 

immunoglobin genes by several independent groups (Banerji et al., 1983; Gillies et al., 

1983; Neuberger, 1983; Queen & Baltimore, 1983; Picard & Schaffner, 1984). 

Importantly, the immunoglobulin enhancers exhibited activity exclusively in lymphoid 

cells but not in other cell types, suggesting that enhancer function is governed by the 

interplay between DNA sequence and the cell-type-specific repertoire of trans-acting 

factors. In the ensuing years, a growing body of evidence substantiated this idea: in vivo 

(Schöler & Gruss, 1984; Mercola et al., 1985) and in vitro (Wildeman et al., 1984; 

Sassone-Corsi et al., 1985; Schöler & Gruss, 1985) competition assays showed that 

enhancer-driven reporter activity could be titrated by introducing excess enhancer DNA 

in trans, indicating that enhancer function depends on a limited pool of trans-acting 

factors; in vivo dimethyl sulfate (DMS) protection assays (Ephrussi et al., 1985; Church 

et al., 1985), together with systematic mutagenesis and in vitro DNase I footprinting 

experiments (Zenke et al., 1986; Wildeman et al., 1986; Davidson et al., 1986), further 


3 

mapped functional sequence motifs corresponding to the DNA contact sites of 

regulatory proteins. Around the same time, the labs of Philip Sharp and David Baltimore 

employed a combination of heparin-Sepharose chromatography, electrophoretic 

mobility shift assays, and methylation interference analysis to achieve the first 

identification of nuclear factors interacting with the immunoglobin enhancers—

including NF-κB, a transcription factor later found to be a central regulator of immune 

response (Singh et al., 1986; Weinberger et al., 1986; Sen & Baltimore, 1986; Staudt et 

al., 1986).  

Although constrained by the experimental tools and the limited number of 

characterized enhancers at the time, the complexity of enhancer regulation quickly 

became evident. First, enhancers were found to harbor multiple classes of sequence 

motifs (Zenke et al., 1986)—some exhibiting cell-type-specific protein occupancy, 

others more ubiquitous (Davidson et al., 1986; Sen & Baltimore, 1986). Second, closely 

homologous motifs within an enhancer could be bound by distinct factors in a given cell 

type (Weinberger et al., 1986; Sen & Baltimore, 1986). Third, a single motif could be 

occupied by different factors in different cell types (Staudt et al., 1986). Fourth, 

enhancer activity could be negatively regulated (Borrelli et al., 1984; Hen et al., 1985; 

Mitchell et al., 1987) or stimulated by inducible factor binding in response to signaling 

cues (Payvar et al., 1983; Treisman, 1985; Staudt et al., 1986). Fifth, the apparent 

discrepancy between in vitro and in vivo binding data suggested that chromatin 

accessibility may influence transcription factor occupancy in living cells (Sen & 

Baltimore, 1986). Together, these findings strongly implied that enhancers play a 

critical role in orchestrating complex, tissue-specific gene expression programs. 


4 

By 1990, it was well established that sequence-specific transcription factors 

(TFs) and general transcription factors (TFIIA, B, D, E, and F) (Buratowski et al., 1989) 

represent two distinct classes of transcriptional regulators. The general transcription 

factors are essential for basal transcription in vitro, whereas sequence-specific TFs 

further augment transcriptional output, especially in response to signaling cues. Most 

TFs share a basic structure consisting of a DNA binding domain and an acidic activating 

domain. Interestingly, experiments showed that introducing an artificially high amount 

of an activator B (or its activating domain) in vitro could suppress the transcriptional 

stimulation of a heterologous gene driven by activator A (Gill & Ptashne, 1988; M. E. 

Meyer et al., 1989; Martin et al., 1990), but not the basal transcription mediated by 

TFIID (Berger et al., 1990). This “squelching” effect implied the existence of an 

intermediary “adaptor” protein shared between the two activators (reviewed in Ptashne 

& Gann, 1990). An major milestone in search of these “adaptors” was the discovery of 

the Mediator complex in yeast by the Kornberg group (Kelleher et al., 1990; Flanagan 

et al., 1991; Y. J. Kim et al., 1994), which was later shown to be conversed and essential 

in metazoans (reviewed in Bourbon et al., 2004). We now know that Mediator is a huge 

multiprotein complex that interacts with a broad range of TFs via distinct interfaces 

(reviewed in Abdella et al., 2021). Once recruited by TFs, it forms a stable complex 

with Pol II preinitiation complex (PIC) and facilitates TFIIH-mediated phosphorylation 

of Ser5 on Pol II C-terminal domain (CTD), a critical step in promoter escape (detailed 

in recent cryo-EM studies by Abdella et al., 2021 and Rengachari et al., 2021).  

 Another central coactivator in the context of enhancers is p300/CBP. These 

two closely related paralogs were originally identified through their interactions with 


5 

the adenovirus E1A oncoprotein (Yee & Branton, 1985; P. Whyte et al., 1989; Stein et 

al., 1990; Eckner et al., 1994) and the transcription factor CREB (Chrivia et al., 1993), 

respectively. Unlike the multi-subunit Mediator complex, p300/CBP is a single peptide 

containing multiple functional domains that interact with a wide array of TFs, thus 

capable of integrating diverse signaling pathways at the chromatin level (reviewed in 

Janknecht & Hunter, 1996; Shiama, 1997; Goodman & Smolik, 2000). Importantly, 

p300/CBP also harbors a histone acetyltransferase (HAT) domain, which acetylates 

lysine residues on histone tails of H3 (Jin et al., 2011) and H4, as well as many non-

histone transcriptional regulators (reviewed in Dancy & Cole, 2015). This acetylation 

function is a key mechanism by which p300/CBP promotes transcriptional activation.             

1.1.2 The Genomics Revolution: Epigenomic Signatures of Active Enhancers 

(2000s–2010s)   

The completion of the Human Genome Project (Lander et al., 2001; 

International Human Genome Sequencing Consortium, 2004), along with the launch of 

the ENCODE project in 2003 (ENCODE Project Consortium, 2004), ushered in a new 

era of human functional genomics. Thanks to the pilot phase of ENCODE which 

focused on characterizing functional sequences within 1% of the human genome 

(Birney et al., 2007), a number of classical biochemical and molecular biology assays 

were adapted into high-throughput, sequencing-based formats to study chromatin 

structure and transcription regulation. These included DNase-chip (Crawford et al., 

2006) and ChIP-chip (Heintzman et al., 2007), which later evolved into DNase-seq 

(Boyle et al., 2008) and ChIP-seq (Visel et al., 2009; Creyghton et al., 2010). These 

studies collectively identified several hallmark chromatin signatures of active 


6 

enhancers, such as DNase I hypersensitivity (DHS), binding of the coactivator p300, 

enrichment of histone modifications H3K27ac and H3K4me1, and relative depletion of 

H3K4me3 (Heintzman et al., 2007; Visel et al., 2009; Creyghton et al., 2010). In 

contrast, gene promoters were shown to display the opposite pattern of H3 

methylation—enriched for H3K4me3 but depleted for H3K4me1 (Heintzman et al., 

2007). However, subsequent studies challenged the functional importance of H3K4me1, 

showing that it is dispensable for enhancer function (Dorighi et al., 2017), and that 

strong enhancers can also exhibit H3K4me3 enrichment instead of H3K4me1 (Core et 

al., 2014; Henriques et al., 2018), thus questioning the utility of H3K4me1 as a 

predictive enhancer mark. 

Another hallmark of active enhancers is bidirectional transcription, first 

discovered in mouse neuronal cells using total RNA-seq (T.-K. Kim et al., 2010). Due 

to the limited abundance and rapid turnover of enhancer RNAs (eRNAs), detection of 

enhancer transcription was significantly improved by nascent RNA sequencing 

approaches, particularly nuclear run-on–based assays GRO-cap and PRO-cap (Kruesi 

et al., 2013; Kwak et al., 2013; reviewed in Yao et al., 2022), which selectively enrich 

for 5′-capped nascent transcripts to map genome-wide transcription initiation events. 

Importantly, GRO-cap analysis in human cells revealed a strikingly similar molecular 

architecture shared by enhancers and promoters: an upstream nucleosome-depleted TF 

binding region flanked by two divergent core promoters that initiate bidirectional Pol II 

transcription (Core et al., 2014). This unified architectural model (Figure 1.1A) laid the 

groundwork for precise enhancer unit annotation in later studies (Tippens et al., 2020; 

Yao et al., 2022). 


7 

We now arrive at a (relatively) complete picture of the epigenomic landscape of 

active enhancers (Figure 1.1B). Mechanistically, these features are highly 

interconnected and collectively contribute to enhancer function. For instance, in 

addition to recruiting coactivators like Mediator and p300, TFs also establish or 

maintain DHS by working in concert with ATP-dependent chromatin remodeling 

complexes (Ho & Crabtree, 2010) and/or acting as pioneer factors that bind condensed 

nucleosomes and open up chromatin, sometimes even independently of ATP-dependent 

enzymes (Cirillo et al., 2002). Histone acetylation mediated by p300 and other HATs 

loosens compacted chromatin structure and also recruits bromodomain-containing 

epigenetic readers like BRD4 (Dey et al., 2003), which in turn promotes transcriptional 

activation by recruiting the pause release factor P-TEFb (Positive Transcription 

Elongation Factor b) (Jang et al., 2005; Yang et al., 2005). Due to the complex interplay 

among these epigenomic features, no single mark is sufficient to quantitatively predict 

enhancer strength. Instead, they typically offer qualitative correlations that imply 

enhancer function in their native chromatin contexts.    


8 

 
Fig. 1.1 Molecular architecture (A) (adapted from Tippens et al., 2020) and 
epigenomic features (B) of active human enhancers.   

 
1.1.3 The Power of Modern Molecular Biology and Genetics: Quantifying and 

Predicting Enhancer Activity (2010s–present)    

Quantitative assessment of enhancer activity is fundamental to deciphering the 

cis-regulatory code. In 2012, massively parallel reporter assays (MPRAs) (Melnikov et 

al., 2012; Patwardhan et al., 2012) were developed as high-throughput extensions of the 

classical enhancer identification method (Banerji et al., 1981). In a typical MPRA, a 

library of synthesized candidate enhancers (each associated with multiple unique 

barcodes) is cloned upstream or downstream of a promoter-driven reporter gene, and 

enhancer activity is calculated as the RNA/DNA barcode ratio, either in an episomal or 

chromosomal setting (reviewed in Klein et al., 2020). A major innovation followed with 


9 

the introduction of STARR-seq (self-transcribing active regulatory region sequencing) 

(Arnold et al., 2013), which enables genome-wide profiling of enhancer activity by 

inserting randomly fragmented genomic DNA into the 3¢ UTR of a reporter gene. This 

design allows active elements to self-transcribe and get sequenced as part of the mRNA 

output. Advances in DNA synthesis technologies now allow these assays to scale up 

dramatically, testing up to billions of synthetic sequences to probe the regulatory syntax 

of enhancers (discussed in Section 1.3.1), though ultra-high library complexity poses 

challenges for quantitative accuracy (Sahu et al., 2022). It is important to note that 

MPRAs and STARR-seq measure intrinsic enhancer potential in artificial, heterologous 

contexts, which may not fully reflect an element’s behavior in its native chromatin 

environment (Arnold et al., 2013).  

The discovery of the CRISPR-Cas9 system (Jinek et al., 2012) and its rapid 

adaptation for (epi)genome editing in mammalian cells (Cong et al., 2013; Mali et al., 

2013; Qi et al., 2013) have revolutionized modern genetics and enabled high-throughput 

functional interrogation of candidate regulatory elements in their native genomic 

contexts. One landmark example is the Cas9-mediated saturating mutagenesis of the 

human BCL11A erythroid enhancer (Bauer et al., 2013; Canver et al., 2015a), which 

pinpointed its critical functional sequences and directly informed the development of 

Casgevy—the first FDA-approved gene therapy that targets this enhancer to reactivate 

fetal hemoglobin production in sickle cell disease patients. Beyond Cas9-mediated 

mutagenesis, dCas9-KRAB–mediated CRISPR interference (CRISPRi) approaches 

have also been combined with various readouts to map enhancer–gene relationships in 

a high-throughput manner. These include growth-based dropout screens (Fulco et al., 


10 

2016), FlowFISH-based measurements (Fulco et al., 2019; Reilly et al., 2021), and 

single-cell RNA-seq (Gasperini et al., 2019), resulting in the identification of hundreds 

of functional enhancer–gene pairs in the human erythroleukemia cell line K562 (an 

ENCODE Tier 1 cell line model). However, it is important to note that CRISPRi screens 

primarily assess the necessity of individual enhancers in their native contexts and may 

overlook redundant or compensatory regulatory elements, potentially underestimating 

the full scope of cis-regulatory networks.       

The explosion of high-throughput sequencing data in the genomics era has 

catalyzed the development of numerous machine learning models for sequence-based 

prediction of TF binding (Alipanahi et al., 2015; Avsec, Weilert, et al., 2021) and 

enhancer function (Zhou & Troyanskaya, 2015; Kelley et al., 2016; Avsec, Agarwal, et 

al., 2021). More recently, deep learning approaches have enabled the rational design of 

synthetic enhancers with cell-type- or cell-state-specific regulatory activity (de Almeida 

et al., 2024; Gosai et al., 2024; Taskiran et al., 2024; Frömel, Rühle, Bernal Martinez, 

et al., 2025), holding great promise for future therapeutic applications. However, a 

recent in situ mutagenesis study generated quantitative gold-standard datasets to 

evaluate the performance of various deep learning models and revealed substantial 

variability in their predictive accuracy—particularly poor performance in modeling 

distal enhancer activity (Martyn et al., 2025). These findings underscore the need for a 

deeper mechanistic understanding of how sequence and epigenomic features contribute 

to enhancer function. 

 
11 

1.2 The Enigma of Enhancer–Promoter Communication 

Much like a romantic relationship, enhancer–promoter communication depends 

on two key aspects: physical proximity and (bio)chemical compatibility. And just as 

with human relationships, we still do not fully understand what makes an enhancer and 

a promoter ‘click’. This section reviews both established knowledge and unresolved 

questions, highlighting ongoing debates and areas of active investigation. 

1.2.1 Spatial Connectivity between Enhancers and Promoters  

CTCF/Cohesin-Insulated Topologically Associating Domains 

Enhancers can regulate gene expression over large genomic distances—

sometimes spanning megabases from their target promoters. For example, the limb-

specific ZRS enhancer lies ~1 Mb away from its cognate gene Shh in both mouse and 

human genomes (Lettice et al., 2003) and two CRISPRi-identified MYC enhancers 

reside ~1.8 Mb from the MYC promoter in K562 cells (Fulco et al., 2016). Despite this 

huge linear distance, enhancer–promoter (E–P) communication requires physical 

proximity in 3D nuclear space, implying that chromatin must adopt specific folding 

patterns to bring regulatory elements together. The development of chromosome 

conformation capture (3C) technologies—first 3C (Dekker et al., 2002), then its high-

throughput versions 5C (Dostie et al., 2006) and Hi-C (Lieberman-Aiden et al., 2009)—

revolutionized the study of genome architecture. Applying 5C and Hi-C to human and 

mouse cells revealed that mammalian genomes are partitioned into megabase-scale 

topologically associating domains (TADs), typically anchored by the insulator protein 

CTCF and largely conserved between cell types and species (Dixon et al., 2012; Nora 


12 

et al., 2012; Rao et al., 2014). This has been further supported by orthogonal imaging 

experiments using DNA FISH and super-resolution microscopy (Nora et al., 2012; 

Bintu et al., 2018). Disruption of TAD boundaries—such as by genomic 

rearrangements—can lead to miswiring of enhancer–gene interactions and ectopic gene 

activation, contributing to developmental disorders and disease (Lupiáñez et al., 2015). 

Furthermore, incorporating Hi-C contact frequencies markedly enhanced model 

accuracy in predicting CRISPRi-validated enhancer–gene links, arguing for an 

important role of 3D genome architecture in E–P communication (Fulco et al., 2019).       

Mechanistically, it has been proposed that CTCF and cohesin localize to 

convergently oriented CTCF-binding sites (CBSs) to mediate chromatin loop formation 

through a process called loop extrusion (Sanborn et al., 2015). In line with this model, 

deletion or inversion of CBSs can reconfigure E–P looping and dysregulate gene 

expression (de Wit et al., 2015; Y. Guo et al., 2015). However, surprisingly, acute 

depletion of CTCF or cohesin eliminates TADs globally, yet only induces modest 

immediate transcriptional changes, as detected at both the stable mRNA level (RNA-

seq) and the nascent RNA level (PRO-seq) (Nora et al., 2017; Rao et al., 2017). This 

discrepancy between the individual mutational analysis and global factor loss has 

sparked intense investigation. Several recent studies have provided insights into this 

conundrum. First, nucleosome-resolution Micro-C mapping in mammalian cells 

revealed fine-scale spatial organizations among enhancers and promoters within TADs, 

which remain largely intact upon CTCF or cohesin loss but are sensitive to acute 

transcriptional inhibition (Hsieh et al., 2020, 2022). Interestingly, cohesin, but not 

CTCF, plays an additional role in facilitating target search and chromatin binding of 


13 

TFs (Hsieh et al., 2022). Second, live-cell imaging experiments showed that fully 

extruded chromatin loops are rare and transient—present only ~2–3% of the time, with 

a median lifespans of ~10–30 min—while partially extruded loop state dominates 

(~92% of the time) (Gabriele et al., 2022). This is consistent with the FISH-based 

chromatin tracing analysis, which revealed that TAD boundaries are highly variable at 

the single-cell level, with CBSs serving as preferential anchoring sites. Upon cohesin 

loss, TAD structures persist in single cells, but the preferential positioning of TAD 

boundaries at CBSs is erased at the population level (Bintu et al., 2018). Third, single-

cell omics and imaging analyses demonstrated that cohesin loss leads to widespread, 

stochastic co-activation of genes across TADs (Dong et al., 2024). Together, these 

findings reveal the dynamic and probabilistic nature of CTCF/cohesin-mediated 

chromatin loops and underscore the insulating function of TADs. However, the 

persistence of E–P contacts following rapid CTCF/cohesin depletion suggests the 

existence of additional mechanisms underlying E–P communication. This is further 

supported by the finding that the mouse Sox2 super-enhancer can bypass artificially 

introduced, CTCF-mediated insulation boundaries to activate its target gene across a 

distance of ~100 kb (Chakraborty et al., 2023).  

Transcription Factor-Mediated Chromatin Looping 

Beyond CTCF/cohesin, can TFs themselves directly determine E–P 

interactions? A classical and well-characterized example is the chromatin looping 

between DNase I hypersensitive sites in the active b-globin locus in mouse erythroid 

cells (Tolhuis et al., 2002). The discovery that this looping is mediated by the lineage-


14 

specifying TF GATA1 introduced a new paradigm of TF-driven E–P contacts (Vakoc 

et al., 2005). Subsequent forced-looping experiments revealed convincingly that 

GATA1-mediated looping requires the cofactor LDB1, which harbors a self-association 

domain and promotes chromatin interactions through dimerization (Deng et al., 2012). 

Notably, rapid depletion of LDB1 disrupts hundreds of long-range loops between 

regulatory elements, many of which form independently of CTCF and cohesin 

(Aboreden et al., 2025). In addition to erythroid-specific genome organization, LDB1 

has also been implicated in a novel class of cis-acting elements termed Range EXtenders 

(REXs), which are critical for the extreme long-range function of limb enhancers 

(Bower et al., 2025). These elements are enriched for motifs recognized by LIM-

homeodomain TFs, which are known interactors of LDB1. Another illustrative case 

comes from developing Drosophila embryos, where the pioneering GAGA factor 

(GAF) binds to a subset of tethering elements (Batut et al., 2022; Levo et al., 2022) and 

mediates regulatory element proximity through its N-terminal oligomerization domain 

(X. Li et al., 2023). These findings underscore a growing recognition that lineage-

specific TFs and their cofactors can orchestrate E–P communication independently of 

classical architectural proteins, warranting further investigation into the diversity, 

mechanisms, and cell-type specificity of TF-mediated chromatin looping.                

Dynamics of Enhancer–Promoter Interactions   

Live-cell imaging of E–P interactions has emerged as a powerful approach in 

enhancer biology, offering direct insights into the dynamic relationship between spatial 

proximity and transcriptional output. However, visualizing specific enhancers and 

promoters in living cells remains technically challenging, often requiring genome 


15 

engineering or tiling of fluorescently labeled dCas9-sgRNA ribonucleoprotein 

complexes. Interestingly, findings from these studies have been somewhat conflicting. 

Chen et al. (2018) in live Drosophila embryos (when examining an E–P interaction 

separated by 150 kb) and Zhu et al. (2025) in human U2OS cells (E–P interaction 

separated by 195 kb) both observed increased E–P proximity during transcriptional 

activation, along with enhanced spatial confinement and temporal stability of 

interactions. In contrast, Alexander et al. (2019) found no evidence of increased spatial 

proximity during enhancer-driven Sox2 activation in mouse embryonic stem cells 

(mESCs), aligning with the “stirring” model proposed by Gu et al. (2018), which 

described transcription-coupled increases in both the average E–P distance and the 

mobility of the mouse Fgf5 enhancer. Together, these conflicting findings may reflect 

the complexity and context-dependence of E–P dynamics, pointing to the need for 

improved live-cell imaging strategies and broader sampling across loci and cell types.

  
1.2.2 Biochemical Compatibility between Enhancers and Promoters 

Physical proximity between enhancers and promoters is not always sufficient to 

ensure cell-type-specific E–P communication. It has long been recognized that 

enhancers can bypass nearby genes to selectively regulate distant genes. A classical 

example is the Drosophila dpp enhancers that regulate the dpp gene residing 20–35 kb 

away while skipping over the immediately adjacent genes Slh and oaf (Merli et al., 

1996). Recent enhancer interactome data in mouse embryonic tissues also reveal that 

~61% enhancers bypass their adjacent promoters; notably, about half of these skipped 

promoters are inactive and marked by CpG methylation, while the other half remain 


16 

accessible and transcriptionally active (Z. Chen et al., 2024). These findings strongly 

suggest that E–P communication involves some biochemical selectivity beyond mere 

spatial proximity. Using STARR-seq, Zabidi et al. (2015) demonstrated that distinct sets 

of enhancers preferentially activate housekeeping or developmental core promoters in 

Drosophila cells, with each enhancer class relying on different TFs. In mammalian cells, 

however, such selectivity has been more difficult to discern. On one hand, studies have 

shown that both human enhancers and core promoters exhibit distinct cofactor 

preferences (Neumayr et al., 2022; Bell et al., 2024), and that different coactivators vary 

in their binding specificity for different activation domains of TFs (DelRosso et al., 

2024). On the other hand, large-scale MPRA testing of E–P combinations in human and 

mouse cells have struggled to identify clear rules of biochemical compatibility at the 

DNA sequence level. The mouse study reported a broad spectrum of E–P compatibility, 

from striking specificity to broad promiscuity (Martinez-Ara et al., 2022), whereas the 

human study found largely promiscuous E–P compatibility (Bergman et al., 2022). 

Nevertheless, subtle differences in enhancer responsiveness between housekeeping and 

developmental human promoters have been observed, which seem to be mediated by 

several specific TFs such as GABPA and YY1 (Bergman et al., 2022). 

Since both active enhancers and promoters are transcribed, their functional 

compatibility may also depend on coordination during key rate-limiting steps of 

transcription. In mammalian cells, transcription by Pol II is a multi-step process 

orchestrated by numerous factors. Here, we broadly divide the early stages of 

transcription cycle into two major steps (Figure 1.2). Step 1 involves chromatin opening, 

PIC assembly, initiation, and promoter escape to the proximal pause site. Step 2 involves 


17 

release of paused Pol II into productive elongation—a distinct rate-limiting step with 

important functional implications (reviewed in Adelman & Lis, 2012). E–P 

compatibility may influence either or both of these steps. For example, promoters may 

differ in their reliance on specific transcription initiation complexes, such as those 

incorporating TBP-related factors (TRFs) or tissue-specific TBP-associated factors 

(TAFs) (Hochheimer & Tjian, 2003), potentially constraining their responsiveness to 

enhancers that recruit distinct TRFs or TAFs. Additionally, recent work has shown that 

different human core promoters are differentially constrained at Step 1 or Step 2, with 

each step preferentially activated by distinct sets of cofactors (Bell et al., 2024). Thus, 

optimal E–P communication may require complementation, whereby enhancers deliver 

the cofactors necessary to relieve the promoter’s rate-limiting step. Supporting this, 

“anti-pause” enhancers associated with BRD4 and JMJD6 have been identified (Liu et 

al., 2013), and certain TFs, such as c-Myc (Rahl et al., 2010) and HSF1 (Lis et al., 2000; 

Duarte et al., 2016; Mahat, Salamanca, et al., 2016), promote pause release by recruiting 

P-TEFb either directly or indirectly. However, a comprehensive and systematic 

evaluation of the Step 1/Step 2 complementation hypothesis across diverse E–P pairs is 

still lacking, highlighting a promising direction for future investigation.   


18 

 
Fig. 1.2 The transcription cycle of RNA Pol II at human genes (adapted from Fuda 
et al., 2009).  

 
1.3 Revisiting the Regulatory Complexity of Enhancers 

In this section, I discuss current perspectives and key unresolved questions in 

enhancer biology. Topics include the sequence and functional features of enhancers and 

the concept of transcriptional regulatory hubs. 

1.3.1 Reflection on What Makes an Enhancer 

Motif Syntax of Enhancers 

Although TF binding sites act as the atomic units for enhancer function, 


19 

enhancers generally exhibit flexible motif grammar without strict requirements for 

specific motif combinations, spacing, or orientation—except in the cases of homo- or 

hetero-dimers (Sahu et al., 2022). This flexibility challenges the classical enhanceosome 

model exemplified by the human interferon-b enhancer, which requires the formation 

of a higher-order TF–DNA complex with rigid motif composition and positioning 

(Thanos & Maniatis, 1995; Panne et al., 2007). Alternative models have since been 

proposed to better explain enhancer function (reviewed in Spitz & Furlong, 2012). The 

“billboard” model posits that enhancers act as modular information platforms with fixed 

TF composition but flexible motif arrangement (Kulkarni & Arnosti, 2003; Arnosti & 

Kulkarni, 2005). Going further, the “TF collective” model allows flexibility in both 

motif combination and organization, proposing that protein–protein interactions among 

TFs enable their collective recruitment to enhancers that may contain only a subset of 

their individual motifs (Junion et al., 2012).           

MPRA studies have revealed a spectrum of combinatorial TF interactions—

ranging from sub-additive to super-additive transcriptional outputs (Grossman et al., 

2017; Sahu et al., 2022). However, these studies often focus on strong consensus motifs. 

It is important to keep in mind that developmental enhancers frequently rely on 

suboptimal TF binding sites to help ensure cell-type-specific function while avoiding 

ectopic gene activation (Farley et al., 2015; Kribelbauer et al., 2019; F. Lim et al., 2024). 

This trade-off between potency and specificity suggests weak multivalent interactions 

among TFs may be central to enhancer logic in development. Future efforts should aim 

to systematically characterize a broader range of low-affinity motif combinations.          


20 

In line with this flexible and degenerate motif syntax, it is not surprising to find 

that mammalian enhancers evolve rapidly, much faster than promoters (Villar et al., 

2015), although some enhancers remain ultraconserved across species (Dickel et al., 

2018; Snetkova et al., 2021). Recent work suggests that even highly diverged enhancer 

sequences between mouse and chicken can remain “indirectly conserved”—preserving 

functional output through positional conservation and shuffling of TF binding sites 

(Phan et al., 2025), reinforcing the concept of soft syntax in enhancer architecture. 

Furthermore, the pervasive contribution of transposable elements—particularly long 

terminal repeats (LTRs)—to the human enhancer repertoire (A. Y. Du et al., 2024; 

Thurman et al., 2012) offers a rich substrate for studying enhancer evolution and 

sequence features, despite technical challenges posed by their repetitive nature.           

Function and Regulation of Enhancer Transcription  

Another well-established feature of enhancers is their divergent transcriptional 

activity, although its precise functional role remains incompletely understood. First of 

all, enhancer transcription reflects sufficient local activator concentration and successful 

assembly and engagement of the transcriptional machinery—both essential 

prerequisites for promoter activation. Additionally, divergent transcription may (1) 

generate negative supercoiling that facilitates DNA unwinding (Wu & Sharp, 2013), (2) 

maintain an open chromatin environment through nucleosome eviction, (3) alter 

chromatin dynamics to favor enhancer–promoter communication (H. Chen et al., 2018; 

Gu et al., 2018; Zhu et al., 2025), and (4) promote transcriptional hub formation via the 

intrinsically disordered Pol II C-terminal domain and other coactivators (discussed in 

Section 1.3.2), which may be further reinforced by eRNAs themselves (Sartorelli & 


21 

Lauberth, 2020).  

Is enhancer transcription regulated similarly to promoters, or simply a byproduct 

of chromatin opening and (co)activator recruitment? Enhancers contain core promoter 

elements (CPEs) (Tippens et al., 2020), and also go through Pol II pausing and release 

steps, although the transcription quickly terminates due to enriched polyadenylation 

signals and depleted splice sites (Henriques et al., 2018; Fitz et al., 2020; Core et al., 

2014). Notably, using a combination of nascent transcriptome sequencing assays 

mNET-seq and TT-seq, the Cramer group introduced the concept of “pause-initiation 

limit” (Gressel et al., 2017) and showed that enhancer transcription is generally not 

constrained by promoter-proximal pausing and can be activated without the P-TEFb 

kinase CDK9 activity (Gressel et al., 2019). Future efforts to classify enhancers based 

on their pause release constraints may shed light on the Step 1/Step 2 complementation 

model discussed in Section 1.2.2.      

The unified divergent transcription architecture of enhancers and promoters 

(Core et al., 2014; Tippens et al., 2020) raises an intriguing question: can enhancers act 

as promoters, and vice versa? Mikhaylichenko et al. (2018) developed a dual transgenic 

assay in Drosophila embryos and found that transcription directionality correlates with 

regulatory function—bidirectional enhancers can act as weak promoters, and 

bidirectional promoters (but not unidirectional ones) often exhibit strong enhancer 

activity. Supporting this fluidity, some lncRNA promoters have also been shown to act 

as enhancers for their neighboring genes (Engreitz et al., 2016). Together, these findings 

underscore the flexibility and complexity of cis-regulatory element function. 


22 

1.3.2 Transcriptional Regulatory Hubs: Interplay Among Cis-Acting Elements 

Beyond the combinatorial action of TFs at individual enhancers, gene activation 

is often orchestrated by multiple enhancers working in concert. Enhancers can act 

redundantly to ensure robustness of gene expression, a widespread phenomenon 

observed in both Drosophila—where such elements are known as shadow enhancers 

(Hong et al., 2008; Perry et al., 2010; Frankel et al., 2010)—and during mouse 

development (Osterwalder et al., 2018). Enhancers may also function additively or 

super-additively, such as in the nested MYC enhancer network, where closely residing 

enhancers drive high expression additively, while distantly residing enhancers synergize 

to reinforce transcriptional robustness (Lin et al., 2022). A particularly notable concept 

is that of super-enhancers—large clusters of enhancers densely occupied by lineage-

determining transcription factors and the Mediator complex (W. A. Whyte et al., 2013). 

These regulatory regions drive cell identity programs and are enriched for disease-

associated genetic variants (Hnisz et al., 2013). It has been well-established that by 

recruiting high concentrations of activators, coactivators, and Pol II—many containing 

intrinsically disordered, low-complexity domains—super-enhancers can drive the 

formation of phase-separated transcriptional condensates (Sabari et al., 2018; M. Du et 

al., 2024). Phase separation was once considered a key mechanism of enhancer function, 

enabling compartmentalization of transcriptional machinery. However, recent findings 

reveal that it is the optimum levels of multivalent interactions among low-complexity 

domains, rather than phase separation per se, that drive full gene activation—and in 

some cases, condensate formation may even impede this process (Chong et al., 2022; 

Trojanowski et al., 2022).  


23 

Emerging evidence also reveals functional hierarchies within super-enhancers 

or functionally linked cis-regulatory elements. This means, despite their similar 

epigenomic signatures, certain constituent enhancers act as the “seed” elements 

nucleating regulatory activity, whereas others play accessory roles to amplify the seed 

activity (Shin et al., 2016; Huang et al., 2018; Thomas et al., 2021; Brosh et al., 2023; 

Blayney et al., 2023). Notably, Blayney et al. (2023) and Brosh et al. (2023) identified 

the extreme cases of facilitators within super-enhancers, which lack intrinsic enhancer 

activity themselves yet potentiate the function of classical autonomous enhancers. These 

findings support the idea of transcriptional regulatory hubs—higher-order assemblies 

instructed by “seed” elements, with accessory elements contributing molecular 

“stickiness” via multivalent interactions or reinforcing spatial connectivity. Consistent 

with this model, multiway chromatin interactions have been observed through both 3C-

based methods (Allahyar et al., 2018) and imaging analysis (Bintu et al., 2018). As 

genetic, molecular, and imaging tools continue to evolve, future studies should aim to 

functionally classify seed and accessory elements and dissect their mechanistic interplay 

within complex enhancer networks. 

 
24 

CHAPTER 2 ROBUST REGULATORY INTERPLAY OF ENHANCERS, 

FACILITATORS, AND PROMOTERS IN A NATIVE CHROMATIN CONTEXT 

2.1 Abstract 

Enhancers are gene-distal cis-regulatory elements that drive cell-type-specific 

gene expression. While significant progress has been made in identifying enhancers and 

characterizing their epigenomic features, much less effort has been devoted to 

elucidating mechanistic interactions among clusters of functionally linked regulatory 

elements within their endogenous chromatin contexts.  Here, we developed a novel 

recombinase-mediated genome rewriting platform and applied our divergent 

transcription architectural model to understand how a long-range human enhancer 

confers a remarkable 10,000-fold activation to its target gene, NMU, at its native locus. 

Our systematic dissection reveals transcription factor synergy at this enhancer and 

highlights the interplay between a divergently transcribed core enhancer unit and 

emerging new types of cis-regulatory elements—notably, intrinsically inactive 

facilitators that augment and buffer core enhancer activity, and an adjacent retroviral 

long terminal repeat promoter that represses enhancer activity. We discuss the broader 

implications of our focused study on enhancer mechanisms and regulation genome-

wide. 

2.2 Introduction 

Since their discovery over four decades ago (Banerji et al., 1981; Moreau et al., 

1981), enhancers have been recognized as abundant and essential cis-regulatory 

elements that recruit transcription factors (TFs) to activate target gene promoters from 


25 

a distance, often in a cell-type-specific manner. Owing to their pivotal roles in 

development and disease, numerous individual laboratories and major consortia like 

ENCODE (ENCODE Project Consortium et al., 2020) have made extensive efforts to 

identify and characterize enhancers across diverse cell types and tissues. Traditional 

hallmarks of active enhancers include TF and coactivator binding, DNase I 

Hypersensitivity, and histone modifications such as H3K27ac, H3K4me1 and 

H3K4me3 (Heintzman et al., 2009). Later, widespread RNA Polymerase II (Pol II) 

transcription has emerged as another key indicator of enhancer activity (T.-K. Kim et 

al., 2010). We previously developed the nuclear run-on–based assays, GRO-cap and 

PRO-cap (Kruesi et al., 2013; Kwak et al., 2013), which selectively enrich for 5′-capped 

nascent RNAs to map genome-wide transcription initiation events with high sensitivity 

and specificity. Applying GRO-cap to human cells revealed a unified molecular 

architecture shared by enhancers and promoters, featuring a central nucleosome-

depleted TF binding region flanked by two divergent core promoters that initiate 

bidirectional Pol II transcription (Core et al., 2014). This unit definition enables precise 

delineation of enhancer boundaries and offers a robust framework for accurate enhancer 

annotation (Tippens et al., 2020; Yao et al., 2022). However, functional dissection of 

enhancers within the paradigm of divergent transcription architecture remains 

limited.              

  Beyond the correlative features, recent technological advancements have 

further established two powerful types of high-throughput screening methods to directly 

quantify enhancer activity. Gain-of-function assays, such as massively parallel reporter 

assays (MPRAs) (Melnikov et al., 2012; Patwardhan et al., 2012) and self-transcribing 


26 

active regulatory region sequencing (STARR-seq) (Arnold et al., 2013), measure 

elements’ intrinsic enhancer potential based on their ability to drive reporter gene 

transcription. While these assays allow impressive genome-wide scalability, they rely 

on assaying DNA sequences outside of their endogenous chromatin contexts, which 

may compromise physiological relevance and introduce false positives (Arnold et al., 

2013). In contrast, CRISPR-based loss-of-function screens (Fulco et al., 2016, 2019; 

Gasperini et al., 2019) assess enhancer necessity and preserve the native spatial 

relationships between enhancers and promoters, but they are complicated by variable 

perturbation efficiency, potential off-target effects, and imprecise definition of element 

boundaries. Moreover, human genes are frequently regulated by ensembles of enhancers 

and related elements that can act redundantly (Kvon et al., 2021), additively, or 

synergistically (Bothma et al., 2015; Carleton et al., 2017; Lin et al., 2022; Thomas et 

al., 2021), representing additional layers of complexity. Therefore, despite the 

exponential rise in the number of experimentally nominated cis-regulatory elements, the 

mechanisms governing their functional logic are still poorly understood.       

  To address these limitations and provide orthogonal insights, recombinase-

mediated genome rewriting (Blayney et al., 2023; Brosh et al., 2021, 2023) has emerged 

as a powerful strategy. Through precise replacement of genomic regions and targeted 

manipulation of individual or combinatorial elements, these approaches allow 

comprehensive interrogation of entire loci of interest and enable functional analysis in 

their native genomic contexts, uncovering hierarchical relationships among a cluster of 

elements. Furthermore, they embrace serendipitous discovery of previously 

unrecognized cis-regulatory behavior. For instance, by engineering the ɑ-globin super-


27 

enhancer in mouse erythroid cells, Blayney et al. (2023) recently identified a novel class 

of distal regulatory elements, termed facilitators, which lack intrinsic enhancer activity 

but potentiate the function of autonomous enhancers. While conceptually intriguing, the 

broader prevalence of facilitators beyond super-enhancers and the molecular 

underpinnings of enhancer–facilitator interactions remain largely unexplored. 

  In this study, we developed a novel recombinase-mediated platform to 

systematically dissect a potent distal human enhancer at its native locus, guided by our 

architectural model of enhancer organization (Core et al., 2014; Tippens et al., 2020). 

Through detailed TF motif mutagenesis and integrative genomics analysis, we 

uncovered intricate crosstalk at both the trans-acting factor and the cis-acting element 

levels. We demonstrate that a core enhancer region, precisely demarcated by a divergent 

transcription pattern, acts as the intrinsic activating unit for target gene expression. This 

core enhancer activity is further modulated by surrounding facilitators and a promoter-

like element, which display distinct molecular signatures and exert positive and negative 

influences, respectively. We propose that such highly interconnected regulatory 

networks are broadly utilized across the genome to ensure precise and robust control of 

transcriptional output.  

2.3 Results 

2.3.1 eNMU landing pad as a powerful system to study enhancer function at the 

native locus 

Neuromedin U (NMU) is a neuropeptide that has been implicated in various 

physiological processes including erythropoiesis (Gambone et al., 2011). In the triploid 


28 

human erythroleukemia cell line K562, Gasperini et al. (2019) identified a critical 

enhancer of NMU (hereafter eNMU) in a CRISPR interference (CRISPRi) screen. This 

enhancer is located ~94 kb away from the NMU gene promoter, and its homozygous 

deletion, without negatively affecting cell growth, led to a remarkable 100% reduction 

in NMU expression by RNA-seq (Gasperini et al., 2019). Tippens et al. (2020) further 

refined the boundary of eNMU based on our unified molecular architecture model of 

transcriptional regulatory elements (Andersson et al., 2015; Core et al., 2014) and 

divided eNMU into two divergently transcribed sub-elements e1 and e2 (Figure 2.1A). 

Homozygous CRISPR knockouts showed that deleting e1, the 453-bp sub-element with 

higher DNase I Hypersensitivity (DHS), reduced gene expression by 10,000-fold 

(0.01% of WT) by quantitative reverse transcription PCR (RT-qPCR)—the same level 

as deleting full eNMU (Figure 2.1B). Precision Run-On and Sequencing (PRO-seq) 

confirmed that ΔeNMU and Δe1 abolished nascent transcription at both the enhancer 

and the target gene without affecting other genes nearby (Figures 2.1, C, D, and 2.2). 

Hence, e1 is essential for transcription initiation while e2 alone in the genome is 

completely inactive. Surprisingly, deletion of this intrinsically inactive 503-bp e2 

element resulted in only ~5% of WT mRNA level (Figure 2.1B) with decreased PRO-

seq signal at e1 and the NMU gene, highlighting that e1 acts as a canonical autonomous 

enhancer for NMU but requires the facilitator element (Blayney et al., 2023) e2 to 

achieve maximal activation.  


29 

 
Fig. 2.1 eNMU is composed of an autonomous enhancer e1 and a facilitator e2.  

(A) Epigenomic landscape of eNMU and its sub-elements e1 and e2 in K562 cells, 
showing DNase I hypersensitivity (DHS) and H3K27ac (ENCODE Project 
Consortium et al., 2020), and GRO-cap–defined transcription start sites (TSSs) (Core 
et al., 2014). (B) NMU mRNA levels (RT-qPCR) in independent cultures of WT, 
ΔeNMU, Δe1 and Δe2 cell lines. Black dots = data from Tippens et al., 2020 (GAPDH 
normalized); red dots = data from this study (ACTB normalized). (C) PRO-seq signal 
at the NMU–eNMU locus in the same cell lines as (B). Tracks represent merged 
biological replicates (n = 2). (D) Relative NMU gene body read counts from PRO-seq 
in (C). 


30 

 
Fig. 2.2 eNMU specifically regulates NMU gene transcription in K562.  

PRO-seq (3¢-end) tracks of WT, ΔeNMU, Δe1, Δe2 and eNMU LP cell lines across a 
1-Mb region around NMU; insets show the full NMU gene and eNMU loci. 
Highlighted regions indicate NMU promoter (P) and eNMU (E). Note that at the 
eNMU locus in LP Clone E5, the prominent PRO-seq signal downstream (to the 
right) of the Bxb1 LP cassette originates from readthrough transcription driven by 
the strong EF1ɑ promoter within the selection cassette.  However, transcriptional 
activity of the LP did not exhibit any enhancer function to activate the distal NMU 


31 

gene. Tracks represent merged biological replicates (n = 2 independent cultures). 
WT K562 GRO-cap data (Core et al., 2014) shown as the TSS reference. 

 
The huge dynamic range of eNMU regulation from a distal site and the 

intriguing cooperativity between its sub-elements e1 and e2 warrant a comprehensive 

interrogation into its sequence features and molecular mechanisms. To this end, we used 

CRISPR to knock in a landing pad at a single allele of the native eNMU locus in the 

K562 ΔeNMU cell line (Figure 2.3, A and B). This landing pad, modified from 

Matreyek et al. (2020), harbors a Bxb1 recombinase-mediated exchange cassette 

containing a battery of selection markers, including a constitutively expressed Blue 

Fluorescent Protein (BFP). Co-transfection of a Bxb1-expressing plasmid and a 

barcoded payload plasmid library of eNMU mutants leads to the loss of selection 

markers and a stable, irreversible integration of individual elements at the landing pad 

locus. The resulting BFP− recombinant population is then subjected to NMU 

hybridization chain reaction fluorescence in situ hybridization coupled with flow 

cytometry (HCR-FlowFISH) (Reilly et al., 2021) to resolve the effects of individual 

mutants on NMU expression as a measure of enhancer activity (Figure 2.3A). As an 

initial test on the functionality of our system, we integrated the full-length eNMU 

sequence into two independently isolated landing pad (LP) clones and observed ~5% 

recombination efficiency in both cases as measured by BFP loss (Figure 2.3C). 

Importantly, for both LP clones, NMU expression was rescued by ~3,000-fold in BFP− 

recombinant cells compared to the parental, NMU-inactive LP cells (Figure 2.3D and 

2.2), consistent with 1/3 of the 10,000-fold activation by three alleles of eNMU (Figure 

2.1B). Subsequent testing of HCR-FlowFISH on eNMU- and e2-recombinant 


32 

populations showed a clear separation of their NMU RNA FISH signals (Figure 2.3E). 

Therefore, we have successfully established an efficient LP-based workflow that would 

allow us to characterize eNMU in its native chromatin context. 

 
Fig. 2.3 eNMU landing pad as a powerful system to study enhancer function at 
the native locus. 

(A) Workflow of the eNMU landing pad system to measure enhancer activity of a 
barcoded element library. (B) LP copy number analysis by quantitative PCR on 
genomic DNA (BFP DNA vs. 3-allele control locus). (C) Bxb1 recombination 
efficiency measured by BFP loss in two independent LP clones using flow cytometry. 
(D) Rescue of NMU expression by inserting eNMU into the landing pad (LP); n = 2 
independent LP clones. (E) Validation of HCR-FISH in the same LP clones as in (C–
D); ACTB served as the housekeeping gene control. 

 
33 

2.3.2 Screening for functional units and motifs in eNMU 

We next set out to design a systematic mutagenesis scheme for eNMU using a 

combination of unbiased and targeted perturbations that complement each other. 

Building on our previous finding that enhancer transcription contributes to activity 

(Tippens et al., 2020), we generated tiling deletions for e1 and e2 to remove clusters of 

transcription start sites (TSSs) (Figure 2.4A). In parallel, considering the central role of 

TFs in shaping enhancer function, we curated a list of TF motifs for targeted 

mutagenesis by (1) intersecting all possible TF binding sites (TFBSs) from the JASPAR 

database (Castro-Mondragon et al., 2022) with K562 ChIP-seq peaks from the UCSC 

Genome Browser (Perez et al., 2025) overlapping eNMU, and (2) filtering this candidate 

set to retain only those motifs corresponding to K562-expressed TFs and located within 

regions of enriched ChIP-seq signals (Methods). This led to a total of 95 motifs 

corresponding to 26 TFs (Figure 2.4A). We then merged highly similar motifs such as 

AP-1 and NFE2 and mutated the two most conserved bases in each motif by transversion 

(C. Guo et al., 2017; Kircher et al., 2019; Kosicki et al., 2024) (A↔C, T↔G, 

Supplementary Table S1) while ensuring minimal interference with overlapping motifs 

to the best of our ability. Finally, we devised a “mix-and-match” cloning strategy 

(Figure 2.4B; Methods) to construct a barcoded mutant library where all the motif 

occurrences for a given TF were altered only in e1, e2, or both. This approach allowed 

for maximal disruption of TF binding within the distinct contexts of the two sub-

elements. Taken together, our library contains 83 elements (77 eNMU-derived 

sequences and 6 exogenous controls) associated with 328 unique barcodes—on average 

4 barcodes per element. Following library integration and recombinant cell selection, 


34 

we performed FlowFISH and sorted cells into 8 bins based on NMU signal intensity, 

using ACTB as an internal control (Figure 2.4C). We then sequenced enhancer barcodes 

in each bin and calculated an activity score for each barcode using a weighted average 

approach (Figures 2.4, D and E). Activity scores from biological replicates showed a 

strong correlation (Pearson’s r = 0.91, Figure 2.4F) and aligned closely with RT-qPCR 

measurements for a select set of mutants (Pearson’s r = 0.97, Figure 2.4G), confirming 

that our assay faithfully captured mutant activities.  


35 

 
Fig. 2.4 The HCR-FlowFISH screen for functional sequences in eNMU.  

(A) Full eNMU mutant design separating overlapping motifs of different TFs. (B) 
Gibson assembly workflow for constructing the barcoded eNMU mutant library. (C) 
Flow cytometry binning strategy using ACTB as an internal control for cell size, 


36 

transcription level, and staining efficiency. (D) Barcode distribution across 8 sorting 
bins for two example elements in the mutant library: WT_eNMU and e2_mSTAT5. 
(E) Calculation of activity scores using a weighted average of barcode distributions. 
(F) Correlation of median barcode activity scores between biological replicates (n = 
2 independent LP clones subjected to recombination and FlowFISH). Pearson’s 
correlation coefficient (r) is shown. (G) Correlation between FlowFISH-measured 
activity scores and RT-qPCR quantifications of select mutants, based on median 
activity values from each assay. Pearson’s correlation coefficient (r) is shown. 

 
  Our analysis of tiling deletions in e1 showed that a divergently transcribed 

region e1.1–e1.3 delineated an activating unit, where e1.2—encompassing the DHS 

signal summit—marked the core of e1 activity (Figure 2.5B, Δe1.2 versus Δe1). 

Unexpectedly, the well transcribed e1.4 acted as a repressing unit, as its deletion led to 

an increase in NMU expression. This observation was bolstered by finer deletions Δe1.5 

to Δe1.8, which revealed that the transition between activation and repression lay 

between e1.6 and e1.7. In fact, the first 201 bp of e1 (Δe1.9) was sufficient to capture 

all of its activity, likely due to the loss of both positive and negative elements in e1.9. 

In contrast, tiling deletions in the facilitator e2 identified a simpler functional core e2.3 

(Figure 2.5B, Δe2.3 versus Δe2), with the other segments showing modest effects. 

Overall, fold changes in double deletions of an e1 segment and an e2 segment were 

multiplicative (i.e., log-additive) of single deletions, except for Δe1.2+Δe2.3, which fell 

below the dynamic range of FlowFISH (Figures 2.6, A and C). These findings highlight 

a modular nature of eNMU’s molecular architecture with largely independent activating 

and repressing features. 

Examination of the motif mutagenesis results in e1 or e2 showed <50% 

reduction of enhancer activity in most cases (Figures 2.5 and 2.6B). Consistent with the 


37 

deletion results, the key TF motifs for e1, namely GATA1 and RUNX1 motifs, are 

located within or near the core region e1.2, while the essential motifs for e2—the 

STAT5 motifs—are clustered in the core e2.3, accounting for nearly all its function. TF 

contributions were context-specific, as exemplified by GATA1 being much more 

critical for e1 than for e2. Similarly, double mutants where the same TF binding was 

disrupted in both e1 and e2 exhibited multiplicative effects (Figures 2.6, B and C), 

suggesting that the cooperativity between e1 and e2 is not driven by a single TF type 

but likely involves multiple different TFs.  

 
Fig. 2.5 Key functional units and motifs in eNMU. 

(A) Overview of deletions and targeted TF binding sites (TFBS) in the eNMU 
mutagenesis screen. (B) Activity scores of select mutants measured by HCR-
FlowFISH. Each dot represents the score of an element-specific barcode from either 


38 

of the two biological replicates. Dashed lines indicate median scores of control 
elements WT_eNMU, Δe1, and Δe2.  

 
Fig. 2.6 Multiplicative effects of double mutants revealed by FlowFISH screen.  

(A) FlowFISH-measured activity scores of single and double deletions, together with 
additional exogenous control elements. (B) FlowFISH-measured activity scores of all 
TF motif mutations in e1, e2, or both. Note that the e1_e2_mCEBPB mutant 
exhibited higher enhancer activity than WT_eNMU, which may be attributed to an 
LTR promoter-mediated mechanism as discussed later. (C) Linear regression of 
observed log2 fold changes in double mutants vs. expected additive effects (sum of 
log2 fold changes from corresponding single mutants). R² from linear regression is 
shown. Dashed 1:1 line indicates perfect additivity. Highlighted outlier: expected 
effect of the Δe1.2+Δe2.3 mutant fell below FlowFISH’s detection range, preventing 
assessment of additivity. 

 
39 

2.3.3 Interplay of regulatory factor binding at eNMU 

To validate the findings on the key motifs and enable clean functional analysis 

downstream, we generated single cell recombinant clones harboring WT_eNMU, 

e1_mRUNX1, e1_mGATA1, and e2_mSTAT5. RT-qPCR quantifications of NMU 

expression showed marked reductions in the mutant clonal lines, corroborating our 

FlowFISH results (Figure 2.7A). Individual mutation of the left and right RUNX1 

motifs revealed that the two sites acted additively, with a stronger contribution from the 

left site, consistent with its higher motif score (JASPAR scores 632 versus 335, 

Supplementary Table S1). Substituting the GATA1 motif with a RUNX1 motif (e1_G-

to-R) failed to rescue e1_mGATA1’s phenotype, suggesting GATA1 as an 

indispensable factor for eNMU function. Conversely, replacing the RUNX1 motifs with 

GATA1 (e1_R-to-G) only partially rescued e1_mRUNX1’s phenotype, suggesting that 

the combination of GATA1 and RUNX1 is particularly potent in the context of e1. We 

also noticed that the modest effect of mutating the right RUNX1 motif—located in the 

segment e1.5—was not sufficient to explain the substantial decrease in enhancer activity 

in the Δe1.5 mutant (Figure 2.5). A closer examination of this region uncovered two 

strong Retinoic Acid Receptor Alpha (RARA)/Retinoid X Receptor Alpha (RXRA) 

motifs (Figure 2.5A, orange hatched boxes), which overlapped the right RUNX1 motif 

and were initially overlooked due to the absence of publicly available ChIP-seq data. 

Mutating both RARA/RXRA motifs without disrupting the RUNX1 site validated their 

crucial function (Figure 2.7A, e1_mRARA). Therefore, we have established that four 

distinct TF motifs, including e1’s RUNX1, GATA1, and RARA/RXRA motifs and e2’s 

STAT5 motifs, are pivotal for eNMU activity.  


40 

 
Fig. 2.7 Interplay of regulatory factor binding at eNMU. 

(A) NMU mRNA levels measured by RT-qPCR in single cell-derived recombinant 
clones; bars = median. Exact mutant sequences are listed in Supplementary Table S1 
(separate file). (B–D) ChIP-qPCR of TF binding in select mutants: GATA1 (B) and 
RUNX1 (C) at e1; STAT5 (D) at e1 and e2 (n = 2 independent single cell clones per 
mutant). Statistical significance assessed using one-way ANOVA with Dunnett’s 
post hoc test vs. WT_eNMU (**, p < 0.01; ***, p < 0.001). Public ENCODE ChIP-seq 
tracks (fold change over control) (ENCODE Project Consortium et al., 2020) shown 
as references: GATA1, ENCFF334KVR; RUNX1, ENCFF654QOE; STAT5A, 
ENCFF171KLX. (E) Left: p300 ChIP-seq profiles at the eNMU locus in the indicated 
mutants from (B–D); tracks represent merged biological replicates (n = 2). Track 
colors indicate specific motif disruptions: grey = WT_eNMU, green = e1_mRUNX1, 


41 

red = e1_mGATA1, blue = e2_mSTAT5. Colored boxes below tracks indicate 
locations of disrupted TF motifs. Right: p300 signal at eNMU vs. NMU mRNA in 
matched single cell clones. (F) Schematic of regulatory factor interplay at eNMU. 
Note that, unlike normal physiological conditions where STAT5 proteins are 
activated in response to cytokine signaling (Tóthová et al., 2021), K562 cells express 
the constitutively active oncogenic BCR-ABL fusion protein that drives persistent 
STAT5 phosphorylation, dimerization and activation (de Groot et al., 1999; Weber-
Nordt et al., 1996).      

 
We next investigated how the motif alterations functionally affected TF 

occupancy. ChIP-qPCR assays revealed that GATA1 and RUNX1 binding at e1 was 

significantly impaired not only by disruption of their own motifs, but also by mutations 

in each other’s motifs (Figures 2.7, B and C), indicating cooperative binding of these 

two factors. STAT5 binding at e2 was largely self-driven, as demonstrated by its 

significant reduction in the e2_mSTAT5 mutant but minor changes in the e1 mutants 

(Figure 2.7D, right panel). Interestingly, STAT5 binding at e1 was also affected by e2’s 

STAT5 mutations (Figure 2.7D, left panel), suggesting that the facilitator element e2 

might boost e1’s activity by promoting STAT5 binding at e1. Collectively, these results 

demonstrate extensive synergy in TF occupancy at eNMU (Figure 2.7F).  

Since p300 is a known coactivator for GATA1 (Boyes et al., 1998), RUNX1 

(Kitabayashi et al., 1998), and STAT5 (Pfitzner et al., 1998), we also performed p300 

ChIP-seq to examine its recruitment in the mutant clones. The RUNX1 and GATA1 

mutants exhibited prominent reductions in p300 binding at eNMU which correlated with 

NMU downregulation (Figure 2.7E). In contrast, e2’s STAT5 mutations only led to a 

mild decrease in p300 occupancy despite marked reduction in NMU expression, 

suggesting that p300 recruitment is not the primary mechanism for STAT5-mediated 


42 

facilitator function in eNMU. 

2.3.4 TF-specific regulation of chromatin accessibility and nascent transcription 

To gain a deeper understanding of TF-specific regulation of chromatin structure, 

we performed ATAC-seq on a select set of critical mutants. Focusing on the eNMU 

locus first, we found distinct changes in chromatin accessibility pattern that seem to be 

related to the positional context of the disrupted motifs (Figure 2.8A). Disruption of the 

GATA1 motif (e1_mGATA1) and the stronger RUNX1 motif (e1_mRUNX1-L)—both 

situated near the DHS summit—resulted in a broad reduction in ATAC-seq peak height 

across e1, suggesting that these motifs act as the nucleation sites for chromatin opening. 

This aligns with the previous reports linking GATA1 and RUNX1 to the recruitment of 

the SWI/SNF chromatin remodeling complex (Bakshi et al., 2010; Kadam & Emerson, 

2003). Mutating both RUNX1 motifs (e1_mRUNX1) reduced both peak height and 

peak width at e1, indicating a more severe defect in chromatin decompaction and 

consistent with its greatest loss in enhancer activity (Figure 2.7A). In contrast, crippling 

the more distal RARA/RXRA motifs led to an asymmetrical loss of accessibility on the 

right flank, while the open chromatin state to their left was likely maintained by GATA1 

and RUNX1. Mutations in e2’s STAT5 motif cluster, which resides even further from 

the DHS center, mainly decreased e2’s accessibility with subtle shrinkage in e1’s peak, 

suggesting that chromatin opening is not the primary mechanism by which STAT5 

facilitates e1’s enhancer activity.  

At the NMU promoter, all mutants exhibited reduced chromatin accessibility 

(Figure 2.8B), consistent with the observed decreases in gene expression. In contrast, 


43 

the control GAPDH gene showed nearly identical accessibility across mutants, 

demonstrating the reproducibility of our data. Notably, RUNX1 seemed to play an 

additional role in enhancer–promoter communication, as its motif disruptions 

specifically affected another NMU promoter-proximal ATAC-seq peak (Figure 2.8B, 

black arrowheads). Overall, ATAC-seq pattern changes were highly consistent across 

biological replicates (independent single cell clones) (Figure 2.8C) and highlight the 

unique contributions of individual TFs to the chromatin landscape at the enhancer and 

promoter of NMU. 


44 

 
Fig. 2.8 TF-specific regulation of chromatin accessibility at the enhancer and 
promoter of NMU.  

(A–B) ATAC-seq signal at eNMU (A), NMU promoter, and GAPDH control locus 
(B) in select eNMU mutants. In (B), black arrows highlight the proximal ATAC-seq 
peak only affected in the RUNX1 motif mutants. Tracks represent merged biological 
replicates (n = 2 independent single cell clones). (C) ATAC-seq signal at eNMU, 
NMU promoter and GAPDH control locus for two independent single cell-derived 
clones (Rep1 and Rep 2) of the eNMU mutants in (A–B). Colored boxes below tracks 
indicate locations of disrupted TF motifs. Fine vertical lines indicate positions of 
GRO-cap–defined TSSs (WT K562) (Core et al., 2014).  

 
45 

To study TF-specific regulation of nascent transcription at base-pair resolution, 

we performed PRO-seq on the same set of clones and plotted 5′ positions of PRO-seq 

reads at the eNMU locus to estimate its TSS usage (Figure 2.9A). WT_eNMU 

integration recapitulated the TSS pattern observed in our published K562 GRO-cap data 

(Core et al., 2014), validating our methodology. Across the mutants, we found varying 

degrees of signal reduction, yet the patterns of divergent transcription at e1 and 

predominantly unidirectional transcription at e2 were largely preserved. Notably, e2’s 

transcription was diminished not only by its own STAT5 mutations, but also in the e1 

mutants where e2’s STAT5 binding was only mildly affected (Figure 2.7D, right panel). 

Such decoupling of transcriptional activity from STAT5 occupancy suggests that 

STAT5 alone is insufficient to drive Pol II initiation at e2. Instead, STAT5 appears to 

act as an effector that mediates e2’s dependence on e1. Together, these findings depict 

a highly interconnected transcriptional landscape at the eNMU locus.  

Finally, we examined nascent transcription changes at the NMU gene by plotting 

the conventional 3′ ends of PRO-seq reads to represent the locations of paused and 

elongating Pol II (Figures 2.9, B and C). All the mutants showed pronounced signal 

reductions in both the NMU promoter pause region (TSS to TSS+250 bp, Figure 2.9B, 

dashed box) and further downstream into the gene body, consistent with their steady-

state mRNA levels (Figure 2.7A). Importantly, the pausing index (PI), defined as the 

ratio between the pause region and gene body read densities (Core et al., 2008) 

(Methods), increased 2- to 3-fold in all the mutants compared to WT_eNMU. This 

points to a defective pause release mechanism at the NMU promoter, regardless of 

specific TF binding at eNMU. Therefore, in addition to its essential role in transcription 


46 

initiation as demonstrated in the ΔeNMU cell line (Figure 2.1C), eNMU also regulates 

Pol II pause release at its target promoter, likely through cofactors shared among its 

critical TFs.   

 
Fig. 2.9 TF-specific regulation of chromatin accessibility at the enhancer and 
promoter of NMU.  

(A–B) PRO-seq signal at eNMU (A) and NMU promoter (B) in select eNMU mutants. 
In (B), bar = NMU pausing index of merged replicates, dots = pausing index of 
individual replicates. (C) PRO-seq tracks at the full NMU and GAPDH (control) 
genes in the same mutants as in (A–B). Tracks represent merged biological replicates 


47 

(n = 2 independent single cell clones). Colored boxes below tracks indicate locations 
of disrupted TF motifs. Fine vertical lines indicate positions of GRO-cap–defined 
TSSs (WT K562) (Core et al., 2014). 

 
2.3.5 Facilitator e2 universally confers enhancer robustness 

The extensive crosstalk between e1 and e2 revealed by our functional analysis 

raised the question on the generality of e2’s facilitator function. To investigate this, we 

selected eight divergently transcribed, CRISPR-validated distal transcriptional 

regulatory elements (dTREs) in K562, which showed large effect sizes in the original 

perturbation studies (Figures 2.10, A and B). We assessed their ability to drive NMU 

expression by recombining each dTRE into the eNMU landing pad, either as standalone 

elements or fused with e2 (Figure 2.11A). When integrated alone, all the dTREs 

elevated NMU mRNA levels above the baseline of e2 only, although the magnitude of 

their effects varied drastically (Figure 2.11B). A closer examination revealed a 

reasonably good correlation between the dTREs’ intrinsic activities at the eNMU locus 

and the total number of GRO-cap reads at their endogenous loci (Pearson’s r = 0.85, 

Figure 2.10C), suggesting nascent transcription as a reliable indicator of enhancer 

function. Notably, fusing e2 to the dTREs amplified their activities in every case, with 

weak elements experiencing greater boosts than strong ones, thereby reducing the 

variation in their effects. Quantitatively, the intrinsic activities of the dTREs and the 

amplifications rendered by e2 closely followed a linear log-log distribution (i.e., power-

law relationship) (Figure 2.11C). These observations illustrate a universal buffering 

function of the facilitator e2 in preventing ultra-low gene expression levels.      


48 

 
Fig. 2.10 Testing CRISPR-validated heterologous K562 dTREs at the eNMU locus. 

(A) Summarized information of selected K562 dTREs from previous studies. Effect 
sizes obtained from Fulco et al. (2019) (except for NMU e1, which is from Tippens 
et al. (2020). (B) Native genomic contexts of each dTRE; tested regions highlighted 


49 

in light blue. Track scales are consistent across dTRE regions, except for GRO-cap 
(Core et al., 2014), which uses an individually indicated scale. Detailed sources and 
accession information are provided in Supplementary Table S4. (C) Correlation 
between GRO-cap read counts at dTREs and their intrinsic enhancer activity (−e2) 
at the eNMU locus. Pearson’s correlation coefficient (r) and corresponding p-value 
are shown. 

 
Fig. 2.11 Facilitator e2 universally confers enhancer robustness.  

(A) Workflow to test e2’s facilitator function on heterologous K562 dTREs using the 
eNMU landing pad. (B) Enhancer activity of dTREs in the absence or presence of e2, 
measured by RT-qPCR; e2 only serves as the baseline. n = 2 independent 
recombination experiments. (C) Correlation between intrinsic activity of elements 
(−e2) and the fold change with e2 fusion. Pearson’s correlation coefficient (r) and 
corresponding p-value are shown. (D) NMU mRNA levels measured by RT-qPCR in 
single cell-derived recombinant clones (n ≥ 4) of e1_WT vs. e1_mGATA1 in the 
absence or presence of e2; e2 only serves as the baseline. Error bars = ± SEM. (E) 
PRO-seq signal at e1, NMU promoter and GAPDH control locus in e1_WT and 
e1_mGATA1 clones lacking e2. Tracks represent merged biological replicates (n = 2 
independent single cell clones). Fine vertical lines indicate positions of GRO-cap–
defined TSSs (WT K562) (Core et al., 2014).    

 
50 

We next asked whether e2’s buffering effect could still apply to a mutated e1 

element. Given the critical role of e1’s GATA1 motif in TF cooperativity (Figure 2.7), 

we compared the enhancer activities of e1_WT and e1_mGATA1 with or without e2, 

again in single cell-derived clones. In the absence of e2, disruption of the GATA1 motif 

completely abolished e1 activity, as reflected by the baseline mRNA level (Figure 

2.11D) and undetectable nascent transcription at both e1 and the NMU promoter (Figure 

2.11E). Notably, the presence of e2 restored nascent transcription (Figure 2.9, 

e1_mGATA1) and rescued the mRNA level by a striking 2,000-fold (Figure 2.11D), in 

stark contrast to the 15-fold increase observed for the active enhancer e1_WT. This 

aligns with the power-law behavior of the heterologous dTREs and highlights the 

importance of facilitators in safeguarding enhancer robustness against disruptive 

mutations. 

2.3.6 A 3D regulatory hub of enhancer, promoter and facilitators of NMU       

In addition to the eNMU region located 94 kb upstream of NMU, CRISPRi 

screens by Gasperini et al. (2019) and Reilly et al. (2021) identified four additional 

candidate NMU “enhancers” at 30.5, 35, 87, and 97.6 kb upstream with varying effect 

sizes (Figure 2.12A, purple highlights). We noted that these elements essentially 

function as facilitators—similar to e2—rather than autonomous enhancers, as they 

failed to activate NMU transcription in the absence of e1 (Figures 2.1, B and C, Δe1). 

In line with this notion, ATAC-seq analysis of the CRISPR deletion lines showed that 

Δe1, but not Δe2, substantially reduced chromatin accessibility across all the facilitators 

and the NMU promoter to levels comparable to ΔeNMU (Figure 2.12A, insets). This 

underscores the hierarchical relationship between the core enhancer e1 and other 


51 

regulatory elements. Furthermore, public high-resolution intact Hi-C data (ENCODE 

Project Consortium et al., 2020) shows a distinct stripe pattern anchored at eNMU 

extending towards the NMU promoter (Figure 2.12B, E–P stripe), suggesting that 

eNMU actively scans across the 94 kb region and makes widespread contacts. Strong 

focal interactions, indicated by dot-like patterns, are observed between the NMU 

promoter and F1, eNMU, F3, as well as a distal CTCF/cohesin peak, and also between 

F1′ and eNMU (Figure 6B, black arrowheads). Consistently, an independent lower-

resolution Hi-C study (Rao et al., 2014) reveals elevated contact frequencies between 

almost every pair of the regulatory elements (Figure 2.13A). These findings together 

hint at the presence of a spatial regulatory hub for the NMU gene (Figure 2.12D).   


52 

 
Fig. 2.12 A 3D regulatory hub of enhancer, promoter and facilitators of NMU.  

(A) ATAC-seq signal at the NMU–eNMU locus in WT, ΔeNMU, Δe1 and Δe2 cell 
lines, highlighting NMU promoter, eNMU, and facilitators (F1, F2, and F3 from 
Gasperini et al. (2019); F1 and F1′ from Reilly et al. (2021)). Tracks represent merged 
biological replicates (n = 2 independent cultures). (B–C) Public intact Hi-C (B) and 
ChIP-seq (B and C) (ENCODE Project Consortium et al., 2020; X. Guo et al., 2020) 
at the NMU–eNMU locus in K562. Intact Hi-C is shown at 300-bp resolution, with 
black arrows indicating pairwise contacts between NMU promoter, facilitators, and 
eNMU; ChIP-seq tracks display signal p-values. Detailed sources and accession 
information are provided in Supplementary Table S4. (D) Schematic model 
illustrating a 3D regulatory hub of enhancer–promoter–facilitator interactions at the 
NMU–eNMU locus. 


53 

 
Fig. 2.13 Additional epigenomic features of enhancer, promoter and facilitators. 

(A) Public Hi-C (Rao et al., 2014) and ChIP-seq tracks of CTCF and RAD21 
(ENCODE Project Consortium et al., 2020) at the NMU–eNMU locus in K562. Hi-C 
is shown at 5-kb resolution, with dashed lines and open circles marking pairwise 
contacts between NMU promoter, facilitators, and eNMU. Note that some contact 
anchors may not align perfectly with the regulatory elements, possibly due to the 
limited resolution of this Hi-C dataset. (B) Expanded ChIP-seq tracks (ENCODE 
Project Consortium et al., 2020; X. Guo et al., 2020) displaying signal p-values. Grey 
box highlights the NMU promoter and its proximal region, with a zoomed-in view 
shown on the top right. A separate inset on the right zooms in at the eNMU region, 
shown at the same scale as the full locus, except where otherwise indicated. Detailed 
sources and accession information are provided in Supplementary Table S4.  

 
To explore potential mechanisms underlying the 3D hub formation and 

facilitator function, we analyzed the epigenomic landscape across the entire NMU–

eNMU locus, leveraging the vast amount of experimental data available for K562 

(Figures 2.12C and 2.13B) (Core et al., 2014; ENCODE Project Consortium et al., 2020; 


54 

X. Guo et al., 2020). The paucity of the structural proteins CTCF/cohesin at eNMU and 

its facilitators prompted us to examine the binding of another independent looping 

factor, the LDB1 complex (Aboreden et al., 2025; Song et al., 2007), at these loci. In 

erythroid cells, the non-DNA binding transcription cofactor LDB1 forms a stable 

complex with GATA1, TAL1, E2A/TCF3 transcription factors (Wadman et al., 1997), 

which drives chromatin looping via dimerization of LDB1’s self-association domain 

(Deng et al., 2012). Indeed, eNMU and two of the facilitators, F1′ and F2, are well 

occupied by the LDB1 complex (Figures 2.12C and 2.13B). Although the NMU 

promoter itself is not bound by LDB1, its downstream proximal DHS peak exhibits low 

levels of TAL1/TCF3/LDB1 binding (Figure 2.13B, grey box and inset for NMU 

promoter). Interestingly, instead of GATA1, these factors seem to complex with 

RUNX1 at this site, which has been reported as an alternative binding partner of TAL1 

(Wilson et al., 2010) and LDB1 (Gilmour et al., 2018; Meier et al., 2006). Of note, we 

observed decreased accessibility at this promoter-proximal peak exclusively in the 

e1_mRUNX1 mutants in Figure 2.8B (black arrowheads), suggesting that RUNX1 

binding at eNMU communicates with the RUNX1-containing LDB1 complex near the 

NMU promoter.  

Beyond the LDB1 complex, F1′ and F2 also show modest enrichment for 

STAT5 binding (Figure 2.12C), which may contribute to their crosstalk with e1, as 

observed for the facilitator e2. In contrast, the strongest facilitator F3 is predominantly 

occupied by AP-1 factors along with appreciable binding of the SWI/SNF subunit 

SMARCA4 (Figure 2.13B). Nevertheless, signals for other coactivators (p300, BRD4, 

NCOA1), as well as DHS and H3K27ac, are evidently lower at all four facilitators 


55 

compared to eNMU. While H3K4me3 is primarily enriched at the NMU promoter, all 

the regulatory loci display comparable levels of H3K4me1, an enhancer mark that has 

been shown to facilitate enhancer–promoter interactions (Kubo et al., 2024). Finally, 

the minimal GRO-cap signals detected at F1–F3 (Figure 2.12C), together with the 

dispensability of TSSs in e2 (Figure 2.5, Δe2.2, Δe2.4), support the notion that active 

transcription is not a defining feature of facilitators, thereby solidifying our divergent 

transcription model for canonical autonomous enhancers. Taken together, our 

integrative analysis of the constellation of cis-regulatory elements at the NMU–eNMU 

locus highlights their spatial connectivity and distinctive epigenomic signatures, 

providing mechanistic insights into their action. 

2.3.7 Dynamics of eNMU regulation during erythroid differentiation  

The remarkable regulatory network of eNMU in K562 led us to explore its 

function under normal physiological conditions. Given the transcriptomic similarity 

between K562 cells and early erythroid precursors (Ulirsch et al., 2016), the 

documented role of NMU peptide in early erythropoiesis (Gambone et al., 2011), and 

the known hematopoietic functions of key TFs acting at eNMU (Chanda et al., 2013; 

M. J. Chen et al., 2009; Grebien et al., 2008; Nuez et al., 1995; Okuda et al., 1996; 

Perkins et al., 1995; Pevny et al., 1991; Socolovsky et al., 1999; Tóthová et al., 2021), 

we examined the well-established ex vivo erythroid differentiation model of human 

hematopoietic stem and progenitor cells (HSPCs) (Hu et al., 2013; J. Li et al., 2014) 

(Figure 2.14A). Reanalysis of published RNA-seq datasets (An et al., 2014; D. Li et al., 

2023; Schulz et al., 2019) shows that NMU is among the most significantly upregulated 

genes during the differentiation of HSPCs into erythroid precursors (proerythroblasts) 


56 

(Figure 2.14B), with expression increasing by over two orders of magnitude (Figures 

2.14, C and G). This aligns with a recent single-cell multiomics study that reported NMU 

induction during early erythropoiesis of human hematopoietic progenitors (X. Zhang et 

al., 2024). Importantly, NMU induction is accompanied by a progressive increase in the 

signals of H3K27ac, GATA1, RUNX1, (D. Li et al., 2023) and chromatin accessibility 

(Schulz et al., 2019) at the eNMU locus (Figures 2.14, D and F), supporting eNMU as 

a developmental enhancer of NMU. Furthermore, small interfering RNA (siRNA) 

knockdown of GATA1 significantly reduces NMU expression (D. Li et al., 2023) (Figure 

2.14E), mirroring the e1_mGATA1 mutant effect in K562 (Figure 2.7A). Together, 

these findings highlight the physiological relevance of our results obtained from the 

immortalized erythroid cell line model K562. 


57 

 
Fig. 2.14 Dynamics of eNMU regulation during erythroid differentiation.  

(A) Stages of HSPC erythroid differentiation analyzed in D. Li et al. (2023) (red) and 
Schulz et al. (2019) (blue). (B) Volcano plot showing genome-wide expression 
changes between HSPC and Ery-Pre stages, reanalyzed from D. Li et al. (2023) RNA-
seq data. Horizontal and vertical blue lines mark adjusted p = 0.05 and log₂ fold 


58 

changes of ±1, respectively. Red dots highlight the top 10 most significantly 
upregulated genes, including the key erythroid markers HBG1 and HBG2 (β-like 
globin genes). (C) NMU expression changes during early erythropoiesis, reanalyzed 
from D. Li et al. (2023) RNA-seq data (n = 3). (D) CUT&RUN signal of H3K27ac, 
GATA1 and RUNX1 at the NMU–eNMU locus during early erythropoiesis. Tracks 
show one representative biological replicate from D. Li et al. (2023). (E) GATA1 and 
NMU expression changes following 24-hr siRNA knockdown of GATA1 in Ery-Pro 
cells, reanalyzed from D. Li et al. (2023) RNA-seq data (n = 2). (F) ATAC-seq signal 
at the same locus as in (D) throughout the full HSPC differentiation time course. 
Tracks show merged biological replicates (n = 2) from Schulz et al. (2019). (G) NMU 
expression changes during the same stages as in (F), reanalyzed from An et al. (2014) 
and Schulz et al. (2019) RNA-seq data (n = 3). 

 
Further examination of the facilitator loci throughout the full differentiation time 

course (Schulz et al., 2019) reveals that facilitators F2 and F3 acquire discernible 

ATAC-seq signals when eNMU accessibility surges (Figure 2.14F), consistent with 

their eNMU-dependent behavior in K562 (Figure 2.12A). Interestingly, the strong 

facilitator F3 becomes even more accessible during the final stages of erythropoiesis, 

despite a sharp decline in eNMU accessibility and NMU expression. This decoupling of 

the enhancer–facilitator hierarchy suggests that facilitators work in concert with 

enhancers in a stage-specific manner and are insufficient to substitute for enhancers in 

the temporal control of gene expression. 

2.3.8 A putative LTR promoter as a built-in negative regulatory element for 

enhancer activity    

After scrutinizing the activating motifs in eNMU, we shifted our focus to the 

repressing segment e1.4 (Figure 2.5) to investigate its functional characteristics. The 

predominantly unidirectional transcription pattern immediately caught our attention, as 

opposed to the balanced divergent transcription in the positive regulatory region e1.1–


59 

e1.3 (Figure 2.15A). Interestingly, the entire e1 element corresponds to a MER72 Long 

Terminal Repeat (LTR) of the ERV1 endogenous retrovirus family. Sequence alignment 

with the MER72 consensus from the Dfam database (Storer et al., 2021) revealed a 

conserved, weak TATA box variant (CATAA) located 31 bp upstream of the TSS in 

e1.4, along with a conserved polyadenylation (poly A) signal downstream, matching the 

typical architecture of an LTR promoter (Medstrand et al., 2001) (Figure 2.15B). It is 

thus likely that e1.4 serves as a putative LTR promoter, while e1.1–1.3 functions as its 

corresponding LTR enhancer. The LTR promoter may compete with the NMU promoter 

for the LTR enhancer activity, which could explain the de-repression of NMU gene 

observed in Δe1.4. To test this hypothesis, we mutated the KLF/SP motifs in either the 

LTR promoter or the LTR enhancer and assessed their effects in single cell clones 

(Figure 2.15C). We chose KLF/SP motifs because (1) disrupting all of them in e1 greatly 

reduced NMU expression in our FlowFISH screen (Figure 2.5B), and (2) several of them 

are located closely upstream of TSSs in e1 (Figure 2.15A), a position generally 

associated with transcription activation (Duttke et al., 2024). Indeed, KLF/SP mutations 

in the LTR promoter caused a 1.5-fold increase in NMU expression (Figure 2.15C, 

LTRpr_mKLF), mirroring the effect of Δe1.4. By comparison, KLF/SP mutations in the 

LTR enhancer or across the entire e1 region dramatically decreased gene expression. 

We also attempted to strengthen the LTR promoter by optimizing its core promoter 

elements (CPEs), specifically the TATA box and the Initiator (Inr) motif. As predicted 

by the competition model, this mutant caused a slight downregulation of NMU 

expression (Figure 2.15C, mCPE_up). However, attempts to weaken these CPEs 

showed only a neutral effect, likely due to their inherently weak strength in driving 


60 

transcription initiation, supporting a dominant role of the KLF/SP motifs in the LTR 

promoter function.  

 
Fig. 2.15 A putative LTR promoter as a built-in negative regulatory element for 
enhancer activity.  

(A) Transcription-related sequence features of e1. (B) Sequence alignment between 
e1 and the MER72 LTR consensus. (C) NMU mRNA levels measured by RT-qPCR 
in single cell-derived recombinant clones; bars = median. (D–G) ATAC-seq signal at 


61 

eNMU (D), NMU promoter, and GAPDH control locus (E); PRO-seq signal at eNMU 
(F) and NMU promoter (G) in select eNMU mutants. In (G), bar = NMU pausing 
index of merged replicates, dots = pausing index of individual replicates. Tracks 
represent merged biological replicates (n = 2 independent single cell clones). Colored 
boxes below tracks indicate locations of disrupted TF motifs. Fine vertical lines 
indicate positions of GRO-cap–defined TSSs (WT K562) (Core et al., 2014). (H) 
Proposed competition model between the LTR promoter and the NMU promoter. 

 
To exclude the possibility that the KLF/SP sites in e1.4 act as repressor motifs, 

we performed ATAC-seq in both the LTR promoter and the LTR enhancer mutants. 

This revealed highly localized accessibility reductions confined to the respective 

mutated regions (Figure 2.15D), confirming the activating function of KLF/SP in both 

contexts. However, accessibility at the NMU promoter changed in opposite directions 

(Figure 2.15E), supporting the idea that the putative LTR promoter and enhancer operate 

as distinct regulatory elements for NMU, likely through a promoter competition 

mechanism (Figure 2.15H).  

Finally, we examined the nascent transcription profiles of the KLF/SP mutants 

by PRO-seq. At the eNMU locus, the LTR promoter mutant showed a nearly identical 

pattern to WT_eNMU, suggesting unperturbed Pol II recruitment (Figure 2.15F). 

Furthermore, the pausing index of NMU also remained unaltered due to the proportional 

increase in the pause region and gene body reads (Figure 2.15G). Conversely, the LTR 

enhancer mutant resembled other e1 mutants studied in Figure 2.9, exhibiting reduced 

eNMU and NMU signals, along with a doubled pausing index. These observations 

thereby raise an interesting possibility: promoter competition could provide a unique 

advantage in modulating target gene transcription while maintaining normal Pol II 

pause–release dynamics. 


62 

2.4 Discussion 

In this study, we established a novel landing pad platform to systematically 

interrogate the molecular architecture of the potent long-range enhancer eNMU at its 

native locus. Through detailed functional dissections of key mutants and integrative 

mining of public datasets, we uncovered several recurring themes supported by multiple 

lines of evidence: (1) TFs exert unique and cooperative functions in a context-specific 

manner; (2) facilitators depend on the core enhancers while ensuring robustness of their 

enhancer partners; (3) divergent transcription accurately demarcates active enhancer 

units. Collectively, our findings illuminate an intricate and coordinated interplay among 

distinct classes of cis-regulatory elements—enhancers, facilitators, and promoters—that 

underpins precise transcriptional regulation.         

Previous enhancer studies have primarily employed approaches such as random 

mutagenesis (Canver et al., 2015b; Kircher et al., 2019; Melnikov et al., 2012; 

Patwardhan et al., 2012), tiling disruptions (Kosicki et al., 2024; Martyn et al., 2025; 

Roh et al., 2024), and specific motif manipulations (Frömel, Rühle, Martinez, et al., 

2025; Georgakopoulos-Soares et al., 2023; Grossman et al., 2017; R. P. Smith et al., 

2013) to identify functional features within regulatory elements. In contrast, our study 

applied a distinct framework to dissect eNMU, grounded in our divergent transcription-

based unit definition of active human enhancers (Core et al., 2014). Building on the 

foundational work of Gasperini et al. (2019) and Tippens et al. (2020), this approach 

enabled us to progressively refine the bona fide NMU enhancer unit from the full eNMU 

region to its sub-element e1, and ultimately to a minimal, divergently transcribed LTR 

enhancer core (Figure 2.15). Importantly, the core enhancer activity is modulated by the 


63 

surrounding sequence features within eNMU—specifically, augmented by the 

intrinsically inactive facilitator element e2 and repressed by the adjacent 

unidirectionally transcribed LTR promoter. These findings reveal the regulatory 

complexity of the eNMU locus and highlight the strength of our divergent transcription 

model in precisely delineating functional enhancer units, thereby guiding the future 

classification of diverse distal regulatory elements.                            

Unlike previously described facilitators identified in the context of hyper-active 

super-enhancers (Blayney et al., 2023; Brosh et al., 2023), the presence of multiple 

facilitators associated with a typical enhancer across the ~100-kb NMU–eNMU region 

raises the possibility that many CRISPRi-identified “enhancers” may in fact function as 

facilitators. Notably, the eNMU-associated facilitators exhibited virtually no intrinsic 

activity even when present all together in the genome (0.01% of WT expression, Figure 

2.1B, Δe1), consistent with their inherently weak, enhancer-dependent accessibility 

patterns (Figure 2.12A). This stands in contrast to the facilitators within super-

enhancers, which tend to display strong signals of open chromatin, TF/coactivator 

binding, and Pol II recruitment—features likely contributing to their residual intrinsic 

enhancer activity (Blayney et al., 2023). We speculate that a continuum of enhancer 

potential exists along the enhancer–facilitator spectrum, with the eNMU-associated 

facilitators situated at the extreme low-activity end. The lack of intrinsic activity is 

crucial in confining facilitator function to potentiating and buffering pre-established 

enhancers (Figure 2.11) while preventing ectopic gene activation. With the continuing 

advances in genome engineering technologies, it will be increasingly important to 

interrogate all candidate regulatory elements simultaneously within their native 


64 

chromatin hub environments, to distinguish autonomous enhancers from affiliated 

facilitators and to better understand their mechanistic interplay.    

In addition to the enhancer–facilitator axis, the LTR enhancer–promoter axis 

within eNMU represents another potentially widespread and underappreciated mode of 

cis-regulatory behavior, especially considering the high abundance of LTRs in the 

human genome and their nearly 10% representation of all ENCODE candidate cis-

regulatory elements (cCREs) (A. Y. Du et al., 2024). By leaving the enhancer intact, the 

LTR promoter can compete with the gene promoter without disrupting normal Pol II 

pause–release dynamics, instead simply siphoning transcriptional activity toward itself. 

This regulatory strategy enables fine-tuning of target gene expression, particularly when 

the LTR promoter harbors motifs for developmental stage-specific TFs. Moving 

forward, a genome-wide search for functional unidirectional TSSs, including but not 

limited to those derived from LTR promoters, will be crucial for constructing a more 

comprehensive map of transcriptional networks.  

How do these functionally distinct classes of cis-regulatory elements coordinate 

to achieve precise and robust transcriptional regulation? We propose that the answer lies 

in the combinatorial action and synergy of a repertoire of trans-acting TFs and cofactors. 

At the core LTR enhancer unit, we observed strong cooperative binding of key TFs 

GATA1 and RUNX1 (Figures 2.7, B and C), despite their motifs being separated by 

~40 bp—a spacing that likely limits direct protein–protein interactions. This suggests 

that indirect mechanisms (Morgunova & Taipale, 2017; Spitz & Furlong, 2012) may 

underlie their cooperation, including DNA conformational changes (Panne et al., 2007), 

co-binding to a shared cofactor or a multiprotein complex (Spitz & Furlong, 2012), and 


65 

nucleosome-mediated collaborative competition (Adams & and Workman, 1995; 

Doughty et al., 2024; Miller & Widom, 2003; Mirny, 2010). At the adjacent LTR 

promoter, activating KLF/SP family TFs played a context-specific role to compete for 

enhancer activity (Figure 2.15). At the facilitator e2, STAT5 binding to a tandem array 

of five motifs critically amplified e1’s enhancer activity, despite modest impact on e1’s 

chromatin accessibility, transcription initiation, and p300 recruitment (Figures 2.7–2.9). 

It is worth noting that e2’s own accessibility and transcription depended not only on its 

own STAT5 binding, but also on the integrity of e1’s key TF binding (Figures 2.8 and 

2.9). This peculiar behavior of STAT5 echoes the recently proposed concept of 

“context-only” TFs (Kribelbauer-Swietek et al., 2024), which do not provide DNA 

access themselves but instead amplify the activity of “context-initiator” TFs by 

establishing cooperative environments. These two classes of TFs partner promiscuously 

without requiring close motif proximity, consistent with e2’s universal buffering effect 

on various heterologous enhancers (Figure 2.11). We speculate that STAT5 binding at 

e2 fosters multivalent interactions (Chong et al., 2022; Trojanowski et al., 2022) with 

e1-bound TFs via its intrinsically disordered C-terminal transactivation domain (C. P. 

Lim & Cao, 2006), thereby enhancing the “stickiness” of the regulatory hub. 

Nonetheless, we cannot exclude the possibility that STAT5 engages some unique 

coactivators which have yet to be identified. 

In addition to the disordered C-terminal transactivation domain, STAT5 

contains an N-terminal oligomerization domain that allows tetramerization of active 

STAT5 dimers on tandemly linked motifs (John et al., 1999; W. K. Meyer et al., 1997). 

This oligomerization extends STAT5’s DNA binding specificity to low-affinity sites 


66 

(Soldaini et al., 2000), which may explain the ~30% residual binding observed upon 

mutating two conserved bases in all five STAT5 motifs at e2 (Figure 2.7D, right panel). 

Such oligomerization might also facilitate spatial connectivity between eNMU and the 

STAT5-bound facilitators F1¢ and F2—reminiscent of GAGA-associated factor (GAF) 

oligomerization at a subset of tethering elements in developing Drosophila embryos (X. 

Li et al., 2023), which, despite lacking intrinsic enhancer activity, are essential for long-

range enhancer–promoter communication (Batut et al., 2022). In parallel, dimerization 

of the LDB1 complex bound at F1¢ and F2, the enhancer e1, and the NMU promoter 

(Figures 2.12 and 2.13) may further promote chromatin contacts between these loci. 

Together, these potential mechanisms suggest a broader architectural role for facilitators 

in organizing 3D regulatory hubs independent of CTCF or cohesion, underscoring an 

exciting avenue for future investigation.  

A final noteworthy observation from our functional analysis is that, despite 

substantial variation in chromatin accessibility pattern at eNMU across different motif 

mutants (Figure 2.8A), the transcriptional output and Pol II pause–release dynamics 

(pausing index) of NMU were altered to similar extents (Figures 2.7A, 2.9B, and 2.9C). 

Such decoupling between enhancer accessibility and gene activation is consistent with 

prior findings (Dogan et al., 2015; Doughty et al., 2024) and highlights the importance 

of specific TF inputs and their associated cofactors in driving functionally productive 

enhancer–promoter communication. Future work should aim to define the full repertoire 

of these regulatory components and the steps of transcription that they influence (such 

as chromatin opening, Pol II initiation, and pause release).  


67 

In summary, we conducted a rigorous in situ dissection of a robust long-range 

enhancer at unprecedented architectural resolution, providing experimental evidence 

that resonates with and extends current models of enhancer function. The intricate 

crosstalk among spatially and functionally linked cis-regulatory elements—including 

enhancers, facilitators, and promoters—underscores the importance of a holistic 

framework to decode their mechanistic interplay. We anticipate that our efficient and 

versatile recombinase-mediated genome rewriting platform will serve as a powerful tool 

to drive these efforts forward.  

Limitations of the study 

Our motif mutagenesis approach could not definitively identify the functional 

TFs acting at eNMU. For instance, both GATA1 and GATA2 are well expressed in 

K562 cells and bind similar/identical motifs, making it difficult to distinguish their 

individual contributions. We attributed the observed effects to GATA1 in our study, 

because it is the most highly expressed GATA family factor in K562 (Karlsson et al., 

2021) and the master regulator of erythropoiesis. Similarly, we did not determine the 

exact TFs binding the critical RARA/RXRA motifs, given the low abundance of RARA 

and RXRA proteins (Grande et al., 2001; Karlsson et al., 2021) and the presumed 

absence of retinoic acid signaling in K562 under standard culture conditions. It is 

possible that other nuclear receptors recognize and bind these motifs. In addition, our 

mutagenesis screen may have missed some functional motifs, as it relied on the 

availability of ChIP-seq data to confirm TF binding. Furthermore, while we made every 

effort to avoid disrupting overlapping motifs, some degree of interference was 

unavoidable—for example, between the right RUNX1 motif and the adjacent 


68 

RARA/RXRA motif. As we introduced only a single version of the transversion 

mutations, we also cannot completely exclude the possibility of inadvertently creating 

novel TF binding sites, despite efforts to minimize matches to known motifs. Finally, 

although e2 displayed power-law buffering behavior across eight heterologous 

enhancers, larger-scale studies are warranted to fully capture the complexity of 

enhancer–facilitator synergism.   

2.5 Methods 

Cell lines and culture 

Parental wildtype K562 cells, an immortalized erythroleukemia cell line isolated 

from the bone marrow of a 53-year-old female patient with chronic myelogenous 

leukemia (CML), were obtained from the America Type Culture Collection (ATCC) 

(ATCC Number CCL-243) by the Yu lab and generously provided to us. Genetically 

modified, homozygous eNMU deletion lines (ΔeNMU, Δe1, and Δe2) were also kind 

gifts of the Yu lab. All the other engineered K562 cell lines, including the eNMU 

landing pad lines and single cell-derived recombinant clones, were generated by this 

study (see below). All the K562 lines were cultured in RPMI 1640 media supplemented 

with GlutaMAX (Gibco) and 10% heat-inactivated FBS (Avantor) at 37°C with 5% CO2 

in a humidified sterile incubator. Cell density was maintained between 0.1 ~ 1 × 10⁶ 

cells/mL, and mycoplasma testing was performed routinely.   

Transfection and cell sorting  

All transfection experiments in K562 cells were carried out using Lonza’s 

Nucleofector 2b device and the Nucleofection Kit V, following manufacturer’s 


69 

instructions. Specifically, one single cuvette was used to transfect 1 million cells with a 

total of 5 µg plasmid DNA; for co-transfection of two plasmids, 2.5 µg of each plasmid 

was used. All cell sorting experiments were performed on the Sony MA900 Multi-

Application Cell Sorter using a 100-μM chip (catalog no. LE-C3210; Sony).       

Single-copy eNMU landing pad cell line construction 

To CRISPR knock in the Bxb1 landing pad at the eNMU locus in an eNMU-

null background, we first amplified the genomic region surrounding the eNMU locus in 

the ΔeNMU cell line, inserted it into a HindIII-linearized pEGFP-N1 vector via Gibson 

assembly, and Sanger sequenced individual colonies to determine the exact allelic 

sequences. Based on the obtained sequences, we designed four sgRNAs using 

CHOPCHOP (Labun et al., 2019) and cloned each sgRNA into the pX330 vector 

(Addgene plasmid # 42230) following its standard protocol, i.e., restriction-ligation 

cloning of annealed sgRNA oligos into BbsI-linearized pX330 backbone. Left and right 

homology arms, each in 1-kb size, were also designed and PCR amplified from K562 

genomic DNA. Two intermediate plasmids were constructed prior to assembling the 

homology directed repair (HDR) donor plasmid: first, the attP1 and attP2 gBlocks (IDT, 

Supplementary Table S2) were inserted into a vector backbone; second, an EF1a 

promoter fragment and the BFP-2A-iCasp9-2A-BlastR cassette, PCR amplified from 

the pFL7_pLenti-pTet-Bxb1-BFP-2A-iCasp9-2A-BlastR_pCMV-rtTA3 plasmid (a 

kind gift from the Grimson lab) (Matreyek et al., 2020) were inserted between the attP1 

and attP2 sites; third, the left and right homology arms, along with the entire attP1-

EF1a-BFP-2A-iCasp9-2A-BlastR-attP2 cassette, were inserted into a pUC19 vector 


70 

backbone to generate the final donor plasmid. All three cloning steps were performed 

using Gibson assembly. The second intermediate plasmid was constructed to enable 

preliminary testing of Bxb1 recombination in a plasmid context prior to chromosomal 

integration (data not shown).    

Each pX330-sgRNA plasmid was then co-transfected with the donor plasmid 

into the ΔeNMU K562 cell line. After episomal BFP signal died out, CRISPR knock-in 

efficiency was assessed by gain of stable BFP expression. Three out of four sgRNAs 

produced a significant BFP+ population compared to the donor-only negative control. 

Single cells from these three populations were sorted into 96-well plates to derive clonal 

cell lines.  

Outgrown single cell clones were first screened by genotyping PCR to identify 

those with heterozygous landing pad (LP) integration. Genomic DNA from a subset of 

candidate clones was purified by phenol-chloroform extraction, and a qPCR-based copy 

number analysis was performed by comparing BFP DNA Ct values to a control locus 

known to exist in three alleles in K562. Confirmed single-copy LP clones were further 

evaluated based on the percentage of BFP+ cells and Bxb1 recombination efficiency 

(see below). Two clonal lines, E5 and D17, were selected for subsequent experiments.               

Bxb1 recombination efficiency and eNMU rescue experiment   

To construct the attB-containing payload plasmid, the attB1 and attB2 gBlocks 

(IDT, Supplementary Table S2) were first inserted into a vector backbone to generate 

an intermediate plasmid. An EF1a promoter fragment and an EGFP or mCherry 

fragment were then introduced into the linearized intermediate plasmid. For all 


71 

subsequent individual element cloning (i.e., excluding eNMU mutant library cloning), 

this parental attB1-EF1a-EGFP/mCherry-attB2 plasmid was digested with BmtI and 

BspEI (NEB) and the EF1a-EGFP/mCherry cassette was replaced with intended 

elements. All cloning steps were performed using Gibson assembly. 

To evaluate Bxb1 recombination efficiency and test the functionality of the 

eNMU landing pad, we co-transfected the pFL9_pCAG-NLS-HA-Bxb1 plasmid 

(Addgene # 51271, a kind gift from the Grimson lab, transiently expressing Bxb1 

recombinase) and the attB1-eNMU-attB2 payload plasmid into the LP cell lines. About 

7 days post-transfection, percentage of BFP− population became stable and was 

measured on the Sony MA900 cell sorter compared to a no-payload negative control. 

Both E5 and D17 LP clones consistently exhibited 4~10% BFP loss across independent 

experiments, with Clone E5 showing slightly higher recombination efficiency. The 

recombinant BFP− cells were further sorted as bulk populations and propagated for 

another 10~14 days to allow stable NMU reactivation. Cells were then harvested for 

RNA extraction and RT-qPCR analysis (see below) to confirm the rescue of NMU gene 

expression.    

eNMU mutant library design, cloning, and integration 

Given the central role of TFs in enhancer function, we first sought to dissect how 

specific TF binding events contribute to eNMU activity by maximizing both the extent 

and specificity of TF binding disruption. To this end, we aimed to (1) curate a list of 

motifs for TFs that are expressed in K562 cells and exhibit motif-specific binding 

supported by public ChIP-seq data, and (2) introduce point mutations across all motif 


72 

occurrences of each selected TF. Specifically, we retrieved all available K562 ChIP-seq 

peaks that overlap the eNMU region (hg38 coordinate = chr4:55729891–55730846) 

using the UCSC Table Browser tool (Karolchik et al., 2004). We removed entries 

corresponding to non-sequence-specific cofactors and TFs not expressed in K562, based 

on ENCODE (ENCODE Project Consortium et al., 2020) polyA plus RNA-seq data 

(accession: ENCSR000CPH) using a TPM > 1 threshold. Binding motifs for the 

remaining TFs were then obtained from the JASPAR 2022 database (Castro-Mondragon 

et al., 2022) with few occasions from the cis-BP database (Weirauch et al., 2014) (see 

Supplementary Table S1). These motifs were further manually reviewed and filtered to 

retain only those located under a ChIP-seq peak. For TFs with motifs that perfectly 

overlap at least once—such as AP-1/NFE2 and KLF1/SP1—we grouped and treated 

them as a single TF. To maximize disruption of TF binding while minimizing 

unintended effects on adjacent motifs, we identified the two most conserved bases in 

each motif using position frequency matrices (PFMs) from the JASPAR 2022 database 

(Castro-Mondragon et al., 2022) and introduced transversion mutations (A↔C, T↔G). 

We chose this transversion scheme because it has been shown to be more effective than 

alternative mutagenesis schemes (Kircher et al., 2019; Kosicki et al., 2024).   

To complement the targeted motif mutagenesis, we designed tiling deletions 

across the 956-bp eNMU region, each spanning ~100-bp intervals within the sub-

elements e1 (first 453 bp) and e2 (last 503 bp). These deletions were intended to 

encompass GRO-cap–defined TSSs and TF motif clusters, resulting in segments e1.1–

e1.4 and e2.1–e2.4. Additional segments e1.5–e1.9 were included to help resolve critical 

sequence features within e1.3 and e1.4. Detailed information on all mutated motifs and 


73 

deleted segments is listed in Supplementary Table S1 (separate file). 

Given the functional distinction between e1 (enhancer) and e2 (facilitator), we 

aimed to introduce mutations in either e1, e2, or both to dissect their individual 

contributions and cooperative interactions. To achieve this, we employed a “mix-and-

match” cloning strategy (Figure 2.4B). For TF motif mutagenesis, mutant versions of 

e1 and e2 were synthesized separately by Twist Bioscience as dsDNA fragments, with 

all occurrences of a given TF’s motif mutated simultaneously. Each mutated e1 element 

was paired with either a wildtype e2 or a mutated e2 of the same TF type, and vice versa. 

Each pair was Gibson assembled with two half-backbone fragments: a fixed attB1-

containing fragment and an attB2-containing fragment carrying a unique 8-bp random 

barcode generated by PCR. Tiling deletion constructs were built using the same cloning 

strategy, except that e1 deletion fragments were PCR amplified from pre-existing 

mutant plasmids created using the Q5 site-directed mutagenesis kit (NEB) in earlier 

experiments, rather than synthesized. Wildtype eNMU, Δe1, and Δe2 constructs were 

included as controls with known enhancer activities. Additionally, six exogenous 

sequences from a published STARR-seq library (Tippens et al., 2020), kindly provided 

by the Yu lab, were also cloned as controls. These included the 584-bp CMV enhancer 

(CMV584), commonly used as a positive control in episomal enhancer reporter assays, 

and several non-regulatory open reading frames (ORFs), including EGFP and four 

human ORFs (ORF56714, ORF52920, ORF54588, and ORF55756). In total, 83 

individual Gibson assembly reactions were performed and transformed into NEB Stable 

competent E. coli cells (prepared using the Mix & Go! E. coli transformation kit from 

Zymo Research). 


74 

For each Gibson assembly transformation, 8 colonies were picked and cultured 

overnight in deep-well 96-well plates. Colony PCR was performed on 1:20 water-

diluted liquid cultures to screen for positive insertions using Q5 High-Fidelity 2X 

Master Mix (NEB) with primers ZZ041 and ZZ044 (Supplementary Table S2), which 

amplify the insertion from regions flanking the Bxb1 recombination sites. The PCR 

program was: initial denaturation 98°C for 5 min; 30 cycles of 98°C for 10 s, 61°C for 

30 s, 72°C for 39 s; and final extension 72°C for 5 min. Positive PCR amplicons (~1.3 

kb) were then purified using homebrew SPRI beads (Boswell, 2020) (0.7× bead ratio) 

and subjected to Sanger sequencing to verify element sequences and determine element-

barcode associations. In total, we identified 328 unique barcodes corresponding to the 

83 elements. These confirmed liquid cultures were pooled together for Maxiprep (Zymo 

Research) to extract the plasmid library.  

 Four million LP cells of the Clone E5 or D17 (biological replicates) were 

transfected with the plasmid library and the pFL9_pCAG-NLS-HA-Bxb1 plasmid to 

achieve a minimum coverage of 200× for each unique barcode representation in the 

recombinant population. On Day 7 post-transfection, BFP− cells were sorted at a 

minimum coverage of 200× and expanded for another 14 days to allow full activation 

of NMU. The recombinant cells were then subjected to HCR-FlowFISH.  

HCR-FlowFISH and sequencing library preparation  

To measure enhancer activity of individual elements within the pooled 

recombinant population, HCR-FlowFISH was performed according to the published 

protocol (Reilly et al., 2021) with minor modifications. We first obtained HCR probe 


75 

sets and fluorescent hairpins from Molecular Instruments for the target gene NMU (B1 

hairpin, Alexa Fluor 647 or AF647) and the internal control gene ACTB (B2 hairpin, 

Alexa Fluor 488 or AF488). Note that the NMU probes were custom-designed in the 

published study (Reilly et al., 2021) while the ACTB probes were pre-designed and 

optimized by Molecular Instruments. FISH probing was performed in strict accordance 

with the published protocol (Reilly et al., 2021), including all solution volumes and 

centrifugation parameters. Briefly, 20 million recombinant cells of each biological 

replicate were fixed with 4% formaldehyde in PBST (1× PBS, 0.1% Tween 20) at room 

temperature for 1 h and washed with PBST for 4 times. Following 10 min up to 24 h 

incubation with cold 70% Ethanol at 4°C, cells were washed with PBST twice and 

incubated with the pre-warmed Probe Hybridization Buffer at 37°C for 30 min. HCR 

probes for NMU and ACTB were added together to cells to reach a final concentration 

of 4 nM per probe. The samples were then incubated overnight with agitation in a 37°C 

hybridization oven. On the next day, cells were washed with the Probe Wash Buffer for 

5 times, with 5× SSCT (5× SSC, 0.1% Tween 20) once, and pre-amplified in the 

Amplification Buffer for 30 min at room temperature. Snap-cooled hairpins were 

diluted in the Amplification Buffer and then added to the pre-amplified samples to reach 

a final hairpin concentration of 60 nM. Samples were incubated with rotation in a dark 

room overnight at room temperature. On the next day, 5× volume of 5× SSCT was added 

to the samples before centrifugation and removal of the hairpin amplification solution. 

Cells were then washed with 5× SSCT for 6 times before final resuspension in PBS at 

a density of 10 million cells/mL. The samples were filtered through a 35 µm Cell 

Strainer cap into a 5 mL polystyrene tube (Corning) before sorting.        


76 

Cells were sorted into 8 bins (2 rounds of 4-way sorting) based on the 

AF647/AF488 ratio (Figure 2.4C) at a minimum coverage of 500× barcode coverage 

per bin to ensure robust representation in the sequencing library. Sorted cells, together 

with the unsorted background sample, were pelleted and resuspended in 400 µL of ChIP 

lysis buffer (50 mM Tris-HCl, pH 8, 10 mM EDTA, 1% SDS), and de-crosslinked 

overnight at 65°C with 1000× rpm shaking. Samples were then treated with RNase A 

(Thermo Scientific) and Proteinase K (Invitrogen) before phenol-chloroform extraction 

of genomic DNA (gDNA). Sequencing libraries were prepared by two rounds of PCR 

using Q5 High-Fidelity 2X Master Mix (NEB). The 1st round PCR was performed on 

the recovered gDNA corresponding to a minimum of 200× barcode coverage, using 

primers ZZ145 and ZZ146 (Supplementary Table S2) to specifically amplify the 8-bp 

barcodes from the eNMU genomic locus. A maximum of 500 ng gDNA was used as 

input in a 50 µL PCR reaction. The PCR program was: initial denaturation 98 °C for 

3 min; 11 cycles of 98°C for 10 s, 65°C for 30 s, 72°C for 1 min; and final extension 

72°C for 5 min. The PCR products were then purified using homebrew SPRI beads 

(1.5× bead ratio) to remove unused primers, followed by the 2nd round PCR with 

standard Illumina Nextera primers to append sequencing library indices and flow cell 

adaptors to the amplicons. The PCR program was: initial denaturation 98°C for 30 s; 11 

cycles of 98°C for 10 s, 67°C for 30 s, 72°C for 20 s; and final extension 72°C for 5 min. 

Final PCR products were purified using the MinElute PCR purification kit (Qiagen) and 

DNA concentration was measured by the Qubit dsDNA High Sensitivity assay (Thermo 

Fisher). The libraries were pooled for sequencing on the Element Biosciences AVITI 

platform (2 × 80 bp paired-end sequencing). 


77 

Individual element testing at the eNMU landing pad 

For downstream functional analysis, critical eNMU mutants identified in the 

FlowFISH screen were cloned into the attB1-attB2 plasmid backbone without any 

element barcode. Several additional related mutants were designed and generated, 

whose sequences are listed in Supplementary Table S1 (separate file). These elements 

were integrated individually into the eNMU landing pad, and the BFP− recombinants 

were sorted as single cells into 96-well plates to establish clonal cell lines as independent 

biological replicates. Three to four weeks after sorting, cells expanded to sufficient 

numbers for crude gDNA extraction (Gasperini et al., 2019) and genotyping PCR to 

confirm element insertion: briefly, ~0.2 million cells were pelleted and concentrated in 

20 µL of culture media in a 0.5-mL PCR tube, mixed with 40 µL of Quick Extract buffer 

(10 mM Tris-HCl, pH 8.5, 0.45% Tween-20, 4 mg/mL protease K), and incubated at 

65°C for 6 min and 98 °C for 2 min; 1 µL of the crude gDNA extract was used as input 

in the genotyping PCR following the standard Phusion polymerase protocol (NEB) with 

homemade Phusion polymerase and primers ZZ104 and ZZ105, which amplify the 

insertion from regions flanking the Bxb1 recombination sites. The PCR program was: 

initial denaturation 98°C for 3 min; 30 cycles of 98°C for 10 s, 58°C for 30 s, 72°C for 

30 s; and final extension 72°C for 5 min. Positive amplicons (1132 bp) were purified 

using homebrew SPRI beads (0.7× bead ratio) and verified by Sanger sequencing to 

confirm sequence integrity. Of note, no mutations were observed in any of the single 

cell-derived clones, demonstrating the genomic stability of K562 cells. The verified 

clonal lines were subjected to RT-qPCR analysis to measure their NMU expression 

levels.  


78 

For heterologous enhancer testing at the eNMU locus, we selected candidate 

elements from a previously curated list of CRISPR-validated distal regulatory elements 

in K562 cells (Supplementary Table 6a of Fulco et al. (2019)), prioritizing those with 

large effect sizes. The selected elements were PCR amplified from K562 gDNA and 

inserted with or without the e2 element into the attB1-attB2 plasmid backbone. 

Recombinant cells were sorted as bulk populations to measure enhancer activity by RT-

qPCR. As a side note, the parental LP cell lines exhibited a basal BFP− fraction 

(0.6~2%), meaning the sorted BFP⁻ population included some non-recombinant LP 

cells. To estimate the true recombinant fraction, we subtracted the %BFP⁻ in the no-

payload control from that in the Bxb1+payload transfection and divided this number by 

the total %BFP⁻ in the Bxb1+payload transfection. NMU expression measured by RT-

qPCR was corrected based on this estimated true recombinant fraction for each element 

integration.     

Quantitative reverse transcription polymerase chain reaction (RT-qPCR) 

K562 cells were lysed with the TRIzol Reagent (Invitrogen), and RNA was 

isolated using the Direct-zol RNA miniprep kit (Zymo Research) with 15 min DNase 

treatment on column. Reverse transcription was performed using M-MuLV RT (NEB 

M0253L) and Random Primer Mix (NEB S1330S) following manufacturer’s 

instructions. Real-time quantitative PCR (qPCR) was carried out with a custom 

protocol: 1/10 volume of cDNA, 1× Phusion HF Buffer (NEB), 500 nM of each primer, 

200 μM dNTPs (Thermo Fisher), 0.7× SYBR Green I (Invitrogen), and 1/100 dilution 

of homemade Phusion polymerase. All qPCR reactions were run in technical triplicates 

in 10 μL volumes in 384-well plates on a Roche LightCycler 480 Instrument II with the 


79 

following program setting: initial denaturation 98°C for 2 min; 45 cycles of 98°C for 

10 s, 58°C for 20 s and 72°C for 30 s; melt curve 98°C for 5 s, 55°C for 1 min, ramp to 

98°C at 0.11°C/s; and cool down to 40°C. NMU expression was normalized to the 

housekeeping gene ACTB using the 2-ΔΔCT method (Livak & Schmittgen, 2001). 

Primers used for RT-qPCR are listed in Supplementary Table S2 (separate file). 

Chromatin Immunoprecipitation (ChIP) 

ChIP experiments were conducted using two independently derived single cell 

clones as biological replicates for each of the four genotypes of interest: WT_eNMU, 

e1_mGATA1, e1_mRUNX1, e2_mSTAT5. For GATA1 and RUNX1 ChIP, cells were 

washed twice with ice-cold PBS and crosslinked with 1% formaldehyde (Electron 

Microscopy Sciences) at room temperature for 10 min before quenching by 200 mM 

glycine at room temperature for 5 min. For STAT5 and p300 ChIP, cells were first 

crosslinked with 2 mM disuccinimidyl glutarate (Santa Cruz) at room temperature for 

30 min, washed 3 times with PBS and then crosslinked with 1% formaldehyde at room 

temperature for 5 min before quenching with 200 mM glycine at room temperature for 

5 min. Two additional PBS washes were performed, and cell pellets were lysed with 

Farnham Lysis Buffer (5 mM PIPES, pH 8, 85 mM KCl, 0.5% NP40, 10 mM glycine, 

1× Thermo Scientific Pierce Protease Inhibitor) on ice for 20 min. After centrifugation 

and supernatant removal, the nuclear pellet was resuspended in RIPA Lysis Buffer (10 

mM Tris-HCl, pH 8, 150 mM NaCl, 1 mM EDTA, 1% NP-40, 0.5% sodium 

deoxycholate, 0.1% SDS, 1× Thermo Scientific Pierce Protease Inhibitor) and incubated 

on ice for 10 min. Sonication was carried out using a Diagenode Bioruptor device at 

High Setting, 30 sec on/30 sec off for three rounds of 10-min cycle to shear chromatin 


80 

to a size of 100~300 bp. The lysate was then clarified by centrifugation at 20,000 r.c.f., 

4°C for 15 min, of which 2% was kept as ChIP input.  

The following antibodies were used for IP: GATA1, Abcam ab11852; RUNX1, 

Abcam ab23980; STAT5, R&D Systems AF2168; p300, Abcam ab14984; normal 

rabbit IgG control, Cell Signaling Technology 2729S; normal mouse IgG1 control, 

Santa Cruz sc-3877. Each IP used 4 million cells/4 µg antibody/40 µL Dynabeads 

Protein A (for rabbit IgG) or Protein G (for mouse IgG1) (Thermo Scientific). Beads 

were washed three times with 5 mg/mL BSA in PBS and incubated with corresponding 

antibodies at 4°C for 6 h to overnight with rotation. Another three BSA/PBS washes 

were performed to remove unbound antibodies, and the clarified chromatin lysate was 

added to the beads and incubated overnight at 4°C with rotation. Beads were then 

washed with the following buffers, each for three times: Low Salt Wash Buffer (20 mM 

Tri-HCl, pH 8, 2 mM EDTA, 150 mM NaCl, 1% Triton X-100, 0.1% SDS), High Salt 

Wash Buffer (20 mM Tri-HCl, pH 8, 2 mM EDTA, 500 mM NaCl, 1% Triton X-100, 

0.1% SDS), and LiCl Wash Buffer (10 mM Tri-HCl, pH 8, 1 mM EDTA, 250 mM LiCl, 

1% NP-40, 1% sodium deoxycholate). After one final wash with TE Buffer (10 mM 

Tris-Cl, pH 8, 1 mM EDTA), chromatin was eluted from beads by two rounds of 

incubation with ChIP elution buffer (1% SDS, 0.1 M sodium bicarbonate). Each 

incubation involved 15 min shaking at 1200 rpm at 65°C, followed by 15 min rotation 

at room temperature. The eluates and input samples were treated with RNase A at 37°C 

for 30 min, de-crosslinked at 65°C overnight with 900 rpm shaking, followed by 

Proteinase K treatment at 45°C for 2 h with shaking. DNA was purified using MinElute 

PCR Purification Kit (Qiagen). 


81 

For GATA1, RUNX1 and STAT5 ChIP, qPCR was performed on the purified 

input and eluate samples to measure enrichment of TF binding at the eNMU locus. A 

negative control locus was also probed to estimate background signal of non-specific 

pull-down. Primers used for ChIP-qPCR are listed in Supplementary Table S2 (separate 

file). Since ChIP eluates were in low abundance and could be difficult to quantitate, 

qPCR was carried out with a custom 10× reaction mix from previous studies (Lutfalla 

& Uze, 2006; Rebouissou et al., 2022) that gave great sensitivity and specificity. The 

10× reaction mix composition was: 400 mM 2‐amino‐2‐methyl‐1,3‐propanediol (pH 

adjusted to 8.3 using HCl), 50 mM KCl, 30 mM MgCl2, 0.09% Brij C10, 0.15% Brij 

58, 500 μg/mL BSA, 300 μM dNTPs, 16.24% glycerol (v/v), 1/3000 SYBR Green I 

(10,000× stock), and 0.4 U/μL Platinum Taq DNA polymerase (Invitrogen). Final ChIP-

qPCR condition was optimized to be: 1/10 volume of ChIP material, 1× custom reaction 

mix, 500 nM of each primer, 170 μM dNTPs, 0.35× SYBR Green I. All qPCR reactions 

were run in technical triplicates in 10 μL volumes in 384-well plates on a Roche 

LightCycler 480 Instrument II with the following program setting: initial denaturation 

95°C for 10 min; 45 cycles of 95°C for 10 s, 60°C for 8 s and 72°C for 14 s; melt curve 

95°C for 5 s, 45°C for 30 s, ramp to 95°C at 0.11°C/s; and cool down to 40°C. Serial 

dilutions of one input sample was included in each qPCR run to generate standard curves 

for each primer set, from which the amplification efficiency (E) was calculated. ChIP 

enrichment was then determined as percent input using the following equation: 

%Input = 100% × E^(Ct_input – Ct_ChIP) × Input Fraction 

Where E represents the qPCR amplification efficiency (ranging from 1.8 to 2.0 

in our experiments), and Input Fraction refers to the proportion of total chromatin lysate 


82 

used for the input (2%, or 0.02, in our case).  

For p300 ChIP, sequencing libraries were prepared using a Tn5 tagmentation-

based protocol. Briefly, 20 μL tagmentation reactions were set up with 0.35 ng of ChIP 

DNA or 1 ng of input DNA, 1× TAPS-DMF Buffer (10 mM TAPS-NaOH, pH 8.5, 5 

mM MgCl2, 10% DMF), and 1 μL of 1:15 diluted homemade Tn5 transposase (Spektor 

et al., 2019) (a kind gift from Dr. Roman Spektor). The reaction was incubated at 55°C 

for 10 min, and 2 μL of 1% SDS was added immediately, followed by another 55°C 

incubation for 7 min to strip off Tn5 from DNA. Post-tagmentation PCR was carried 

out in 100 μL volume with 10 μL of the tagmentation reaction, 1× Phusion HF Buffer, 

200 μM dNTPs, 400 nM of each Illumina Nextera index primer, and 1 μL of homemade 

Phusion polymerase. The PCR program was: initial extension 72°C for 3 min; initial 

denaturation 98°C 30s; 13 cycles of 98°C for 10 s, 63°C for 30 s, 72°C for 3 min; and 

final extension 72°C for 5 min. PCR products were purified first using the MinElute 

PCR purification kit (Qiagen), followed by an additional cleanup with homemade SPRI 

beads (1.5× bead ratio) to ensure complete removal of unused primers. DNA 

concentration was measured by the Qubit dsDNA High Sensitivity assay (Thermo 

Fisher). The libraries were pooled for sequencing on the Element Biosciences AVITI 

platform (2 × 80 bp paired-end sequencing).                    

ATAC-seq 

For TF motif mutants, ATAC-seq was performed on two independently derived 

recombinant single cell clones as biological replicates. For WT K562 and CRISPR 

deletion cell lines (ΔeNMU, Δe1, and Δe2) obtained from the Yu lab (Tippens et al., 


83 

2020), ATAC-seq was conducted on two independent cultures from the same clonal 

source, as only one deletion clone was available for the genotypes ΔeNMU and Δe1. 

ATAC-seq was performed on 50,000 K562 cells with all buffer compositions and 

reaction conditions following the published Omni-ATAC protocol (Corces et al., 2017) 

unless otherwise specified. Briefly, cells were pelleted, washed once with ice-cold PBS, 

resuspended in ice-cold Lysis Buffer, and incubated on ice for 3 min. Upon addition of 

Wash Buffer and gentle inversion, nuclei were pelleted, and supernatant (cytoplasm) 

was discarded. Nuclei were then resuspended gently in 50 μL transposition reaction mix 

containing 1 μL homemade Tn5 transposase and incubated at 37°C for 30 min with 

1000 rpm shaking. DNA was purified using the MinElute PCR purification kit (Qiagen) 

and eluted in 21 μL volume. The entire product (~20 μL) was mixed with 2.5 μL of each 

Nextera index primer (25 μM) and 25 μL NEBNext High-Fidelity 2X PCR Master Mix 

and subjected to a first round of PCR: initial extension 72°C for 5 min; initial 

denaturation 98°C 30s; 5 cycles of 98°C for 10 s, 63°C for 30 s, 72°C for 1 min. To 

determine additional cycles needed to avoid over-amplification, a qPCR analysis was 

performed using 5 μL of the first-round PCR reaction (Buenrostro et al., 2015). For all 

ATAC-seq libraries, 3~4 additional cycles of amplification were performed on the 

remaining 45 μL PCR reaction, followed by sequential cleanup using the MinElute PCR 

purification kit (Qiagen) and homebrew SPRI beads (1.5× bead ratio). DNA 

concentration was measured by the Qubit dsDNA High Sensitivity assay (Thermo 

Fisher). The libraries were pooled for sequencing on the Element Biosciences AVITI 

platform (2 × 80 bp paired-end sequencing) or Illumina NovaSeq X Plus platform (2 × 

150 bp paired-end sequencing).            


84 

PRO-seq 

PRO-seq was performed on the same clonal lines used for ATAC-seq, as well 

as on two independently derived recombinant single cell clones harboring e1_WT or 

e1_mGATA1 integration without e2. All buffer compositions and reaction conditions 

followed the published protocol of Mahat et al. (2016) unless otherwise specified. 

Briefly, 5 million K562 cells were mixed with 250,000 Drosophila S2 cells (5% spike-

in), pelleted at 1000 r.c.f. for 5 min at 4°C, washed once with ice-cold PBS, resuspended 

with ice-cold permeabilization buffer at a density of 1 million cells/mL, and incubated 

on ice for 5 min. Cells were washed twice with the same volume of ice-cold 

permeabilization buffer, and resuspended in 100 μL storage buffer before immediate 

nuclear run-on or flash-freezing in liquid nitrogen for long-term storage at –80°C. Prior 

to the run-on reaction, 40 μL Dynabeads MyOne Streptavidin C1 beads (Thermo Fisher) 

per sample were pre-washed sequentially with Hydrolysis Buffer (0.1N NaOH + 50 mM 

NaCl), High Salt Wash Buffer, and Binding Buffer. Pre-washed beads were 

resuspended in 60 μL Binding Buffer per sample. The nuclear run-on reaction was 

performed at 37°C for 5 min with a final concentration of 20 µM each of Biotin-11-

CTP, Biotin-11-UTP, ATP, and GTP. Following RNA extraction by Trizol LS 

(Invitrogen) and RNA fragmentation by base hydrolysis, 30 μL pre-washed C1 beads 

and 30 μL Binding Buffer were added to the ~60 μL RNA sample, and bead binding 

was performed at room temperature for 20 min on a rotational device. Beads were then 

washed twice using 500 μL High Salt Wash Buffer and once using 500 μL Low Salt 

Wash Buffer, with tube swap after each wash. Biotinylated RNA was eluted from beads 

by Trizol extraction, and 3¢ RNA adaptor ligation was performed in a total volume of 


85 

20 μL with 5 µM final adaptor concentration and 2 μL T4 RNA Ligase I, High 

Concentration (NEB). The reaction was incubated at 20°C for 4 h and held at 4°C 

overnight. On the next day, 50 μL Binding buffer and 30 μL pre-washed C1 beads were 

added to the reaction, and another round of bead binding and bead washing was 

performed as described above. Subsequent 5¢ enzymatic modifications of RNA were 

performed on beads with a reaction volume of 20 μL assuming 1 μL bead volume: 5¢ 

decapping reaction involved 1 μL RppH (NEB) and 1-h incubation at 37°C; 5¢ hydroxyl 

repair involved 1 μL T4 PNK (NEB) and 1-h incubation at 37°C. Beads were then 

washed once with 300 μL Binding Buffer, and 5¢ RNA adaptor ligation was performed 

on beads in a total volume of 20 μL with 5 µM final adaptor concentration and 2 μL T4 

RNA Ligase I, High Concentration (NEB), incubated at room temperature for 1 h on a 

rotational device. Beads were washed twice using 500 μL High Salt Wash Buffer and 

once using 500 μL Low Salt Wash Buffer, with tube swap after each wash. RNA was 

eluted from beads by Trizol extraction and resuspended in 13 μL RT resuspension mix 

(8 μL DEPC H2O, 4 μL of 10 μM Illumina RP1 primer, 1 μL of 10 mM dNTPs). RNA 

was denatured at 65°C for 5 min and snap cooled on ice, and 7 μL RT master mix was 

added to the sample (4 μL of 5× RT Buffer, 1 μL of 100 mM DTT, 1 μL Invitrogen 

SUPERase·In RNase Inhibitor, 1 μL Thermo Scientific Maxima H Minus Reverse 

Transcriptase). RT reaction program was: 50°C for 30 min, 65°C for 15 min, 85°C for 

5 min, hold at 4°C. The resulting cDNA was diluted with an equal volume of DEPC 

H2O. A test amplification was performed on 1:4 serial dilutions of 2 μL cDNA sample 

and run on 6% native PAGE TBE gel to determine the optimal PCR cycle number (N). 

Final amplification was performed in 100 μL volume (32.5 μL DEPC H2O, 20 μL of 5× 


86 

Phusion HF Buffer, 20 μL of 5 M betaine, 2.5 μL of 10 µM Illumina RP1 primer, 2.5 

μL of 10 µM Illumina indexing RPI-n primer, 2.5 μL of 10 mM dNTPs, 1 μL homemade 

Phusion polymerase, 19 μL cDNA sample). PCR program was: initial denaturation 

95°C for 2 min; 5 cycles of 95°C for 30 s, 56°C for 30 s, 72°C for 30 s; N cycles of 

95°C for 30 s, 65°C for 30 s, 72°C for 30 s; final extension 72°C for 5 min. PCR products 

were sequentially purified using the MinElute PCR purification kit (Qiagen) and 

homebrew SPRI beads (1.5× bead ratio) to remove all unused primers. DNA 

concentration was measured by the Qubit dsDNA High Sensitivity assay (Thermo 

Fisher). The libraries were pooled for sequencing on the Element Biosciences AVITI (2 

× 80 bp paired-end sequencing) or Illumina NovaSeq 6000/X Plus platform (2 × 150 bp 

paired-end sequencing). Note that the 3¢ and 5¢ RNA adaptors contain a 6-nt unique 

molecular identifier (UMI) to enable accurate identification of PCR duplicates in 

downstream bioinformatic analysis. The complete sequences of the adaptors can be 

found in Judd et al. (2020).       

Sequencing data analysis  

Next-Generation Sequencing (NGS) data preprocessing: For all NGS 

sequencing data, the quality of FASTQ files was first accessed using FastQC (LaMar, 

2015), and the Illumina sequencing adaptors were trimmed using fastp (S. Chen et al., 

2018). Note that one of the ATAC-seq samples was sequenced at 2 × 150 bp instead of 

2 × 80 bp. To ensure consistency across samples, the raw FASTQ files for this sample 

were trimmed to 80 bp prior to any analysis. 

Activity score calculation of HCR-FlowFISH: First, 8-bp element barcodes were 


87 

extracted from the trimmed FASTQ Read 1 using fastx_trimmer (Gordon, 2013/2010) 

with flags -Q33 -f 23 -l 30. The FASTQ format was converted into FASTA format using 

seqtk (H. Li, 2012/2023) with the command “seqtk seq -a -q20 -n N”. Occurrences of 

each barcode were counted using fastx_collapser (Gordon, 2013/2010). The 328 unique 

barcodes were associated with their corresponding 83 elements using a custom python 

script. We excluded 13 barcodes that had <50 raw reads in the unsorted background 

sample in at least one biological replicate. Fraction of each barcode in each bin was 

calculated. In parallel, the mean NMU fluorescence intensity (AF647) of each bin was 

normalized by the mean ATCB fluorescence intensity (AF488) of the same bin. The 

activity score of each barcode was then calculated using the weighted average method 

shown in Figure 2.4E. Calculated barcode activity scores are provided in Supplementary 

Table S3 (separate file). 

Multiplicative model of double mutants: To assess whether e1+e2 double 

mutations (either deletions or disruptions of the same TF motif) followed a 

multiplicative model based on single e1 and e2 mutations (inspired by Lin et al. (2022)), 

we first converted the median activity scores of both single and double mutants into 

pseudo expression values. This conversion used the linear regression equation derived 

from Figure 2.4G, which correlates FlowFISH scores with RT-qPCR measurements of 

NMU expression in a select set of mutants. The conversion was necessary because 

background fluorescence in the FlowFISH assay caused the activity scores to deviate 

from a direct representation of gene expression levels (i.e., the regression line did not 

pass through the origin). Log2 fold changes of each mutant relative to the WT_eNMU 

control was then calculated using the converted pseudo expression values. For each 


88 

e1+e2 double mutant, the log2 fold change was plotted against the sum of the log2 fold 

changes of the corresponding single mutants, as shown in Figure 2.6C.    

 
Genomics sequencing read alignment: To enable accurate alignment of 

sequencing reads to the eNMU locus for ATAC-/ChIP-/PRO-seq data, we built a custom 

genome for each recombinant TF motif mutant using the reform command line tool 

(Khalfan, 2018/2021). Each custom genome incorporated the exact mutant sequence 

flanked by Bxb1 recombination sites. Alignment was performed using the bowtie2 

aligner (Langmead & Salzberg, 2012): for ATAC-seq and ChIP-seq, “-end-to-end --

very-sensitive” mode was used, with ATAC-seq involving a subsequent step to remove 

mitochondrial reads using samtools view (H. Li et al., 2009); Picard MarkDuplicates 

(Broad Institute, 2014/2019) was then used for deduplication; for PRO-seq, rRNA read 

removal, alignment, and deduplication (with UMI-tools (T. Smith et al., 2017)) were 

performed using the published bash script at 

https://github.com/JAJ256/PROseq_alignment.sh (Judd, 2020/2020).     

ATAC-seq and ChIP-seq data analysis: To better normalize genomics datasets 

across different clonal lines, we applied a “reads-under-peaks” approach to calculate 

scaling factors for each sample. ATAC-seq peaks were called for each sample using 

HMMRATAC (Tarbell & Liu, 2019) in the MACS3 software (Y. Zhang et al., 2008) 

with “-u 90 -l 30 -c 10” parameters. To create a unified peak set for count normalization, 

we first identified reciprocal >50% overlaps between peaks from biological replicates 

using bedtools intersect “-f 0.50 -r” (Quinlan & Hall, 2010). Consensus peak sets from 

https://github.com/JAJ256/PROseq_alignment.sh


89 

each group of replicates were then merged to create a union peak set. ChIP-seq peaks 

were called for each biological replicate and for pooled replicates using MACS2 (Y. 

Zhang et al., 2008) with the input sample as control and the parameters “--broad --broad-

cutoff 0.05 --keep-dup all”. Consensus peak sets from each group of replicates was 

generated using a published bash script (Additional file 5 of Reske et al. (2020)), which 

identifies pooled peaks that show >50% reciprocal overlap with each biological 

replicate. These consensus peak sets were then merged to create a union peak set.  

To generate the read count matrix for ATAC-seq and ChIP-seq datasets, read 

pairs overlapping union peaks were quantified using featureCounts (Liao et al., 2014). 

DESeq2 (Love et al., 2014) was then used to calculate scaling factors for normalization. 

bamCoverage (Ramírez et al., 2016) was used to generate normalized bigwig files at a 

bin size of 1 bp for ATAC-seq and 50 bp for ChIP-seq. p300 ChIP-seq signal at the 

eNMU locus was quantified using bigWigAverageOverBed from the kentUtils of the 

UCSC Genome Browser (Kent et al., 2010). Merged bigwig files from two biological 

replicates were generated also using kentUtils (Kent et al., 2010). 

PRO-seq data analysis: Unnormalized 3¢-end bigwigs files for PRO-seq were 

first generated from bam alignment files by PINTS (Yao et al., 2022) using the option 

“pints_visualizer -e R1_5 --reverse-complement”. To calculate scaling factors for PRO-

seq data, a collapsed list of all GENCODE v46 transcripts (Mudge et al., 2025) was first 

generated using the reduceByGene() function in the BRGenomics R package 

(DeBerardine, 2023). The unnormalized bigwig files were loaded into Rstudio, and a 

DESeq2 object was generated for the combined list of all PRO-seq samples using the 


90 

getDESeqDataSet() function in BRGenomics, using the collapsed transcript list to 

specify genomic regions of interest. Scaling factors calculated by DESeq2 (Love et al., 

2014) were applied to each bigwig file using the applyNFsGRanges() function in 

BRGenomics before data export. Unnormalized 5¢-end bigwigs files for PRO-seq were 

generated from bam alignment files by PINTS (Yao et al., 2022) using the option 

“pints_visualizer -e R2_5” and normalized using the same scaling factors calculated 

from the 3¢-end bigwigs files. Normalized bigwig files of two biological replicates were 

merged using kentUtils of the UCSC Genome Browser for visualization (Kent et al., 

2010).    

Pausing index of the NMU gene was calculated using the getPausingIndices() 

function in BRGenomics (DeBerardine, 2023), with the promoter region defined as TSS 

to TSS+250 bp and the gene body region as TSS+500 bp to TES–500 bp (TSS, 

transcription start site; TES, transcription end site) using the promoters() and 

genebodies() functions in BRGenomics. Pausing indices for both individual samples 

and merged biological replicates were calculated and plotted in the figures.    

Public sequencing data analysis: All publicly available datasets visualized in 

this study are listed in Supplementary Table S4 (separate file). All ChIP-seq datasets, 

except for the one targeting LDB1(X. Guo et al., 2020) (GEO accession: GSE142227), 

were obtained from the ENCODE data portal (Luo et al., 2020) 

(https://www.encodeproject.org/). To ensure consistency, we re-analyzed the LDB1 

ChIP-seq raw FASTQ files using the standard ENCODE ChIP-seq pipeline (Hitz et al., 

2023) (https://github.com/ENCODE-DCC/chip-seq-pipeline2). For RNA-seq datasets 

https://www.encodeproject.org/
https://github.com/ENCODE-DCC/chip-seq-pipeline2


91 

from D. Li et al. (2023) raw read counts were downloaded from GEO accession 

GSE214809. For RNA-seq datasets from Schulz et al. (2019) and An et al. (2014), 

NCBI-generated RNA-seq raw read counts (Sayers et al., 2025) were downloaded from 

GEO accessions GSE128268 and GSE53983. Differential gene expression analysis was 

performed using DESeq2 (Love et al., 2014) in Rstudio.         

Data visualization: Genome browser tracks were visualized using 

pyGenomeTracks (Lopez-Delisle et al., 2021). All bar plots, box plots, scatter plots, line 

plots, volcano plots, and correlation analyses were generated using the ggplot2 package 

(Wickham, n.d.) in R (version 4.2.3) and RStudio. Flow cytometry data was analyzed 

and plotted using FlowJo (version 10.10.0). Schematic illustrations were generated 

using BioRender.com under an academic license. Figures were assembled, annotated, 

and finalized using Adobe Illustrator. All plots used consistent color scales for cross-

comparison. 

Quantification and statistical analysis 

One-way ANOVA with Dunnett’s post hoc test using WT_eNMU as the control 

was applied to ChIP-qPCR analysis. The statistical details of each experiment, including 

exact sample sizes (n) and additional tests (e.g., Pearson’s correlation coefficient r for 

linear relationships), are provided in the figure legends and/or shown graphically in the 

figures. For RNA-seq analysis, DESeq2 was used to determine the significance of 

differentially expressed genes. 

2.6 Acknowledgements 

We thank all past and present members of the Lis Lab for insightful discussions 


92 

and support throughout this work. A special thank you to Dr. Judhajeet Ray, former 

research associate in the Lis Lab and currently at the Broad Institute, for his unwavering 

mentorship and patience in guiding Zhou Zhou during the formative early years of her 

Ph.D. training. Another special thank you to Jessica West, former Ph.D. candidate in 

Dr. Andrew Grimson’s lab and currently at UCSF, for generously sharing landing pad-

related plasmids and for valuable input on the design of the recombination platform and 

library screening strategy. Thanks to former lab members Dr. Jin Liang and Nathaniel 

Tippens in Dr. Haiyuan Yu’s lab for providing eNMU deletion cell lines and 

contributing early ideas to the construction of the landing pad system. Thanks to Dr. 

Andrew Grimson and Dr. Charles Danko for their critical feedback on the project. 

Thanks to Jaret Lieberth (the Lis and Feschotte Labs), Adam He (Dr. Charles Danko’s 

lab), and Haining Chen (Dr. Franklin Pugh’s lab) for helpful discussions and expertise. 

Thanks to Xinchen Chen (Dr. Chun Han’s lab) for guidance on figure preparation using 

Adobe Illustrator. Fluorescence-activated cell sorting experiments were conducted at 

the Cornell Institute of Biotechnology’s Flow Cytometry Facility. Most next-generation 

sequencing data were generated by the Institute’s Epigenomics Core Facility, and initial 

pilot data produced by the Institute’s Genomics Facility. This work was supported by 

the National Human Genome Research Institute (NHGRI) grant 5R01HG012970 to 

J.T.L. and H.Y.  

  
93 

CHAPTER 3 INVESTIGATING THE NECESSITY OF ENHANCER 

TRANSCRIPTION FOR ENHANCER FUNCTION IN A HEAT-INDUCIBLE 

SYSTEM 

3.1 Abstract 

Enhancers are key cis-regulatory elements that activate transcription at target 

promoters independent of distance and orientation. While the epigenomic features of 

enhancers have been extensively characterized, the precise mechanisms by which 

enhancers stimulate promoter transcription still remain unclear. Previously, our lab and 

the Yu lab developed eSTARR-seq, a massively parallel episomal assay that reliably 

quantifies enhancer activity across thousands of cloned elements. Using this method, 

we found that transcriptional activity is a stronger predictor of enhancer function than 

classical epigenomic marks such as DNase I hypersensitivity or histone modifications. 

To further explore whether enhancer transcription is required for function, I focused on 

a set of distal elements in human K562 cells that gain HSF1 binding and elevated H4 

acetylation following 30-min heat shock. Interestingly, although HSF1 is a potent 

transcriptional activator, only ~300 of the bound elements exhibited heat-induced 

transcription by PRO-seq, while ~500 showed no detectable transcription. To test 

whether both transcribed and untranscribed HSF1-bound elements can function as 

enhancers, I cloned a library comprising ~120 transcribed elements (including ~60 heat-

induced and ~60 unchanged), along with ~60 untranscribed elements. These were then 

subjected to eSTARR-seq to measure enhancer activity in K562 cells under heat shock 

(HS) and non-heat shock (NHS) conditions. Preliminary results showed that, 

unexpectedly, both the upregulated transcribed elements and the untranscribed elements 

could act as heat shock-inducible enhancers, though the overall fraction of active 

elements was small. Notably, the untranscribed distal elements exhibited lower basal 

activity but greater inducibility upon heat shock. Applying our most sensitive PRO-cap 

assay to HS and NHS cells further revealed that many of the elements previously 

identified as “untranscribed” by PRO-seq indeed exhibited induced transcription 

initiation upon HS. I further discuss the implications and caveats of this study.   


94 

3.2 Introduction 

Since its initial discovery genome-wide (T.-K. Kim et al., 2010), enhancer 

transcription has been recognized as one of the hallmark features of active enhancers, 

yet its functional role has remained unclear. Applying the sensitive GRO-cap assay to 

human cells has revealed a unified molecular architecture of enhancers and promoters, 

where an upstream TF binding region is flanked by two divergent core promoters (Core 

et al., 2014). We previously showed that deletion of core promoter regions in enhancers 

caused reduced enhancer activity as measured by element-STARR-seq (eSTARR-seq) 

(Tippens et al., 2020), suggesting that enhancer transcription contributes to function. 

However, sequence alterations in these experiments may confound the data 

interpretation as disruption or creation of transcription factor (TF) motifs is 

unavoidable. To rigorously test the necessity of enhancer transcription for its activity, 

combining eSTARR-seq with a controlled inducible system will be an ideal approach.     

To this end, I turned to the well-studied mammalian heat shock (HS) response. 

While heat stress induces global downregulation of nascent transcription, hundreds of 

genes, including the heat shock protein (HSP) genes, are rapidly upregulated by the 

master regulator Heat Shock Factor 1 (HSF1) (Mahat, Salamanca, et al., 2016; 

Vihervaara et al., 2017). Notably, in human erythroleukemia K562 cells, HS treatment 

induced HSF1 binding not only at gene promoters but also at hundreds of gene-distal 

regions (Vihervaara et al., 2017). Since HSF1 is a potent transcriptional activator that 

causes universally elevated H4 acetylation (H4ac) upon binding, I hypothesized that 

some of these HSF1-bound distal elements may act as heat-inducible enhancers. More 

interestingly, PRO-seq analysis revealed that only a subset of these distal elements 


95 

showed upregulated transcriptional activity upon HS, while the others remained 

transcriptionally uninduced or completely inactive. These findings made the HS system 

an appealing playground to tease apart the relationship between enhancer transcription 

and enhancer activity.   

In this study, I systematically tested the enhancer activity under both heat shock 

and non-heat shock conditions for a library of ~200 HSF1-bound elements that exhibit 

distinct transcriptional profiles as mentioned above. Preliminary results showed that, 

surprisingly, both transcriptionally upregulated and untranscribed distal elements can 

function as heat-inducible enhancers. Further examination of their transcription 

initiation profiles using PRO-cap, our most sensitive enhancer detection assay, revealed 

that the untranscribed elements are also induced at low levels upon heat stress. The 

implications and limitations of this study are further explored. 

3.3 Results 

3.3.1 Evidence of an HSF1-bound, heat-inducible enhancer 

To obtain initial evidence that HSF1-bound distal elements can function as HS-

inducible enhancers, I focused on a candidate element identified in Vihervaara et al. 

(2021), located approximately 4.5 kb upstream of the TAX1BP1 gene (Figure 3.1A). 

Following 30-min heat stress, this element—but not the TAX1BP1 promoter—exhibited 

strong HSF1 binding, and showed increased nascent transcription concurrent with 

TAX1BP1 upregulation (Figures 3.1, A and B, adapted from Vihervaara et al., 2021). 

Moreover, short hairpin RNA (shRNA)-mediated knockdown of HSF1 attenuated heat-

induced transcription at both TAX1BP1 and this upstream element (Vihervaara et al., 


96 

2021). To directly test its function, homozygous CRISPR deletion of the element was 

performed. In two independently derived single cell clonal lines, heat-induced 

TAX1BP1 expression was completely abolished (Figure 3.1C). Together, these findings 

demonstrate that this element functions as a bona fide HSF1-bound, HS-inducible 

enhancer, hereafter referred to as eTAX1BP1. 

 
Fig. 3.1 Homozygous deletion of eTAX1BP1 abolishes heat inducibility of 
TAX1BP1 expression.  

(A and B) HSF1 and TBP ChIP-seq signal (A) and PRO-seq tracks (B) at the 
TAX1BP1 locus under HS and NHS conditions in K562 cells. The purple box and the 
green bar highlight the candidate HSF1-bound enhancer. Adapted from Vihervaara 
et al. (2021). (C) Heat-induced TAX1BP1 expression in WT and DeTAX1BP1 cell 
lines measured by RT-qPCR.   

 
97 

To test whether eTAX1BP1’s function could be recapitulated in an episomal 

context using the eSTARR-seq plasmid (Tippens et al., 2020), I configured three 

reporter constructs (Figure 3.2A): a negative control where the MYC promoter (which 

lacks HS-induced HSF1 binding) is linked to a non-regulatory EGFP fragment; a 

positive control linking the HSPA1A promoter (which exhibits strong HSF1 binding 

upon HS) to EGFP; and a test construct linking the MYC promoter to the eTAX1BP1 

element. Each construct was transfected into K562 cells, followed by 30- or 60-min HS 

at either 6 or 12 hr post-transfection. Cells were then harvested for RT-qPCR 

quantification of the luciferase reporter mRNA. As expected, the negative control 

showed minimal expression change between HS and NHS conditions, whereas both the 

positive control and the eTAX1BP1 test construct exhibited clear heat-induced 

transcription (Figure 3.2B). These results validate the feasibility of eSTARR-seq to 

measure inducible enhancer activity in response to heat stress.    

 
98 

Fig. 3.2 eTAX1BP1 activates episomal reporter gene expression in response to 
heat shock.  

(A)  Schematic of the experimental design used to test heat-inducible enhancer 
activity of eTAX1BP1 on an episomal eSTARR-seq plasmid. Abbreviations: prom, 
promoter; pA, polyadenylation signal; trfx, transfection. (B) RT-qPCR measurement 
of luciferase mRNA fold change (HS over NHS) across the four treatment conditions 
shown in (A).  

3.3.2 Systematically testing enhancer activity of a library of HSF1-bound 

candidate elements using eSTARR-seq  

Next, I set out to systematically test the heat-induced enhancer activity for a 

library of HSF-bound elements exhibiting distinct transcriptional profiles upon 30-min 

HS treatment (Figure 3.3). These include 50~60 elements from each of the three classes: 

upregulated distal transcriptional regulatory elements (dTREs), unchanged dTREs, and 

untranscribed distal elements. Eleven upregulated HSP gene promoters were also 

included. To limit the confounding factors, these elements were selected to match their 

HSF1 binding signals upon HS. Following individual element cloning from K562 

genomic DNA via the Gateway BP reaction and high-throughput sequencing 

verification of cloned sequences, I pooled and moved the element library into the 

eSTARR-seq destination vector by the Gateway LR reaction. Transfected cells were 

heat shocked for 30 min (to match the PRO-seq and ChIP-seq HS duration) at 12 hr 

post-transfection, which showed a slightly higher heat inducibility than the 6 hr time 

point (Figure 3.2B). The eSTARR-seq sequencing libraries were prepared according to 

Tippens et al. (2020) using the tagmentation approach. Data analysis was performed by 

Dr. Alden K. Leung in the Yu Lab.      


99 

 
Fig. 3.3 Four classes of HSF1-bound elements for eSTARR-seq testing.  

Representative PRO-seq (Vihervaara et al., 2017) and HSF1 ChIP-seq (Vihervaara et 
al., 2013) profiles of the four classes of tested elements: upregulated dTREs (58 
cloned), unchanged dTREs (55 cloned), untranscribed distal elements (54 cloned), 
and upregulated gene promoters (11 cloned). HS (30 min) and NHS conditions are 
compared side by side using the same scales.   

PROseq 
(NHS)

PROseq 
(HS)

HSF1 
(30’ HS)

HSF1 
(NHS)

PROseq 
(NHS)

PROseq 
(HS)

HSF1 
(30’ HS)

HSF1 
(NHS)

PROseq 
(NHS)

PROseq 
(HS)

HSF1 
(30’ HS)

HSF1 
(NHS)

PROseq 
(NHS)

PROseq 
(HS)

HSF1 
(30’ HS)

HSF1 
(NHS)

58 upregulated dTREs 55 unchanged dTREs

54 untranscribed distal elements 11 upregulated promoters


100 

 
Fig. 3.4 Systematically testing heat-induced enhancer activity of HSF1-bound 
candidate elements using eSTARR-seq. 

(A) Schematic of eSTARR-seq workflow in this study. Adapted from Tippens et al. 
(2020). (B) Correlation plots between measured enhancer activities under HS and 
NHS conditions, with elements placed in the forward or reverse orientation relative 
to the reporter. Top two plots are color coded by enhancer activity levels, and the 
bottom two plots color coded by their transcriptional profile classes. Data anlysis 
and visualization credit to Dr. Alden K. Leung. (C) Box plot comparing the bulk 
trend of HS-mediated enhancer activity inducibility across the four classes of HSF1-
bound elements, only showing the forward orientation results as representative.   


101 

The eSTARR-seq results showed only a small subset of elements exhibiting 

heat-induced enhancer activity in both orientations. Notably, this included the positive 

control element eTAX1BP1, thereby validating the dataset (Figure 3.4B, top). 

Consistent with prior findings (Tippens et al., 2020), transcribed elements showed 

higher basal activity than untranscribed elements under the NHS condition (Figure 

3.4C). While the unchanged dTREs exhibited slight downregulation of enhancer 

activity upon HS, both the upregulated dTREs and the untranscribed distal elements 

exhibited some level of heat inducibility (Figures 3.4C). Moreover, as shown in Figure 

3.4B bottom plots, fold change for the untranscribed elements was actually higher than 

the upregulated dTREs. A few HSP promoters also exhibited heat-induced enhancer 

activity, demonstrating the functional flexibility between enhancers and promoters.  

3.3.3 Induced transcription initiation at “untranscribed” elements detected by 

PRO-cap  

The unexpected observation that untranscribed elements exhibited heat-induced 

enhancer activity prompted us to more rigorously examine their transcriptional status 

using PRO-cap—a more sensitive assay that selectively enriches for 5′-capped nascent 

RNAs to map genome-wide transcription initiation events. Unlike PRO-seq, whose 

reads predominantly map to gene body regions, PRO-cap offers much higher coverage 

around transcription start sites (TSSs). Applying PRO-cap to K562 cells indeed revealed 

low levels of induced transcription initiation at the previously classified “untranscribed” 

distal elements upon 30-min HS, though the degree of transcriptional induction did not 

seem to correlate with their enhancer activity induction (Figure 3.5).      


102 

 
Fig. 3.5 PRO-cap detects induced transcription initiation at “untranscribed” 
elements upon HS.  

PRO-cap browser shots of three highly heat-induced untranscribed elements 
(DU0031, DU0033, DU0014) and the positive control element eTAX1BP1 (DT002), 
which is an upregulated transcribed dTRE. The tested regions are highlighted in 
light orange shade. “F” and “R” values indicate the log2 fold change of HS over NHS 
enhancer activity in the forward and reverse orientations, respectively. Both the 5′- 


103 

and 3′-end (labeled as 3p) read position tracks are shown, with the 5′ ends 
representing TSSs and the 3′ ends representing paused or elongating Pol II positions.      

 
3.4 Discussion 

In this study, I set out to study the relationship between enhancer activity and 

enhancer transcription in a well-controlled HS-inducible system. However, several 

caveats in the experimental design are worth noting. First, classification of distal 

elements was based on PRO-seq profiles obtained prior to PRO-cap experiments. Given 

PRO-seq’s limited sensitivity to lowly transcribed enhancers, this may have led to 

misclassification, as illustrated in Figure 3.5. Second, unlike the approach taken by 

Tippens et al. (2020), we did not control for chromatin accessibility between different 

classes of elements due to the limited number of HSF1-bound distal sites. As a result, 

the apparent lack of transcriptional activity at “untranscribed” elements could simply 

reflect a closed chromatin state—one that may not be faithfully modeled on non-

chromatinized plasmids. In other words, enhancer transcriptional states on episomal 

plasmids might diverge significantly from PRO-seq or PRO-cap patterns observed in 

native genomic contexts. This limitation underscores the value of a chromosomal 

reporter assay for more physiologically relevant assessments. 

Nonetheless, the lower basal activity of the untranscribed elements measured by 

STARR-seq suggests that they possess fewer intrinsically activating features compared 

to transcribed elements. Their higher heat inducibility may arise from cooperative 

effects between HSF1 and weak pre-existing activating components, whereas 

transcribed elements—already active—derive only modest additional benefit from 


104 

HSF1 binding. Interestingly, almost none of the unchanged dTREs exhibited heat-

induced enhancer activity. Assuming these elements reside in similarly accessible 

chromatin as the upregulated dTREs, this observation supports the idea that changes in 

enhancer transcription can serve as a predictor of functional activity shifts. 

In summary, the heat shock system, while informative, may not be optimal for 

testing this hypothesis due to the limited number of candidate elements and the modest 

activation rate. Future studies would benefit from exploring alternative inducible 

systems, such as the interferon response (Doughty et al., 2024), while carefully 

controlling for epigenomic context, leveraging the sensitivity of PRO-cap for candidate 

classification, and ideally employing chromosomal reporter assays for functional 

validation. 

 
3.5 Methods 

Cell Culture and Heat Shock Treatment  

K562 cells were cultured as described in Chapter 2 (Section 2.5). Cells were heat 

shocked in a 42°C water bath for the indicated duration plus 5 min (the time needed for 

temperature to reach 42°C), while the non-heat shock control was incubated in parallel 

in a 37°C water bath. Following the treatment, cells were promptly put on ice to stop 

any further heat shock response.  

eSTARR-seq element library cloning  

eSTARR-seq element library cloning was performed as described in Tippens et al., 


105 

2020. Briefly, candidate HSF1-bound elements were individually PCR-amplified from 

K562 genomic DNA (primers listed in Supplementary Table S5) and cloned into the 

pDONR223 vector by Gateway BP reaction. Four colonies were picked for each 

element and grown in a 96-well deep well plate. The sequences were verified by 

Illumina sequencing following the published Clone-seq protocol (Wei et al., 2014). The 

correct clones were propagated individually in 96-well plates and pooled together for 

Midiprep to isolate the pENTR library. Next, an en masse Gateway LR reaction was 

performed to move the elements into the eSTARR-seq destination vector. The 

transformants were propagated in LB broth and the plasmid library extracted by 

Maxiprep.  

eSTARR-seq                 

eSTARR-seq was performed as described in Tippens et al., 2020, following the Tn5 

tagmentation protocol. Cells were recovered for 11.5 hrs post-transfection before 

subjection to 30-min heat shock treatment. Forward and reverse orientation libraries 

were transfected separately to prevent interference during sequencing library 

preparation. Data analysis was performed by Dr. Alden K. Leung in the Yu lab. Raw 

and processed data are summarized in Supplementary Table S5. 

RT-qPCR  

RT-qPCR was performed as described in Chapter 2 (Section 2.5). Primers are listed as 

below: 

Table 3.１qPCR primer sequences used in this study 


106 

Name Sequence 

FFluc2_qPCR_For1 GTGGTGTGCAGCGAGAATAG 

FFluc2_qPCR_Rev1 CGCTCGTTGTAGATGTCGTTAG 

HSPA1A_qPCR_For1 AGGCCAACAAGATCACCATC 

HSPA1A_qPCR_Rev1 GTCCTCCGCTTTGTACTTCTC 

Bactin_QPCR_For2 CAAGCAGGAGTATGACGAGTC 

Bactin_QPCR_Rev2 GCCATGCCAATCTCATCTTG 

TAX1BPB1_QPCR_For1 GAGACAGAACGATGGCAGAC 

TAX1BPB1_QPCR_Rev1 AGTTCGTGTTCCAGTGTATCAG 

 
PRO-cap 

PRO-cap protocol was largely similar to the PRO-seq protocol descried in Chapter 2 

(Section 2.5) except the following differences. First, the SUPERase·In RNase Inhibitor 

(Invitrogen AM2696) was switched to Protector RNase Inhibitor (Roche 3335402001), 

which was found to give a much better yield of the final library with little adaptor dimer 

formation. Second, the 5¢ enzymatic modifications of RNA on beads differed from 

PRO-seq: in PRO-cap, RNA containing 5¢-monophosphate was first removed by a 

reaction involved 1 μL XRN-1 (NEB) and 30-min incubation at 37°C; uncapped RNA 

was then dephosphorylated by Quick CIP (NEB) for 30 min at 37°C; capped RNA was 

decapped by Cap-Clip Acid Pyrophosphatase (Cellscript Inc) for 1 h at 37°C before 5¢ 


107 

adaptor ligation.  

 
3.6 Acknowledgements 

I thank Dr. Alden K. Leung in Dr. Haiyuan Yu’s lab for collaborating with me 

on this project, who selected candidate elements, designed cloning primers, and 

performed eSTARR-seq data analysis. I thank Dr. Jin Liang in the Yu lab for sharing 

eSTARR-seq reagents and protocols, and Dr. Sagar Shah for helping to troubleshoot the 

PRO-cap protocol with me.   

 
108 

CHAPTER 4 CONCLUSION AND PERSPECTIVES 

In this thesis, I explored multiple facets of enhancer function through distinct 

experimental systems and techniques. In Chapter 2, I developed a novel recombinase-

mediated cassette exchange platform, integrating pooled mutagenesis screens with 

functional genomics to dissect the sequence-to-function relationship of a long-range 

enhancer in its native chromatin context. In Chapter 3, I examined the role of enhancer 

transcription on enhancer activity using a heat shock–inducible system and a massively 

parallel episomal reporter assay. Together, these studies emphasize the power of 

studying enhancers in chromosomal contexts and highlight the utility of the divergent 

transcription model for understanding and predicting enhancer function. 

Chapter 2’s systematic analysis revealed several key insights, summarized 

below: (1) eNMU is composed of two functionally distinct sub-elements: a canonical 

autonomous enhancer e1, and an intrinsically inactive facilitator e2 that augments e1’s 

activity; (2) The facilitator e2 universally buffers the activity of a collection of tested 

enhancers and ensures enhancer robustness against disruptive mutations; (3) The 

autonomous enhancer e1 is functionally hierarchical to the NMU promoter, e2, and 

additional facilitators across the ~100-kb NMU–eNMU region, and it orchestrates the 

formation of a 3D regulatory hub with these elements; (4) e1 also harbors a bipartite 

structure: a divergently transcribed retroviral LTR enhancer that serves as the minimal 

core enhancer unit, and an adjacent unidirectionally transcribed LTR promoter that 

dampens NMU expression by competing with the NMU promoter for LTR enhancer 

activity; and (5) Coordinated action of lineage-specific transcription factors underlies 

the intricate cis-regulatory element interplay at eNMU: GATA1 and RUNX1 


109 

cooperatively confer intrinsic activity at the core LTR enhancer; KLF/SP factors at the 

adjacent LTR promoter modulate context-specific repression of enhancer activity; and 

STAT5 binding at the facilitator e2 is essential for mediating e1–e2 crosstalk. This work 

provides both conceptual and technical advances with broad implications for enhancer 

biology and gene regulation research: (1) it presents the first in situ dissection of 

enhancer–facilitator interplay at a typical enhancer (rather than a super-enhancer), 

suggesting that many CRISPR-identified candidate enhancers may function instead as 

facilitators; (2) it uncovers a previously unrecognized role of the LTR enhancer–

promoter axis in fine-tuning gene expression, a potentially widespread mechanism 

given the abundance of LTRs in the human genome; (3) it showcases the power of the 

divergent transcription-based framework for enhancer definition, which enables high-

resolution parsing of enhancer architecture and regulatory element classification; and 

(4) it introduces a versatile recombinase-mediated genome rewriting platform for 

functional interrogation and synthetic element design within native chromatin contexts, 

with broad applications in biotechnology and therapeutic development. 

Multiple avenues for follow-up investigation emerge from this work. Pertaining 

to the facilitator mechanisms, one may ask: (1) does the facilitator e2 mediate enhancer–

promoter interactions in addition to establishing cooperative environments? This can be 

answered directly by performing region capture micro-C (RCMC) (Goel et al., 2023) or 

similar targeted 3C-based assays in the e2 deletion or e2_mSTAT5 mutants and 

comparing the contact frequencies with WT and e1 deletion cell lines; (2) do other 

STAT5 motif clusters genome-wide have similar functions as facilitator elements? This 

may be answered by bioinformatic classification of STAT5 ChIP-seq peaks into those 


110 

containing tandem arrays of STAT5 motifs without GATA1/RUNX1 binding and those 

containing no good STAT5 motifs but well bound by GATA1 and RUNX1. If the 

former can function as facilitators and the latter as buffered bona fide enhancer, these 

two classes of STAT5 peaks should have some spatial connectivity, i.e., either residing 

adjacent to each other as in the case of e1 and e2, or colocalized in the same chromatin 

hub that can be inferred from public Hi-C datasets. These candidates can be further 

cloned and tested in the eNMU landing pad system. It will also be interesting to treat 

K562 cells with STAT5 inhibitors and perform ATAC-seq and PRO-seq analysis to 

study how STAT5 regulates the behavior of clusters of cis-regulatory elements; (3) do 

the tested heterologous enhancers communicate with the same set of additional 

facilitators (F1–F3) as e1 does? This can be answered immediately by performing 

ATAC-seq in a few examples; (4) what will the eNMU mutagenesis screen results look 

like if the additional facilitators (F1–F3) are all deleted or repressed in the genome? 

How much of amplification do they collectively contribute to eNMU activity? 

One particularly puzzling observation in the case of e2 fusion experiments was 

that direct duplication of e2 exhibited ~30% of WT_eNMU (i.e., e1 fused with e2) 

activity (data not shown). This strongly implies that once duplicated, some TFs binding 

at e2 are able to act synergistically over the ~500 bp added distance. Elucidating the 

nature of this synergy may allow us to get to the bottom of how e2 works. Since the 

eNMU mutagenesis screen revealed that e2’s STAT5 and GATA1 motifs both 

contribute to eNMU activity, it will be interesting to mutate the e2+e2 direct fusion in 

different combinations of STAT5 and GATA1 motifs in each of the e2 unit (i.e., 

mGATA1/mGATA1, mSTAT5/mSTAT5, mSTAT5/mGATA1, mGATA1/mSTAT5). 


111 

This should reveal the TFs participating in the synergistic effect. Moreover, it could be 

interesting to perform a PRO-seq or PRO-cap experiment to examine the transcription 

profile of this fusion element—since it has become an active enhancer, there should be 

a clear divergent transcription pattern that is distinct from e2 alone (no transcription at 

all) or e1+e2 (where e2 is predominately unidirectionally transcribed).  

The interesting LTR enhancer–promoter regulatory axis in e1 also raises the 

question on its genome-wide prevalence. It may be possible to make use of the whole-

genome STARR-seq datasets and employ the genomic binning strategy (developed by 

J. Zhang et al., 2025) to look for more examples when the introduction of a 

unidirectional TSS represses enhancer activity. Alternatively, we can screen across all 

transcribed LTRs in K562 cells in search of similar “divergent + unidirectional” TSS 

structures and experimentally test the unidirectional TSS function in our landing pad 

system or using a plasmid reporter assay. 

Similar to Martyn et al. (2025), our in situ mutagenesis study provided a gold-

standard dataset for evaluating different deep learning-based predictive models. While 

direct prediction from sequence to gene expression over ~94 kb may be inaccurate, our 

high-quality ATAC-seq profiles may offer an ideal intermediate to calibrate model 

performance. However, we are indeed limited by the sample size to reach any solid 

conclusions. 

I would also like to mention some relevant ongoing projects in the lab. Yiyang 

and Jessica are currently working on a facilitator screen, where a library of CRISPRi-

validated “enhancers” are integrated into the eNMU landing pad as standalone elements, 


112 

or fused with e1 or e2. They will perform HCR-FlowFISH to measure enhancer activity 

for these three configurations and classify tested elements into enhancers and facilitators 

to understand their sequence and functional features in an unbiased way. Yang is 

creating an orthogonal promoter landing pad that would allow swapping of the NMU 

promoter with any other promoters of interest. Combining with the existing eNMU 

landing pad, this project holds tremendous promise in studying long-range enhancer–

promoter compatibility.  

To address the major caveat in Chapter 3 (i.e., the episomal context), the 

eTAX1BP1 enhancer deletion lines may serve as the starting point to construct an 

eTAX1BP1 landing pad. The HSF1-bound element library can then be integrated at the 

fixed eTAX1BP1 locus to rigorously test their heat-induced enhancer activity on a heat 

shock-responsive promoter located 4.5 kb away. The short distance between this 

proposed landing and its target promoter may make it useful to measure the “intrinsic” 

activity of tested elements, which can be compared with the measurements at the eNMU 

landing pad to derive insights into long-range enhancer function. However, considering 

basal transcription of TAX1BP1 is not affected by eTAX1BP1 deletion, it would require 

more detailed examination to check if there exist other enhancer elements for TAX1BP1 

that may interact differentially with different candidate elements. Another possibility is 

to use a chromosomal reporter assay to assess the enhancer activity of our HSF1-bound 

distal element library on a mutant HSPA1A promoter where all the HSF1 binding sites 

have been disrupted. This design should enhance the dynamic range of activation and 

allow cleaner identification of HS-inducible enhancers. 

Taken together, I envision that future advances in enhancer biology will hinge 


113 

on the integration of systematic genome engineering, cutting-edge functional assays—

including genomics and imaging—and powerful computational tools to decipher and 

apply the principles of enhancer logic.                


114 

APPENDIX A  A FLEXIBLE PROTEIN-TAGGING STRATEGY FOR MAPPING 

CDK9 CHROMATIN OCCUPANCY AND NUCLEAR PROXIMITY 

INTERACTOME 

 
The release of RNA Pol II from the promoter-proximal pausing site into 

productive elongation critically depends on the positive elongation factor P-TEFb 

(Jonkers et al., 2014), a complex of Cyclin-Dependent Kinase 9 (CDK9) and Cyclin T1 

or T2 in human cells (Fujinaga et al., 2023). P-TEFb activity is tightly regulated by the 

inhibitory 7SK complex and by positive factors including BRD4 and the Super 

Elongation Complex (SEC) (reviewed in Lu et al., 2016). Despite these insights, a 

proteome-wide survey of potential P-TEFb recruitment factors—especially sequence-

specific TFs—remains lacking. To address this, I undertook a side project to map the 

proximal proteome of CDK9 by fusing it with the APEX2 peroxidase (Lam et al., 2015). 

In parallel, I also aimed to generate a CDK9-EGFP cell line to test whether this could 

improve the quality of CDK9 ChIP-seq. As CDK9 is an interesting target that could be 

studied by other tags (e.g., HaloTag for live-cell imaging, dTAG for acute depletion), I 

first developed a flexible protein-tagging strategy using the Bxb1-mediated landing pad 

system described in Chapter 2. Specifically, I inserted the landing pad at the C-terminus 

of the endogenous CDK9 gene, between the stop codon and the 3′ UTR (Figure A.1), 

and established a homozygous HCT116 clonal cell line. This allowed me to swap in any 

desired protein tag via Bxb1 recombination.  

        
115 

 
Fig. 4.1 Schematic of Bxb1-mediated flexible protein-tagging strategy for CDK9.   

 
Using the parental CDK9 landing pad cell line, I successfully established 

homozygous CDK9-EGFP clones (Figure A.2A). In contrast, only heterozygous clones 

could be obtained for the CDK9-APEX2 fusion (Figure A.2B), suggesting that 

homozygous APEX2 tagging is incompatible with cell viability. This limitation makes 

it challenging to determine whether the CDK9-APEX2 fusion protein is fully functional, 

though some interaction between CDK9-APEX2 and Cyclin T1 was observed in the co-

IP experiment, especially for Clones A5 and A10. (Figure A.3, right panel, compare “C” 

(CDK9) and “F” (FLAG) IP).  

  
Fig. 4.2 Western blot analysis of tagged CDK9 clonal cell lines.  

(A) Anti-CDK9 (red) western blot for CDK9-EGFP clones with anti-TBP (green) as 


116 

a loading control; (B) Anti-CDK9 and anti-FLAG wester blots for CDK9-FLAG-
APEX2 clones.   

 
Figure 4.3 Probing physical interaction between CDK9-FLAG-APEX2 and 
Cyclin T1 by Co-IP analysis.  

Lysates from the three heterozygous CDK9-APEX2 clones (A4, A5, and A10) and 
the non-tagged control clone A11 were subjected to IgG, FLAG or CDK9 
immunoprecipitation (IP). Left panel shows the immunoblot (IB) for the bait protein 
CDK9. Right panel shows the immunoblot for the interactor Cyclin T1. 

 
APEX2 catalyzes the biotinylation of proximal proteins using biotin-phenol and 

H₂O₂ as substrates. Testing the CDK9–APEX2 clones for biotin labeling revealed a 

strong smear of biotinylated proteins (Figure A.4A), confirming that the APEX2 

enzyme is functional in these cells. However, whole cell lysates showed a high 

background of endogenously biotinylated proteins, which are mainly located in 

mitochondria and cytoplasm (Niers et al., 2011). Indeed, isolating nuclei using an 

efficient Lyse-and-Wash protocol (Senichkin et al., 2021) eliminated most of these 

proteins and improved the signal-to-noise ratio of APEX2-mediated biotinylation 

(Figure A.4B). 

  
117 

 
Figure 4.4 Nuclear isolation effectively eliminates endogenously biotinylated 
proteins.  

(A) Anti-biotin western blot of whole-cell lysates (untreated, biotin-phenol only, 
and biotin-phenol + H2O2) from the three heterozygous CDK9-APEX2 clones (A4, 
A5, and A10) and the non-tagged control clone A11; (B) Streptavidin HRP blot of 
whole-cell lysates, cytoplasmic fractions, and nuclear fractions of Clones A10 and 
A11 treated with biotin-phenol ± H2O2.  

 
I pulled down biotinylated proteins from ±H2O2 nuclear lysates of Clone A10 

and sent the eluates to Dr. Jin Joo Kang in the Yu lab for a preliminary mass 

spectrometry run. However, the data were problematic: the -H2O2 control yielded very 

little pull-down material (as shown in Figure A.4B), preventing reliable identification 

of enriched targets. I conclude that a proper control cell line (i.e., one expressing free 

nuclear APEX2) is needed for future experiments. The project was subsequently taken 

over by Miliarys. 

Working together with a rotation student Simian Cai, I also attempted to 

optimize CDK9 ChIP-seq using the anti-GFP antibody (Abcam ab290) and the 

homozygous CDK9-EGFP Clone G4. Of the three crosslinking methods tested (Figure 

A.5A), the two-step crosslinking protocol (Tian et al., 2012) appeared most promising, 

showing increased CDK9 occupancy at the heat shock gene HSPH1 upon heat shock 


118 

treatment (Figure A.5B). Nonetheless, discernible CDK9 signal was lacking at most 

genes, indicating that further optimization of crosslinking and sonication conditions is 

required. 

 
Figure 4.5 Optimizing CDK9-EGFP ChIP-seq with different crosslinking 
conditions. 

(A) Summary of experimental conditions in the CDK9-EGFP ChIP optimization test. 

(B) Example two-step crosslinking CDK9-EGFP ChIP-seq profile at the HSPH1 

gene, showing input (genomic background control), heat shock (HS) and non-heat 

shock (NHS) tracks. PRO-seq tracks are kindly provided by Jawaher.  

 
In summary, this side project proved extremely challenging due to the various 

biological and technical issues described above. Nevertheless, the Bxb1 landing pad 

remains a versatile system for endogenous protein tagging. 

 
119 

REFERENCES 

Abdella, R., Talyzina, A., Chen, S., Inouye, C. J., Tjian, R., & He, Y. (2021). 

Structure of the human Mediator-bound transcription preinitiation 

complex. Science (New York, N.Y.), 372(6537), 52–56. 

https://doi.org/10.1126/science.abg3074 

Aboreden, N. G., Lam, J. C., Goel, V. Y., Wang, S., Wang, X., Midla, S. C., Quijano, 

A., Keller, C. A., Giardine, B. M., Hardison, R. C., Zhang, H., Hansen, A. S., 

& Blobel, G. A. (2025). LDB1 establishes multi-enhancer networks to 

regulate gene expression. Molecular Cell, 85(2), 376-393.e9. 

https://doi.org/10.1016/j.molcel.2024.11.037 

Adams, C. C., & and Workman, J. L. (1995). Binding of disparate transcriptional 

activators to nucleosomal DNA is inherently cooperative. Molecular and 

Cellular Biology, 15(3), 1405–1421. https://doi.org/10.1128/MCB.15.3.1405 

Adelman, K., & Lis, J. T. (2012). Promoter-proximal pausing of RNA polymerase II: 

Emerging roles in metazoans. Nature Reviews. Genetics, 13(10), 720–731. 

https://doi.org/10.1038/nrg3293 

Alexander, J. M., Guan, J., Li, B., Maliskova, L., Song, M., Shen, Y., Huang, B., 

Lomvardas, S., & Weiner, O. D. (2019). Live-cell imaging reveals enhancer-

dependent Sox2 transcription in the absence of enhancer proximity. eLife, 

8, e41769. https://doi.org/10.7554/eLife.41769 


120 

Alipanahi, B., Delong, A., Weirauch, M. T., & Frey, B. J. (2015). Predicting the 

sequence specificities of DNA- and RNA-binding proteins by deep learning. 

Nature Biotechnology, 33(8), 831–838. https://doi.org/10.1038/nbt.3300 

Allahyar, A., Vermeulen, C., Bouwman, B. A. M., Krijger, P. H. L., Verstegen, M. J. 

A. M., Geeven, G., van Kranenburg, M., Pieterse, M., Straver, R., Haarhuis, 

J. H. I., Jalink, K., Teunissen, H., Renkens, I. J., Kloosterman, W. P., 

Rowland, B. D., de Wit, E., de Ridder, J., & de Laat, W. (2018). Enhancer 

hubs and loop collisions identified from single-allele topologies. Nature 

Genetics, 50(8), 1151–1160. https://doi.org/10.1038/s41588-018-0161-5 

An, X., Schulz, V. P., Li, J., Wu, K., Liu, J., Xue, F., Hu, J., Mohandas, N., & 

Gallagher, P. G. (2014). Global transcriptome analyses of human and 

murine terminal erythroid differentiation. Blood, 123(22), 3466–3477. 

https://doi.org/10.1182/blood-2014-01-548305 

Andersson, R., Sandelin, A., & Danko, C. G. (2015). A unified architecture of 

transcriptional regulatory elements. Trends in Genetics: TIG, 31(8), 426–

433. https://doi.org/10.1016/j.tig.2015.05.007 

Arnold, C. D., Gerlach, D., Stelzer, C., Boryń, Ł. M., Rath, M., & Stark, A. (2013). 

Genome-wide quantitative enhancer activity maps identified by STARR-

seq. Science (New York, N.Y.), 339(6123), 1074–1077. 

https://doi.org/10.1126/science.1232542 


121 

Arnosti, D. N., & Kulkarni, M. M. (2005). Transcriptional enhancers: Intelligent 

enhanceosomes or flexible billboards? Journal of Cellular Biochemistry, 

94(5), 890–898. https://doi.org/10.1002/jcb.20352 

Avsec, Ž., Agarwal, V., Visentin, D., Ledsam, J. R., Grabska-Barwinska, A., Taylor, 

K. R., Assael, Y., Jumper, J., Kohli, P., & Kelley, D. R. (2021). Effective gene 

expression prediction from sequence by integrating long-range interactions. 

Nature Methods, 18(10), 1196–1203. https://doi.org/10.1038/s41592-021-

01252-x 

Avsec, Ž., Weilert, M., Shrikumar, A., Krueger, S., Alexandari, A., Dalal, K., Fropf, 

R., McAnany, C., Gagneur, J., Kundaje, A., & Zeitlinger, J. (2021). Base-

resolution models of transcription-factor binding reveal soft motif syntax. 

Nature Genetics, 53(3), 354–366. https://doi.org/10.1038/s41588-021-00782-

6 

Bakshi, R., Hassan, M. Q., Pratap, J., Lian, J. B., Montecino, M. A., van Wijnen, A. 

J., Stein, J. L., Imbalzano, A. N., & Stein, G. S. (2010). The human SWI/SNF 

complex associates with RUNX1 to control transcription of hematopoietic 

target genes. Journal of Cellular Physiology, 225(2), 569–576. 

https://doi.org/10.1002/jcp.22240 

Banerji, J., Olson, L., & Schaffner, W. (1983). A lymphocyte-specific cellular 

enhancer is located downstream of the joining region in immunoglobulin 


122 

heavy chain genes. Cell, 33(3), 729–740. https://doi.org/10.1016/0092-

8674(83)90015-6 

Banerji, J., Rusconi, S., & Schaffner, W. (1981). Expression of a β-globin gene is 

enhanced by remote SV40 DNA sequences. Cell, 27(2), 299–308. 

https://doi.org/10.1016/0092-8674(81)90413-X 

Batut, P. J., Bing, X. Y., Sisco, Z., Raimundo, J., Levo, M., & Levine, M. S. (2022). 

Genome organization controls transcriptional dynamics during 

development. Science (New York, N.Y.), 375(6580), 566–570. 

https://doi.org/10.1126/science.abi7178 

Bauer, D. E., Kamran, S. C., Lessard, S., Xu, J., Fujiwara, Y., Lin, C., Shao, Z., 

Canver, M. C., Smith, E. C., Pinello, L., Sabo, P. J., Vierstra, J., Voit, R. A., 

Yuan, G.-C., Porteus, M. H., Stamatoyannopoulos, J. A., Lettre, G., & Orkin, 

S. H. (2013). An erythroid enhancer of BCL11A subject to genetic variation 

determines fetal hemoglobin level. Science (New York, N.Y.), 342(6155), 

253–257. https://doi.org/10.1126/science.1242088 

Bell, C. C., Balic, J. J., Talarmain, L., Gillespie, A., Scolamiero, L., Lam, E. Y. N., 

Ang, C.-S., Faulkner, G. J., Gilan, O., & Dawson, M. A. (2024). Comparative 

cofactor screens show the influence of transactivation domains and core 

promoters on the mechanisms of transcription. Nature Genetics, 56(6), 

1181–1192. https://doi.org/10.1038/s41588-024-01749-z 


123 

Berger, S. L., Cress, W. D., Cress, A., Triezenberg, S. J., & Guarente, L. (1990). 

Selective inhibition of activated but not basal transcription by the acidic 

activation domain of VP16: Evidence for transcriptional adaptors. Cell, 

61(7), 1199–1208. https://doi.org/10.1016/0092-8674(90)90684-7 

Bergman, D. T., Jones, T. R., Liu, V., Ray, J., Jagoda, E., Siraj, L., Kang, H. Y., 

Nasser, J., Kane, M., Rios, A., Nguyen, T. H., Grossman, S. R., Fulco, C. P., 

Lander, E. S., & Engreitz, J. M. (2022). Compatibility rules of human 

enhancer and promoter sequences. Nature, 607(7917), 176–184. 

https://doi.org/10.1038/s41586-022-04877-w 

Bintu, B., Mateo, L. J., Su, J.-H., Sinnott-Armstrong, N. A., Parker, M., Kinrot, S., 

Yamaya, K., Boettiger, A. N., & Zhuang, X. (2018). Super-resolution 

chromatin tracing reveals domains and cooperative interactions in single 

cells. Science (New York, N.Y.), 362(6413), eaau1783. 

https://doi.org/10.1126/science.aau1783 

Birney, E., Stamatoyannopoulos, J. A., Dutta, A., Guigó, R., Gingeras, T. R., 

Margulies, E. H., Weng, Z., Snyder, M., Dermitzakis, E. T., 

Stamatoyannopoulos, J. A., Thurman, R. E., Kuehn, M. S., Taylor, C. M., 

Neph, S., Koch, C. M., Asthana, S., Malhotra, A., Adzhubei, I., Greenbaum, 

J. A., … Transcriptional Regulatory Elements. (2007). Identification and 

analysis of functional elements in 1% of the human genome by the 


124 

ENCODE pilot project. Nature, 447(7146), 799–816. 

https://doi.org/10.1038/nature05874 

Blayney, J. W., Francis, H., Rampasekova, A., Camellato, B., Mitchell, L., Stolper, 

R., Cornell, L., Babbs, C., Boeke, J. D., Higgs, D. R., & Kassouf, M. (2023). 

Super-enhancers include classical enhancers and facilitators to fully activate 

gene expression. Cell, 186(26), 5826-5839.e18. 

https://doi.org/10.1016/j.cell.2023.11.030 

Borrelli, E., Hen, R., & Chambon, P. (1984). Adenovirus-2 E1A products repress 

enhancer-induced stimulation of transcription. Nature, 312(5995), 608–612. 

https://doi.org/10.1038/312608a0 

Boswell, S. (2020, November 10). Home-Brew SPRI Beads. Protocols.Io. 

https://www.protocols.io/view/home-brew-spri-beads-bkppkvmn 

Bothma, J. P., Garcia, H. G., Ng, S., Perry, M. W., Gregor, T., & Levine, M. (2015). 

Enhancer additivity and non-additivity are determined by enhancer 

strength in the Drosophila embryo. eLife, 4, e07956. 

https://doi.org/10.7554/eLife.07956 

Bourbon, H.-M., Aguilera, A., Ansari, A. Z., Asturias, F. J., Berk, A. J., Bjorklund, 

S., Blackwell, T. K., Borggrefe, T., Carey, M., Carlson, M., Conaway, J. W., 

Conaway, R. C., Emmons, S. W., Fondell, J. D., Freedman, L. P., Fukasawa, 

T., Gustafsson, C. M., Han, M., He, X., … Kornberg, R. D. (2004). A unified 


125 

nomenclature for protein subunits of mediator complexes linking 

transcriptional regulators to RNA polymerase II. Molecular Cell, 14(5), 553–

557. https://doi.org/10.1016/j.molcel.2004.05.011 

Bower, G., Hollingsworth, E. W., Jacinto, S. H., Alcantara, J. A., Clock, B., Cao, K., 

Liu, M., Dziulko, A., Alcaina-Caro, A., Xu, Q., Skowronska-Krawczyk, D., 

Lopez-Rios, J., Dickel, D. E., Bardet, A. F., Pennacchio, L. A., Visel, A., & 

Kvon, E. Z. (2025). Range extender mediates long-distance enhancer 

activity. Nature, 643(8072), 830–838. https://doi.org/10.1038/s41586-025-

09221-6 

Boyle, A. P., Davis, S., Shulha, H. P., Meltzer, P., Margulies, E. H., Weng, Z., 

Furey, T. S., & Crawford, G. E. (2008). High-Resolution Mapping 

and Characterization of Open Chromatin across the Genome. Cell, 132(2), 

311–322. https://doi.org/10.1016/j.cell.2007.12.014 

Broad Institute. (2019). Picard Toolkit. GitHub Repository. 

https://broadinstitute.github.io/picard/ (Original work published 2014) 

Brosh, R., Coelho, C., Ribeiro-Dos-Santos, A. M., Ellis, G., Hogan, M. S., Ashe, H. 

J., Somogyi, N., Ordoñez, R., Luther, R. D., Huang, E., Boeke, J. D., & 

Maurano, M. T. (2023). Synthetic regulatory genomics uncovers enhancer 

context dependence at the Sox2 locus. Molecular Cell, 83(7), 1140-1152.e7. 

https://doi.org/10.1016/j.molcel.2023.02.027 


126 

Brosh, R., Laurent, J. M., Ordoñez, R., Huang, E., Hogan, M. S., Hitchcock, A. M., 

Mitchell, L. A., Pinglay, S., Cadley, J. A., Luther, R. D., Truong, D. M., 

Boeke, J. D., & Maurano, M. T. (2021). A versatile platform for locus-scale 

genome rewriting and verification. Proceedings of the National Academy of 

Sciences of the United States of America, 118(10), e2023952118. 

https://doi.org/10.1073/pnas.2023952118 

Buenrostro, J. D., Wu, B., Chang, H. Y., & Greenleaf, W. J. (2015). ATAC-seq: A 

Method for Assaying Chromatin Accessibility Genome-Wide. Current 

Protocols in Molecular Biology, 109, 21.29.1-21.29.9. 

https://doi.org/10.1002/0471142727.mb2129s109 

Buratowski, S., Hahn, S., Guarente, L., & Sharp, P. A. (1989). Five intermediate 

complexes in transcription initiation by RNA polymerase II. Cell, 56(4), 

549–561. https://doi.org/10.1016/0092-8674(89)90578-3 

Canver, M. C., Smith, E. C., Sher, F., Pinello, L., Sanjana, N. E., Shalem, O., Chen, 

D. D., Schupp, P. G., Vinjamur, D. S., Garcia, S. P., Luc, S., Kurita, R., 

Nakamura, Y., Fujiwara, Y., Maeda, T., Yuan, G.-C., Zhang, F., Orkin, S. H., 

& Bauer, D. E. (2015a). BCL11A enhancer dissection by Cas9-mediated in 

situ saturating mutagenesis. Nature, 527(7577), 192–197. 

https://doi.org/10.1038/nature15521 

Canver, M. C., Smith, E. C., Sher, F., Pinello, L., Sanjana, N. E., Shalem, O., Chen, 


127 

D. D., Schupp, P. G., Vinjamur, D. S., Garcia, S. P., Luc, S., Kurita, R., 

Nakamura, Y., Fujiwara, Y., Maeda, T., Yuan, G.-C., Zhang, F., Orkin, S. H., 

& Bauer, D. E. (2015b). BCL11A enhancer dissection by Cas9-mediated in 

situ saturating mutagenesis. Nature, 527(7577), 192–197. 

https://doi.org/10.1038/nature15521 

Carleton, J. B., Berrett, K. C., & Gertz, J. (2017). Multiplex Enhancer Interference 

Reveals Collaborative Control of Gene Regulation by Estrogen Receptor α-

Bound Enhancers. Cell Systems, 5(4), 333-344.e5. 

https://doi.org/10.1016/j.cels.2017.08.011 

Castro-Mondragon, J. A., Riudavets-Puig, R., Rauluseviciute, I., Berhanu Lemma, 

R., Turchi, L., Blanc-Mathieu, R., Lucas, J., Boddie, P., Khan, A., 

Manosalva Pérez, N., Fornes, O., Leung, T. Y., Aguirre, A., Hammal, F., 

Schmelter, D., Baranasic, D., Ballester, B., Sandelin, A., Lenhard, B., … 

Mathelier, A. (2022). JASPAR 2022: The 9th release of the open-access 

database of transcription factor binding profiles. Nucleic Acids Research, 

50(D1), D165–D173. https://doi.org/10.1093/nar/gkab1113 

Chakraborty, S., Kopitchinski, N., Zuo, Z., Eraso, A., Awasthi, P., Chari, R., Mitra, 

A., Tobias, I. C., Moorthy, S. D., Dale, R. K., Mitchell, J. A., Petros, T. J., & 

Rocha, P. P. (2023). Enhancer–promoter interactions can bypass CTCF-

mediated boundaries and contribute to phenotypic robustness. Nature 


128 

Genetics, 55(2), 280–290. https://doi.org/10.1038/s41588-022-01295-6 

Chanda, B., Ditadi, A., Iscove, N. N., & Keller, G. (2013). Retinoic acid signaling is 

essential for embryonic hematopoietic stem cell development. Cell, 155(1), 

215–227. https://doi.org/10.1016/j.cell.2013.08.055 

Chen, H., Levo, M., Barinov, L., Fujioka, M., Jaynes, J. B., & Gregor, T. (2018). 

Dynamic interplay between enhancer-promoter topology and gene activity. 

Nature Genetics, 50(9), 1296–1303. https://doi.org/10.1038/s41588-018-

0175-z 

Chen, M. J., Yokomizo, T., Zeigler, B. M., Dzierzak, E., & Speck, N. A. (2009). 

Runx1 is required for the endothelial to haematopoietic cell transition but 

not thereafter. Nature, 457(7231), 887–891. 

https://doi.org/10.1038/nature07619 

Chen, S., Zhou, Y., Chen, Y., & Gu, J. (2018). fastp: An ultra-fast all-in-one FASTQ 

preprocessor. Bioinformatics (Oxford, England), 34(17), i884–i890. 

https://doi.org/10.1093/bioinformatics/bty560 

Chen, Z., Snetkova, V., Bower, G., Jacinto, S., Clock, B., Dizehchi, A., Barozzi, I., 

Mannion, B. J., Alcaina-Caro, A., Lopez-Rios, J., Dickel, D. E., Visel, A., 

Pennacchio, L. A., & Kvon, E. Z. (2024). Increased enhancer–promoter 

interactions during developmental enhancer activation in mammals. Nature 

Genetics, 56(4), 675–685. https://doi.org/10.1038/s41588-024-01681-2 


129 

Chong, S., Graham, T. G. W., Dugast-Darzacq, C., Dailey, G. M., Darzacq, X., & 

Tjian, R. (2022). Tuning levels of low-complexity domain interactions to 

modulate endogenous oncogenic transcription. Molecular Cell, 82(11), 

2084-2097.e5. https://doi.org/10.1016/j.molcel.2022.04.007 

Chrivia, J. C., Kwok, R. P., Lamb, N., Hagiwara, M., Montminy, M. R., & 

Goodman, R. H. (1993). Phosphorylated CREB binds specifically to the 

nuclear protein CBP. Nature, 365(6449), 855–859. 

https://doi.org/10.1038/365855a0 

Church, G. M., Ephrussi, A., Gilbert, W., & Tonegawa, S. (1985). Cell-type-specific 

contacts to immunoglobulin enhancers in nuclei. Nature, 313(6005), 798–

801. https://doi.org/10.1038/313798a0 

Cirillo, L. A., Lin, F. R., Cuesta, I., Friedman, D., Jarnik, M., & Zaret, K. S. (2002). 

Opening of Compacted Chromatin by Early Developmental Transcription 

Factors HNF3 (FoxA) and GATA-4. Molecular Cell, 9(2), 279–289. 

https://doi.org/10.1016/S1097-2765(02)00459-8 

Cong, L., Ran, F. A., Cox, D., Lin, S., Barretto, R., Habib, N., Hsu, P. D., Wu, X., 

Jiang, W., Marraffini, L. A., & Zhang, F. (2013). Multiplex genome 

engineering using CRISPR/Cas systems. Science (New York, N.Y.), 

339(6121), 819–823. https://doi.org/10.1126/science.1231143 

Corces, M. R., Trevino, A. E., Hamilton, E. G., Greenside, P. G., Sinnott-


130 

Armstrong, N. A., Vesuna, S., Satpathy, A. T., Rubin, A. J., Montine, K. S., 

Wu, B., Kathiria, A., Cho, S. W., Mumbach, M. R., Carter, A. C., Kasowski, 

M., Orloff, L. A., Risca, V. I., Kundaje, A., Khavari, P. A., … Chang, H. Y. 

(2017). An improved ATAC-seq protocol reduces background and enables 

interrogation of frozen tissues. Nature Methods, 14(10), 959–962. 

https://doi.org/10.1038/nmeth.4396 

Core, L. J., Martins, A. L., Danko, C. G., Waters, C. T., Siepel, A., & Lis, J. T. (2014). 

Analysis of nascent RNA identifies a unified architecture of initiation 

regions at mammalian promoters and enhancers. Nature Genetics, 46(12), 

1311–1320. https://doi.org/10.1038/ng.3142 

Core, L. J., Waterfall, J. J., & Lis, J. T. (2008). Nascent RNA Sequencing Reveals 

Widespread Pausing and Divergent Initiation at Human Promoters. 

Science, 322(5909), 1845–1848. https://doi.org/10.1126/science.1162228 

Crawford, G. E., Davis, S., Scacheri, P. C., Renaud, G., Halawi, M. J., Erdos, M. R., 

Green, R., Meltzer, P. S., Wolfsberg, T. G., & Collins, F. S. (2006). DNase-

chip: A high-resolution method to identify DNase I hypersensitive sites 

using tiled microarrays. Nature Methods, 3(7), 503–509. 

https://doi.org/10.1038/nmeth888 

Creyghton, M. P., Cheng, A. W., Welstead, G. G., Kooistra, T., Carey, B. W., 

Steine, E. J., Hanna, J., Lodato, M. A., Frampton, G. M., Sharp, P. A., Boyer, 


131 

L. A., Young, R. A., & Jaenisch, R. (2010). Histone H3K27ac separates active 

from poised enhancers and predicts developmental state. Proceedings of the 

National Academy of Sciences, 107(50), 21931–21936. 

https://doi.org/10.1073/pnas.1016071107 

Dancy, B. M., & Cole, P. A. (2015). Protein Lysine Acetylation by p300/CBP. 

Chemical Reviews, 115(6), 2419–2452. https://doi.org/10.1021/cr500452k 

Davidson, I., Fromental, C., Augereau, P., Wildeman, A., Zenke, M., & Chambon, 

P. (1986). Cell-type specific protein binding to the enhancer of simian virus 

40 in nuclear extracts. Nature, 323(6088), 544–548. 

https://doi.org/10.1038/323544a0 

de Almeida, B. P., Schaub, C., Pagani, M., Secchia, S., Furlong, E. E. M., & Stark, A. 

(2024). Targeted design of synthetic enhancers for selected tissues in the 

Drosophila embryo. Nature, 626(7997), 207–211. 

https://doi.org/10.1038/s41586-023-06905-9 

de Groot, R. P., Raaijmakers, J. A., Lammers, J. W., Jove, R., & Koenderman, L. 

(1999). STAT5 activation by BCR-Abl contributes to transformation of 

K562 leukemia cells. Blood, 94(3), 1108–1112. 

de Wit, E., Vos, E. S. M., Holwerda, S. J. B., Valdes-Quezada, C., Verstegen, M. J. 

A. M., Teunissen, H., Splinter, E., Wijchers, P. J., Krijger, P. H. L., & de 

Laat, W. (2015). CTCF Binding Polarity Determines Chromatin Looping. 


132 

Molecular Cell, 60(4), 676–684. https://doi.org/10.1016/j.molcel.2015.09.023 

DeBerardine, M. (2023). BRGenomics for analyzing high-resolution genomics data 

in R. Bioinformatics (Oxford, England), 39(6), btad331. 

https://doi.org/10.1093/bioinformatics/btad331 

Dekker, J., Rippe, K., Dekker, M., & Kleckner, N. (2002). Capturing Chromosome 

Conformation. Science, 295(5558), 1306–1311. 

https://doi.org/10.1126/science.1067799 

DelRosso, N., Suzuki, P. H., Griffith, D., Lotthammer, J. M., Novak, B., Kocalar, S., 

Sheth, M. U., Holehouse, A. S., Bintu, L., & Fordyce, P. (2024). High-

throughput affinity measurements of direct interactions between activation 

domains and co-activators. bioRxiv: The Preprint Server for Biology, 

2024.08.19.608698. https://doi.org/10.1101/2024.08.19.608698 

Deng, W., Lee, J., Wang, H., Miller, J., Reik, A., Gregory, P. D., Dean, A., & Blobel, 

G. A. (2012). Controlling long-range genomic interactions at a native locus 

by targeted tethering of a looping factor. Cell, 149(6), 1233–1244. 

https://doi.org/10.1016/j.cell.2012.03.051 

Dey, A., Chitsaz, F., Abbasi, A., Misteli, T., & Ozato, K. (2003). The double 

bromodomain protein Brd4 binds to acetylated chromatin during 

interphase and mitosis. Proceedings of the National Academy of Sciences, 

100(15), 8758–8763. https://doi.org/10.1073/pnas.1433065100 


133 

Dickel, D. E., Ypsilanti, A. R., Pla, R., Zhu, Y., Barozzi, I., Mannion, B. J., Khin, Y. 

S., Fukuda-Yuzawa, Y., Plajzer-Frick, I., Pickle, C. S., Lee, E. A., 

Harrington, A. N., Pham, Q. T., Garvin, T. H., Kato, M., Osterwalder, M., 

Akiyama, J. A., Afzal, V., Rubenstein, J. L. R., … Visel, A. (2018). 

Ultraconserved Enhancers Are Required for Normal Development. Cell, 

172(3), 491-499.e15. https://doi.org/10.1016/j.cell.2017.12.017 

Dixon, J. R., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., Hu, M., Liu, J. S., & Ren, 

B. (2012). Topological domains in mammalian genomes identified by 

analysis of chromatin interactions. Nature, 485(7398), 376–380. 

https://doi.org/10.1038/nature11082 

Dogan, N., Wu, W., Morrissey, C. S., Chen, K.-B., Stonestrom, A., Long, M., Keller, 

C. A., Cheng, Y., Jain, D., Visel, A., Pennacchio, L. A., Weiss, M. J., Blobel, 

G. A., & Hardison, R. C. (2015). Occupancy by key transcription factors is a 

more accurate predictor of enhancer activity than histone modifications or 

chromatin accessibility. Epigenetics & Chromatin, 8, 16. 

https://doi.org/10.1186/s13072-015-0009-5 

Dong, P., Zhang, S., Gandin, V., Xie, L., Wang, L., Lemire, A. L., Li, W., Otsuna, 

H., Kawase, T., Lander, A. D., Chang, H. Y., & Liu, Z. J. (2024). Cohesin 

prevents cross-domain gene coactivation. Nature Genetics, 56(8), 1654–

1664. https://doi.org/10.1038/s41588-024-01852-1 


134 

Dorighi, K. M., Swigut, T., Henriques, T., Bhanu, N. V., Scruggs, B. S., Nady, N., 

Still, C. D., Garcia, B. A., Adelman, K., & Wysocka, J. (2017). Mll3 and Mll4 

Facilitate Enhancer RNA Synthesis and Transcription from Promoters 

Independently of H3K4 Monomethylation. Molecular Cell, 66(4), 568-

576.e4. https://doi.org/10.1016/j.molcel.2017.04.018 

Dostie, J., Richmond, T. A., Arnaout, R. A., Selzer, R. R., Lee, W. L., Honan, T. A., 

Rubio, E. D., Krumm, A., Lamb, J., Nusbaum, C., Green, R. D., & Dekker, J. 

(2006). Chromosome Conformation Capture Carbon Copy (5C): A massively 

parallel solution for mapping interactions between genomic elements. 

Genome Research, 16(10), 1299–1309. https://doi.org/10.1101/gr.5571506 

Doughty, B. R., Hinks, M. M., Schaepe, J. M., Marinov, G. K., Thurm, A. R., Rios-

Martinez, C., Parks, B. E., Tan, Y., Marklund, E., Dubocanin, D., Bintu, L., 

& Greenleaf, W. J. (2024). Single-molecule states link transcription factor 

binding to gene expression. Nature, 636(8043), 745–754. 

https://doi.org/10.1038/s41586-024-08219-w 

Du, A. Y., Chobirko, J. D., Zhuo, X., Feschotte, C., & Wang, T. (2024). Regulatory 

transposable elements in the encyclopedia of DNA elements. Nature 

Communications, 15(1), 7594. https://doi.org/10.1038/s41467-024-51921-6 

Du, M., Stitzinger, S. H., Spille, J.-H., Cho, W.-K., Lee, C., Hijaz, M., Quintana, A., 

& Cissé, I. I. (2024). Direct observation of a condensate effect on super-


135 

enhancer controlled gene bursting. Cell, 187(2), 331-344.e17. 

https://doi.org/10.1016/j.cell.2023.12.005 

Duarte, F. M., Fuda, N. J., Mahat, D. B., Core, L. J., Guertin, M. J., & Lis, J. T. 

(2016). Transcription factors GAF and HSF act at distinct regulatory steps to 

modulate stress-induced gene activation. Genes & Development, 30(15), 

1731–1746. https://doi.org/10.1101/gad.284430.116 

Duttke, S. H., Guzman, C., Chang, M., Delos Santos, N. P., McDonald, B. R., Xie, J., 

Carlin, A. F., Heinz, S., & Benner, C. (2024). Position-dependent function of 

human sequence-specific transcription factors. Nature, 631(8022), 891–898. 

https://doi.org/10.1038/s41586-024-07662-z 

Eckner, R., Ewen, M. E., Newsome, D., Gerdes, M., DeCaprio, J. A., Lawrence, J. 

B., & Livingston, D. M. (1994). Molecular cloning and functional analysis of 

the adenovirus E1A-associated 300-kD protein (p300) reveals a protein with 

properties of a transcriptional adaptor. Genes & Development, 8(8), 869–

884. https://doi.org/10.1101/gad.8.8.869 

ENCODE Project Consortium. (2004). The ENCODE (ENCyclopedia Of DNA 

Elements) Project. Science (New York, N.Y.), 306(5696), 636–640. 

https://doi.org/10.1126/science.1105136 

ENCODE Project Consortium, Moore, J. E., Purcaro, M. J., Pratt, H. E., Epstein, C. 

B., Shoresh, N., Adrian, J., Kawli, T., Davis, C. A., Dobin, A., Kaul, R., 


136 

Halow, J., Van Nostrand, E. L., Freese, P., Gorkin, D. U., Shen, Y., He, Y., 

Mackiewicz, M., Pauli-Behn, F., … Weng, Z. (2020). Expanded 

encyclopaedias of DNA elements in the human and mouse genomes. 

Nature, 583(7818), 699–710. https://doi.org/10.1038/s41586-020-2493-4 

Engreitz, J. M., Haines, J. E., Perez, E. M., Munson, G., Chen, J., Kane, M., 

McDonel, P. E., Guttman, M., & Lander, E. S. (2016). Local regulation of 

gene expression by lncRNA promoters, transcription and splicing. Nature, 

539(7629), 452–455. https://doi.org/10.1038/nature20149 

Ephrussi, A., Church, G. M., Tonegawa, S., & Gilbert, W. (1985). B lineage—

Specific interactions of an immunoglobulin enhancer with cellular factors 

in vivo. Science (New York, N.Y.), 227(4683), 134–140. 

https://doi.org/10.1126/science.3917574 

Farley, E. K., Olson, K. M., Zhang, W., Brandt, A. J., Rokhsar, D. S., & Levine, M. 

S. (2015). Suboptimization of developmental enhancers. Science, 350(6258), 

325–328. https://doi.org/10.1126/science.aac6948 

Fitz, J., Neumann, T., Steininger, M., Wiedemann, E.-M., Garcia, A. C., 

Athanasiadis, A., Schoeberl, U. E., & Pavri, R. (2020). Spt5-mediated 

enhancer transcription directly couples enhancer activation with physical 

promoter interaction. Nature Genetics, 52(5), 505–515. 

https://doi.org/10.1038/s41588-020-0605-6 


137 

Flanagan, P. M., Kelleher, R. J., Sayre, M. H., Tschochner, H., & Kornberg, R. D. 

(1991). A mediator required for activation of RNA polymerase II 

transcription in vitro. Nature, 350(6317), 436–438. 

https://doi.org/10.1038/350436a0 

Frankel, N., Davis, G. K., Vargas, D., Wang, S., Payre, F., & Stern, D. L. (2010). 

Phenotypic robustness conferred by apparently redundant transcriptional 

enhancers. Nature, 466(7305), 490–493. https://doi.org/10.1038/nature09158 

Frömel, R., Rühle, J., Bernal Martinez, A., Szu-Tu, C., Pacheco Pastor, F., 

Martinez-Corral, R., & Velten, L. (2025). Design principles of cell-state-

specific enhancers in hematopoiesis. Cell, 188(12), 3202-3218.e21. 

https://doi.org/10.1016/j.cell.2025.04.017 

Frömel, R., Rühle, J., Martinez, A. B., Szu-Tu, C., Pastor, F. P., Martinez-Corral, R., 

& Velten, L. (2025). Design principles of cell-state-specific enhancers in 

hematopoiesis. Cell, 0(0). https://doi.org/10.1016/j.cell.2025.04.017 

Fuda, N. J., Ardehali, M. B., & Lis, J. T. (2009). Defining mechanisms that regulate 

RNA polymerase II transcription in vivo. Nature, 461(7261), 186–192. 

https://doi.org/10.1038/nature08449 

Fujinaga, K., Huang, F., & Peterlin, B. M. (2023). P-TEFb: The master regulator of 

transcription elongation. Molecular Cell, 83(3), 393–403. 

https://doi.org/10.1016/j.molcel.2022.12.006 


138 

Fulco, C. P., Munschauer, M., Anyoha, R., Munson, G., Grossman, S. R., Perez, E. 

M., Kane, M., Cleary, B., Lander, E. S., & Engreitz, J. M. (2016). Systematic 

mapping of functional enhancer–promoter connections with CRISPR 

interference. Science, 354(6313), 769–773. 

https://doi.org/10.1126/science.aag2445 

Fulco, C. P., Nasser, J., Jones, T. R., Munson, G., Bergman, D. T., Subramanian, V., 

Grossman, S. R., Anyoha, R., Doughty, B. R., Patwardhan, T. A., Nguyen, T. 

H., Kane, M., Perez, E. M., Durand, N. C., Lareau, C. A., Stamenova, E. K., 

Aiden, E. L., Lander, E. S., & Engreitz, J. M. (2019). Activity-by-Contact 

model of enhancer-promoter regulation from thousands of CRISPR 

perturbations. Nature Genetics, 51(12), 1664–1669. 

https://doi.org/10.1038/s41588-019-0538-0 

Gabriele, M., Brandão, H. B., Grosse-Holz, S., Jha, A., Dailey, G. M., Cattoglio, C., 

Hsieh, T.-H. S., Mirny, L., Zechner, C., & Hansen, A. S. (2022). Dynamics of 

CTCF- and cohesin-mediated chromatin looping revealed by live-cell 

imaging. Science, 376(6592), 496–501. 

https://doi.org/10.1126/science.abn6583 

Gambone, J. E., Dusaban, S. S., Loperena, R., Nakata, Y., & Shetzline, S. E. (2011). 

The c-Myb target gene neuromedin U functions as a novel cofactor during 

the early stages of erythropoiesis. Blood, 117(21), 5733–5743. 


139 

https://doi.org/10.1182/blood-2009-09-242131 

Gasperini, M., Hill, A. J., McFaline-Figueroa, J. L., Martin, B., Kim, S., Zhang, M. 

D., Jackson, D., Leith, A., Schreiber, J., Noble, W. S., Trapnell, C., Ahituv, 

N., & Shendure, J. (2019). A Genome-wide Framework for Mapping Gene 

Regulation via Cellular Genetic Screens. Cell, 176(1–2), 377-390.e19. 

https://doi.org/10.1016/j.cell.2018.11.029 

Georgakopoulos-Soares, I., Deng, C., Agarwal, V., Chan, C. S. Y., Zhao, J., Inoue, 

F., & Ahituv, N. (2023). Transcription factor binding site orientation and 

order are major drivers of gene regulatory activity. Nature 

Communications, 14(1), 2333. https://doi.org/10.1038/s41467-023-37960-5 

Gill, G., & Ptashne, M. (1988). Negative effect of the transcriptional activator 

GAL4. Nature, 334(6184), 721–724. https://doi.org/10.1038/334721a0 

Gillies, S. D., Morrison, S. L., Oi, V. T., & Tonegawa, S. (1983). A tissue-specific 

transcription enhancer element is located in the major intron of a 

rearranged immunoglobulin heavy chain gene. Cell, 33(3), 717–728. 

https://doi.org/10.1016/0092-8674(83)90014-4 

Gilmour, J., Assi, S. A., Noailles, L., Lichtinger, M., Obier, N., & Bonifer, C. (2018). 

The Co-operation of RUNX1 with LDB1, CDK9 and BRD4 Drives 

Transcription Factor Complex Relocation During Haematopoietic 

Specification. Scientific Reports, 8, 10410. https://doi.org/10.1038/s41598-


140 

018-28506-7 

Goel, V. Y., Huseyin, M. K., & Hansen, A. S. (2023). Region Capture Micro-C 

reveals coalescence of enhancers and promoters into nested 

microcompartments. Nature Genetics, 55(6), 1048–1056. 

https://doi.org/10.1038/s41588-023-01391-1 

Goodman, R. H., & Smolik, S. (2000). CBP/p300 in cell growth, transformation, 

and development. Genes & Development, 14(13), 1553–1577. 

https://doi.org/10.1101/gad.14.13.1553 

Gordon, A. (2010). FASTX-Toolkit. GitHub. 

https://github.com/agordon/fastx_toolkit (Original work published 2013) 

Gosai, S. J., Castro, R. I., Fuentes, N., Butts, J. C., Mouri, K., Alasoadura, M., Kales, 

S., Nguyen, T. T. L., Noche, R. R., Rao, A. S., Joy, M. T., Sabeti, P. C., Reilly, 

S. K., & Tewhey, R. (2024). Machine-guided design of cell-type-targeting 

cis-regulatory elements. Nature, 634(8036), 1211–1220. 

https://doi.org/10.1038/s41586-024-08070-z 

Grande, A., Montanari, M., Manfredini, R., Tagliafico, E., Zanocco-Marani, T., 

Trevisan, F., Ligabue, G., Siena, M., Ferrari, S., & Ferrari, S. (2001). A 

functionally active RARalpha nuclear receptor is expressed in retinoic acid 

non responsive early myeloblastic cell lines. Cell Death and Differentiation, 

8(1), 70–82. https://doi.org/10.1038/sj.cdd.4400771 


141 

Grebien, F., Kerenyi, M. A., Kovacic, B., Kolbe, T., Becker, V., Dolznig, H., Pfeffer, 

K., Klingmüller, U., Müller, M., Beug, H., Müllner, E. W., & Moriggl, R. 

(2008). Stat5 activation enables erythropoiesis in the absence of EpoR and 

Jak2. Blood, 111(9), 4511–4522. https://doi.org/10.1182/blood-2007-07-

102848 

Gressel, S., Schwalb, B., & Cramer, P. (2019). The pause-initiation limit restricts 

transcription activation in human cells. Nature Communications, 10(1), 

3603. https://doi.org/10.1038/s41467-019-11536-8 

Gressel, S., Schwalb, B., Decker, T. M., Qin, W., Leonhardt, H., Eick, D., & 

Cramer, P. (2017). CDK9-dependent RNA polymerase II pausing controls 

transcription initiation. eLife, 6, e29736. https://doi.org/10.7554/eLife.29736 

Grossman, S. R., Zhang, X., Wang, L., Engreitz, J., Melnikov, A., Rogov, P., 

Tewhey, R., Isakova, A., Deplancke, B., Bernstein, B. E., Mikkelsen, T. S., & 

Lander, E. S. (2017). Systematic dissection of genomic features determining 

transcription factor binding and enhancer function. Proceedings of the 

National Academy of Sciences of the United States of America, 114(7), 

E1291–E1300. https://doi.org/10.1073/pnas.1621150114 

Gu, B., Swigut, T., Spencley, A., Bauer, M. R., Chung, M., Meyer, T., & Wysocka, J. 

(2018). Transcription-coupled changes in nuclear mobility of mammalian 

cis-regulatory elements. Science (New York, N.Y.), 359(6379), 1050–1055. 


142 

https://doi.org/10.1126/science.aao3136 

Guo, C., McDowell, I. C., Nodzenski, M., Scholtens, D. M., Allen, A. S., Lowe, W. 

L., & Reddy, T. E. (2017). Transversions have larger regulatory effects than 

transitions. BMC Genomics, 18, 394. https://doi.org/10.1186/s12864-017-

3785-4 

Guo, X., Plank-Bazinet, J., Krivega, I., Dale, R. K., & Dean, A. (2020). Embryonic 

erythropoiesis and hemoglobin switching require transcriptional repressor 

ETO2 to modulate chromatin organization. Nucleic Acids Research, 48(18), 

10226–10240. https://doi.org/10.1093/nar/gkaa736 

Guo, Y., Xu, Q., Canzio, D., Shou, J., Li, J., Gorkin, D. U., Jung, I., Wu, H., Zhai, Y., 

Tang, Y., Lu, Y., Wu, Y., Jia, Z., Li, W., Zhang, M. Q., Ren, B., Krainer, A. 

R., Maniatis, T., & Wu, Q. (2015). CRISPR Inversion of CTCF Sites Alters 

Genome Topology and Enhancer/Promoter Function. Cell, 162(4), 900–910. 

https://doi.org/10.1016/j.cell.2015.07.038 

Heintzman, N. D., Hon, G. C., Hawkins, R. D., Kheradpour, P., Stark, A., Harp, L. 

F., Ye, Z., Lee, L. K., Stuart, R. K., Ching, C. W., Ching, K. A., Antosiewicz-

Bourget, J. E., Liu, H., Zhang, X., Green, R. D., Lobanenkov, V. V., Stewart, 

R., Thomson, J. A., Crawford, G. E., … Ren, B. (2009). Histone 

modifications at human enhancers reflect global cell-type-specific gene 

expression. Nature, 459(7243), 108–112. 


143 

https://doi.org/10.1038/nature07829 

Heintzman, N. D., Stuart, R. K., Hon, G., Fu, Y., Ching, C. W., Hawkins, R. D., 

Barrera, L. O., Van Calcar, S., Qu, C., Ching, K. A., Wang, W., Weng, Z., 

Green, R. D., Crawford, G. E., & Ren, B. (2007). Distinct and predictive 

chromatin signatures of transcriptional promoters and enhancers in the 

human genome. Nature Genetics, 39(3), 311–318. 

https://doi.org/10.1038/ng1966 

Hen, R., Borrelli, E., & Chambon, P. (1985). Repression of the immunoglobulin 

heavy chain enhancer by the adenovirus-2 E1A products. Science (New 

York, N.Y.), 230(4732), 1391–1394. https://doi.org/10.1126/science.2999984 

Henriques, T., Scruggs, B. S., Inouye, M. O., Muse, G. W., Williams, L. H., 

Burkholder, A. B., Lavender, C. A., Fargo, D. C., & Adelman, K. (2018). 

Widespread transcriptional pausing and elongation control at enhancers. 

Genes & Development, 32(1), 26–41. 

https://doi.org/10.1101/gad.309351.117 

Hitz, B. C., Jin-Wook, L., Jolanki, O., Kagda, M. S., Graham, K., Sud, P., Gabdank, 

I., Strattan, J. S., Sloan, C. A., Dreszer, T., Rowe, L. D., Podduturi, N. R., 

Malladi, V. S., Chan, E. T., Davidson, J. M., Ho, M., Miyasato, S., Simison, 

M., Tanaka, F., … Cherry, J. M. (2023). The ENCODE Uniform Analysis 

Pipelines. bioRxiv: The Preprint Server for Biology, 2023.04.04.535623. 


144 

https://doi.org/10.1101/2023.04.04.535623 

Hnisz, D., Abraham, B. J., Lee, T. I., Lau, A., Saint-André, V., Sigova, A. A., Hoke, 

H. A., & Young, R. A. (2013). Super-Enhancers in the Control of Cell 

Identity and Disease. Cell, 155(4), 934–947. 

https://doi.org/10.1016/j.cell.2013.09.053 

Ho, L., & Crabtree, G. R. (2010). Chromatin remodelling during development. 

Nature, 463(7280), 474–484. https://doi.org/10.1038/nature08911 

Hochheimer, A., & Tjian, R. (2003). Diversified transcription initiation complexes 

expand promoter selectivity and tissue-specific gene expression. Genes & 

Development, 17(11), 1309–1320. https://doi.org/10.1101/gad.1099903 

Hong, J.-W., Hendrix, D. A., & Levine, M. S. (2008). Shadow Enhancers as a Source 

of Evolutionary Novelty. Science, 321(5894), 1314–1314. 

https://doi.org/10.1126/science.1160631 

Hsieh, T.-H. S., Cattoglio, C., Slobodyanyuk, E., Hansen, A. S., Darzacq, X., & 

Tjian, R. (2022). Enhancer–promoter interactions and transcription are 

largely maintained upon acute loss of CTCF, cohesin, WAPL or YY1. 

Nature Genetics, 54(12), 1919–1932. https://doi.org/10.1038/s41588-022-

01223-8 

Hsieh, T.-H. S., Cattoglio, C., Slobodyanyuk, E., Hansen, A. S., Rando, O. J., Tjian, 

R., & Darzacq, X. (2020). Resolving the 3D Landscape of Transcription-


145 

Linked Mammalian Chromatin Folding. Molecular Cell, 78(3), 539-553.e8. 

https://doi.org/10.1016/j.molcel.2020.03.002 

Hu, J., Liu, J., Xue, F., Halverson, G., Reid, M., Guo, A., Chen, L., Raza, A., Galili, 

N., Jaffray, J., Lane, J., Chasis, J. A., Taylor, N., Mohandas, N., & An, X. 

(2013). Isolation and functional characterization of human erythroblasts at 

distinct stages: Implications for understanding of normal and disordered 

erythropoiesis in vivo. Blood, 121(16), 3246–3253. 

https://doi.org/10.1182/blood-2013-01-476390 

Huang, J., Li, K., Cai, W., Liu, X., Zhang, Y., Orkin, S. H., Xu, J., & Yuan, G.-C. 

(2018). Dissecting super-enhancer hierarchy based on chromatin 

interactions. Nature Communications, 9, 943. 

https://doi.org/10.1038/s41467-018-03279-9 

International Human Genome Sequencing Consortium. (2004). Finishing the 

euchromatic sequence of the human genome. Nature, 431(7011), 931–945. 

https://doi.org/10.1038/nature03001 

Jang, M. K., Mochizuki, K., Zhou, M., Jeong, H.-S., Brady, J. N., & Ozato, K. (2005). 

The Bromodomain Protein Brd4 Is a Positive Regulatory Component of P-

TEFb and Stimulates RNA Polymerase II-Dependent Transcription. 

Molecular Cell, 19(4), 523–534. https://doi.org/10.1016/j.molcel.2005.06.027 

Janknecht, R., & Hunter, T. (1996). A growing coactivator network. Nature, 


146 

383(6595), 22–23. https://doi.org/10.1038/383022a0 

Jin, Q., Yu, L.-R., Wang, L., Zhang, Z., Kasper, L. H., Lee, J.-E., Wang, C., Brindle, 

P. K., Dent, S. Y. R., & Ge, K. (2011). Distinct roles of GCN5/PCAF-

mediated H3K9ac and CBP/p300-mediated H3K18/27ac in nuclear receptor 

transactivation. The EMBO Journal, 30(2), 249–262. 

https://doi.org/10.1038/emboj.2010.318 

Jinek, M., Chylinski, K., Fonfara, I., Hauer, M., Doudna, J. A., & Charpentier, E. 

(2012). A programmable dual-RNA-guided DNA endonuclease in adaptive 

bacterial immunity. Science (New York, N.Y.), 337(6096), 816–821. 

https://doi.org/10.1126/science.1225829 

John, S., Vinkemeier, U., Soldaini, E., Darnell, J. E., & Leonard, W. J. (1999). The 

significance of tetramerization in promoter recruitment by Stat5. Molecular 

and Cellular Biology, 19(3), 1910–1918. 

https://doi.org/10.1128/MCB.19.3.1910 

Jonkers, I., Kwak, H., & Lis, J. T. (2014). Genome-wide dynamics of Pol II 

elongation and its interplay with promoter proximal pausing, chromatin, 

and exons. eLife, 3, e02407. https://doi.org/10.7554/eLife.02407 

Judd, J. (2020). PROseq_alignment.sh. GitHub. 

https://github.com/JAJ256/PROseq_alignment.sh (Original work published 

2020) 


147 

Judd, J., Wojenski, L. A., Wainman, L. M., Tippens, N. D., Rice, E. J., Dziubek, A., 

Villafano, G. J., Wissink, E. M., Versluis, P., Bagepalli, L., Shah, S. R., 

Mahat, D. B., Tome, J. M., Danko, C. G., Lis, J. T., & Core, L. J. (2020). A 

rapid, sensitive, scalable method for Precision Run-On sequencing (PRO-

seq). bioRxiv, 2020.05.18.102277. https://doi.org/10.1101/2020.05.18.102277 

Junion, G., Spivakov, M., Girardot, C., Braun, M., Gustafson, E. H., Birney, E., & 

Furlong, E. E. M. (2012). A Transcription Factor Collective Defines Cardiac 

Cell Fate and Reflects Lineage History. Cell, 148(3), 473–486. 

https://doi.org/10.1016/j.cell.2012.01.030 

Kadam, S., & Emerson, B. M. (2003). Transcriptional specificity of human 

SWI/SNF BRG1 and BRM chromatin remodeling complexes. Molecular 

Cell, 11(2), 377–389. https://doi.org/10.1016/s1097-2765(03)00034-0 

Karlsson, M., Zhang, C., Méar, L., Zhong, W., Digre, A., Katona, B., Sjöstedt, E., 

Butler, L., Odeberg, J., Dusart, P., Edfors, F., Oksvold, P., von Feilitzen, K., 

Zwahlen, M., Arif, M., Altay, O., Li, X., Ozcan, M., Mardinoglu, A., … 

Lindskog, C. (2021). A single-cell type transcriptomics map of human 

tissues. Science Advances, 7(31), eabh2169. 

https://doi.org/10.1126/sciadv.abh2169 

Karolchik, D., Hinrichs, A. S., Furey, T. S., Roskin, K. M., Sugnet, C. W., Haussler, 

D., & Kent, W. J. (2004). The UCSC Table Browser data retrieval tool. 


148 

Nucleic Acids Research, 32(Database issue), D493-496. 

https://doi.org/10.1093/nar/gkh103 

Kelleher, R. J., Flanagan, P. M., & Kornberg, R. D. (1990). A novel mediator 

between activator proteins and the RNA polymerase II transcription 

apparatus. Cell, 61(7), 1209–1215. https://doi.org/10.1016/0092-

8674(90)90685-8 

Kelley, D. R., Snoek, J., & Rinn, J. L. (2016). Basset: Learning the regulatory code of 

the accessible genome with deep convolutional neural networks. Genome 

Research, 26(7), 990–999. https://doi.org/10.1101/gr.200535.115 

Kent, W. J., Zweig, A. S., Barber, G., Hinrichs, A. S., & Karolchik, D. (2010). 

BigWig and BigBed: Enabling browsing of large distributed datasets. 

Bioinformatics (Oxford, England), 26(17), 2204–2207. 

https://doi.org/10.1093/bioinformatics/btq351 

Khalfan, M. (2021). reform: Modify Reference Sequence and Annotation Files 

Quickly and Reproducibly. Genomics Core at NYU CGSB. 

https://gencore.bio.nyu.edu/reform/ (Original work published 2018) 

Kim, T.-K., Hemberg, M., Gray, J. M., Costa, A. M., Bear, D. M., Wu, J., Harmin, D. 

A., Laptewicz, M., Barbara-Haley, K., Kuersten, S., Markenscoff-

Papadimitriou, E., Kuhl, D., Bito, H., Worley, P. F., Kreiman, G., & 

Greenberg, M. E. (2010). Widespread transcription at neuronal activity-


149 

regulated enhancers. Nature, 465(7295), 182–187. 

https://doi.org/10.1038/nature09033 

Kim, Y. J., Björklund, S., Li, Y., Sayre, M. H., & Kornberg, R. D. (1994). A 

multiprotein mediator of transcriptional activation and its interaction with 

the C-terminal repeat domain of RNA polymerase II. Cell, 77(4), 599–608. 

https://doi.org/10.1016/0092-8674(94)90221-6 

Kircher, M., Xiong, C., Martin, B., Schubach, M., Inoue, F., Bell, R. J. A., Costello, 

J. F., Shendure, J., & Ahituv, N. (2019). Saturation mutagenesis of twenty 

disease-associated regulatory elements at single base-pair resolution. Nature 

Communications, 10(1), 3583. https://doi.org/10.1038/s41467-019-11526-w 

Klein, J. C., Agarwal, V., Inoue, F., Keith, A., Martin, B., Kircher, M., Ahituv, N., & 

Shendure, J. (2020). A systematic evaluation of the design and context 

dependencies of massively parallel reporter assays. Nature Methods, 17(11), 

1083–1091. https://doi.org/10.1038/s41592-020-0965-y 

Kosicki, M., Zhang, B., Pampari, A., Akiyama, J. A., Plajzer-Frick, I., Novak, C. S., 

Tran, S., Zhu, Y., Kato, M., Hunter, R. D., von Maydell, K., Barton, S., 

Beckman, E., Kundaje, A., Dickel, D. E., Visel, A., & Pennacchio, L. A. 

(2024). Mutagenesis Sensitivity Mapping of Human Enhancers In Vivo. 

bioRxiv: The Preprint Server for Biology, 2024.09.06.611737. 

https://doi.org/10.1101/2024.09.06.611737 


150 

Kribelbauer, J. F., Rastogi, C., Bussemaker, H. J., & Mann, R. S. (2019). Low-

Affinity Binding Sites and the Transcription Factor Specificity Paradox in 

Eukaryotes. Annual Review of Cell and Developmental Biology, 35, 357–

379. https://doi.org/10.1146/annurev-cellbio-100617-062719 

Kribelbauer-Swietek, J. F., Pushkarev, O., Gardeux, V., Faltejskova, K., Russeil, J., 

van Mierlo, G., & Deplancke, B. (2024). Context transcription factors 

establish cooperative environments and mediate enhancer communication. 

Nature Genetics, 56(10), 2199–2212. https://doi.org/10.1038/s41588-024-

01892-7 

Kruesi, W. S., Core, L. J., Waters, C. T., Lis, J. T., & Meyer, B. J. (2013). Condensin 

controls recruitment of RNA polymerase II to achieve nematode X-

chromosome dosage compensation. eLife, 2, e00808. 

https://doi.org/10.7554/eLife.00808 

Kubo, N., Chen, P. B., Hu, R., Ye, Z., Sasaki, H., & Ren, B. (2024). H3K4me1 

facilitates promoter-enhancer interactions and gene activation during 

embryonic stem cell differentiation. Molecular Cell, 84(9), 1742-1752.e5. 

https://doi.org/10.1016/j.molcel.2024.02.030 

Kulkarni, M. M., & Arnosti, D. N. (2003). Information display by transcriptional 

enhancers. Development (Cambridge, England), 130(26), 6569–6575. 

https://doi.org/10.1242/dev.00890 


151 

Kvon, E. Z., Waymack, R., Gad, M., & Wunderlich, Z. (2021). Enhancer 

redundancy in development and disease. Nature Reviews Genetics, 22(5), 

324–336. https://doi.org/10.1038/s41576-020-00311-x 

Kwak, H., Fuda, N. J., Core, L. J., & Lis, J. T. (2013). Precise maps of RNA 

polymerase reveal how promoters direct initiation and pausing. Science 

(New York, N.Y.), 339(6122), 950–953. 

https://doi.org/10.1126/science.1229386 

Labun, K., Montague, T. G., Krause, M., Torres Cleuren, Y. N., Tjeldnes, H., & 

Valen, E. (2019). CHOPCHOP v3: Expanding the CRISPR web toolbox 

beyond genome editing. Nucleic Acids Research, 47(W1), W171–W174. 

https://doi.org/10.1093/nar/gkz365 

Lam, S. S., Martell, J. D., Kamer, K. J., Deerinck, T. J., Ellisman, M. H., Mootha, V. 

K., & Ting, A. Y. (2015). Directed evolution of APEX2 for electron 

microscopy and proximity labeling. Nature Methods, 12(1), 51–54. 

https://doi.org/10.1038/nmeth.3179 

LaMar, D. (2015). FastQC. https://qubeshub.org/resources/fastqc 

Lander, E. S., Linton, L. M., Birren, B., Nusbaum, C., Zody, M. C., Baldwin, J., 

Devon, K., Dewar, K., Doyle, M., FitzHugh, W., Funke, R., Gage, D., Harris, 

K., Heaford, A., Howland, J., Kann, L., Lehoczky, J., LeVine, R., McEwan, 

P., … The Wellcome Trust: (2001). Initial sequencing and analysis of the 


152 

human genome. Nature, 409(6822), 860–921. 

https://doi.org/10.1038/35057062 

Langmead, B., & Salzberg, S. L. (2012). Fast gapped-read alignment with Bowtie 2. 

Nature Methods, 9(4), 357–359. https://doi.org/10.1038/nmeth.1923 

Lettice, L. A., Heaney, S. J. H., Purdie, L. A., Li, L., de Beer, P., Oostra, B. A., 

Goode, D., Elgar, G., Hill, R. E., & de Graaff, E. (2003). A long-range Shh 

enhancer regulates expression in the developing limb and fin and is 

associated with preaxial polydactyly. Human Molecular Genetics, 12(14), 

1725–1735. https://doi.org/10.1093/hmg/ddg180 

Levo, M., Raimundo, J., Bing, X. Y., Sisco, Z., Batut, P. J., Ryabichko, S., Gregor, T., 

& Levine, M. S. (2022). Transcriptional coupling of distant regulatory genes 

in living embryos. Nature, 605(7911), 754–760. 

https://doi.org/10.1038/s41586-022-04680-7 

Li, D., Zhao, X.-Y., Zhou, S., Hu, Q., Wu, F., & Lee, H.-Y. (2023). 

Multidimensional profiling reveals GATA1-modulated stage-specific 

chromatin states and functional associations during human erythropoiesis. 

Nucleic Acids Research, 51(13), 6634–6653. 

https://doi.org/10.1093/nar/gkad468 

Li, H. (2023). Seqtk. GitHub. https://github.com/lh3/seqtk (Original work 

published 2012) 


153 

Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., 

Abecasis, G., Durbin, R., & 1000 Genome Project Data Processing Subgroup. 

(2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics 

(Oxford, England), 25(16), 2078–2079. 

https://doi.org/10.1093/bioinformatics/btp352 

Li, J., Hale, J., Bhagia, P., Xue, F., Chen, L., Jaffray, J., Yan, H., Lane, J., Gallagher, 

P. G., Mohandas, N., Liu, J., & An, X. (2014). Isolation and transcriptome 

analyses of human erythroid progenitors: BFU-E and CFU-E. Blood, 

124(24), 3636–3645. https://doi.org/10.1182/blood-2014-07-588806 

Li, X., Tang, X., Bing, X., Catalano, C., Li, T., Dolsten, G., Wu, C., & Levine, M. 

(2023). GAGA-associated factor fosters loop formation in the Drosophila 

genome. Molecular Cell, 83(9), 1519-1526.e4. 

https://doi.org/10.1016/j.molcel.2023.03.011 

Liao, Y., Smyth, G. K., & Shi, W. (2014). featureCounts: An efficient general 

purpose program for assigning sequence reads to genomic features. 

Bioinformatics (Oxford, England), 30(7), 923–930. 

https://doi.org/10.1093/bioinformatics/btt656 

Lieberman-Aiden, E., van Berkum, N. L., Williams, L., Imakaev, M., Ragoczy, T., 

Telling, A., Amit, I., Lajoie, B. R., Sabo, P. J., Dorschner, M. O., Sandstrom, 

R., Bernstein, B., Bender, M. A., Groudine, M., Gnirke, A., 


154 

Stamatoyannopoulos, J., Mirny, L. A., Lander, E. S., & Dekker, J. (2009). 

Comprehensive Mapping of Long-Range Interactions Reveals Folding 

Principles of the Human Genome. Science, 326(5950), 289–293. 

https://doi.org/10.1126/science.1181369 

Lim, C. P., & Cao, X. (2006). Structure, function, and regulation of STAT proteins. 

Molecular bioSystems, 2(11), 536–550. https://doi.org/10.1039/b606246f 

Lim, F., Solvason, J. J., Ryan, G. E., Le, S. H., Jindal, G. A., Steffen, P., Jandu, S. K., 

& Farley, E. K. (2024). Affinity-optimizing enhancer variants disrupt 

development. Nature, 626(7997), 151–159. https://doi.org/10.1038/s41586-

023-06922-8 

Lin, X., Liu, Y., Liu, S., Zhu, X., Wu, L., Zhu, Y., Zhao, D., Xu, X., Chemparathy, 

A., Wang, H., Cao, Y., Nakamura, M., Noordermeer, J. N., La Russa, M., 

Wong, W. H., Zhao, K., & Qi, L. S. (2022). Nested epistasis enhancer 

networks for robust genome regulation. Science, 377(6610), 1077–1085. 

https://doi.org/10.1126/science.abk3512 

Lis, J. T., Mason, P., Peng, J., Price, D. H., & Werner, J. (2000). P-TEFb kinase 

recruitment and function at heat shock loci. Genes & Development, 14(7), 

792–803. 

Liu, W., Ma, Q., Wong, K., Li, W., Ohgi, K., Zhang, J., Aggarwal, A., & Rosenfeld, 

M. G. (2013). Brd4 and JMJD6-associated anti-pause enhancers in 


155 

regulation of transcriptional pause release. Cell, 155(7), 1581–1595. 

https://doi.org/10.1016/j.cell.2013.10.056 

Livak, K. J., & Schmittgen, T. D. (2001). Analysis of relative gene expression data 

using real-time quantitative PCR and the 2(-Delta Delta C(T)) Method. 

Methods (San Diego, Calif.), 25(4), 402–408. 

https://doi.org/10.1006/meth.2001.1262 

Lopez-Delisle, L., Rabbani, L., Wolff, J., Bhardwaj, V., Backofen, R., Grüning, B., 

Ramírez, F., & Manke, T. (2021). pyGenomeTracks: Reproducible plots for 

multivariate genomic datasets. Bioinformatics (Oxford, England), 37(3), 

422–423. https://doi.org/10.1093/bioinformatics/btaa692 

Love, M. I., Huber, W., & Anders, S. (2014). Moderated estimation of fold change 

and dispersion for RNA-seq data with DESeq2. Genome Biology, 15(12), 

550. https://doi.org/10.1186/s13059-014-0550-8 

Lu, X., Zhu, X., Li, Y., Liu, M., Yu, B., Wang, Y., Rao, M., Yang, H., Zhou, K., 

Wang, Y., Chen, Y., Chen, M., Zhuang, S., Chen, L.-F., Liu, R., & Chen, R. 

(2016). Multiple P-TEFbs cooperatively regulate the release of promoter-

proximally paused RNA polymerase II. Nucleic Acids Research, 44(14), 

6853–6867. https://doi.org/10.1093/nar/gkw571 

Luo, Y., Hitz, B. C., Gabdank, I., Hilton, J. A., Kagda, M. S., Lam, B., Myers, Z., 

Sud, P., Jou, J., Lin, K., Baymuradov, U. K., Graham, K., Litton, C., 


156 

Miyasato, S. R., Strattan, J. S., Jolanki, O., Lee, J.-W., Tanaka, F. Y., 

Adenekan, P., … Cherry, J. M. (2020). New developments on the 

Encyclopedia of DNA Elements (ENCODE) data portal. Nucleic Acids 

Research, 48(D1), D882–D889. https://doi.org/10.1093/nar/gkz1062 

Lupiáñez, D. G., Kraft, K., Heinrich, V., Krawitz, P., Brancati, F., Klopocki, E., 

Horn, D., Kayserili, H., Opitz, J. M., Laxova, R., Santos-Simarro, F., Gilbert-

Dussardier, B., Wittler, L., Borschiwer, M., Haas, S. A., Osterwalder, M., 

Franke, M., Timmermann, B., Hecht, J., … Mundlos, S. (2015). Disruptions 

of Topological Chromatin Domains Cause Pathogenic Rewiring of Gene-

Enhancer Interactions. Cell, 161(5), 1012–1025. 

https://doi.org/10.1016/j.cell.2015.04.004 

Lutfalla, G., & Uze, G. (2006). Performing quantitative reverse-transcribed 

polymerase chain reaction experiments. Methods in Enzymology, 410, 386–

400. https://doi.org/10.1016/S0076-6879(06)10019-1 

Mahat, D. B., Kwak, H., Booth, G. T., Jonkers, I. H., Danko, C. G., Patel, R. K., 

Waters, C. T., Munson, K., Core, L. J., & Lis, J. T. (2016). Base-pair-

resolution genome-wide mapping of active RNA polymerases using 

precision nuclear run-on (PRO-seq). Nature Protocols, 11(8), 1455–1476. 

https://doi.org/10.1038/nprot.2016.086 

Mahat, D. B., Salamanca, H. H., Duarte, F. M., Danko, C. G., & Lis, J. T. (2016). 


157 

Mammalian Heat Shock Response and Mechanisms Underlying Its 

Genome-wide Transcriptional Regulation. Molecular Cell, 62(1), 63–78. 

https://doi.org/10.1016/j.molcel.2016.02.025 

Mali, P., Yang, L., Esvelt, K. M., Aach, J., Guell, M., DiCarlo, J. E., Norville, J. E., & 

Church, G. M. (2013). RNA-guided human genome engineering via Cas9. 

Science (New York, N.Y.), 339(6121), 823–826. 

https://doi.org/10.1126/science.1232033 

Martin, K. J., Lillie, J. W., & Green, M. R. (1990). Evidence for interaction of 

different eukaryotic transcriptional activators with distinct cellular targets. 

Nature, 346(6280), 147–152. https://doi.org/10.1038/346147a0 

Martinez-Ara, M., Comoglio, F., Arensbergen, J. van, & Steensel, B. van. (2022). 

Systematic analysis of intrinsic enhancer-promoter compatibility in the 

mouse genome. Molecular Cell, 82(13), 2519-2531.e6. 

https://doi.org/10.1016/j.molcel.2022.04.009 

Martyn, G. E., Montgomery, M. T., Jones, H., Guo, K., Doughty, B. R., Linder, J., 

Bisht, D., Xia, F., Cai, X. S., Chen, Z., Cochran, K., Lawrence, K. A., 

Munson, G., Pampari, A., Fulco, C. P., Sahni, N., Kelley, D. R., Lander, E. S., 

Kundaje, A., & Engreitz, J. M. (2025). Rewriting regulatory DNA to dissect 

and reprogram gene expression. Cell, S0092-8674(25)00352-6. 

https://doi.org/10.1016/j.cell.2025.03.034 


158 

Matreyek, K. A., Stephany, J. J., Chiasson, M. A., Hasle, N., & Fowler, D. M. (2020). 

An improved platform for functional assessment of large protein libraries in 

mammalian cells. Nucleic Acids Research, 48(1), e1. 

https://doi.org/10.1093/nar/gkz910 

Medstrand, P., Landry, J. R., & Mager, D. L. (2001). Long terminal repeats are used 

as alternative promoters for the endothelin B receptor and apolipoprotein 

C-I genes in humans. The Journal of Biological Chemistry, 276(3), 1896–

1903. https://doi.org/10.1074/jbc.M006557200 

Meier, N., Krpic, S., Rodriguez, P., Strouboulis, J., Monti, M., Krijgsveld, J., Gering, 

M., Patient, R., Hostert, A., & Grosveld, F. (2006). Novel binding partners of 

Ldb1 are required for haematopoietic development. Development, 133(24), 

4913–4923. https://doi.org/10.1242/dev.02656 

Melnikov, A., Murugan, A., Zhang, X., Tesileanu, T., Wang, L., Rogov, P., Feizi, S., 

Gnirke, A., Callan, C. G., Kinney, J. B., Kellis, M., Lander, E. S., & 

Mikkelsen, T. S. (2012). Systematic dissection and optimization of inducible 

enhancers in human cells using a massively parallel reporter assay. Nature 

Biotechnology, 30(3), 271–277. https://doi.org/10.1038/nbt.2137 

Mercola, M., Goverman, J., Mirell, C., & Calame, K. (1985). Immunoglobulin 

Heavy-Chain Enhancer Requires One or More Tissue-Specific Factors. 

Science, 227(4684), 266–270. https://doi.org/10.1126/science.3917575 


159 

Merli, C., Bergstrom, D. E., Cygan, J. A., & Blackman, R. K. (1996). Promoter 

specificity mediates the independent regulation of neighboring genes. 

Genes & Development, 10(10), 1260–1270. 

https://doi.org/10.1101/gad.10.10.1260 

Meyer, M. E., Gronemeyer, H., Turcotte, B., Bocquel, M. T., Tasset, D., & 

Chambon, P. (1989). Steroid hormone receptors compete for factors that 

mediate their enhancer function. Cell, 57(3), 433–442. 

https://doi.org/10.1016/0092-8674(89)90918-5 

Meyer, W. K., Reichenbach, P., Schindler, U., Soldaini, E., & Nabholz, M. (1997). 

Interaction of STAT5 dimers on two low affinity binding sites mediates 

interleukin 2 (IL-2) stimulation of IL-2 receptor alpha gene transcription. 

The Journal of Biological Chemistry, 272(50), 31821–31828. 

https://doi.org/10.1074/jbc.272.50.31821 

Mikhaylichenko, O., Bondarenko, V., Harnett, D., Schor, I. E., Males, M., Viales, 

R. R., & Furlong, E. E. M. (2018). The degree of enhancer or promoter 

activity is reflected by the levels and directionality of eRNA transcription. 

Genes & Development, 32(1), 42–57. 

https://doi.org/10.1101/gad.308619.117 

Miller, J. A., & Widom, J. (2003). Collaborative competition mechanism for gene 

activation in vivo. Molecular and Cellular Biology, 23(5), 1623–1632. 


160 

https://doi.org/10.1128/MCB.23.5.1623-1632.2003 

Mirny, L. A. (2010). Nucleosome-mediated cooperativity between transcription 

factors. Proceedings of the National Academy of Sciences of the United 

States of America, 107(52), 22534–22539. 

https://doi.org/10.1073/pnas.0913805107 

Mitchell, P. J., Wang, C., & Tjian, R. (1987). Positive and negative regulation of 

transcription in vitro: Enhancer-binding protein AP-2 is inhibited by SV40 

T antigen. Cell, 50(6), 847–861. https://doi.org/10.1016/0092-

8674(87)90512-5 

Moreau, P., Hen, R., Wasylyk, B., Everett, R., Gaub, M. P., & Chambon, P. (1981). 

The SV40 72 base repair repeat has a striking effect on gene expression both 

in SV40 and other chimeric recombinants. Nucleic Acids Research, 9(22), 

6047–6068. https://doi.org/10.1093/nar/9.22.6047 

Morgunova, E., & Taipale, J. (2017). Structural perspective of cooperative 

transcription factor binding. Current Opinion in Structural Biology, 47, 1–

8. https://doi.org/10.1016/j.sbi.2017.03.006 

Mudge, J. M., Carbonell-Sala, S., Diekhans, M., Martinez, J. G., Hunt, T., Jungreis, 

I., Loveland, J. E., Arnan, C., Barnes, I., Bennett, R., Berry, A., Bignell, A., 

Cerdán-Vélez, D., Cochran, K., Cortés, L. T., Davidson, C., Donaldson, S., 

Dursun, C., Fatima, R., … Frankish, A. (2025). GENCODE 2025: Reference 


161 

gene annotation for human and mouse. Nucleic Acids Research, 53(D1), 

D966–D975. https://doi.org/10.1093/nar/gkae1078 

Neuberger, M. S. (1983). Expression and regulation of immunoglobulin heavy 

chain gene transfected into lymphoid cells. The EMBO Journal, 2(8), 1373–

1378. https://doi.org/10.1002/j.1460-2075.1983.tb01594.x 

Neumayr, C., Haberle, V., Serebreni, L., Karner, K., Hendy, O., Boija, A., 

Henninger, J. E., Li, C. H., Stejskal, K., Lin, G., Bergauer, K., Pagani, M., 

Rath, M., Mechtler, K., Arnold, C. D., & Stark, A. (2022). Differential 

cofactor dependencies define distinct types of human enhancers. Nature, 

606(7913), 406–413. https://doi.org/10.1038/s41586-022-04779-x 

Niers, J. M., Chen, J. W., Weissleder, R., & Tannous, B. A. (2011). Enhanced in 

vivo imaging of metabolically biotinylated cell surface reporters. Analytical 

Chemistry, 83(3), 994–999. https://doi.org/10.1021/ac102758m 

Nora, E. P., Goloborodko, A., Valton, A.-L., Gibcus, J. H., Uebersohn, A., 

Abdennur, N., Dekker, J., Mirny, L. A., & Bruneau, B. G. (2017). Targeted 

Degradation of CTCF Decouples Local Insulation of Chromosome Domains 

from Genomic Compartmentalization. Cell, 169(5), 930-944.e22. 

https://doi.org/10.1016/j.cell.2017.05.004 

Nora, E. P., Lajoie, B. R., Schulz, E. G., Giorgetti, L., Okamoto, I., Servant, N., 

Piolot, T., van Berkum, N. L., Meisig, J., Sedat, J., Gribnau, J., Barillot, E., 


162 

Blüthgen, N., Dekker, J., & Heard, E. (2012). Spatial partitioning of the 

regulatory landscape of the X-inactivation centre. Nature, 485(7398), 381–

385. https://doi.org/10.1038/nature11049 

Nuez, B., Michalovich, D., Bygrave, A., Ploemacher, R., & Grosveld, F. (1995). 

Defective haematopoiesis in fetal liver resulting from inactivation of the 

EKLF gene. Nature, 375(6529), 316–318. https://doi.org/10.1038/375316a0 

Okuda, T., van Deursen, J., Hiebert, S. W., Grosveld, G., & Downing, J. R. (1996). 

AML1, the target of multiple chromosomal translocations in human 

leukemia, is essential for normal fetal liver hematopoiesis. Cell, 84(2), 321–

330. https://doi.org/10.1016/s0092-8674(00)80986-1 

Osterwalder, M., Barozzi, I., Tissières, V., Fukuda-Yuzawa, Y., Mannion, B. J., 

Afzal, S. Y., Lee, E. A., Zhu, Y., Plajzer-Frick, I., Pickle, C. S., Kato, M., 

Garvin, T. H., Pham, Q. T., Harrington, A. N., Akiyama, J. A., Afzal, V., 

Lopez-Rios, J., Dickel, D. E., Visel, A., & Pennacchio, L. A. (2018). 

Enhancer redundancy provides phenotypic robustness in mammalian 

development. Nature, 554(7691), 239–243. 

https://doi.org/10.1038/nature25461 

Panne, D., Maniatis, T., & Harrison, S. C. (2007). An atomic model of the 

interferon-beta enhanceosome. Cell, 129(6), 1111–1123. 

https://doi.org/10.1016/j.cell.2007.05.019 


163 

Patwardhan, R. P., Hiatt, J. B., Witten, D. M., Kim, M. J., Smith, R. P., May, D., 

Lee, C., Andrie, J. M., Lee, S.-I., Cooper, G. M., Ahituv, N., Pennacchio, L. 

A., & Shendure, J. (2012). Massively parallel functional dissection of 

mammalian enhancers in vivo. Nature Biotechnology, 30(3), 265–270. 

https://doi.org/10.1038/nbt.2136 

Payvar, F., DeFranco, D., Firestone, G. L., Edgar, B., Wrange, O., Okret, S., 

Gustafsson, J. A., & Yamamoto, K. R. (1983). Sequence-specific binding of 

glucocorticoid receptor to MTV DNA at sites within and upstream of the 

transcribed region. Cell, 35(2 Pt 1), 381–392. https://doi.org/10.1016/0092-

8674(83)90171-x 

Perez, G., Barber, G. P., Benet-Pages, A., Casper, J., Clawson, H., Diekhans, M., 

Fischer, C., Gonzalez, J. N., Hinrichs, A. S., Lee, C. M., Nassar, L. R., Raney, 

B. J., Speir, M. L., van Baren, M. J., Vaske, C. J., Haussler, D., Kent, W. J., & 

Haeussler, M. (2025). The UCSC Genome Browser database: 2025 update. 

Nucleic Acids Research, 53(D1), D1243–D1249. 

https://doi.org/10.1093/nar/gkae974 

Perkins, A. C., Sharpe, A. H., & Orkin, S. H. (1995). Lethal beta-thalassaemia in 

mice lacking the erythroid CACCC-transcription factor EKLF. Nature, 

375(6529), 318–322. https://doi.org/10.1038/375318a0 

Perry, M. W., Boettiger, A. N., Bothma, J. P., & Levine, M. (2010). Shadow 


164 

Enhancers Foster Robustness of Drosophila Gastrulation. Current Biology, 

20(17), 1562–1567. https://doi.org/10.1016/j.cub.2010.07.043 

Pevny, L., Simon, M. C., Robertson, E., Klein, W. H., Tsai, S. F., D’Agati, V., Orkin, 

S. H., & Costantini, F. (1991). Erythroid differentiation in chimaeric mice 

blocked by a targeted mutation in the gene for transcription factor GATA-

1. Nature, 349(6306), 257–260. https://doi.org/10.1038/349257a0 

Phan, M. H. Q., Zehnder, T., Puntieri, F., Magg, A., Majchrzycka, B., Antonović, 

M., Wieler, H., Lo, B.-W., Baranasic, D., Lenhard, B., Müller, F., Vingron, 

M., & Ibrahim, D. M. (2025). Conservation of regulatory elements with 

highly diverged sequences across large evolutionary distances. Nature 

Genetics, 57(6), 1524–1534. https://doi.org/10.1038/s41588-025-02202-5 

Picard, D., & Schaffner, W. (1984). A lymphocyte-specific enhancer in the mouse 

immunoglobulin κ gene. Nature, 307(5946), 80–82. 

https://doi.org/10.1038/307080a0 

Ptashne, M., & Gann, A. A. (1990). Activators and targets. Nature, 346(6282), 329–

331. https://doi.org/10.1038/346329a0 

Qi, L. S., Larson, M. H., Gilbert, L. A., Doudna, J. A., Weissman, J. S., Arkin, A. P., 

& Lim, W. A. (2013). Repurposing CRISPR as an RNA-Guided Platform for 

Sequence-Specific Control of Gene Expression. Cell, 152(5), 1173–1183. 

https://doi.org/10.1016/j.cell.2013.02.022 


165 

Queen, C., & Baltimore, D. (1983). Immunoglobulin gene transcription is activated 

by downstream sequence elements. Cell, 33(3), 741–748. 

https://doi.org/10.1016/0092-8674(83)90016-8 

Quinlan, A. R., & Hall, I. M. (2010). BEDTools: A flexible suite of utilities for 

comparing genomic features. Bioinformatics (Oxford, England), 26(6), 841–

842. https://doi.org/10.1093/bioinformatics/btq033 

Rahl, P. B., Lin, C. Y., Seila, A. C., Flynn, R. A., McCuine, S., Burge, C. B., Sharp, P. 

A., & Young, R. A. (2010). C-Myc regulates transcriptional pause release. 

Cell, 141(3), 432–445. https://doi.org/10.1016/j.cell.2010.03.030 

Ramírez, F., Ryan, D. P., Grüning, B., Bhardwaj, V., Kilpert, F., Richter, A. S., 

Heyne, S., Dündar, F., & Manke, T. (2016). deepTools2: A next generation 

web server for deep-sequencing data analysis. Nucleic Acids Research, 

44(W1), W160-165. https://doi.org/10.1093/nar/gkw257 

Rao, S. S. P., Huang, S.-C., Hilaire, B. G. S., Engreitz, J. M., Perez, E. M., Kieffer-

Kwon, K.-R., Sanborn, A. L., Johnstone, S. E., Bascom, G. D., Bochkov, I. 

D., Huang, X., Shamim, M. S., Shin, J., Turner, D., Ye, Z., Omer, A. D., 

Robinson, J. T., Schlick, T., Bernstein, B. E., … Aiden, E. L. (2017). Cohesin 

Loss Eliminates All Loop Domains. Cell, 171(2), 305-320.e24. 

https://doi.org/10.1016/j.cell.2017.09.026 

Rao, S. S. P., Huntley, M. H., Durand, N. C., Stamenova, E. K., Bochkov, I. D., 


166 

Robinson, J. T., Sanborn, A. L., Machol, I., Omer, A. D., Lander, E. S., & 

Aiden, E. L. (2014). A 3D map of the human genome at kilobase resolution 

reveals principles of chromatin looping. Cell, 159(7), 1665–1680. 

https://doi.org/10.1016/j.cell.2014.11.021 

Rebouissou, C., Sallis, S., & Forné, T. (2022). Quantitative Chromosome 

Conformation Capture (3C-qPCR). Methods in Molecular Biology (Clifton, 

N.J.), 2532, 3–13. https://doi.org/10.1007/978-1-0716-2497-5_1 

Reilly, S. K., Gosai, S. J., Gutierrez, A., Mackay-Smith, A., Ulirsch, J. C., Kanai, M., 

Mouri, K., Berenzy, D., Kales, S., Butler, G. M., Gladden-Young, A., 

Bhuiyan, R. M., Stitzel, M. L., Finucane, H. K., Sabeti, P. C., & Tewhey, R. 

(2021). Direct characterization of cis-regulatory elements and functional 

dissection of complex genetic associations using HCR-FlowFISH. Nature 

Genetics, 53(8), 1166–1176. https://doi.org/10.1038/s41588-021-00900-4 

Rengachari, S., Schilbach, S., Aibara, S., Dienemann, C., & Cramer, P. (2021). 

Structure of the human Mediator–RNA polymerase II pre-initiation 

complex. Nature, 594(7861), 129–133. https://doi.org/10.1038/s41586-021-

03555-7 

Reske, J. J., Wilson, M. R., & Chandler, R. L. (2020). ATAC-seq normalization 

method can significantly affect differential accessibility analysis and 

interpretation. Epigenetics & Chromatin, 13(1), 22. 


167 

https://doi.org/10.1186/s13072-020-00342-y 

Roh, H., Shen, S. P., Hu, Y., Kwok, H. S., Siegenfeld, A. P., Lee, C., Zepeda, M. A., 

Guo, C.-J., Roseman, S. A., Sankaran, V. G., Buenrostro, J. D., & Liau, B. B. 

(2024). Coupling CRISPR Scanning with Targeted Chromatin Accessibility 

Profiling using a Double-Stranded DNA Deaminase. bioRxiv, 

2024.12.17.628791. https://doi.org/10.1101/2024.12.17.628791 

Sabari, B. R., Dall’Agnese, A., Boija, A., Klein, I. A., Coffey, E. L., Shrinivas, K., 

Abraham, B. J., Hannett, N. M., Zamudio, A. V., Manteiga, J. C., Li, C. H., 

Guo, Y. E., Day, D. S., Schuijers, J., Vasile, E., Malik, S., Hnisz, D., Lee, T. I., 

Cisse, I. I., … Young, R. A. (2018). Coactivator condensation at super-

enhancers links phase separation and gene control. Science, 361(6400), 

eaar3958. https://doi.org/10.1126/science.aar3958 

Sahu, B., Hartonen, T., Pihlajamaa, P., Wei, B., Dave, K., Zhu, F., Kaasinen, E., 

Lidschreiber, K., Lidschreiber, M., Daub, C. O., Cramer, P., Kivioja, T., & 

Taipale, J. (2022). Sequence determinants of human gene regulatory 

elements. Nature Genetics, 54(3), 283–294. https://doi.org/10.1038/s41588-

021-01009-4 

Sanborn, A. L., Rao, S. S. P., Huang, S.-C., Durand, N. C., Huntley, M. H., Jewett, 

A. I., Bochkov, I. D., Chinnappan, D., Cutkosky, A., Li, J., Geeting, K. P., 

Gnirke, A., Melnikov, A., McKenna, D., Stamenova, E. K., Lander, E. S., & 


168 

Aiden, E. L. (2015). Chromatin extrusion explains key features of loop and 

domain formation in wild-type and engineered genomes. Proceedings of 

the National Academy of Sciences, 112(47), E6456–E6465. 

https://doi.org/10.1073/pnas.1518552112 

Sartorelli, V., & Lauberth, S. M. (2020). Enhancer RNAs are an important 

regulatory layer of the epigenome. Nature Structural & Molecular Biology, 

27(6), 521–528. https://doi.org/10.1038/s41594-020-0446-0 

Sassone-Corsi, P., Wildeman, A., & Chambon, P. (1985). A trans-acting factor is 

responsible for the simian virus 40 enhancer activity in vitro. Nature, 

313(6002), 458–463. https://doi.org/10.1038/313458a0 

Sayers, E. W., Beck, J., Bolton, E. E., Brister, J. R., Chan, J., Connor, R., Feldgarden, 

M., Fine, A. M., Funk, K., Hoffman, J., Kannan, S., Kelly, C., Klimke, W., 

Kim, S., Lathrop, S., Marchler-Bauer, A., Murphy, T. D., O’Sullivan, C., 

Schmieder, E., … Pruitt, K. D. (2025). Database resources of the National 

Center for Biotechnology Information in 2025. Nucleic Acids Research, 

53(D1), D20–D29. https://doi.org/10.1093/nar/gkae979 

Schöler, H. R., & Gruss, P. (1984). Specific interaction between enhancer-

containing molecules and cellular components. Cell, 36(2), 403–411. 

https://doi.org/10.1016/0092-8674(84)90233-2 

Schöler, H. R., & Gruss, P. (1985). Cell type-specific transcriptional enhancement 


169 

in vitro requires the presence of trans-acting factors. The EMBO Journal, 

4(11), 3005–3013. https://doi.org/10.1002/j.1460-2075.1985.tb04036.x 

Schulz, V. P., Yan, H., Lezon-Geyda, K., An, X., Hale, J., Hillyer, C. D., Mohandas, 

N., & Gallagher, P. G. (2019). A Unique Epigenomic Landscape Defines 

Human Erythropoiesis. Cell Reports, 28(11), 2996-3009.e7. 

https://doi.org/10.1016/j.celrep.2019.08.020 

Sen, R., & Baltimore, D. (1986). Multiple nuclear factors interact with the 

immunoglobulin enhancer sequences. Cell, 46(5), 705–716. 

https://doi.org/10.1016/0092-8674(86)90346-6 

Senichkin, V. V., Prokhorova, E. A., Zhivotovsky, B., & Kopeina, G. S. (2021). 

Simple and Efficient Protocol for Subcellular Fractionation of Normal and 

Apoptotic Cells. Cells, 10(4), 852. https://doi.org/10.3390/cells10040852 

Shiama, N. (1997). The p300/CBP family: Integrating signals with transcription 

factors and chromatin. Trends in Cell Biology, 7(6), 230–236. 

https://doi.org/10.1016/S0962-8924(97)01048-9 

Shin, H. Y., Willi, M., Yoo, K. H., Zeng, X., Wang, C., Metser, G., & 

Hennighausen, L. (2016). Hierarchy within the mammary STAT5-driven 

Wap super-enhancer. Nature Genetics, 48(8), 904–911. 

https://doi.org/10.1038/ng.3606 

Singh, H., Sen, R., Baltimore, D., & Sharp, P. A. (1986). A nuclear factor that binds 


170 

to a conserved sequence motif in transcriptional control elements of 

immunoglobulin genes. Nature, 319(6049), 154–158. 

https://doi.org/10.1038/319154a0 

Smith, R. P., Taher, L., Patwardhan, R. P., Kim, M. J., Inoue, F., Shendure, J., 

Ovcharenko, I., & Ahituv, N. (2013). Massively parallel decoding of 

mammalian regulatory sequences supports a flexible organizational model. 

Nature Genetics, 45(9), 1021–1028. https://doi.org/10.1038/ng.2713 

Smith, T., Heger, A., & Sudbery, I. (2017). UMI-tools: Modeling sequencing errors 

in Unique Molecular Identifiers to improve quantification accuracy. 

Genome Research, 27(3), 491–499. https://doi.org/10.1101/gr.209601.116 

Snetkova, V., Ypsilanti, A. R., Akiyama, J. A., Mannion, B. J., Plajzer-Frick, I., 

Novak, C. S., Harrington, A. N., Pham, Q. T., Kato, M., Zhu, Y., Godoy, J., 

Meky, E., Hunter, R. D., Shi, M., Kvon, E. Z., Afzal, V., Tran, S., 

Rubenstein, J. L. R., Visel, A., … Dickel, D. E. (2021). Ultraconserved 

enhancer function does not require perfect sequence conservation. Nature 

Genetics, 53(4), 521–528. https://doi.org/10.1038/s41588-021-00812-3 

Socolovsky, M., Fallon, A. E., Wang, S., Brugnara, C., & Lodish, H. F. (1999). Fetal 

anemia and apoptosis of red cell progenitors in Stat5a-/-5b-/- mice: A direct 

role for Stat5 in Bcl-X(L) induction. Cell, 98(2), 181–191. 

https://doi.org/10.1016/s0092-8674(00)81013-2 


171 

Soldaini, E., John, S., Moro, S., Bollenbacher, J., Schindler, U., & Leonard, W. J. 

(2000). DNA binding site selection of dimeric and tetrameric Stat5 proteins 

reveals a large repertoire of divergent tetrameric Stat5a binding sites. 

Molecular and Cellular Biology, 20(1), 389–401. 

https://doi.org/10.1128/MCB.20.1.389-401.2000 

Song, S.-H., Hou, C., & Dean, A. (2007). A positive role for NLI/Ldb1 in long range 

β-globin locus control region function. Molecular Cell, 28(5), 810–822. 

https://doi.org/10.1016/j.molcel.2007.09.025 

Spektor, R., Tippens, N. D., Mimoso, C. A., & Soloway, P. D. (2019). Methyl-

ATAC-seq measures DNA methylation at accessible chromatin. Genome 

Research, 29(6), 969–977. https://doi.org/10.1101/gr.245399.118 

Spitz, F., & Furlong, E. E. M. (2012). Transcription factors: From enhancer binding 

to developmental control. Nature Reviews. Genetics, 13(9), 613–626. 

https://doi.org/10.1038/nrg3207 

Staudt, L. M., Singh, H., Sen, R., Wirth, T., Sharp, P. A., & Baltimore, D. (1986). A 

lymphoid-specific protein binding to the octamer motif of immunoglobulin 

genes. Nature, 323(6089), 640–643. https://doi.org/10.1038/323640a0 

Stein, R. W., Corrigan, M., Yaciuk, P., Whelan, J., & Moran, E. (1990). Analysis of 

E1A-mediated growth regulation functions: Binding of the 300-kilodalton 

cellular product correlates with E1A enhancer repression function and 


172 

DNA synthesis-inducing activity. Journal of Virology, 64(9), 4421–4427. 

https://doi.org/10.1128/jvi.64.9.4421-4427.1990 

Storer, J., Hubley, R., Rosen, J., Wheeler, T. J., & Smit, A. F. (2021). The Dfam 

community resource of transposable element families, sequence models, 

and genome annotations. Mobile DNA, 12(1), 2. 

https://doi.org/10.1186/s13100-020-00230-y 

Tarbell, E. D., & Liu, T. (2019). HMMRATAC: A Hidden Markov ModeleR for 

ATAC-seq. Nucleic Acids Research, 47(16), e91. 

https://doi.org/10.1093/nar/gkz533 

Taskiran, I. I., Spanier, K. I., Dickmänken, H., Kempynck, N., Pančíková, A., Ekşi, 

E. C., Hulselmans, G., Ismail, J. N., Theunis, K., Vandepoel, R., Christiaens, 

V., Mauduit, D., & Aerts, S. (2024). Cell-type-directed design of synthetic 

enhancers. Nature, 626(7997), 212–220. https://doi.org/10.1038/s41586-023-

06936-2 

Thanos, D., & Maniatis, T. (1995). Virus induction of human IFN beta gene 

expression requires the assembly of an enhanceosome. Cell, 83(7), 1091–

1100. https://doi.org/10.1016/0092-8674(95)90136-1 

Thomas, H. F., Kotova, E., Jayaram, S., Pilz, A., Romeike, M., Lackner, A., Penz, T., 

Bock, C., Leeb, M., Halbritter, F., Wysocka, J., & Buecker, C. (2021). 

Temporal dissection of an enhancer cluster reveals distinct temporal and 


173 

functional contributions of individual elements. Molecular Cell, 81(5), 969-

982.e13. https://doi.org/10.1016/j.molcel.2020.12.047 

Thurman, R. E., Rynes, E., Humbert, R., Vierstra, J., Maurano, M. T., Haugen, E., 

Sheffield, N. C., Stergachis, A. B., Wang, H., Vernot, B., Garg, K., John, S., 

Sandstrom, R., Bates, D., Boatman, L., Canfield, T. K., Diegel, M., Dunn, D., 

Ebersol, A. K., … Stamatoyannopoulos, J. A. (2012). The accessible 

chromatin landscape of the human genome. Nature, 489(7414), 75–82. 

https://doi.org/10.1038/nature11232 

Tian, B., Yang, J., & Brasier, A. R. (2012). Two-step cross-linking for analysis of 

protein-chromatin interactions. Methods in Molecular Biology (Clifton, 

N.J.), 809, 105–120. https://doi.org/10.1007/978-1-61779-376-9_7 

Tippens, N. D., Liang, J., Leung, A. K.-Y., Wierbowski, S. D., Ozer, A., Booth, J. G., 

Lis, J. T., & Yu, H. (2020). Transcription imparts architecture, function and 

logic to enhancer units. Nature Genetics, 52(10), 1067–1075. 

https://doi.org/10.1038/s41588-020-0686-2 

Tolhuis, B., Palstra, R. J., Splinter, E., Grosveld, F., & de Laat, W. (2002). Looping 

and interaction between hypersensitive sites in the active beta-globin locus. 

Molecular Cell, 10(6), 1453–1465. https://doi.org/10.1016/s1097-

2765(02)00781-5 

Tóthová, Z., Tomc, J., Debeljak, N., & Solár, P. (2021). STAT5 as a Key Protein of 


174 

Erythropoietin Signalization. International Journal of Molecular Sciences, 

22(13), 7109. https://doi.org/10.3390/ijms22137109 

Treisman, R. (1985). Transient accumulation of c-fos RNA following serum 

stimulation requires a conserved 5’ element and c-fos 3’ sequences. Cell, 

42(3), 889–902. https://doi.org/10.1016/0092-8674(85)90285-5 

Trojanowski, J., Frank, L., Rademacher, A., Mücke, N., Grigaitis, P., & Rippe, K. 

(2022). Transcription activation is enhanced by multivalent interactions 

independent of phase separation. Molecular Cell, 82(10), 1878-1893.e10. 

https://doi.org/10.1016/j.molcel.2022.04.017 

Ulirsch, J. C., Nandakumar, S. K., Wang, L., Giani, F. C., Zhang, X., Rogov, P., 

Melnikov, A., McDonel, P., Do, R., Mikkelsen, T. S., & Sankaran, V. G. 

(2016). Systematic Functional Dissection of Common Genetic Variation 

Affecting Red Blood Cell Traits. Cell, 165(6), 1530–1545. 

https://doi.org/10.1016/j.cell.2016.04.048 

Vakoc, C. R., Letting, D. L., Gheldof, N., Sawado, T., Bender, M. A., Groudine, M., 

Weiss, M. J., Dekker, J., & Blobel, G. A. (2005). Proximity among distant 

regulatory elements at the beta-globin locus requires GATA-1 and FOG-1. 

Molecular Cell, 17(3), 453–462. https://doi.org/10.1016/j.molcel.2004.12.028 

Vihervaara, A., Mahat, D. B., Guertin, M. J., Chu, T., Danko, C. G., Lis, J. T., & 

Sistonen, L. (2017). Transcriptional response to stress is pre-wired by 


175 

promoter and enhancer architecture. Nature Communications, 8(1), 255. 

https://doi.org/10.1038/s41467-017-00151-0 

Vihervaara, A., Mahat, D. B., Himanen, S. V., Blom, M. A. H., Lis, J. T., & Sistonen, 

L. (2021). Stress-induced transcriptional memory accelerates promoter-

proximal pause release and decelerates termination over mitotic divisions. 

Molecular Cell, 81(8), 1715-1731.e6. 

https://doi.org/10.1016/j.molcel.2021.03.007 

Vihervaara, A., Sergelius, C., Vasara, J., Blom, M. A. H., Elsing, A. N., Roos-

Mattjus, P., & Sistonen, L. (2013). Transcriptional response to stress in the 

dynamic chromatin environment of cycling and mitotic cells. Proceedings 

of the National Academy of Sciences, 110(36), E3388–E3397. 

https://doi.org/10.1073/pnas.1305275110 

Villar, D., Berthelot, C., Aldridge, S., Rayner, T. F., Lukk, M., Pignatelli, M., Park, 

T. J., Deaville, R., Erichsen, J. T., Jasinska, A. J., Turner, J. M. A., Bertelsen, 

M. F., Murchison, E. P., Flicek, P., & Odom, D. T. (2015). Enhancer 

Evolution across 20 Mammalian Species. Cell, 160(3), 554–566. 

https://doi.org/10.1016/j.cell.2015.01.006 

Visel, A., Blow, M. J., Li, Z., Zhang, T., Akiyama, J. A., Holt, A., Plajzer-Frick, I., 

Shoukry, M., Wright, C., Chen, F., Afzal, V., Ren, B., Rubin, E. M., & 

Pennacchio, L. A. (2009). ChIP-seq accurately predicts tissue-specific 


176 

activity of enhancers. Nature, 457(7231), 854–858. 

https://doi.org/10.1038/nature07730 

Wadman, I. A., Osada, H., Grütz, G. G., Agulnick, A. D., Westphal, H., Forster, A., 

& Rabbitts, T. H. (1997). The LIM-only protein Lmo2 is a bridging molecule 

assembling an erythroid, DNA-binding complex which includes the TAL1, 

E47, GATA-1 and Ldb1/NLI proteins. The EMBO Journal, 16(11), 3145–

3157. https://doi.org/10.1093/emboj/16.11.3145 

Weber-Nordt, R. M., Egen, C., Wehinger, J., Ludwig, W., Gouilleux-Gruart, V., 

Mertelsmann, R., & Finke, J. (1996). Constitutive activation of STAT 

proteins in primary lymphoid and myeloid leukemia cells and in Epstein-

Barr virus (EBV)-related lymphoma cell lines. Blood, 88(3), 809–816. 

Wei, X., Das, J., Fragoza, R., Liang, J., Bastos de Oliveira, F. M., Lee, H. R., Wang, 

X., Mort, M., Stenson, P. D., Cooper, D. N., Lipkin, S. M., Smolka, M. B., & 

Yu, H. (2014). A massively parallel pipeline to clone DNA variants and 

examine molecular phenotypes of human disease mutations. PLoS Genetics, 

10(12), e1004819. https://doi.org/10.1371/journal.pgen.1004819 

Weinberger, J., Baltimore, D., & Sharp, P. A. (1986). Distinct factors bind to 

apparently homolgous sequences in the immunoglobulin heavy-chain 

enhancer. Nature, 322(6082), 846–848. https://doi.org/10.1038/322846a0 

Weirauch, M. T., Yang, A., Albu, M., Cote, A. G., Montenegro-Montero, A., 


177 

Drewe, P., Najafabadi, H. S., Lambert, S. A., Mann, I., Cook, K., Zheng, H., 

Goity, A., van Bakel, H., Lozano, J.-C., Galli, M., Lewsey, M. G., Huang, E., 

Mukherjee, T., Chen, X., … Hughes, T. R. (2014). Determination and 

inference of eukaryotic transcription factor sequence specificity. Cell, 

158(6), 1431–1443. https://doi.org/10.1016/j.cell.2014.08.009 

Whyte, P., Williamson, N. M., & Harlow, E. (1989). Cellular targets for 

transformation by the adenovirus E1A proteins. Cell, 56(1), 67–75. 

https://doi.org/10.1016/0092-8674(89)90984-7 

Whyte, W. A., Orlando, D. A., Hnisz, D., Abraham, B. J., Lin, C. Y., Kagey, M. H., 

Rahl, P. B., Lee, T. I., & Young, R. A. (2013). Master Transcription Factors 

and Mediator Establish Super-Enhancers at Key Cell Identity Genes. Cell, 

153(2), 307–319. https://doi.org/10.1016/j.cell.2013.03.035 

Wickham, H. (n.d.). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag 

New York. Retrieved June 3, 2025, from https://ggplot2.tidyverse.org 

Wildeman, A. G., Sassone-Corsi, P., Grundström, T., Zenke, M., & Chambon, P. 

(1984). Stimulation of in vitro transcription from the SV40 early promoter 

by the enhancer involves a specific trans-acting factor. The EMBO Journal, 

3(13), 3129–3133. https://doi.org/10.1002/j.1460-2075.1984.tb02269.x 

Wildeman, A. G., Zenke, M., Schatz, C., Wintzerith, M., Grundström, T., Matthes, 

H., Takahashi, K., & Chambon, P. (1986). Specific protein binding to the 


178 

simian virus 40 enhancer in vitro. Molecular and Cellular Biology, 6(6), 

2098–2105. https://doi.org/10.1128/mcb.6.6.2098-2105.1986 

Wilson, N. K., Foster, S. D., Wang, X., Knezevic, K., Schütte, J., Kaimakis, P., 

Chilarska, P. M., Kinston, S., Ouwehand, W. H., Dzierzak, E., Pimanda, J. 

E., de Bruijn, M. F. T. R., & Göttgens, B. (2010). Combinatorial 

transcriptional control in blood stem/progenitor cells: Genome-wide 

analysis of ten major transcriptional regulators. Cell Stem Cell, 7(4), 532–

544. https://doi.org/10.1016/j.stem.2010.07.016 

Wu, X., & Sharp, P. A. (2013). Divergent transcription: A driving force for new 

gene origination? Cell, 155(5), 990–996. 

https://doi.org/10.1016/j.cell.2013.10.048 

Yang, Z., Yik, J. H. N., Chen, R., He, N., Jang, M. K., Ozato, K., & Zhou, Q. (2005). 

Recruitment of P-TEFb for Stimulation of Transcriptional Elongation by 

the Bromodomain Protein Brd4. Molecular Cell, 19(4), 535–545. 

https://doi.org/10.1016/j.molcel.2005.06.029 

Yao, L., Liang, J., Ozer, A., Leung, A. K.-Y., Lis, J. T., & Yu, H. (2022). A 

comparison of experimental assays and analytical methods for genome-wide 

identification of active enhancers. Nature Biotechnology, 40(7), 1056–1065. 

https://doi.org/10.1038/s41587-022-01211-7 

Yee, S. P., & Branton, P. E. (1985). Detection of cellular proteins associated with 


179 

human adenovirus type 5 early region 1A polypeptides. Virology, 147(1), 

142–153. https://doi.org/10.1016/0042-6822(85)90234-x 

Zabidi, M. A., Arnold, C. D., Schernhuber, K., Pagani, M., Rath, M., Frank, O., & 

Stark, A. (2015). Enhancer–core-promoter specificity separates 

developmental and housekeeping gene regulation. Nature, 518(7540), 556–

559. https://doi.org/10.1038/nature13994 

Zenke, M., Grundström, T., Matthes, H., Wintzerith, M., Schatz, C., Wildeman, A., 

& Chambon, P. (1986). Multiple sequence motifs are involved in SV40 

enhancer function. The EMBO Journal, 5(2), 387–397. 

https://doi.org/10.1002/j.1460-2075.1986.tb04224.x 

Zhang, J., Leung, A. K.-Y., Zhu, Y., Yao, L., Willis, A., Pan, X., Ozer, A., Zhou, Z., 

Siklenka, K., Barrera, A., Liang, J., Tippens, N. D., Reddy, T. E., Lis, J. T., & 

Yu, H. (2025). Comprehensive Evaluation of Diverse Massively Parallel 

Reporter Assays to Functionally Characterize Human Enhancers Genome-

wide (p. 2025.03.25.645321). bioRxiv. 

https://doi.org/10.1101/2025.03.25.645321 

Zhang, X., Song, B., Carlino, M. J., Li, G., Ferchen, K., Chen, M., Thompson, E. N., 

Kain, B. N., Schnell, D., Thakkar, K., Kouril, M., Jin, K., Hay, S. B., Sen, S., 

Bernardicius, D., Ma, S., Bennett, S. N., Croteau, J., Salvatori, O., … Grimes, 

H. L. (2024). An immunophenotype-coupled transcriptomic atlas of human 


180 

hematopoietic progenitors. Nature Immunology, 25(4), 703–715. 

https://doi.org/10.1038/s41590-024-01782-4 

Zhang, Y., Liu, T., Meyer, C. A., Eeckhoute, J., Johnson, D. S., Bernstein, B. E., 

Nusbaum, C., Myers, R. M., Brown, M., Li, W., & Liu, X. S. (2008). Model-

based analysis of ChIP-Seq (MACS). Genome Biology, 9(9), R137. 

https://doi.org/10.1186/gb-2008-9-9-r137 

Zhou, J., & Troyanskaya, O. G. (2015). Predicting effects of noncoding variants 

with deep learning–based sequence model. Nature Methods, 12(10), 931–

934. https://doi.org/10.1038/nmeth.3547 

Zhu, Y., Balaji, A., Han, M., Andronov, L., Roy, A. R., Wei, Z., Chen, C., Miles, L., 

Cai, S., Gu, Z., Tse, A., Yu, B. C., Uenaka, T., Lin, X., Spakowitz, A. J., 

Moerner, W. E., & Qi, L. S. (2025). High-resolution dynamic imaging of 

chromatin DNA communication using Oligo-LiveFISH. Cell, 188(12), 3310-

3328.e27. https://doi.org/10.1016/j.cell.2025.03.032