ENHANCER-CENTRIC DISSECTION OF CIS- REGULATORY LOGIC IN HUMAN CELLS A Dissertation Presented to the Faculty of the Graduate School of Cornell University In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy by Zhou Zhou December 2025 © 2025 Zhou Zhou ENHANCER-CENTRIC DISSECTION OF CIS-REGULATORY LOGIC IN HUMAN CELLS Zhou Zhou, Ph. D. Cornell University 2025 Enhancers are essential cis-regulatory elements that orchestrate cell-type- specific gene expression. While conceptually simple, their mechanisms of action are remarkably complex. At the level of individual elements, active enhancers typically consist of a central transcription factor binding region flanked by a pair of divergent core promoters. On a broader genomic scale, individual enhancers frequently collaborate with other regulatory elements to achieve precise and robust transcriptional control. In this thesis, I aim to bridge these two layers of enhancer regulation through a combination of genome engineering, high-throughput screening, and functional genomics. I first dissected the sequence–function relationship of a model long-range human enhancer, “eNMU,” which drives an extraordinary ~10,000-fold activation of its target gene NMU from 94 kb away. Systematic dissection guided by the divergent transcription model revealed extensive transcription factor synergy at this enhancer and uncovered a complex interplay between the core divergently transcribed enhancer unit and surrounding cis-regulatory elements. Notably, these include intrinsically inactive facilitators that augment and buffer enhancer output, as well as an adjacent retroviral long terminal repeat (LTR) promoter that acts to repress enhancer activity. These two emerging modes of cis-acting logic may be broadly utilized across the genome, suggesting that the complexity of enhancer regulation has been significantly underestimated. In a parallel line of investigation, I identified a heat shock–induced, HSF1-bound distal enhancer and systematically characterized ~200 HSF1-bound distal elements using a massively parallel episomal reporter assay and the sensitive PRO-cap assay. This analysis revealed that heat-induced enhancer transcription is a prerequisite for heat-induced enhancer activity. Together, these findings offer new insights into the multilayered mechanisms by which enhancers function and are themselves regulated. v BIOGRAPHICAL SKETCH Zhou Zhou was born and raised in the city of Nanchang, People’s Republic of China. She attended the Attached Middle School of Jiangxi Normal University in her hometown. In 2018, she earned her B.S. in Cell and Molecular Biology from the Chinese University of Hong Kong in Hong Kong SAR. After obtaining her bachelor’s degree, Zhou Zhou joined the graduate program of Biochemistry, Molecular and Cell Biology at Cornell University to pursue a doctorate degree in molecular biology, where she explored the intersection of enhancer biology, transcription regulation, and genome engineering, under the mentorship of Professor John T. Lis. vi To my mentors and friends—thank you for being both. vii ACKNOWLEDGMENTS I am deeply grateful to my parents for their unconditional love throughout my academic journey, to my grandparents and cousins for unwavering support, and to my cats Bubble and Sushi for being a constant source of comfort. To my childhood friends, Lianghui (Lily) Li, Mengyuan (Vimy) Wan, Ziya Zhong, and Ximeng (Simon) Fan— my soul-sisters—your blessings walk with me, always. Especially to Lily—thank you for being by my side, from our kindergarten days to beautiful sunsets on Libe Slope. I’ll always remember those two springs we spent together, and your unwavering belief in me. To my college friends Zheng (April) Wang, Yinuo Hu, Yue (Anna) Gao, Jinci (Tina) Liu, Jiabin (Jessica) Chen, and Yifan (Evian) Yao: though oceans apart, our bond remains unshaken. I cherish our enduring friendship. To the wonderful friends in Ithaca: Tong, thank you for countless joyful memories and for always grounding me in reality. Elif, my brilliantly artistic roommate, thank you for the laughter, the tears, and the deep trust we shared. My exceptional cohort mates—Xinchen Chen, Fangyu Wang, and Yiwen Qin—thank you for growing alongside me. Xinchen, in particular, your support during my darkest days is something I will always carry with gratitude. Haining Chen, Junke Zhang, Kunlin Li, You Chen, Li Yao, Shuran Wang, and others—thank you for your companionship, wisdom, and genuine care for me. To my lab peers Alex and Philip: your dedication, discipline, and kindness have been so humbling and inspiring to me. I have learned so much from you both. To Janis— viii thank you for putting up with my messiness and for being the lab mom we all rely on. To our brilliant past and present postdocs Anni, Erin, Eric, Takuya, Jin (Yu Lab), Jinjoo (Yu Lab), Sagar, Gopal, Yuko, and Raphael for warmly help and insightful discussions. To our current graduate students Jaret, Yiyang, Cara, James, Miliarys, Brent, Adam, Katya, and Yang—and to alumni Nate, Lina, Kara, Julius, Mike, and Jawaher—for the camaraderie and shared journey. To my undergrad colleagues—Zining, Kiara, Albert, Rachel, and Jessica—thank you for your enthusiasm, hard work, and the joy of working together. I am especially grateful to my committee members, Andrew and Charles. Andrew, your encouragement during the formative years of my Ph.D. meant more than you may realize; it was a light in times of doubt. Thank you also for broadening my scientific horizons beyond the lab by involving me in the faculty search committee. Charles, thank you, along with Adam He, for sharing expertise in computational biology and for your steady support throughout my thesis journey. To the mentors who shaped me most profoundly—Abdullah and Judhajeet. Words fall short of capturing what you mean to me. You welcomed me into the lab and became the reason I stayed. Judhajeet, thank you for teaching me to experiment with rigor, and for always listening with patience and guiding me with care. Your companionship during the pandemic was a lifeline, and your emotional support gave me hope when I needed it most. Abdullah, your bold ideas and sharp thinking have left a lasting mark on my scientific taste. Thank you for pushing me to grow, even when it brought tears, and for your unwavering care, always offered with a golden heart. I feel ix incredibly lucky to have had you both not only as my mentors, but also as my role models and lifelong friends. And to John—my advisor, mentor, and comrade in science—thank you for believing in me through every struggle, detour, and moment of doubt. Thank you for nurturing in me the confidence to think independently, the courage to fight perfectionism, and the curiosity to remain an active learner. Your optimism, clarity, and commitment to mentorship have been a guiding light. You’ve shown me how science can be both serious and fun—a lesson I’ll carry forward always. Finally, thank you to Cornell and Ithaca—for the tranquil beauty and the peaceful freedom to grow. I’ve often complained about the isolation and solitude, but in truth, it gave me the space I needed to find myself. I know I’ll miss it wherever I go. x TABLE OF CONTENTS BIOGRAPHICAL SKETCH ........................................................................................... v ACKNOWLEDGMENTS ............................................................................................ vii TABLE OF CONTENTS ................................................................................................ x LIST OF FIGURES ...................................................................................................... xiii LIST OF TABLES ......................................................................................................... xv Chapter 1 Introduction ................................................................................................. 1 1.1 A Historical Perspective on Enhancer Discovery and Characterization .......... 1 1.1.1 The Emergence of Enhancer Biology: When Cis-Regulatory Code Met Trans-Acting Factors (1980s–1990s) .................................................................... 1 1.1.2 The Genomics Revolution: Epigenomic Signatures of Active Enhancers (2000s–2010s) ......................................................................................................... 5 1.1.3 The Power of Modern Molecular Biology and Genetics: Quantifying and Predicting Enhancer Activity (2010s–present) .................................................... 8 1.2 The Enigma of Enhancer–Promoter Communication ..................................... 11 1.2.1 Spatial Connectivity between Enhancers and Promoters ........................ 11 1.2.2 Biochemical Compatibility between Enhancers and Promoters ............. 15 1.3 Revisiting the Regulatory Complexity of Enhancers ...................................... 18 1.3.1 Reflection on What Makes an Enhancer .................................................. 18 1.3.2 Transcriptional Regulatory Hubs: Interplay Among Cis-Acting Elements .............................................................................................................................. 22 Chapter 2 Robust regulatory interplay of enhancers, facilitators, and promoters in a native chromatin context ............................................................................................ 24 2.1 Abstract .............................................................................................................. 24 2.2 Introduction ....................................................................................................... 24 xi 2.3 Results ................................................................................................................ 27 2.3.1 eNMU landing pad as a powerful system to study enhancer function at the native locus .................................................................................................... 27 2.3.2 Screening for functional units and motifs in eNMU ................................ 33 2.3.3 Interplay of regulatory factor binding at eNMU ...................................... 39 2.3.4 TF-specific regulation of chromatin accessibility and nascent transcription ........................................................................................................ 42 2.3.5 Facilitator e2 universally confers enhancer robustness ........................... 47 2.3.6 A 3D regulatory hub of enhancer, promoter and facilitators of NMU ... 50 2.3.7 Dynamics of eNMU regulation during erythroid differentiation ............ 55 2.3.8 A putative LTR promoter as a built-in negative regulatory element for enhancer activity ................................................................................................. 58 2.4 Discussion .......................................................................................................... 62 2.5 Methods ............................................................................................................. 68 2.6 Acknowledgements ........................................................................................... 91 Chapter 3 Investigating the necessity of enhancer transcription for enhancer function in a heat-inducible system ........................................................................... 93 3.1 Abstract .............................................................................................................. 93 3.2 Introduction ....................................................................................................... 94 3.3 Results ................................................................................................................ 95 3.3.1 Evidence of an HSF1-bound, heat-inducible enhancer ........................... 95 3.3.2 Systematically testing enhancer activity of a library of HSF1-bound candidate elements using eSTARR-seq .............................................................. 98 xii 3.3.3 Induced transcription initiation at “untranscribed” elements detected by PRO-cap ............................................................................................................. 101 3.4 Discussion ........................................................................................................ 103 3.5 Methods ........................................................................................................... 104 3.6 Acknowledgements ......................................................................................... 107 Chapter 4 Conclusion and Perspectives ................................................................... 108 Appendix A A flexible Protein-Tagging Strategy for Mapping CDK9 Chromatin Occupancy and Nuclear Proximity Interactome .................................. 114 xiii LIST OF FIGURES Fig. 1.1 Molecular architecture (A) (adapted from Tippens et al., 2020) and epigenomic features (B) of active human enhancers. .................................................. 8 Fig. 1.2 The transcription cycle of RNA Pol II at human genes (adapted from Fuda et al., 2009). .................................................................................................................. 18 Fig. 2.1 eNMU is composed of an autonomous enhancer e1 and a facilitator e2. ... 29 Fig. 2.2 eNMU specifically regulates NMU gene transcription in K562. ................. 30 Fig. 2.3 eNMU landing pad as a powerful system to study enhancer function at the native locus. ................................................................................................................. 32 Fig. 2.4 The HCR-FlowFISH screen for functional sequences in eNMU. ................ 35 Fig. 2.5 Key functional units and motifs in eNMU. ................................................... 37 Fig. 2.6 Multiplicative effects of double mutants revealed by FlowFISH screen. ... 38 Fig. 2.7 Interplay of regulatory factor binding at eNMU. ......................................... 40 Fig. 2.8 TF-specific regulation of chromatin accessibility at the enhancer and promoter of NMU. ....................................................................................................... 44 Fig. 2.9 TF-specific regulation of chromatin accessibility at the enhancer and promoter of NMU. ....................................................................................................... 46 Fig. 2.10 Testing CRISPR-validated heterologous K562 dTREs at the eNMU locus. ...................................................................................................................................... 48 Fig. 2.11 Facilitator e2 universally confers enhancer robustness. ............................ 49 Fig. 2.12 A 3D regulatory hub of enhancer, promoter and facilitators of NMU. .... 52 Fig. 2.13 Additional epigenomic features of enhancer, promoter and facilitators. . 53 Fig. 2.14 Dynamics of eNMU regulation during erythroid differentiation. ............ 57 Fig. 2.15 A putative LTR promoter as a built-in negative regulatory element for enhancer activity. ........................................................................................................ 60 Fig. 3.1 Homozygous deletion of eTAX1BP1 abolishes heat inducibility of xiv TAX1BP1 expression. .................................................................................................. 96 Fig. 3.2 eTAX1BP1 activates episomal reporter gene expression in response to heat shock. ........................................................................................................................... 98 Fig. 3.3 Four classes of HSF1-bound elements for eSTARR-seq testing. .................. 99 Fig. 3.4 Systematically testing heat-induced enhancer activity of HSF1-bound candidate elements using eSTARR-seq. ................................................................... 100 Fig. 3.5 PRO-cap detects induced transcription initiation at “untranscribed” elements upon HS. ..................................................................................................... 102 Fig. A.1 Schematic of Bxb1-mediated flexible protein-tagging strategy for CDK9. .................................................................................................................................... 115 Fig. A.2 Western blot analysis of tagged CDK9 clonal cell lines. ........................... 115 Figure A.3 Probing physical interaction between CDK9-FLAG-APEX2 and Cyclin T1 by Co-IP analysis. ................................................................................................. 116 Figure A.4 Nuclear isolation effectively eliminates endogenously biotinylated proteins. ..................................................................................................................... 117 Figure A.5 Optimizing CDK9-EGFP ChIP-seq with different crosslinking conditions. .................................................................................................................. 118 xv LIST OF TABLES Table 3.1qPCR primer sequences used in this study ............................................. 105 Supplementary Table S1. (separate spreadsheet) Sequences of all mutants analyzed in Chapter 2. Supplementary Table S2. (separate spreadsheet) Sequences of all oligonucleotides and dsDNA fragments synthesized in Chapter 2. Supplementary Table S3. (separate spreadsheet) Raw read counts and calculated activity scores for individual barcodes from the HCR-FlowFISH screen in Chapter 2. Supplementary Table S4. (separate spreadsheet) Public datasets and their sources used in Chapter 2. Supplementary Table S5. (separate spreadsheet) Information and eSTARR-seq- measured enhancer activity of HSF1-bound elements tested in Chapter 3. 1 CHAPTER 1 INTRODUCTION In the human genome, RNA polymerase II (Pol II) transcribes all ~20,000 protein-coding genes along with several important classes of non-coding RNAs, including long non-coding RNAs (lncRNAs) and microRNA (miRNA) precursors. A central question in biology is how different cell types, responding to diverse signaling cues, establish and maintain their unique Pol II transcriptional programs—a process fundamental to development and disease. While gene promoters provide the core sequence architecture required to initiate Pol II transcription, many genes rely on an additional class of cis-regulatory elements, enhancers, to achieve cell-type-specific transcription activation. This thesis focuses on enhancers and aims to offer new conceptual insights into enhancer mechanisms in human cells. In this introductory chapter, I begin by reviewing the historical progression of enhancer identification and characterization. I then explore the longstanding enigma of enhancer–promoter communication, highlighting the field’s continuously evolving understanding. Finally, I discuss several key questions in enhancer biology, including the functional features of enhancers and the complex interplay among multiple cis-regulatory elements. 1.1 A Historical Perspective on Enhancer Discovery and Characterization 1.1.1 The Emergence of Enhancer Biology: When Cis-Regulatory Code Met Trans- Acting Factors (1980s–1990s) The story of enhancer biology is deeply intertwined with the broader advances in molecular biology and genomics technologies, whose impact has extended across 2 numerous areas of research. What makes enhancer biology uniquely fascinating is its paradoxical nature—simple in concept, yet complex in action—a theme this mini- review seeks to illustrate. The term “enhancer” was first introduced in 1981 by Walter Schaffner’s group (Banerji et al., 1981), who demonstrated using transfected plasmids in Hela cells that a 72-bp simian virus 40 (SV40) DNA could markedly enhance the expression of a rabbit β-globin reporter gene when inserted several kilobases upstream or downstream in cis, in either orientation relative to the promoter. This seminal finding led to the classical definition of enhancers as DNA elements that can activate transcription from a promoter in a position- and orientation-independent manner. Shortly thereafter, the first mammalian enhancers were discovered in rearranged mouse immunoglobin genes by several independent groups (Banerji et al., 1983; Gillies et al., 1983; Neuberger, 1983; Queen & Baltimore, 1983; Picard & Schaffner, 1984). Importantly, the immunoglobulin enhancers exhibited activity exclusively in lymphoid cells but not in other cell types, suggesting that enhancer function is governed by the interplay between DNA sequence and the cell-type-specific repertoire of trans-acting factors. In the ensuing years, a growing body of evidence substantiated this idea: in vivo (Schöler & Gruss, 1984; Mercola et al., 1985) and in vitro (Wildeman et al., 1984; Sassone-Corsi et al., 1985; Schöler & Gruss, 1985) competition assays showed that enhancer-driven reporter activity could be titrated by introducing excess enhancer DNA in trans, indicating that enhancer function depends on a limited pool of trans-acting factors; in vivo dimethyl sulfate (DMS) protection assays (Ephrussi et al., 1985; Church et al., 1985), together with systematic mutagenesis and in vitro DNase I footprinting experiments (Zenke et al., 1986; Wildeman et al., 1986; Davidson et al., 1986), further 3 mapped functional sequence motifs corresponding to the DNA contact sites of regulatory proteins. Around the same time, the labs of Philip Sharp and David Baltimore employed a combination of heparin-Sepharose chromatography, electrophoretic mobility shift assays, and methylation interference analysis to achieve the first identification of nuclear factors interacting with the immunoglobin enhancers— including NF-κB, a transcription factor later found to be a central regulator of immune response (Singh et al., 1986; Weinberger et al., 1986; Sen & Baltimore, 1986; Staudt et al., 1986). Although constrained by the experimental tools and the limited number of characterized enhancers at the time, the complexity of enhancer regulation quickly became evident. First, enhancers were found to harbor multiple classes of sequence motifs (Zenke et al., 1986)—some exhibiting cell-type-specific protein occupancy, others more ubiquitous (Davidson et al., 1986; Sen & Baltimore, 1986). Second, closely homologous motifs within an enhancer could be bound by distinct factors in a given cell type (Weinberger et al., 1986; Sen & Baltimore, 1986). Third, a single motif could be occupied by different factors in different cell types (Staudt et al., 1986). Fourth, enhancer activity could be negatively regulated (Borrelli et al., 1984; Hen et al., 1985; Mitchell et al., 1987) or stimulated by inducible factor binding in response to signaling cues (Payvar et al., 1983; Treisman, 1985; Staudt et al., 1986). Fifth, the apparent discrepancy between in vitro and in vivo binding data suggested that chromatin accessibility may influence transcription factor occupancy in living cells (Sen & Baltimore, 1986). Together, these findings strongly implied that enhancers play a critical role in orchestrating complex, tissue-specific gene expression programs. 4 By 1990, it was well established that sequence-specific transcription factors (TFs) and general transcription factors (TFIIA, B, D, E, and F) (Buratowski et al., 1989) represent two distinct classes of transcriptional regulators. The general transcription factors are essential for basal transcription in vitro, whereas sequence-specific TFs further augment transcriptional output, especially in response to signaling cues. Most TFs share a basic structure consisting of a DNA binding domain and an acidic activating domain. Interestingly, experiments showed that introducing an artificially high amount of an activator B (or its activating domain) in vitro could suppress the transcriptional stimulation of a heterologous gene driven by activator A (Gill & Ptashne, 1988; M. E. Meyer et al., 1989; Martin et al., 1990), but not the basal transcription mediated by TFIID (Berger et al., 1990). This “squelching” effect implied the existence of an intermediary “adaptor” protein shared between the two activators (reviewed in Ptashne & Gann, 1990). An major milestone in search of these “adaptors” was the discovery of the Mediator complex in yeast by the Kornberg group (Kelleher et al., 1990; Flanagan et al., 1991; Y. J. Kim et al., 1994), which was later shown to be conversed and essential in metazoans (reviewed in Bourbon et al., 2004). We now know that Mediator is a huge multiprotein complex that interacts with a broad range of TFs via distinct interfaces (reviewed in Abdella et al., 2021). Once recruited by TFs, it forms a stable complex with Pol II preinitiation complex (PIC) and facilitates TFIIH-mediated phosphorylation of Ser5 on Pol II C-terminal domain (CTD), a critical step in promoter escape (detailed in recent cryo-EM studies by Abdella et al., 2021 and Rengachari et al., 2021). Another central coactivator in the context of enhancers is p300/CBP. These two closely related paralogs were originally identified through their interactions with 5 the adenovirus E1A oncoprotein (Yee & Branton, 1985; P. Whyte et al., 1989; Stein et al., 1990; Eckner et al., 1994) and the transcription factor CREB (Chrivia et al., 1993), respectively. Unlike the multi-subunit Mediator complex, p300/CBP is a single peptide containing multiple functional domains that interact with a wide array of TFs, thus capable of integrating diverse signaling pathways at the chromatin level (reviewed in Janknecht & Hunter, 1996; Shiama, 1997; Goodman & Smolik, 2000). Importantly, p300/CBP also harbors a histone acetyltransferase (HAT) domain, which acetylates lysine residues on histone tails of H3 (Jin et al., 2011) and H4, as well as many non- histone transcriptional regulators (reviewed in Dancy & Cole, 2015). This acetylation function is a key mechanism by which p300/CBP promotes transcriptional activation. 1.1.2 The Genomics Revolution: Epigenomic Signatures of Active Enhancers (2000s–2010s) The completion of the Human Genome Project (Lander et al., 2001; International Human Genome Sequencing Consortium, 2004), along with the launch of the ENCODE project in 2003 (ENCODE Project Consortium, 2004), ushered in a new era of human functional genomics. Thanks to the pilot phase of ENCODE which focused on characterizing functional sequences within 1% of the human genome (Birney et al., 2007), a number of classical biochemical and molecular biology assays were adapted into high-throughput, sequencing-based formats to study chromatin structure and transcription regulation. These included DNase-chip (Crawford et al., 2006) and ChIP-chip (Heintzman et al., 2007), which later evolved into DNase-seq (Boyle et al., 2008) and ChIP-seq (Visel et al., 2009; Creyghton et al., 2010). These studies collectively identified several hallmark chromatin signatures of active 6 enhancers, such as DNase I hypersensitivity (DHS), binding of the coactivator p300, enrichment of histone modifications H3K27ac and H3K4me1, and relative depletion of H3K4me3 (Heintzman et al., 2007; Visel et al., 2009; Creyghton et al., 2010). In contrast, gene promoters were shown to display the opposite pattern of H3 methylation—enriched for H3K4me3 but depleted for H3K4me1 (Heintzman et al., 2007). However, subsequent studies challenged the functional importance of H3K4me1, showing that it is dispensable for enhancer function (Dorighi et al., 2017), and that strong enhancers can also exhibit H3K4me3 enrichment instead of H3K4me1 (Core et al., 2014; Henriques et al., 2018), thus questioning the utility of H3K4me1 as a predictive enhancer mark. Another hallmark of active enhancers is bidirectional transcription, first discovered in mouse neuronal cells using total RNA-seq (T.-K. Kim et al., 2010). Due to the limited abundance and rapid turnover of enhancer RNAs (eRNAs), detection of enhancer transcription was significantly improved by nascent RNA sequencing approaches, particularly nuclear run-on–based assays GRO-cap and PRO-cap (Kruesi et al., 2013; Kwak et al., 2013; reviewed in Yao et al., 2022), which selectively enrich for 5′-capped nascent transcripts to map genome-wide transcription initiation events. Importantly, GRO-cap analysis in human cells revealed a strikingly similar molecular architecture shared by enhancers and promoters: an upstream nucleosome-depleted TF binding region flanked by two divergent core promoters that initiate bidirectional Pol II transcription (Core et al., 2014). This unified architectural model (Figure 1.1A) laid the groundwork for precise enhancer unit annotation in later studies (Tippens et al., 2020; Yao et al., 2022). 7 We now arrive at a (relatively) complete picture of the epigenomic landscape of active enhancers (Figure 1.1B). Mechanistically, these features are highly interconnected and collectively contribute to enhancer function. For instance, in addition to recruiting coactivators like Mediator and p300, TFs also establish or maintain DHS by working in concert with ATP-dependent chromatin remodeling complexes (Ho & Crabtree, 2010) and/or acting as pioneer factors that bind condensed nucleosomes and open up chromatin, sometimes even independently of ATP-dependent enzymes (Cirillo et al., 2002). Histone acetylation mediated by p300 and other HATs loosens compacted chromatin structure and also recruits bromodomain-containing epigenetic readers like BRD4 (Dey et al., 2003), which in turn promotes transcriptional activation by recruiting the pause release factor P-TEFb (Positive Transcription Elongation Factor b) (Jang et al., 2005; Yang et al., 2005). Due to the complex interplay among these epigenomic features, no single mark is sufficient to quantitatively predict enhancer strength. Instead, they typically offer qualitative correlations that imply enhancer function in their native chromatin contexts. 8 Fig. 1.1 Molecular architecture (A) (adapted from Tippens et al., 2020) and epigenomic features (B) of active human enhancers. 1.1.3 The Power of Modern Molecular Biology and Genetics: Quantifying and Predicting Enhancer Activity (2010s–present) Quantitative assessment of enhancer activity is fundamental to deciphering the cis-regulatory code. In 2012, massively parallel reporter assays (MPRAs) (Melnikov et al., 2012; Patwardhan et al., 2012) were developed as high-throughput extensions of the classical enhancer identification method (Banerji et al., 1981). In a typical MPRA, a library of synthesized candidate enhancers (each associated with multiple unique barcodes) is cloned upstream or downstream of a promoter-driven reporter gene, and enhancer activity is calculated as the RNA/DNA barcode ratio, either in an episomal or chromosomal setting (reviewed in Klein et al., 2020). A major innovation followed with 9 the introduction of STARR-seq (self-transcribing active regulatory region sequencing) (Arnold et al., 2013), which enables genome-wide profiling of enhancer activity by inserting randomly fragmented genomic DNA into the 3¢ UTR of a reporter gene. This design allows active elements to self-transcribe and get sequenced as part of the mRNA output. Advances in DNA synthesis technologies now allow these assays to scale up dramatically, testing up to billions of synthetic sequences to probe the regulatory syntax of enhancers (discussed in Section 1.3.1), though ultra-high library complexity poses challenges for quantitative accuracy (Sahu et al., 2022). It is important to note that MPRAs and STARR-seq measure intrinsic enhancer potential in artificial, heterologous contexts, which may not fully reflect an element’s behavior in its native chromatin environment (Arnold et al., 2013). The discovery of the CRISPR-Cas9 system (Jinek et al., 2012) and its rapid adaptation for (epi)genome editing in mammalian cells (Cong et al., 2013; Mali et al., 2013; Qi et al., 2013) have revolutionized modern genetics and enabled high-throughput functional interrogation of candidate regulatory elements in their native genomic contexts. One landmark example is the Cas9-mediated saturating mutagenesis of the human BCL11A erythroid enhancer (Bauer et al., 2013; Canver et al., 2015a), which pinpointed its critical functional sequences and directly informed the development of Casgevy—the first FDA-approved gene therapy that targets this enhancer to reactivate fetal hemoglobin production in sickle cell disease patients. Beyond Cas9-mediated mutagenesis, dCas9-KRAB–mediated CRISPR interference (CRISPRi) approaches have also been combined with various readouts to map enhancer–gene relationships in a high-throughput manner. These include growth-based dropout screens (Fulco et al., 10 2016), FlowFISH-based measurements (Fulco et al., 2019; Reilly et al., 2021), and single-cell RNA-seq (Gasperini et al., 2019), resulting in the identification of hundreds of functional enhancer–gene pairs in the human erythroleukemia cell line K562 (an ENCODE Tier 1 cell line model). However, it is important to note that CRISPRi screens primarily assess the necessity of individual enhancers in their native contexts and may overlook redundant or compensatory regulatory elements, potentially underestimating the full scope of cis-regulatory networks. The explosion of high-throughput sequencing data in the genomics era has catalyzed the development of numerous machine learning models for sequence-based prediction of TF binding (Alipanahi et al., 2015; Avsec, Weilert, et al., 2021) and enhancer function (Zhou & Troyanskaya, 2015; Kelley et al., 2016; Avsec, Agarwal, et al., 2021). More recently, deep learning approaches have enabled the rational design of synthetic enhancers with cell-type- or cell-state-specific regulatory activity (de Almeida et al., 2024; Gosai et al., 2024; Taskiran et al., 2024; Frömel, Rühle, Bernal Martinez, et al., 2025), holding great promise for future therapeutic applications. However, a recent in situ mutagenesis study generated quantitative gold-standard datasets to evaluate the performance of various deep learning models and revealed substantial variability in their predictive accuracy—particularly poor performance in modeling distal enhancer activity (Martyn et al., 2025). These findings underscore the need for a deeper mechanistic understanding of how sequence and epigenomic features contribute to enhancer function. 11 1.2 The Enigma of Enhancer–Promoter Communication Much like a romantic relationship, enhancer–promoter communication depends on two key aspects: physical proximity and (bio)chemical compatibility. And just as with human relationships, we still do not fully understand what makes an enhancer and a promoter ‘click’. This section reviews both established knowledge and unresolved questions, highlighting ongoing debates and areas of active investigation. 1.2.1 Spatial Connectivity between Enhancers and Promoters CTCF/Cohesin-Insulated Topologically Associating Domains Enhancers can regulate gene expression over large genomic distances— sometimes spanning megabases from their target promoters. For example, the limb- specific ZRS enhancer lies ~1 Mb away from its cognate gene Shh in both mouse and human genomes (Lettice et al., 2003) and two CRISPRi-identified MYC enhancers reside ~1.8 Mb from the MYC promoter in K562 cells (Fulco et al., 2016). Despite this huge linear distance, enhancer–promoter (E–P) communication requires physical proximity in 3D nuclear space, implying that chromatin must adopt specific folding patterns to bring regulatory elements together. The development of chromosome conformation capture (3C) technologies—first 3C (Dekker et al., 2002), then its high- throughput versions 5C (Dostie et al., 2006) and Hi-C (Lieberman-Aiden et al., 2009)— revolutionized the study of genome architecture. Applying 5C and Hi-C to human and mouse cells revealed that mammalian genomes are partitioned into megabase-scale topologically associating domains (TADs), typically anchored by the insulator protein CTCF and largely conserved between cell types and species (Dixon et al., 2012; Nora 12 et al., 2012; Rao et al., 2014). This has been further supported by orthogonal imaging experiments using DNA FISH and super-resolution microscopy (Nora et al., 2012; Bintu et al., 2018). Disruption of TAD boundaries—such as by genomic rearrangements—can lead to miswiring of enhancer–gene interactions and ectopic gene activation, contributing to developmental disorders and disease (Lupiáñez et al., 2015). Furthermore, incorporating Hi-C contact frequencies markedly enhanced model accuracy in predicting CRISPRi-validated enhancer–gene links, arguing for an important role of 3D genome architecture in E–P communication (Fulco et al., 2019). Mechanistically, it has been proposed that CTCF and cohesin localize to convergently oriented CTCF-binding sites (CBSs) to mediate chromatin loop formation through a process called loop extrusion (Sanborn et al., 2015). In line with this model, deletion or inversion of CBSs can reconfigure E–P looping and dysregulate gene expression (de Wit et al., 2015; Y. Guo et al., 2015). However, surprisingly, acute depletion of CTCF or cohesin eliminates TADs globally, yet only induces modest immediate transcriptional changes, as detected at both the stable mRNA level (RNA- seq) and the nascent RNA level (PRO-seq) (Nora et al., 2017; Rao et al., 2017). This discrepancy between the individual mutational analysis and global factor loss has sparked intense investigation. Several recent studies have provided insights into this conundrum. First, nucleosome-resolution Micro-C mapping in mammalian cells revealed fine-scale spatial organizations among enhancers and promoters within TADs, which remain largely intact upon CTCF or cohesin loss but are sensitive to acute transcriptional inhibition (Hsieh et al., 2020, 2022). Interestingly, cohesin, but not CTCF, plays an additional role in facilitating target search and chromatin binding of 13 TFs (Hsieh et al., 2022). Second, live-cell imaging experiments showed that fully extruded chromatin loops are rare and transient—present only ~2–3% of the time, with a median lifespans of ~10–30 min—while partially extruded loop state dominates (~92% of the time) (Gabriele et al., 2022). This is consistent with the FISH-based chromatin tracing analysis, which revealed that TAD boundaries are highly variable at the single-cell level, with CBSs serving as preferential anchoring sites. Upon cohesin loss, TAD structures persist in single cells, but the preferential positioning of TAD boundaries at CBSs is erased at the population level (Bintu et al., 2018). Third, single- cell omics and imaging analyses demonstrated that cohesin loss leads to widespread, stochastic co-activation of genes across TADs (Dong et al., 2024). Together, these findings reveal the dynamic and probabilistic nature of CTCF/cohesin-mediated chromatin loops and underscore the insulating function of TADs. However, the persistence of E–P contacts following rapid CTCF/cohesin depletion suggests the existence of additional mechanisms underlying E–P communication. This is further supported by the finding that the mouse Sox2 super-enhancer can bypass artificially introduced, CTCF-mediated insulation boundaries to activate its target gene across a distance of ~100 kb (Chakraborty et al., 2023). Transcription Factor-Mediated Chromatin Looping Beyond CTCF/cohesin, can TFs themselves directly determine E–P interactions? A classical and well-characterized example is the chromatin looping between DNase I hypersensitive sites in the active b-globin locus in mouse erythroid cells (Tolhuis et al., 2002). The discovery that this looping is mediated by the lineage- 14 specifying TF GATA1 introduced a new paradigm of TF-driven E–P contacts (Vakoc et al., 2005). Subsequent forced-looping experiments revealed convincingly that GATA1-mediated looping requires the cofactor LDB1, which harbors a self-association domain and promotes chromatin interactions through dimerization (Deng et al., 2012). Notably, rapid depletion of LDB1 disrupts hundreds of long-range loops between regulatory elements, many of which form independently of CTCF and cohesin (Aboreden et al., 2025). In addition to erythroid-specific genome organization, LDB1 has also been implicated in a novel class of cis-acting elements termed Range EXtenders (REXs), which are critical for the extreme long-range function of limb enhancers (Bower et al., 2025). These elements are enriched for motifs recognized by LIM- homeodomain TFs, which are known interactors of LDB1. Another illustrative case comes from developing Drosophila embryos, where the pioneering GAGA factor (GAF) binds to a subset of tethering elements (Batut et al., 2022; Levo et al., 2022) and mediates regulatory element proximity through its N-terminal oligomerization domain (X. Li et al., 2023). These findings underscore a growing recognition that lineage- specific TFs and their cofactors can orchestrate E–P communication independently of classical architectural proteins, warranting further investigation into the diversity, mechanisms, and cell-type specificity of TF-mediated chromatin looping. Dynamics of Enhancer–Promoter Interactions Live-cell imaging of E–P interactions has emerged as a powerful approach in enhancer biology, offering direct insights into the dynamic relationship between spatial proximity and transcriptional output. However, visualizing specific enhancers and promoters in living cells remains technically challenging, often requiring genome 15 engineering or tiling of fluorescently labeled dCas9-sgRNA ribonucleoprotein complexes. Interestingly, findings from these studies have been somewhat conflicting. Chen et al. (2018) in live Drosophila embryos (when examining an E–P interaction separated by 150 kb) and Zhu et al. (2025) in human U2OS cells (E–P interaction separated by 195 kb) both observed increased E–P proximity during transcriptional activation, along with enhanced spatial confinement and temporal stability of interactions. In contrast, Alexander et al. (2019) found no evidence of increased spatial proximity during enhancer-driven Sox2 activation in mouse embryonic stem cells (mESCs), aligning with the “stirring” model proposed by Gu et al. (2018), which described transcription-coupled increases in both the average E–P distance and the mobility of the mouse Fgf5 enhancer. Together, these conflicting findings may reflect the complexity and context-dependence of E–P dynamics, pointing to the need for improved live-cell imaging strategies and broader sampling across loci and cell types. 1.2.2 Biochemical Compatibility between Enhancers and Promoters Physical proximity between enhancers and promoters is not always sufficient to ensure cell-type-specific E–P communication. It has long been recognized that enhancers can bypass nearby genes to selectively regulate distant genes. A classical example is the Drosophila dpp enhancers that regulate the dpp gene residing 20–35 kb away while skipping over the immediately adjacent genes Slh and oaf (Merli et al., 1996). Recent enhancer interactome data in mouse embryonic tissues also reveal that ~61% enhancers bypass their adjacent promoters; notably, about half of these skipped promoters are inactive and marked by CpG methylation, while the other half remain 16 accessible and transcriptionally active (Z. Chen et al., 2024). These findings strongly suggest that E–P communication involves some biochemical selectivity beyond mere spatial proximity. Using STARR-seq, Zabidi et al. (2015) demonstrated that distinct sets of enhancers preferentially activate housekeeping or developmental core promoters in Drosophila cells, with each enhancer class relying on different TFs. In mammalian cells, however, such selectivity has been more difficult to discern. On one hand, studies have shown that both human enhancers and core promoters exhibit distinct cofactor preferences (Neumayr et al., 2022; Bell et al., 2024), and that different coactivators vary in their binding specificity for different activation domains of TFs (DelRosso et al., 2024). On the other hand, large-scale MPRA testing of E–P combinations in human and mouse cells have struggled to identify clear rules of biochemical compatibility at the DNA sequence level. The mouse study reported a broad spectrum of E–P compatibility, from striking specificity to broad promiscuity (Martinez-Ara et al., 2022), whereas the human study found largely promiscuous E–P compatibility (Bergman et al., 2022). Nevertheless, subtle differences in enhancer responsiveness between housekeeping and developmental human promoters have been observed, which seem to be mediated by several specific TFs such as GABPA and YY1 (Bergman et al., 2022). Since both active enhancers and promoters are transcribed, their functional compatibility may also depend on coordination during key rate-limiting steps of transcription. In mammalian cells, transcription by Pol II is a multi-step process orchestrated by numerous factors. Here, we broadly divide the early stages of transcription cycle into two major steps (Figure 1.2). Step 1 involves chromatin opening, PIC assembly, initiation, and promoter escape to the proximal pause site. Step 2 involves 17 release of paused Pol II into productive elongation—a distinct rate-limiting step with important functional implications (reviewed in Adelman & Lis, 2012). E–P compatibility may influence either or both of these steps. For example, promoters may differ in their reliance on specific transcription initiation complexes, such as those incorporating TBP-related factors (TRFs) or tissue-specific TBP-associated factors (TAFs) (Hochheimer & Tjian, 2003), potentially constraining their responsiveness to enhancers that recruit distinct TRFs or TAFs. Additionally, recent work has shown that different human core promoters are differentially constrained at Step 1 or Step 2, with each step preferentially activated by distinct sets of cofactors (Bell et al., 2024). Thus, optimal E–P communication may require complementation, whereby enhancers deliver the cofactors necessary to relieve the promoter’s rate-limiting step. Supporting this, “anti-pause” enhancers associated with BRD4 and JMJD6 have been identified (Liu et al., 2013), and certain TFs, such as c-Myc (Rahl et al., 2010) and HSF1 (Lis et al., 2000; Duarte et al., 2016; Mahat, Salamanca, et al., 2016), promote pause release by recruiting P-TEFb either directly or indirectly. However, a comprehensive and systematic evaluation of the Step 1/Step 2 complementation hypothesis across diverse E–P pairs is still lacking, highlighting a promising direction for future investigation. 18 Fig. 1.2 The transcription cycle of RNA Pol II at human genes (adapted from Fuda et al., 2009). 1.3 Revisiting the Regulatory Complexity of Enhancers In this section, I discuss current perspectives and key unresolved questions in enhancer biology. Topics include the sequence and functional features of enhancers and the concept of transcriptional regulatory hubs. 1.3.1 Reflection on What Makes an Enhancer Motif Syntax of Enhancers Although TF binding sites act as the atomic units for enhancer function, 19 enhancers generally exhibit flexible motif grammar without strict requirements for specific motif combinations, spacing, or orientation—except in the cases of homo- or hetero-dimers (Sahu et al., 2022). This flexibility challenges the classical enhanceosome model exemplified by the human interferon-b enhancer, which requires the formation of a higher-order TF–DNA complex with rigid motif composition and positioning (Thanos & Maniatis, 1995; Panne et al., 2007). Alternative models have since been proposed to better explain enhancer function (reviewed in Spitz & Furlong, 2012). The “billboard” model posits that enhancers act as modular information platforms with fixed TF composition but flexible motif arrangement (Kulkarni & Arnosti, 2003; Arnosti & Kulkarni, 2005). Going further, the “TF collective” model allows flexibility in both motif combination and organization, proposing that protein–protein interactions among TFs enable their collective recruitment to enhancers that may contain only a subset of their individual motifs (Junion et al., 2012). MPRA studies have revealed a spectrum of combinatorial TF interactions— ranging from sub-additive to super-additive transcriptional outputs (Grossman et al., 2017; Sahu et al., 2022). However, these studies often focus on strong consensus motifs. It is important to keep in mind that developmental enhancers frequently rely on suboptimal TF binding sites to help ensure cell-type-specific function while avoiding ectopic gene activation (Farley et al., 2015; Kribelbauer et al., 2019; F. Lim et al., 2024). This trade-off between potency and specificity suggests weak multivalent interactions among TFs may be central to enhancer logic in development. Future efforts should aim to systematically characterize a broader range of low-affinity motif combinations. 20 In line with this flexible and degenerate motif syntax, it is not surprising to find that mammalian enhancers evolve rapidly, much faster than promoters (Villar et al., 2015), although some enhancers remain ultraconserved across species (Dickel et al., 2018; Snetkova et al., 2021). Recent work suggests that even highly diverged enhancer sequences between mouse and chicken can remain “indirectly conserved”—preserving functional output through positional conservation and shuffling of TF binding sites (Phan et al., 2025), reinforcing the concept of soft syntax in enhancer architecture. Furthermore, the pervasive contribution of transposable elements—particularly long terminal repeats (LTRs)—to the human enhancer repertoire (A. Y. Du et al., 2024; Thurman et al., 2012) offers a rich substrate for studying enhancer evolution and sequence features, despite technical challenges posed by their repetitive nature. Function and Regulation of Enhancer Transcription Another well-established feature of enhancers is their divergent transcriptional activity, although its precise functional role remains incompletely understood. First of all, enhancer transcription reflects sufficient local activator concentration and successful assembly and engagement of the transcriptional machinery—both essential prerequisites for promoter activation. Additionally, divergent transcription may (1) generate negative supercoiling that facilitates DNA unwinding (Wu & Sharp, 2013), (2) maintain an open chromatin environment through nucleosome eviction, (3) alter chromatin dynamics to favor enhancer–promoter communication (H. Chen et al., 2018; Gu et al., 2018; Zhu et al., 2025), and (4) promote transcriptional hub formation via the intrinsically disordered Pol II C-terminal domain and other coactivators (discussed in Section 1.3.2), which may be further reinforced by eRNAs themselves (Sartorelli & 21 Lauberth, 2020). Is enhancer transcription regulated similarly to promoters, or simply a byproduct of chromatin opening and (co)activator recruitment? Enhancers contain core promoter elements (CPEs) (Tippens et al., 2020), and also go through Pol II pausing and release steps, although the transcription quickly terminates due to enriched polyadenylation signals and depleted splice sites (Henriques et al., 2018; Fitz et al., 2020; Core et al., 2014). Notably, using a combination of nascent transcriptome sequencing assays mNET-seq and TT-seq, the Cramer group introduced the concept of “pause-initiation limit” (Gressel et al., 2017) and showed that enhancer transcription is generally not constrained by promoter-proximal pausing and can be activated without the P-TEFb kinase CDK9 activity (Gressel et al., 2019). Future efforts to classify enhancers based on their pause release constraints may shed light on the Step 1/Step 2 complementation model discussed in Section 1.2.2. The unified divergent transcription architecture of enhancers and promoters (Core et al., 2014; Tippens et al., 2020) raises an intriguing question: can enhancers act as promoters, and vice versa? Mikhaylichenko et al. (2018) developed a dual transgenic assay in Drosophila embryos and found that transcription directionality correlates with regulatory function—bidirectional enhancers can act as weak promoters, and bidirectional promoters (but not unidirectional ones) often exhibit strong enhancer activity. Supporting this fluidity, some lncRNA promoters have also been shown to act as enhancers for their neighboring genes (Engreitz et al., 2016). Together, these findings underscore the flexibility and complexity of cis-regulatory element function. 22 1.3.2 Transcriptional Regulatory Hubs: Interplay Among Cis-Acting Elements Beyond the combinatorial action of TFs at individual enhancers, gene activation is often orchestrated by multiple enhancers working in concert. Enhancers can act redundantly to ensure robustness of gene expression, a widespread phenomenon observed in both Drosophila—where such elements are known as shadow enhancers (Hong et al., 2008; Perry et al., 2010; Frankel et al., 2010)—and during mouse development (Osterwalder et al., 2018). Enhancers may also function additively or super-additively, such as in the nested MYC enhancer network, where closely residing enhancers drive high expression additively, while distantly residing enhancers synergize to reinforce transcriptional robustness (Lin et al., 2022). A particularly notable concept is that of super-enhancers—large clusters of enhancers densely occupied by lineage- determining transcription factors and the Mediator complex (W. A. Whyte et al., 2013). These regulatory regions drive cell identity programs and are enriched for disease- associated genetic variants (Hnisz et al., 2013). It has been well-established that by recruiting high concentrations of activators, coactivators, and Pol II—many containing intrinsically disordered, low-complexity domains—super-enhancers can drive the formation of phase-separated transcriptional condensates (Sabari et al., 2018; M. Du et al., 2024). Phase separation was once considered a key mechanism of enhancer function, enabling compartmentalization of transcriptional machinery. However, recent findings reveal that it is the optimum levels of multivalent interactions among low-complexity domains, rather than phase separation per se, that drive full gene activation—and in some cases, condensate formation may even impede this process (Chong et al., 2022; Trojanowski et al., 2022). 23 Emerging evidence also reveals functional hierarchies within super-enhancers or functionally linked cis-regulatory elements. This means, despite their similar epigenomic signatures, certain constituent enhancers act as the “seed” elements nucleating regulatory activity, whereas others play accessory roles to amplify the seed activity (Shin et al., 2016; Huang et al., 2018; Thomas et al., 2021; Brosh et al., 2023; Blayney et al., 2023). Notably, Blayney et al. (2023) and Brosh et al. (2023) identified the extreme cases of facilitators within super-enhancers, which lack intrinsic enhancer activity themselves yet potentiate the function of classical autonomous enhancers. These findings support the idea of transcriptional regulatory hubs—higher-order assemblies instructed by “seed” elements, with accessory elements contributing molecular “stickiness” via multivalent interactions or reinforcing spatial connectivity. Consistent with this model, multiway chromatin interactions have been observed through both 3C- based methods (Allahyar et al., 2018) and imaging analysis (Bintu et al., 2018). As genetic, molecular, and imaging tools continue to evolve, future studies should aim to functionally classify seed and accessory elements and dissect their mechanistic interplay within complex enhancer networks. 24 CHAPTER 2 ROBUST REGULATORY INTERPLAY OF ENHANCERS, FACILITATORS, AND PROMOTERS IN A NATIVE CHROMATIN CONTEXT 2.1 Abstract Enhancers are gene-distal cis-regulatory elements that drive cell-type-specific gene expression. While significant progress has been made in identifying enhancers and characterizing their epigenomic features, much less effort has been devoted to elucidating mechanistic interactions among clusters of functionally linked regulatory elements within their endogenous chromatin contexts. Here, we developed a novel recombinase-mediated genome rewriting platform and applied our divergent transcription architectural model to understand how a long-range human enhancer confers a remarkable 10,000-fold activation to its target gene, NMU, at its native locus. Our systematic dissection reveals transcription factor synergy at this enhancer and highlights the interplay between a divergently transcribed core enhancer unit and emerging new types of cis-regulatory elements—notably, intrinsically inactive facilitators that augment and buffer core enhancer activity, and an adjacent retroviral long terminal repeat promoter that represses enhancer activity. We discuss the broader implications of our focused study on enhancer mechanisms and regulation genome- wide. 2.2 Introduction Since their discovery over four decades ago (Banerji et al., 1981; Moreau et al., 1981), enhancers have been recognized as abundant and essential cis-regulatory elements that recruit transcription factors (TFs) to activate target gene promoters from 25 a distance, often in a cell-type-specific manner. Owing to their pivotal roles in development and disease, numerous individual laboratories and major consortia like ENCODE (ENCODE Project Consortium et al., 2020) have made extensive efforts to identify and characterize enhancers across diverse cell types and tissues. Traditional hallmarks of active enhancers include TF and coactivator binding, DNase I Hypersensitivity, and histone modifications such as H3K27ac, H3K4me1 and H3K4me3 (Heintzman et al., 2009). Later, widespread RNA Polymerase II (Pol II) transcription has emerged as another key indicator of enhancer activity (T.-K. Kim et al., 2010). We previously developed the nuclear run-on–based assays, GRO-cap and PRO-cap (Kruesi et al., 2013; Kwak et al., 2013), which selectively enrich for 5′-capped nascent RNAs to map genome-wide transcription initiation events with high sensitivity and specificity. Applying GRO-cap to human cells revealed a unified molecular architecture shared by enhancers and promoters, featuring a central nucleosome- depleted TF binding region flanked by two divergent core promoters that initiate bidirectional Pol II transcription (Core et al., 2014). This unit definition enables precise delineation of enhancer boundaries and offers a robust framework for accurate enhancer annotation (Tippens et al., 2020; Yao et al., 2022). However, functional dissection of enhancers within the paradigm of divergent transcription architecture remains limited. Beyond the correlative features, recent technological advancements have further established two powerful types of high-throughput screening methods to directly quantify enhancer activity. Gain-of-function assays, such as massively parallel reporter assays (MPRAs) (Melnikov et al., 2012; Patwardhan et al., 2012) and self-transcribing 26 active regulatory region sequencing (STARR-seq) (Arnold et al., 2013), measure elements’ intrinsic enhancer potential based on their ability to drive reporter gene transcription. While these assays allow impressive genome-wide scalability, they rely on assaying DNA sequences outside of their endogenous chromatin contexts, which may compromise physiological relevance and introduce false positives (Arnold et al., 2013). In contrast, CRISPR-based loss-of-function screens (Fulco et al., 2016, 2019; Gasperini et al., 2019) assess enhancer necessity and preserve the native spatial relationships between enhancers and promoters, but they are complicated by variable perturbation efficiency, potential off-target effects, and imprecise definition of element boundaries. Moreover, human genes are frequently regulated by ensembles of enhancers and related elements that can act redundantly (Kvon et al., 2021), additively, or synergistically (Bothma et al., 2015; Carleton et al., 2017; Lin et al., 2022; Thomas et al., 2021), representing additional layers of complexity. Therefore, despite the exponential rise in the number of experimentally nominated cis-regulatory elements, the mechanisms governing their functional logic are still poorly understood. To address these limitations and provide orthogonal insights, recombinase- mediated genome rewriting (Blayney et al., 2023; Brosh et al., 2021, 2023) has emerged as a powerful strategy. Through precise replacement of genomic regions and targeted manipulation of individual or combinatorial elements, these approaches allow comprehensive interrogation of entire loci of interest and enable functional analysis in their native genomic contexts, uncovering hierarchical relationships among a cluster of elements. Furthermore, they embrace serendipitous discovery of previously unrecognized cis-regulatory behavior. For instance, by engineering the ɑ-globin super- 27 enhancer in mouse erythroid cells, Blayney et al. (2023) recently identified a novel class of distal regulatory elements, termed facilitators, which lack intrinsic enhancer activity but potentiate the function of autonomous enhancers. While conceptually intriguing, the broader prevalence of facilitators beyond super-enhancers and the molecular underpinnings of enhancer–facilitator interactions remain largely unexplored. In this study, we developed a novel recombinase-mediated platform to systematically dissect a potent distal human enhancer at its native locus, guided by our architectural model of enhancer organization (Core et al., 2014; Tippens et al., 2020). Through detailed TF motif mutagenesis and integrative genomics analysis, we uncovered intricate crosstalk at both the trans-acting factor and the cis-acting element levels. We demonstrate that a core enhancer region, precisely demarcated by a divergent transcription pattern, acts as the intrinsic activating unit for target gene expression. This core enhancer activity is further modulated by surrounding facilitators and a promoter- like element, which display distinct molecular signatures and exert positive and negative influences, respectively. We propose that such highly interconnected regulatory networks are broadly utilized across the genome to ensure precise and robust control of transcriptional output. 2.3 Results 2.3.1 eNMU landing pad as a powerful system to study enhancer function at the native locus Neuromedin U (NMU) is a neuropeptide that has been implicated in various physiological processes including erythropoiesis (Gambone et al., 2011). In the triploid 28 human erythroleukemia cell line K562, Gasperini et al. (2019) identified a critical enhancer of NMU (hereafter eNMU) in a CRISPR interference (CRISPRi) screen. This enhancer is located ~94 kb away from the NMU gene promoter, and its homozygous deletion, without negatively affecting cell growth, led to a remarkable 100% reduction in NMU expression by RNA-seq (Gasperini et al., 2019). Tippens et al. (2020) further refined the boundary of eNMU based on our unified molecular architecture model of transcriptional regulatory elements (Andersson et al., 2015; Core et al., 2014) and divided eNMU into two divergently transcribed sub-elements e1 and e2 (Figure 2.1A). Homozygous CRISPR knockouts showed that deleting e1, the 453-bp sub-element with higher DNase I Hypersensitivity (DHS), reduced gene expression by 10,000-fold (0.01% of WT) by quantitative reverse transcription PCR (RT-qPCR)—the same level as deleting full eNMU (Figure 2.1B). Precision Run-On and Sequencing (PRO-seq) confirmed that ΔeNMU and Δe1 abolished nascent transcription at both the enhancer and the target gene without affecting other genes nearby (Figures 2.1, C, D, and 2.2). Hence, e1 is essential for transcription initiation while e2 alone in the genome is completely inactive. Surprisingly, deletion of this intrinsically inactive 503-bp e2 element resulted in only ~5% of WT mRNA level (Figure 2.1B) with decreased PRO- seq signal at e1 and the NMU gene, highlighting that e1 acts as a canonical autonomous enhancer for NMU but requires the facilitator element (Blayney et al., 2023) e2 to achieve maximal activation. 29 Fig. 2.1 eNMU is composed of an autonomous enhancer e1 and a facilitator e2. (A) Epigenomic landscape of eNMU and its sub-elements e1 and e2 in K562 cells, showing DNase I hypersensitivity (DHS) and H3K27ac (ENCODE Project Consortium et al., 2020), and GRO-cap–defined transcription start sites (TSSs) (Core et al., 2014). (B) NMU mRNA levels (RT-qPCR) in independent cultures of WT, ΔeNMU, Δe1 and Δe2 cell lines. Black dots = data from Tippens et al., 2020 (GAPDH normalized); red dots = data from this study (ACTB normalized). (C) PRO-seq signal at the NMU–eNMU locus in the same cell lines as (B). Tracks represent merged biological replicates (n = 2). (D) Relative NMU gene body read counts from PRO-seq in (C). 30 Fig. 2.2 eNMU specifically regulates NMU gene transcription in K562. PRO-seq (3¢-end) tracks of WT, ΔeNMU, Δe1, Δe2 and eNMU LP cell lines across a 1-Mb region around NMU; insets show the full NMU gene and eNMU loci. Highlighted regions indicate NMU promoter (P) and eNMU (E). Note that at the eNMU locus in LP Clone E5, the prominent PRO-seq signal downstream (to the right) of the Bxb1 LP cassette originates from readthrough transcription driven by the strong EF1ɑ promoter within the selection cassette. However, transcriptional activity of the LP did not exhibit any enhancer function to activate the distal NMU 31 gene. Tracks represent merged biological replicates (n = 2 independent cultures). WT K562 GRO-cap data (Core et al., 2014) shown as the TSS reference. The huge dynamic range of eNMU regulation from a distal site and the intriguing cooperativity between its sub-elements e1 and e2 warrant a comprehensive interrogation into its sequence features and molecular mechanisms. To this end, we used CRISPR to knock in a landing pad at a single allele of the native eNMU locus in the K562 ΔeNMU cell line (Figure 2.3, A and B). This landing pad, modified from Matreyek et al. (2020), harbors a Bxb1 recombinase-mediated exchange cassette containing a battery of selection markers, including a constitutively expressed Blue Fluorescent Protein (BFP). Co-transfection of a Bxb1-expressing plasmid and a barcoded payload plasmid library of eNMU mutants leads to the loss of selection markers and a stable, irreversible integration of individual elements at the landing pad locus. The resulting BFP− recombinant population is then subjected to NMU hybridization chain reaction fluorescence in situ hybridization coupled with flow cytometry (HCR-FlowFISH) (Reilly et al., 2021) to resolve the effects of individual mutants on NMU expression as a measure of enhancer activity (Figure 2.3A). As an initial test on the functionality of our system, we integrated the full-length eNMU sequence into two independently isolated landing pad (LP) clones and observed ~5% recombination efficiency in both cases as measured by BFP loss (Figure 2.3C). Importantly, for both LP clones, NMU expression was rescued by ~3,000-fold in BFP− recombinant cells compared to the parental, NMU-inactive LP cells (Figure 2.3D and 2.2), consistent with 1/3 of the 10,000-fold activation by three alleles of eNMU (Figure 2.1B). Subsequent testing of HCR-FlowFISH on eNMU- and e2-recombinant 32 populations showed a clear separation of their NMU RNA FISH signals (Figure 2.3E). Therefore, we have successfully established an efficient LP-based workflow that would allow us to characterize eNMU in its native chromatin context. Fig. 2.3 eNMU landing pad as a powerful system to study enhancer function at the native locus. (A) Workflow of the eNMU landing pad system to measure enhancer activity of a barcoded element library. (B) LP copy number analysis by quantitative PCR on genomic DNA (BFP DNA vs. 3-allele control locus). (C) Bxb1 recombination efficiency measured by BFP loss in two independent LP clones using flow cytometry. (D) Rescue of NMU expression by inserting eNMU into the landing pad (LP); n = 2 independent LP clones. (E) Validation of HCR-FISH in the same LP clones as in (C– D); ACTB served as the housekeeping gene control. 33 2.3.2 Screening for functional units and motifs in eNMU We next set out to design a systematic mutagenesis scheme for eNMU using a combination of unbiased and targeted perturbations that complement each other. Building on our previous finding that enhancer transcription contributes to activity (Tippens et al., 2020), we generated tiling deletions for e1 and e2 to remove clusters of transcription start sites (TSSs) (Figure 2.4A). In parallel, considering the central role of TFs in shaping enhancer function, we curated a list of TF motifs for targeted mutagenesis by (1) intersecting all possible TF binding sites (TFBSs) from the JASPAR database (Castro-Mondragon et al., 2022) with K562 ChIP-seq peaks from the UCSC Genome Browser (Perez et al., 2025) overlapping eNMU, and (2) filtering this candidate set to retain only those motifs corresponding to K562-expressed TFs and located within regions of enriched ChIP-seq signals (Methods). This led to a total of 95 motifs corresponding to 26 TFs (Figure 2.4A). We then merged highly similar motifs such as AP-1 and NFE2 and mutated the two most conserved bases in each motif by transversion (C. Guo et al., 2017; Kircher et al., 2019; Kosicki et al., 2024) (A↔C, T↔G, Supplementary Table S1) while ensuring minimal interference with overlapping motifs to the best of our ability. Finally, we devised a “mix-and-match” cloning strategy (Figure 2.4B; Methods) to construct a barcoded mutant library where all the motif occurrences for a given TF were altered only in e1, e2, or both. This approach allowed for maximal disruption of TF binding within the distinct contexts of the two sub- elements. Taken together, our library contains 83 elements (77 eNMU-derived sequences and 6 exogenous controls) associated with 328 unique barcodes—on average 4 barcodes per element. Following library integration and recombinant cell selection, 34 we performed FlowFISH and sorted cells into 8 bins based on NMU signal intensity, using ACTB as an internal control (Figure 2.4C). We then sequenced enhancer barcodes in each bin and calculated an activity score for each barcode using a weighted average approach (Figures 2.4, D and E). Activity scores from biological replicates showed a strong correlation (Pearson’s r = 0.91, Figure 2.4F) and aligned closely with RT-qPCR measurements for a select set of mutants (Pearson’s r = 0.97, Figure 2.4G), confirming that our assay faithfully captured mutant activities. 35 Fig. 2.4 The HCR-FlowFISH screen for functional sequences in eNMU. (A) Full eNMU mutant design separating overlapping motifs of different TFs. (B) Gibson assembly workflow for constructing the barcoded eNMU mutant library. (C) Flow cytometry binning strategy using ACTB as an internal control for cell size, 36 transcription level, and staining efficiency. (D) Barcode distribution across 8 sorting bins for two example elements in the mutant library: WT_eNMU and e2_mSTAT5. (E) Calculation of activity scores using a weighted average of barcode distributions. (F) Correlation of median barcode activity scores between biological replicates (n = 2 independent LP clones subjected to recombination and FlowFISH). Pearson’s correlation coefficient (r) is shown. (G) Correlation between FlowFISH-measured activity scores and RT-qPCR quantifications of select mutants, based on median activity values from each assay. Pearson’s correlation coefficient (r) is shown. Our analysis of tiling deletions in e1 showed that a divergently transcribed region e1.1–e1.3 delineated an activating unit, where e1.2—encompassing the DHS signal summit—marked the core of e1 activity (Figure 2.5B, Δe1.2 versus Δe1). Unexpectedly, the well transcribed e1.4 acted as a repressing unit, as its deletion led to an increase in NMU expression. This observation was bolstered by finer deletions Δe1.5 to Δe1.8, which revealed that the transition between activation and repression lay between e1.6 and e1.7. In fact, the first 201 bp of e1 (Δe1.9) was sufficient to capture all of its activity, likely due to the loss of both positive and negative elements in e1.9. In contrast, tiling deletions in the facilitator e2 identified a simpler functional core e2.3 (Figure 2.5B, Δe2.3 versus Δe2), with the other segments showing modest effects. Overall, fold changes in double deletions of an e1 segment and an e2 segment were multiplicative (i.e., log-additive) of single deletions, except for Δe1.2+Δe2.3, which fell below the dynamic range of FlowFISH (Figures 2.6, A and C). These findings highlight a modular nature of eNMU’s molecular architecture with largely independent activating and repressing features. Examination of the motif mutagenesis results in e1 or e2 showed <50% reduction of enhancer activity in most cases (Figures 2.5 and 2.6B). Consistent with the 37 deletion results, the key TF motifs for e1, namely GATA1 and RUNX1 motifs, are located within or near the core region e1.2, while the essential motifs for e2—the STAT5 motifs—are clustered in the core e2.3, accounting for nearly all its function. TF contributions were context-specific, as exemplified by GATA1 being much more critical for e1 than for e2. Similarly, double mutants where the same TF binding was disrupted in both e1 and e2 exhibited multiplicative effects (Figures 2.6, B and C), suggesting that the cooperativity between e1 and e2 is not driven by a single TF type but likely involves multiple different TFs. Fig. 2.5 Key functional units and motifs in eNMU. (A) Overview of deletions and targeted TF binding sites (TFBS) in the eNMU mutagenesis screen. (B) Activity scores of select mutants measured by HCR- FlowFISH. Each dot represents the score of an element-specific barcode from either 38 of the two biological replicates. Dashed lines indicate median scores of control elements WT_eNMU, Δe1, and Δe2. Fig. 2.6 Multiplicative effects of double mutants revealed by FlowFISH screen. (A) FlowFISH-measured activity scores of single and double deletions, together with additional exogenous control elements. (B) FlowFISH-measured activity scores of all TF motif mutations in e1, e2, or both. Note that the e1_e2_mCEBPB mutant exhibited higher enhancer activity than WT_eNMU, which may be attributed to an LTR promoter-mediated mechanism as discussed later. (C) Linear regression of observed log2 fold changes in double mutants vs. expected additive effects (sum of log2 fold changes from corresponding single mutants). R² from linear regression is shown. Dashed 1:1 line indicates perfect additivity. Highlighted outlier: expected effect of the Δe1.2+Δe2.3 mutant fell below FlowFISH’s detection range, preventing assessment of additivity. 39 2.3.3 Interplay of regulatory factor binding at eNMU To validate the findings on the key motifs and enable clean functional analysis downstream, we generated single cell recombinant clones harboring WT_eNMU, e1_mRUNX1, e1_mGATA1, and e2_mSTAT5. RT-qPCR quantifications of NMU expression showed marked reductions in the mutant clonal lines, corroborating our FlowFISH results (Figure 2.7A). Individual mutation of the left and right RUNX1 motifs revealed that the two sites acted additively, with a stronger contribution from the left site, consistent with its higher motif score (JASPAR scores 632 versus 335, Supplementary Table S1). Substituting the GATA1 motif with a RUNX1 motif (e1_G- to-R) failed to rescue e1_mGATA1’s phenotype, suggesting GATA1 as an indispensable factor for eNMU function. Conversely, replacing the RUNX1 motifs with GATA1 (e1_R-to-G) only partially rescued e1_mRUNX1’s phenotype, suggesting that the combination of GATA1 and RUNX1 is particularly potent in the context of e1. We also noticed that the modest effect of mutating the right RUNX1 motif—located in the segment e1.5—was not sufficient to explain the substantial decrease in enhancer activity in the Δe1.5 mutant (Figure 2.5). A closer examination of this region uncovered two strong Retinoic Acid Receptor Alpha (RARA)/Retinoid X Receptor Alpha (RXRA) motifs (Figure 2.5A, orange hatched boxes), which overlapped the right RUNX1 motif and were initially overlooked due to the absence of publicly available ChIP-seq data. Mutating both RARA/RXRA motifs without disrupting the RUNX1 site validated their crucial function (Figure 2.7A, e1_mRARA). Therefore, we have established that four distinct TF motifs, including e1’s RUNX1, GATA1, and RARA/RXRA motifs and e2’s STAT5 motifs, are pivotal for eNMU activity. 40 Fig. 2.7 Interplay of regulatory factor binding at eNMU. (A) NMU mRNA levels measured by RT-qPCR in single cell-derived recombinant clones; bars = median. Exact mutant sequences are listed in Supplementary Table S1 (separate file). (B–D) ChIP-qPCR of TF binding in select mutants: GATA1 (B) and RUNX1 (C) at e1; STAT5 (D) at e1 and e2 (n = 2 independent single cell clones per mutant). Statistical significance assessed using one-way ANOVA with Dunnett’s post hoc test vs. WT_eNMU (**, p < 0.01; ***, p < 0.001). Public ENCODE ChIP-seq tracks (fold change over control) (ENCODE Project Consortium et al., 2020) shown as references: GATA1, ENCFF334KVR; RUNX1, ENCFF654QOE; STAT5A, ENCFF171KLX. (E) Left: p300 ChIP-seq profiles at the eNMU locus in the indicated mutants from (B–D); tracks represent merged biological replicates (n = 2). Track colors indicate specific motif disruptions: grey = WT_eNMU, green = e1_mRUNX1, 41 red = e1_mGATA1, blue = e2_mSTAT5. Colored boxes below tracks indicate locations of disrupted TF motifs. Right: p300 signal at eNMU vs. NMU mRNA in matched single cell clones. (F) Schematic of regulatory factor interplay at eNMU. Note that, unlike normal physiological conditions where STAT5 proteins are activated in response to cytokine signaling (Tóthová et al., 2021), K562 cells express the constitutively active oncogenic BCR-ABL fusion protein that drives persistent STAT5 phosphorylation, dimerization and activation (de Groot et al., 1999; Weber- Nordt et al., 1996). We next investigated how the motif alterations functionally affected TF occupancy. ChIP-qPCR assays revealed that GATA1 and RUNX1 binding at e1 was significantly impaired not only by disruption of their own motifs, but also by mutations in each other’s motifs (Figures 2.7, B and C), indicating cooperative binding of these two factors. STAT5 binding at e2 was largely self-driven, as demonstrated by its significant reduction in the e2_mSTAT5 mutant but minor changes in the e1 mutants (Figure 2.7D, right panel). Interestingly, STAT5 binding at e1 was also affected by e2’s STAT5 mutations (Figure 2.7D, left panel), suggesting that the facilitator element e2 might boost e1’s activity by promoting STAT5 binding at e1. Collectively, these results demonstrate extensive synergy in TF occupancy at eNMU (Figure 2.7F). Since p300 is a known coactivator for GATA1 (Boyes et al., 1998), RUNX1 (Kitabayashi et al., 1998), and STAT5 (Pfitzner et al., 1998), we also performed p300 ChIP-seq to examine its recruitment in the mutant clones. The RUNX1 and GATA1 mutants exhibited prominent reductions in p300 binding at eNMU which correlated with NMU downregulation (Figure 2.7E). In contrast, e2’s STAT5 mutations only led to a mild decrease in p300 occupancy despite marked reduction in NMU expression, suggesting that p300 recruitment is not the primary mechanism for STAT5-mediated 42 facilitator function in eNMU. 2.3.4 TF-specific regulation of chromatin accessibility and nascent transcription To gain a deeper understanding of TF-specific regulation of chromatin structure, we performed ATAC-seq on a select set of critical mutants. Focusing on the eNMU locus first, we found distinct changes in chromatin accessibility pattern that seem to be related to the positional context of the disrupted motifs (Figure 2.8A). Disruption of the GATA1 motif (e1_mGATA1) and the stronger RUNX1 motif (e1_mRUNX1-L)—both situated near the DHS summit—resulted in a broad reduction in ATAC-seq peak height across e1, suggesting that these motifs act as the nucleation sites for chromatin opening. This aligns with the previous reports linking GATA1 and RUNX1 to the recruitment of the SWI/SNF chromatin remodeling complex (Bakshi et al., 2010; Kadam & Emerson, 2003). Mutating both RUNX1 motifs (e1_mRUNX1) reduced both peak height and peak width at e1, indicating a more severe defect in chromatin decompaction and consistent with its greatest loss in enhancer activity (Figure 2.7A). In contrast, crippling the more distal RARA/RXRA motifs led to an asymmetrical loss of accessibility on the right flank, while the open chromatin state to their left was likely maintained by GATA1 and RUNX1. Mutations in e2’s STAT5 motif cluster, which resides even further from the DHS center, mainly decreased e2’s accessibility with subtle shrinkage in e1’s peak, suggesting that chromatin opening is not the primary mechanism by which STAT5 facilitates e1’s enhancer activity. At the NMU promoter, all mutants exhibited reduced chromatin accessibility (Figure 2.8B), consistent with the observed decreases in gene expression. In contrast, 43 the control GAPDH gene showed nearly identical accessibility across mutants, demonstrating the reproducibility of our data. Notably, RUNX1 seemed to play an additional role in enhancer–promoter communication, as its motif disruptions specifically affected another NMU promoter-proximal ATAC-seq peak (Figure 2.8B, black arrowheads). Overall, ATAC-seq pattern changes were highly consistent across biological replicates (independent single cell clones) (Figure 2.8C) and highlight the unique contributions of individual TFs to the chromatin landscape at the enhancer and promoter of NMU. 44 Fig. 2.8 TF-specific regulation of chromatin accessibility at the enhancer and promoter of NMU. (A–B) ATAC-seq signal at eNMU (A), NMU promoter, and GAPDH control locus (B) in select eNMU mutants. In (B), black arrows highlight the proximal ATAC-seq peak only affected in the RUNX1 motif mutants. Tracks represent merged biological replicates (n = 2 independent single cell clones). (C) ATAC-seq signal at eNMU, NMU promoter and GAPDH control locus for two independent single cell-derived clones (Rep1 and Rep 2) of the eNMU mutants in (A–B). Colored boxes below tracks indicate locations of disrupted TF motifs. Fine vertical lines indicate positions of GRO-cap–defined TSSs (WT K562) (Core et al., 2014). 45 To study TF-specific regulation of nascent transcription at base-pair resolution, we performed PRO-seq on the same set of clones and plotted 5′ positions of PRO-seq reads at the eNMU locus to estimate its TSS usage (Figure 2.9A). WT_eNMU integration recapitulated the TSS pattern observed in our published K562 GRO-cap data (Core et al., 2014), validating our methodology. Across the mutants, we found varying degrees of signal reduction, yet the patterns of divergent transcription at e1 and predominantly unidirectional transcription at e2 were largely preserved. Notably, e2’s transcription was diminished not only by its own STAT5 mutations, but also in the e1 mutants where e2’s STAT5 binding was only mildly affected (Figure 2.7D, right panel). Such decoupling of transcriptional activity from STAT5 occupancy suggests that STAT5 alone is insufficient to drive Pol II initiation at e2. Instead, STAT5 appears to act as an effector that mediates e2’s dependence on e1. Together, these findings depict a highly interconnected transcriptional landscape at the eNMU locus. Finally, we examined nascent transcription changes at the NMU gene by plotting the conventional 3′ ends of PRO-seq reads to represent the locations of paused and elongating Pol II (Figures 2.9, B and C). All the mutants showed pronounced signal reductions in both the NMU promoter pause region (TSS to TSS+250 bp, Figure 2.9B, dashed box) and further downstream into the gene body, consistent with their steady- state mRNA levels (Figure 2.7A). Importantly, the pausing index (PI), defined as the ratio between the pause region and gene body read densities (Core et al., 2008) (Methods), increased 2- to 3-fold in all the mutants compared to WT_eNMU. This points to a defective pause release mechanism at the NMU promoter, regardless of specific TF binding at eNMU. Therefore, in addition to its essential role in transcription 46 initiation as demonstrated in the ΔeNMU cell line (Figure 2.1C), eNMU also regulates Pol II pause release at its target promoter, likely through cofactors shared among its critical TFs. Fig. 2.9 TF-specific regulation of chromatin accessibility at the enhancer and promoter of NMU. (A–B) PRO-seq signal at eNMU (A) and NMU promoter (B) in select eNMU mutants. In (B), bar = NMU pausing index of merged replicates, dots = pausing index of individual replicates. (C) PRO-seq tracks at the full NMU and GAPDH (control) genes in the same mutants as in (A–B). Tracks represent merged biological replicates 47 (n = 2 independent single cell clones). Colored boxes below tracks indicate locations of disrupted TF motifs. Fine vertical lines indicate positions of GRO-cap–defined TSSs (WT K562) (Core et al., 2014). 2.3.5 Facilitator e2 universally confers enhancer robustness The extensive crosstalk between e1 and e2 revealed by our functional analysis raised the question on the generality of e2’s facilitator function. To investigate this, we selected eight divergently transcribed, CRISPR-validated distal transcriptional regulatory elements (dTREs) in K562, which showed large effect sizes in the original perturbation studies (Figures 2.10, A and B). We assessed their ability to drive NMU expression by recombining each dTRE into the eNMU landing pad, either as standalone elements or fused with e2 (Figure 2.11A). When integrated alone, all the dTREs elevated NMU mRNA levels above the baseline of e2 only, although the magnitude of their effects varied drastically (Figure 2.11B). A closer examination revealed a reasonably good correlation between the dTREs’ intrinsic activities at the eNMU locus and the total number of GRO-cap reads at their endogenous loci (Pearson’s r = 0.85, Figure 2.10C), suggesting nascent transcription as a reliable indicator of enhancer function. Notably, fusing e2 to the dTREs amplified their activities in every case, with weak elements experiencing greater boosts than strong ones, thereby reducing the variation in their effects. Quantitatively, the intrinsic activities of the dTREs and the amplifications rendered by e2 closely followed a linear log-log distribution (i.e., power- law relationship) (Figure 2.11C). These observations illustrate a universal buffering function of the facilitator e2 in preventing ultra-low gene expression levels. 48 Fig. 2.10 Testing CRISPR-validated heterologous K562 dTREs at the eNMU locus. (A) Summarized information of selected K562 dTREs from previous studies. Effect sizes obtained from Fulco et al. (2019) (except for NMU e1, which is from Tippens et al. (2020). (B) Native genomic contexts of each dTRE; tested regions highlighted 49 in light blue. Track scales are consistent across dTRE regions, except for GRO-cap (Core et al., 2014), which uses an individually indicated scale. Detailed sources and accession information are provided in Supplementary Table S4. (C) Correlation between GRO-cap read counts at dTREs and their intrinsic enhancer activity (−e2) at the eNMU locus. Pearson’s correlation coefficient (r) and corresponding p-value are shown. Fig. 2.11 Facilitator e2 universally confers enhancer robustness. (A) Workflow to test e2’s facilitator function on heterologous K562 dTREs using the eNMU landing pad. (B) Enhancer activity of dTREs in the absence or presence of e2, measured by RT-qPCR; e2 only serves as the baseline. n = 2 independent recombination experiments. (C) Correlation between intrinsic activity of elements (−e2) and the fold change with e2 fusion. Pearson’s correlation coefficient (r) and corresponding p-value are shown. (D) NMU mRNA levels measured by RT-qPCR in single cell-derived recombinant clones (n ≥ 4) of e1_WT vs. e1_mGATA1 in the absence or presence of e2; e2 only serves as the baseline. Error bars = ± SEM. (E) PRO-seq signal at e1, NMU promoter and GAPDH control locus in e1_WT and e1_mGATA1 clones lacking e2. Tracks represent merged biological replicates (n = 2 independent single cell clones). Fine vertical lines indicate positions of GRO-cap– defined TSSs (WT K562) (Core et al., 2014). 50 We next asked whether e2’s buffering effect could still apply to a mutated e1 element. Given the critical role of e1’s GATA1 motif in TF cooperativity (Figure 2.7), we compared the enhancer activities of e1_WT and e1_mGATA1 with or without e2, again in single cell-derived clones. In the absence of e2, disruption of the GATA1 motif completely abolished e1 activity, as reflected by the baseline mRNA level (Figure 2.11D) and undetectable nascent transcription at both e1 and the NMU promoter (Figure 2.11E). Notably, the presence of e2 restored nascent transcription (Figure 2.9, e1_mGATA1) and rescued the mRNA level by a striking 2,000-fold (Figure 2.11D), in stark contrast to the 15-fold increase observed for the active enhancer e1_WT. This aligns with the power-law behavior of the heterologous dTREs and highlights the importance of facilitators in safeguarding enhancer robustness against disruptive mutations. 2.3.6 A 3D regulatory hub of enhancer, promoter and facilitators of NMU In addition to the eNMU region located 94 kb upstream of NMU, CRISPRi screens by Gasperini et al. (2019) and Reilly et al. (2021) identified four additional candidate NMU “enhancers” at 30.5, 35, 87, and 97.6 kb upstream with varying effect sizes (Figure 2.12A, purple highlights). We noted that these elements essentially function as facilitators—similar to e2—rather than autonomous enhancers, as they failed to activate NMU transcription in the absence of e1 (Figures 2.1, B and C, Δe1). In line with this notion, ATAC-seq analysis of the CRISPR deletion lines showed that Δe1, but not Δe2, substantially reduced chromatin accessibility across all the facilitators and the NMU promoter to levels comparable to ΔeNMU (Figure 2.12A, insets). This underscores the hierarchical relationship between the core enhancer e1 and other 51 regulatory elements. Furthermore, public high-resolution intact Hi-C data (ENCODE Project Consortium et al., 2020) shows a distinct stripe pattern anchored at eNMU extending towards the NMU promoter (Figure 2.12B, E–P stripe), suggesting that eNMU actively scans across the 94 kb region and makes widespread contacts. Strong focal interactions, indicated by dot-like patterns, are observed between the NMU promoter and F1, eNMU, F3, as well as a distal CTCF/cohesin peak, and also between F1′ and eNMU (Figure 6B, black arrowheads). Consistently, an independent lower- resolution Hi-C study (Rao et al., 2014) reveals elevated contact frequencies between almost every pair of the regulatory elements (Figure 2.13A). These findings together hint at the presence of a spatial regulatory hub for the NMU gene (Figure 2.12D). 52 Fig. 2.12 A 3D regulatory hub of enhancer, promoter and facilitators of NMU. (A) ATAC-seq signal at the NMU–eNMU locus in WT, ΔeNMU, Δe1 and Δe2 cell lines, highlighting NMU promoter, eNMU, and facilitators (F1, F2, and F3 from Gasperini et al. (2019); F1 and F1′ from Reilly et al. (2021)). Tracks represent merged biological replicates (n = 2 independent cultures). (B–C) Public intact Hi-C (B) and ChIP-seq (B and C) (ENCODE Project Consortium et al., 2020; X. Guo et al., 2020) at the NMU–eNMU locus in K562. Intact Hi-C is shown at 300-bp resolution, with black arrows indicating pairwise contacts between NMU promoter, facilitators, and eNMU; ChIP-seq tracks display signal p-values. Detailed sources and accession information are provided in Supplementary Table S4. (D) Schematic model illustrating a 3D regulatory hub of enhancer–promoter–facilitator interactions at the NMU–eNMU locus. 53 Fig. 2.13 Additional epigenomic features of enhancer, promoter and facilitators. (A) Public Hi-C (Rao et al., 2014) and ChIP-seq tracks of CTCF and RAD21 (ENCODE Project Consortium et al., 2020) at the NMU–eNMU locus in K562. Hi-C is shown at 5-kb resolution, with dashed lines and open circles marking pairwise contacts between NMU promoter, facilitators, and eNMU. Note that some contact anchors may not align perfectly with the regulatory elements, possibly due to the limited resolution of this Hi-C dataset. (B) Expanded ChIP-seq tracks (ENCODE Project Consortium et al., 2020; X. Guo et al., 2020) displaying signal p-values. Grey box highlights the NMU promoter and its proximal region, with a zoomed-in view shown on the top right. A separate inset on the right zooms in at the eNMU region, shown at the same scale as the full locus, except where otherwise indicated. Detailed sources and accession information are provided in Supplementary Table S4. To explore potential mechanisms underlying the 3D hub formation and facilitator function, we analyzed the epigenomic landscape across the entire NMU– eNMU locus, leveraging the vast amount of experimental data available for K562 (Figures 2.12C and 2.13B) (Core et al., 2014; ENCODE Project Consortium et al., 2020; 54 X. Guo et al., 2020). The paucity of the structural proteins CTCF/cohesin at eNMU and its facilitators prompted us to examine the binding of another independent looping factor, the LDB1 complex (Aboreden et al., 2025; Song et al., 2007), at these loci. In erythroid cells, the non-DNA binding transcription cofactor LDB1 forms a stable complex with GATA1, TAL1, E2A/TCF3 transcription factors (Wadman et al., 1997), which drives chromatin looping via dimerization of LDB1’s self-association domain (Deng et al., 2012). Indeed, eNMU and two of the facilitators, F1′ and F2, are well occupied by the LDB1 complex (Figures 2.12C and 2.13B). Although the NMU promoter itself is not bound by LDB1, its downstream proximal DHS peak exhibits low levels of TAL1/TCF3/LDB1 binding (Figure 2.13B, grey box and inset for NMU promoter). Interestingly, instead of GATA1, these factors seem to complex with RUNX1 at this site, which has been reported as an alternative binding partner of TAL1 (Wilson et al., 2010) and LDB1 (Gilmour et al., 2018; Meier et al., 2006). Of note, we observed decreased accessibility at this promoter-proximal peak exclusively in the e1_mRUNX1 mutants in Figure 2.8B (black arrowheads), suggesting that RUNX1 binding at eNMU communicates with the RUNX1-containing LDB1 complex near the NMU promoter. Beyond the LDB1 complex, F1′ and F2 also show modest enrichment for STAT5 binding (Figure 2.12C), which may contribute to their crosstalk with e1, as observed for the facilitator e2. In contrast, the strongest facilitator F3 is predominantly occupied by AP-1 factors along with appreciable binding of the SWI/SNF subunit SMARCA4 (Figure 2.13B). Nevertheless, signals for other coactivators (p300, BRD4, NCOA1), as well as DHS and H3K27ac, are evidently lower at all four facilitators 55 compared to eNMU. While H3K4me3 is primarily enriched at the NMU promoter, all the regulatory loci display comparable levels of H3K4me1, an enhancer mark that has been shown to facilitate enhancer–promoter interactions (Kubo et al., 2024). Finally, the minimal GRO-cap signals detected at F1–F3 (Figure 2.12C), together with the dispensability of TSSs in e2 (Figure 2.5, Δe2.2, Δe2.4), support the notion that active transcription is not a defining feature of facilitators, thereby solidifying our divergent transcription model for canonical autonomous enhancers. Taken together, our integrative analysis of the constellation of cis-regulatory elements at the NMU–eNMU locus highlights their spatial connectivity and distinctive epigenomic signatures, providing mechanistic insights into their action. 2.3.7 Dynamics of eNMU regulation during erythroid differentiation The remarkable regulatory network of eNMU in K562 led us to explore its function under normal physiological conditions. Given the transcriptomic similarity between K562 cells and early erythroid precursors (Ulirsch et al., 2016), the documented role of NMU peptide in early erythropoiesis (Gambone et al., 2011), and the known hematopoietic functions of key TFs acting at eNMU (Chanda et al., 2013; M. J. Chen et al., 2009; Grebien et al., 2008; Nuez et al., 1995; Okuda et al., 1996; Perkins et al., 1995; Pevny et al., 1991; Socolovsky et al., 1999; Tóthová et al., 2021), we examined the well-established ex vivo erythroid differentiation model of human hematopoietic stem and progenitor cells (HSPCs) (Hu et al., 2013; J. Li et al., 2014) (Figure 2.14A). Reanalysis of published RNA-seq datasets (An et al., 2014; D. Li et al., 2023; Schulz et al., 2019) shows that NMU is among the most significantly upregulated genes during the differentiation of HSPCs into erythroid precursors (proerythroblasts) 56 (Figure 2.14B), with expression increasing by over two orders of magnitude (Figures 2.14, C and G). This aligns with a recent single-cell multiomics study that reported NMU induction during early erythropoiesis of human hematopoietic progenitors (X. Zhang et al., 2024). Importantly, NMU induction is accompanied by a progressive increase in the signals of H3K27ac, GATA1, RUNX1, (D. Li et al., 2023) and chromatin accessibility (Schulz et al., 2019) at the eNMU locus (Figures 2.14, D and F), supporting eNMU as a developmental enhancer of NMU. Furthermore, small interfering RNA (siRNA) knockdown of GATA1 significantly reduces NMU expression (D. Li et al., 2023) (Figure 2.14E), mirroring the e1_mGATA1 mutant effect in K562 (Figure 2.7A). Together, these findings highlight the physiological relevance of our results obtained from the immortalized erythroid cell line model K562. 57 Fig. 2.14 Dynamics of eNMU regulation during erythroid differentiation. (A) Stages of HSPC erythroid differentiation analyzed in D. Li et al. (2023) (red) and Schulz et al. (2019) (blue). (B) Volcano plot showing genome-wide expression changes between HSPC and Ery-Pre stages, reanalyzed from D. Li et al. (2023) RNA- seq data. Horizontal and vertical blue lines mark adjusted p = 0.05 and log₂ fold 58 changes of ±1, respectively. Red dots highlight the top 10 most significantly upregulated genes, including the key erythroid markers HBG1 and HBG2 (β-like globin genes). (C) NMU expression changes during early erythropoiesis, reanalyzed from D. Li et al. (2023) RNA-seq data (n = 3). (D) CUT&RUN signal of H3K27ac, GATA1 and RUNX1 at the NMU–eNMU locus during early erythropoiesis. Tracks show one representative biological replicate from D. Li et al. (2023). (E) GATA1 and NMU expression changes following 24-hr siRNA knockdown of GATA1 in Ery-Pro cells, reanalyzed from D. Li et al. (2023) RNA-seq data (n = 2). (F) ATAC-seq signal at the same locus as in (D) throughout the full HSPC differentiation time course. Tracks show merged biological replicates (n = 2) from Schulz et al. (2019). (G) NMU expression changes during the same stages as in (F), reanalyzed from An et al. (2014) and Schulz et al. (2019) RNA-seq data (n = 3). Further examination of the facilitator loci throughout the full differentiation time course (Schulz et al., 2019) reveals that facilitators F2 and F3 acquire discernible ATAC-seq signals when eNMU accessibility surges (Figure 2.14F), consistent with their eNMU-dependent behavior in K562 (Figure 2.12A). Interestingly, the strong facilitator F3 becomes even more accessible during the final stages of erythropoiesis, despite a sharp decline in eNMU accessibility and NMU expression. This decoupling of the enhancer–facilitator hierarchy suggests that facilitators work in concert with enhancers in a stage-specific manner and are insufficient to substitute for enhancers in the temporal control of gene expression. 2.3.8 A putative LTR promoter as a built-in negative regulatory element for enhancer activity After scrutinizing the activating motifs in eNMU, we shifted our focus to the repressing segment e1.4 (Figure 2.5) to investigate its functional characteristics. The predominantly unidirectional transcription pattern immediately caught our attention, as opposed to the balanced divergent transcription in the positive regulatory region e1.1– 59 e1.3 (Figure 2.15A). Interestingly, the entire e1 element corresponds to a MER72 Long Terminal Repeat (LTR) of the ERV1 endogenous retrovirus family. Sequence alignment with the MER72 consensus from the Dfam database (Storer et al., 2021) revealed a conserved, weak TATA box variant (CATAA) located 31 bp upstream of the TSS in e1.4, along with a conserved polyadenylation (poly A) signal downstream, matching the typical architecture of an LTR promoter (Medstrand et al., 2001) (Figure 2.15B). It is thus likely that e1.4 serves as a putative LTR promoter, while e1.1–1.3 functions as its corresponding LTR enhancer. The LTR promoter may compete with the NMU promoter for the LTR enhancer activity, which could explain the de-repression of NMU gene observed in Δe1.4. To test this hypothesis, we mutated the KLF/SP motifs in either the LTR promoter or the LTR enhancer and assessed their effects in single cell clones (Figure 2.15C). We chose KLF/SP motifs because (1) disrupting all of them in e1 greatly reduced NMU expression in our FlowFISH screen (Figure 2.5B), and (2) several of them are located closely upstream of TSSs in e1 (Figure 2.15A), a position generally associated with transcription activation (Duttke et al., 2024). Indeed, KLF/SP mutations in the LTR promoter caused a 1.5-fold increase in NMU expression (Figure 2.15C, LTRpr_mKLF), mirroring the effect of Δe1.4. By comparison, KLF/SP mutations in the LTR enhancer or across the entire e1 region dramatically decreased gene expression. We also attempted to strengthen the LTR promoter by optimizing its core promoter elements (CPEs), specifically the TATA box and the Initiator (Inr) motif. As predicted by the competition model, this mutant caused a slight downregulation of NMU expression (Figure 2.15C, mCPE_up). However, attempts to weaken these CPEs showed only a neutral effect, likely due to their inherently weak strength in driving 60 transcription initiation, supporting a dominant role of the KLF/SP motifs in the LTR promoter function. Fig. 2.15 A putative LTR promoter as a built-in negative regulatory element for enhancer activity. (A) Transcription-related sequence features of e1. (B) Sequence alignment between e1 and the MER72 LTR consensus. (C) NMU mRNA levels measured by RT-qPCR in single cell-derived recombinant clones; bars = median. (D–G) ATAC-seq signal at 61 eNMU (D), NMU promoter, and GAPDH control locus (E); PRO-seq signal at eNMU (F) and NMU promoter (G) in select eNMU mutants. In (G), bar = NMU pausing index of merged replicates, dots = pausing index of individual replicates. Tracks represent merged biological replicates (n = 2 independent single cell clones). Colored boxes below tracks indicate locations of disrupted TF motifs. Fine vertical lines indicate positions of GRO-cap–defined TSSs (WT K562) (Core et al., 2014). (H) Proposed competition model between the LTR promoter and the NMU promoter. To exclude the possibility that the KLF/SP sites in e1.4 act as repressor motifs, we performed ATAC-seq in both the LTR promoter and the LTR enhancer mutants. This revealed highly localized accessibility reductions confined to the respective mutated regions (Figure 2.15D), confirming the activating function of KLF/SP in both contexts. However, accessibility at the NMU promoter changed in opposite directions (Figure 2.15E), supporting the idea that the putative LTR promoter and enhancer operate as distinct regulatory elements for NMU, likely through a promoter competition mechanism (Figure 2.15H). Finally, we examined the nascent transcription profiles of the KLF/SP mutants by PRO-seq. At the eNMU locus, the LTR promoter mutant showed a nearly identical pattern to WT_eNMU, suggesting unperturbed Pol II recruitment (Figure 2.15F). Furthermore, the pausing index of NMU also remained unaltered due to the proportional increase in the pause region and gene body reads (Figure 2.15G). Conversely, the LTR enhancer mutant resembled other e1 mutants studied in Figure 2.9, exhibiting reduced eNMU and NMU signals, along with a doubled pausing index. These observations thereby raise an interesting possibility: promoter competition could provide a unique advantage in modulating target gene transcription while maintaining normal Pol II pause–release dynamics. 62 2.4 Discussion In this study, we established a novel landing pad platform to systematically interrogate the molecular architecture of the potent long-range enhancer eNMU at its native locus. Through detailed functional dissections of key mutants and integrative mining of public datasets, we uncovered several recurring themes supported by multiple lines of evidence: (1) TFs exert unique and cooperative functions in a context-specific manner; (2) facilitators depend on the core enhancers while ensuring robustness of their enhancer partners; (3) divergent transcription accurately demarcates active enhancer units. Collectively, our findings illuminate an intricate and coordinated interplay among distinct classes of cis-regulatory elements—enhancers, facilitators, and promoters—that underpins precise transcriptional regulation. Previous enhancer studies have primarily employed approaches such as random mutagenesis (Canver et al., 2015b; Kircher et al., 2019; Melnikov et al., 2012; Patwardhan et al., 2012), tiling disruptions (Kosicki et al., 2024; Martyn et al., 2025; Roh et al., 2024), and specific motif manipulations (Frömel, Rühle, Martinez, et al., 2025; Georgakopoulos-Soares et al., 2023; Grossman et al., 2017; R. P. Smith et al., 2013) to identify functional features within regulatory elements. In contrast, our study applied a distinct framework to dissect eNMU, grounded in our divergent transcription- based unit definition of active human enhancers (Core et al., 2014). Building on the foundational work of Gasperini et al. (2019) and Tippens et al. (2020), this approach enabled us to progressively refine the bona fide NMU enhancer unit from the full eNMU region to its sub-element e1, and ultimately to a minimal, divergently transcribed LTR enhancer core (Figure 2.15). Importantly, the core enhancer activity is modulated by the 63 surrounding sequence features within eNMU—specifically, augmented by the intrinsically inactive facilitator element e2 and repressed by the adjacent unidirectionally transcribed LTR promoter. These findings reveal the regulatory complexity of the eNMU locus and highlight the strength of our divergent transcription model in precisely delineating functional enhancer units, thereby guiding the future classification of diverse distal regulatory elements. Unlike previously described facilitators identified in the context of hyper-active super-enhancers (Blayney et al., 2023; Brosh et al., 2023), the presence of multiple facilitators associated with a typical enhancer across the ~100-kb NMU–eNMU region raises the possibility that many CRISPRi-identified “enhancers” may in fact function as facilitators. Notably, the eNMU-associated facilitators exhibited virtually no intrinsic activity even when present all together in the genome (0.01% of WT expression, Figure 2.1B, Δe1), consistent with their inherently weak, enhancer-dependent accessibility patterns (Figure 2.12A). This stands in contrast to the facilitators within super- enhancers, which tend to display strong signals of open chromatin, TF/coactivator binding, and Pol II recruitment—features likely contributing to their residual intrinsic enhancer activity (Blayney et al., 2023). We speculate that a continuum of enhancer potential exists along the enhancer–facilitator spectrum, with the eNMU-associated facilitators situated at the extreme low-activity end. The lack of intrinsic activity is crucial in confining facilitator function to potentiating and buffering pre-established enhancers (Figure 2.11) while preventing ectopic gene activation. With the continuing advances in genome engineering technologies, it will be increasingly important to interrogate all candidate regulatory elements simultaneously within their native 64 chromatin hub environments, to distinguish autonomous enhancers from affiliated facilitators and to better understand their mechanistic interplay. In addition to the enhancer–facilitator axis, the LTR enhancer–promoter axis within eNMU represents another potentially widespread and underappreciated mode of cis-regulatory behavior, especially considering the high abundance of LTRs in the human genome and their nearly 10% representation of all ENCODE candidate cis- regulatory elements (cCREs) (A. Y. Du et al., 2024). By leaving the enhancer intact, the LTR promoter can compete with the gene promoter without disrupting normal Pol II pause–release dynamics, instead simply siphoning transcriptional activity toward itself. This regulatory strategy enables fine-tuning of target gene expression, particularly when the LTR promoter harbors motifs for developmental stage-specific TFs. Moving forward, a genome-wide search for functional unidirectional TSSs, including but not limited to those derived from LTR promoters, will be crucial for constructing a more comprehensive map of transcriptional networks. How do these functionally distinct classes of cis-regulatory elements coordinate to achieve precise and robust transcriptional regulation? We propose that the answer lies in the combinatorial action and synergy of a repertoire of trans-acting TFs and cofactors. At the core LTR enhancer unit, we observed strong cooperative binding of key TFs GATA1 and RUNX1 (Figures 2.7, B and C), despite their motifs being separated by ~40 bp—a spacing that likely limits direct protein–protein interactions. This suggests that indirect mechanisms (Morgunova & Taipale, 2017; Spitz & Furlong, 2012) may underlie their cooperation, including DNA conformational changes (Panne et al., 2007), co-binding to a shared cofactor or a multiprotein complex (Spitz & Furlong, 2012), and 65 nucleosome-mediated collaborative competition (Adams & and Workman, 1995; Doughty et al., 2024; Miller & Widom, 2003; Mirny, 2010). At the adjacent LTR promoter, activating KLF/SP family TFs played a context-specific role to compete for enhancer activity (Figure 2.15). At the facilitator e2, STAT5 binding to a tandem array of five motifs critically amplified e1’s enhancer activity, despite modest impact on e1’s chromatin accessibility, transcription initiation, and p300 recruitment (Figures 2.7–2.9). It is worth noting that e2’s own accessibility and transcription depended not only on its own STAT5 binding, but also on the integrity of e1’s key TF binding (Figures 2.8 and 2.9). This peculiar behavior of STAT5 echoes the recently proposed concept of “context-only” TFs (Kribelbauer-Swietek et al., 2024), which do not provide DNA access themselves but instead amplify the activity of “context-initiator” TFs by establishing cooperative environments. These two classes of TFs partner promiscuously without requiring close motif proximity, consistent with e2’s universal buffering effect on various heterologous enhancers (Figure 2.11). We speculate that STAT5 binding at e2 fosters multivalent interactions (Chong et al., 2022; Trojanowski et al., 2022) with e1-bound TFs via its intrinsically disordered C-terminal transactivation domain (C. P. Lim & Cao, 2006), thereby enhancing the “stickiness” of the regulatory hub. Nonetheless, we cannot exclude the possibility that STAT5 engages some unique coactivators which have yet to be identified. In addition to the disordered C-terminal transactivation domain, STAT5 contains an N-terminal oligomerization domain that allows tetramerization of active STAT5 dimers on tandemly linked motifs (John et al., 1999; W. K. Meyer et al., 1997). This oligomerization extends STAT5’s DNA binding specificity to low-affinity sites 66 (Soldaini et al., 2000), which may explain the ~30% residual binding observed upon mutating two conserved bases in all five STAT5 motifs at e2 (Figure 2.7D, right panel). Such oligomerization might also facilitate spatial connectivity between eNMU and the STAT5-bound facilitators F1¢ and F2—reminiscent of GAGA-associated factor (GAF) oligomerization at a subset of tethering elements in developing Drosophila embryos (X. Li et al., 2023), which, despite lacking intrinsic enhancer activity, are essential for long- range enhancer–promoter communication (Batut et al., 2022). In parallel, dimerization of the LDB1 complex bound at F1¢ and F2, the enhancer e1, and the NMU promoter (Figures 2.12 and 2.13) may further promote chromatin contacts between these loci. Together, these potential mechanisms suggest a broader architectural role for facilitators in organizing 3D regulatory hubs independent of CTCF or cohesion, underscoring an exciting avenue for future investigation. A final noteworthy observation from our functional analysis is that, despite substantial variation in chromatin accessibility pattern at eNMU across different motif mutants (Figure 2.8A), the transcriptional output and Pol II pause–release dynamics (pausing index) of NMU were altered to similar extents (Figures 2.7A, 2.9B, and 2.9C). Such decoupling between enhancer accessibility and gene activation is consistent with prior findings (Dogan et al., 2015; Doughty et al., 2024) and highlights the importance of specific TF inputs and their associated cofactors in driving functionally productive enhancer–promoter communication. Future work should aim to define the full repertoire of these regulatory components and the steps of transcription that they influence (such as chromatin opening, Pol II initiation, and pause release). 67 In summary, we conducted a rigorous in situ dissection of a robust long-range enhancer at unprecedented architectural resolution, providing experimental evidence that resonates with and extends current models of enhancer function. The intricate crosstalk among spatially and functionally linked cis-regulatory elements—including enhancers, facilitators, and promoters—underscores the importance of a holistic framework to decode their mechanistic interplay. We anticipate that our efficient and versatile recombinase-mediated genome rewriting platform will serve as a powerful tool to drive these efforts forward. Limitations of the study Our motif mutagenesis approach could not definitively identify the functional TFs acting at eNMU. For instance, both GATA1 and GATA2 are well expressed in K562 cells and bind similar/identical motifs, making it difficult to distinguish their individual contributions. We attributed the observed effects to GATA1 in our study, because it is the most highly expressed GATA family factor in K562 (Karlsson et al., 2021) and the master regulator of erythropoiesis. Similarly, we did not determine the exact TFs binding the critical RARA/RXRA motifs, given the low abundance of RARA and RXRA proteins (Grande et al., 2001; Karlsson et al., 2021) and the presumed absence of retinoic acid signaling in K562 under standard culture conditions. It is possible that other nuclear receptors recognize and bind these motifs. In addition, our mutagenesis screen may have missed some functional motifs, as it relied on the availability of ChIP-seq data to confirm TF binding. Furthermore, while we made every effort to avoid disrupting overlapping motifs, some degree of interference was unavoidable—for example, between the right RUNX1 motif and the adjacent 68 RARA/RXRA motif. As we introduced only a single version of the transversion mutations, we also cannot completely exclude the possibility of inadvertently creating novel TF binding sites, despite efforts to minimize matches to known motifs. Finally, although e2 displayed power-law buffering behavior across eight heterologous enhancers, larger-scale studies are warranted to fully capture the complexity of enhancer–facilitator synergism. 2.5 Methods Cell lines and culture Parental wildtype K562 cells, an immortalized erythroleukemia cell line isolated from the bone marrow of a 53-year-old female patient with chronic myelogenous leukemia (CML), were obtained from the America Type Culture Collection (ATCC) (ATCC Number CCL-243) by the Yu lab and generously provided to us. Genetically modified, homozygous eNMU deletion lines (ΔeNMU, Δe1, and Δe2) were also kind gifts of the Yu lab. All the other engineered K562 cell lines, including the eNMU landing pad lines and single cell-derived recombinant clones, were generated by this study (see below). All the K562 lines were cultured in RPMI 1640 media supplemented with GlutaMAX (Gibco) and 10% heat-inactivated FBS (Avantor) at 37°C with 5% CO2 in a humidified sterile incubator. Cell density was maintained between 0.1 ~ 1 × 10⁶ cells/mL, and mycoplasma testing was performed routinely. Transfection and cell sorting All transfection experiments in K562 cells were carried out using Lonza’s Nucleofector 2b device and the Nucleofection Kit V, following manufacturer’s 69 instructions. Specifically, one single cuvette was used to transfect 1 million cells with a total of 5 µg plasmid DNA; for co-transfection of two plasmids, 2.5 µg of each plasmid was used. All cell sorting experiments were performed on the Sony MA900 Multi- Application Cell Sorter using a 100-μM chip (catalog no. LE-C3210; Sony). Single-copy eNMU landing pad cell line construction To CRISPR knock in the Bxb1 landing pad at the eNMU locus in an eNMU- null background, we first amplified the genomic region surrounding the eNMU locus in the ΔeNMU cell line, inserted it into a HindIII-linearized pEGFP-N1 vector via Gibson assembly, and Sanger sequenced individual colonies to determine the exact allelic sequences. Based on the obtained sequences, we designed four sgRNAs using CHOPCHOP (Labun et al., 2019) and cloned each sgRNA into the pX330 vector (Addgene plasmid # 42230) following its standard protocol, i.e., restriction-ligation cloning of annealed sgRNA oligos into BbsI-linearized pX330 backbone. Left and right homology arms, each in 1-kb size, were also designed and PCR amplified from K562 genomic DNA. Two intermediate plasmids were constructed prior to assembling the homology directed repair (HDR) donor plasmid: first, the attP1 and attP2 gBlocks (IDT, Supplementary Table S2) were inserted into a vector backbone; second, an EF1a promoter fragment and the BFP-2A-iCasp9-2A-BlastR cassette, PCR amplified from the pFL7_pLenti-pTet-Bxb1-BFP-2A-iCasp9-2A-BlastR_pCMV-rtTA3 plasmid (a kind gift from the Grimson lab) (Matreyek et al., 2020) were inserted between the attP1 and attP2 sites; third, the left and right homology arms, along with the entire attP1- EF1a-BFP-2A-iCasp9-2A-BlastR-attP2 cassette, were inserted into a pUC19 vector 70 backbone to generate the final donor plasmid. All three cloning steps were performed using Gibson assembly. The second intermediate plasmid was constructed to enable preliminary testing of Bxb1 recombination in a plasmid context prior to chromosomal integration (data not shown). Each pX330-sgRNA plasmid was then co-transfected with the donor plasmid into the ΔeNMU K562 cell line. After episomal BFP signal died out, CRISPR knock-in efficiency was assessed by gain of stable BFP expression. Three out of four sgRNAs produced a significant BFP+ population compared to the donor-only negative control. Single cells from these three populations were sorted into 96-well plates to derive clonal cell lines. Outgrown single cell clones were first screened by genotyping PCR to identify those with heterozygous landing pad (LP) integration. Genomic DNA from a subset of candidate clones was purified by phenol-chloroform extraction, and a qPCR-based copy number analysis was performed by comparing BFP DNA Ct values to a control locus known to exist in three alleles in K562. Confirmed single-copy LP clones were further evaluated based on the percentage of BFP+ cells and Bxb1 recombination efficiency (see below). Two clonal lines, E5 and D17, were selected for subsequent experiments. Bxb1 recombination efficiency and eNMU rescue experiment To construct the attB-containing payload plasmid, the attB1 and attB2 gBlocks (IDT, Supplementary Table S2) were first inserted into a vector backbone to generate an intermediate plasmid. An EF1a promoter fragment and an EGFP or mCherry fragment were then introduced into the linearized intermediate plasmid. For all 71 subsequent individual element cloning (i.e., excluding eNMU mutant library cloning), this parental attB1-EF1a-EGFP/mCherry-attB2 plasmid was digested with BmtI and BspEI (NEB) and the EF1a-EGFP/mCherry cassette was replaced with intended elements. All cloning steps were performed using Gibson assembly. To evaluate Bxb1 recombination efficiency and test the functionality of the eNMU landing pad, we co-transfected the pFL9_pCAG-NLS-HA-Bxb1 plasmid (Addgene # 51271, a kind gift from the Grimson lab, transiently expressing Bxb1 recombinase) and the attB1-eNMU-attB2 payload plasmid into the LP cell lines. About 7 days post-transfection, percentage of BFP− population became stable and was measured on the Sony MA900 cell sorter compared to a no-payload negative control. Both E5 and D17 LP clones consistently exhibited 4~10% BFP loss across independent experiments, with Clone E5 showing slightly higher recombination efficiency. The recombinant BFP− cells were further sorted as bulk populations and propagated for another 10~14 days to allow stable NMU reactivation. Cells were then harvested for RNA extraction and RT-qPCR analysis (see below) to confirm the rescue of NMU gene expression. eNMU mutant library design, cloning, and integration Given the central role of TFs in enhancer function, we first sought to dissect how specific TF binding events contribute to eNMU activity by maximizing both the extent and specificity of TF binding disruption. To this end, we aimed to (1) curate a list of motifs for TFs that are expressed in K562 cells and exhibit motif-specific binding supported by public ChIP-seq data, and (2) introduce point mutations across all motif 72 occurrences of each selected TF. Specifically, we retrieved all available K562 ChIP-seq peaks that overlap the eNMU region (hg38 coordinate = chr4:55729891–55730846) using the UCSC Table Browser tool (Karolchik et al., 2004). We removed entries corresponding to non-sequence-specific cofactors and TFs not expressed in K562, based on ENCODE (ENCODE Project Consortium et al., 2020) polyA plus RNA-seq data (accession: ENCSR000CPH) using a TPM > 1 threshold. Binding motifs for the remaining TFs were then obtained from the JASPAR 2022 database (Castro-Mondragon et al., 2022) with few occasions from the cis-BP database (Weirauch et al., 2014) (see Supplementary Table S1). These motifs were further manually reviewed and filtered to retain only those located under a ChIP-seq peak. For TFs with motifs that perfectly overlap at least once—such as AP-1/NFE2 and KLF1/SP1—we grouped and treated them as a single TF. To maximize disruption of TF binding while minimizing unintended effects on adjacent motifs, we identified the two most conserved bases in each motif using position frequency matrices (PFMs) from the JASPAR 2022 database (Castro-Mondragon et al., 2022) and introduced transversion mutations (A↔C, T↔G). We chose this transversion scheme because it has been shown to be more effective than alternative mutagenesis schemes (Kircher et al., 2019; Kosicki et al., 2024). To complement the targeted motif mutagenesis, we designed tiling deletions across the 956-bp eNMU region, each spanning ~100-bp intervals within the sub- elements e1 (first 453 bp) and e2 (last 503 bp). These deletions were intended to encompass GRO-cap–defined TSSs and TF motif clusters, resulting in segments e1.1– e1.4 and e2.1–e2.4. Additional segments e1.5–e1.9 were included to help resolve critical sequence features within e1.3 and e1.4. Detailed information on all mutated motifs and 73 deleted segments is listed in Supplementary Table S1 (separate file). Given the functional distinction between e1 (enhancer) and e2 (facilitator), we aimed to introduce mutations in either e1, e2, or both to dissect their individual contributions and cooperative interactions. To achieve this, we employed a “mix-and- match” cloning strategy (Figure 2.4B). For TF motif mutagenesis, mutant versions of e1 and e2 were synthesized separately by Twist Bioscience as dsDNA fragments, with all occurrences of a given TF’s motif mutated simultaneously. Each mutated e1 element was paired with either a wildtype e2 or a mutated e2 of the same TF type, and vice versa. Each pair was Gibson assembled with two half-backbone fragments: a fixed attB1- containing fragment and an attB2-containing fragment carrying a unique 8-bp random barcode generated by PCR. Tiling deletion constructs were built using the same cloning strategy, except that e1 deletion fragments were PCR amplified from pre-existing mutant plasmids created using the Q5 site-directed mutagenesis kit (NEB) in earlier experiments, rather than synthesized. Wildtype eNMU, Δe1, and Δe2 constructs were included as controls with known enhancer activities. Additionally, six exogenous sequences from a published STARR-seq library (Tippens et al., 2020), kindly provided by the Yu lab, were also cloned as controls. These included the 584-bp CMV enhancer (CMV584), commonly used as a positive control in episomal enhancer reporter assays, and several non-regulatory open reading frames (ORFs), including EGFP and four human ORFs (ORF56714, ORF52920, ORF54588, and ORF55756). In total, 83 individual Gibson assembly reactions were performed and transformed into NEB Stable competent E. coli cells (prepared using the Mix & Go! E. coli transformation kit from Zymo Research). 74 For each Gibson assembly transformation, 8 colonies were picked and cultured overnight in deep-well 96-well plates. Colony PCR was performed on 1:20 water- diluted liquid cultures to screen for positive insertions using Q5 High-Fidelity 2X Master Mix (NEB) with primers ZZ041 and ZZ044 (Supplementary Table S2), which amplify the insertion from regions flanking the Bxb1 recombination sites. The PCR program was: initial denaturation 98°C for 5 min; 30 cycles of 98°C for 10 s, 61°C for 30 s, 72°C for 39 s; and final extension 72°C for 5 min. Positive PCR amplicons (~1.3 kb) were then purified using homebrew SPRI beads (Boswell, 2020) (0.7× bead ratio) and subjected to Sanger sequencing to verify element sequences and determine element- barcode associations. In total, we identified 328 unique barcodes corresponding to the 83 elements. These confirmed liquid cultures were pooled together for Maxiprep (Zymo Research) to extract the plasmid library. Four million LP cells of the Clone E5 or D17 (biological replicates) were transfected with the plasmid library and the pFL9_pCAG-NLS-HA-Bxb1 plasmid to achieve a minimum coverage of 200× for each unique barcode representation in the recombinant population. On Day 7 post-transfection, BFP− cells were sorted at a minimum coverage of 200× and expanded for another 14 days to allow full activation of NMU. The recombinant cells were then subjected to HCR-FlowFISH. HCR-FlowFISH and sequencing library preparation To measure enhancer activity of individual elements within the pooled recombinant population, HCR-FlowFISH was performed according to the published protocol (Reilly et al., 2021) with minor modifications. We first obtained HCR probe 75 sets and fluorescent hairpins from Molecular Instruments for the target gene NMU (B1 hairpin, Alexa Fluor 647 or AF647) and the internal control gene ACTB (B2 hairpin, Alexa Fluor 488 or AF488). Note that the NMU probes were custom-designed in the published study (Reilly et al., 2021) while the ACTB probes were pre-designed and optimized by Molecular Instruments. FISH probing was performed in strict accordance with the published protocol (Reilly et al., 2021), including all solution volumes and centrifugation parameters. Briefly, 20 million recombinant cells of each biological replicate were fixed with 4% formaldehyde in PBST (1× PBS, 0.1% Tween 20) at room temperature for 1 h and washed with PBST for 4 times. Following 10 min up to 24 h incubation with cold 70% Ethanol at 4°C, cells were washed with PBST twice and incubated with the pre-warmed Probe Hybridization Buffer at 37°C for 30 min. HCR probes for NMU and ACTB were added together to cells to reach a final concentration of 4 nM per probe. The samples were then incubated overnight with agitation in a 37°C hybridization oven. On the next day, cells were washed with the Probe Wash Buffer for 5 times, with 5× SSCT (5× SSC, 0.1% Tween 20) once, and pre-amplified in the Amplification Buffer for 30 min at room temperature. Snap-cooled hairpins were diluted in the Amplification Buffer and then added to the pre-amplified samples to reach a final hairpin concentration of 60 nM. Samples were incubated with rotation in a dark room overnight at room temperature. On the next day, 5× volume of 5× SSCT was added to the samples before centrifugation and removal of the hairpin amplification solution. Cells were then washed with 5× SSCT for 6 times before final resuspension in PBS at a density of 10 million cells/mL. The samples were filtered through a 35 µm Cell Strainer cap into a 5 mL polystyrene tube (Corning) before sorting. 76 Cells were sorted into 8 bins (2 rounds of 4-way sorting) based on the AF647/AF488 ratio (Figure 2.4C) at a minimum coverage of 500× barcode coverage per bin to ensure robust representation in the sequencing library. Sorted cells, together with the unsorted background sample, were pelleted and resuspended in 400 µL of ChIP lysis buffer (50 mM Tris-HCl, pH 8, 10 mM EDTA, 1% SDS), and de-crosslinked overnight at 65°C with 1000× rpm shaking. Samples were then treated with RNase A (Thermo Scientific) and Proteinase K (Invitrogen) before phenol-chloroform extraction of genomic DNA (gDNA). Sequencing libraries were prepared by two rounds of PCR using Q5 High-Fidelity 2X Master Mix (NEB). The 1st round PCR was performed on the recovered gDNA corresponding to a minimum of 200× barcode coverage, using primers ZZ145 and ZZ146 (Supplementary Table S2) to specifically amplify the 8-bp barcodes from the eNMU genomic locus. A maximum of 500 ng gDNA was used as input in a 50 µL PCR reaction. The PCR program was: initial denaturation 98 °C for 3 min; 11 cycles of 98°C for 10 s, 65°C for 30 s, 72°C for 1 min; and final extension 72°C for 5 min. The PCR products were then purified using homebrew SPRI beads (1.5× bead ratio) to remove unused primers, followed by the 2nd round PCR with standard Illumina Nextera primers to append sequencing library indices and flow cell adaptors to the amplicons. The PCR program was: initial denaturation 98°C for 30 s; 11 cycles of 98°C for 10 s, 67°C for 30 s, 72°C for 20 s; and final extension 72°C for 5 min. Final PCR products were purified using the MinElute PCR purification kit (Qiagen) and DNA concentration was measured by the Qubit dsDNA High Sensitivity assay (Thermo Fisher). The libraries were pooled for sequencing on the Element Biosciences AVITI platform (2 × 80 bp paired-end sequencing). 77 Individual element testing at the eNMU landing pad For downstream functional analysis, critical eNMU mutants identified in the FlowFISH screen were cloned into the attB1-attB2 plasmid backbone without any element barcode. Several additional related mutants were designed and generated, whose sequences are listed in Supplementary Table S1 (separate file). These elements were integrated individually into the eNMU landing pad, and the BFP− recombinants were sorted as single cells into 96-well plates to establish clonal cell lines as independent biological replicates. Three to four weeks after sorting, cells expanded to sufficient numbers for crude gDNA extraction (Gasperini et al., 2019) and genotyping PCR to confirm element insertion: briefly, ~0.2 million cells were pelleted and concentrated in 20 µL of culture media in a 0.5-mL PCR tube, mixed with 40 µL of Quick Extract buffer (10 mM Tris-HCl, pH 8.5, 0.45% Tween-20, 4 mg/mL protease K), and incubated at 65°C for 6 min and 98 °C for 2 min; 1 µL of the crude gDNA extract was used as input in the genotyping PCR following the standard Phusion polymerase protocol (NEB) with homemade Phusion polymerase and primers ZZ104 and ZZ105, which amplify the insertion from regions flanking the Bxb1 recombination sites. The PCR program was: initial denaturation 98°C for 3 min; 30 cycles of 98°C for 10 s, 58°C for 30 s, 72°C for 30 s; and final extension 72°C for 5 min. Positive amplicons (1132 bp) were purified using homebrew SPRI beads (0.7× bead ratio) and verified by Sanger sequencing to confirm sequence integrity. Of note, no mutations were observed in any of the single cell-derived clones, demonstrating the genomic stability of K562 cells. The verified clonal lines were subjected to RT-qPCR analysis to measure their NMU expression levels. 78 For heterologous enhancer testing at the eNMU locus, we selected candidate elements from a previously curated list of CRISPR-validated distal regulatory elements in K562 cells (Supplementary Table 6a of Fulco et al. (2019)), prioritizing those with large effect sizes. The selected elements were PCR amplified from K562 gDNA and inserted with or without the e2 element into the attB1-attB2 plasmid backbone. Recombinant cells were sorted as bulk populations to measure enhancer activity by RT- qPCR. As a side note, the parental LP cell lines exhibited a basal BFP− fraction (0.6~2%), meaning the sorted BFP⁻ population included some non-recombinant LP cells. To estimate the true recombinant fraction, we subtracted the %BFP⁻ in the no- payload control from that in the Bxb1+payload transfection and divided this number by the total %BFP⁻ in the Bxb1+payload transfection. NMU expression measured by RT- qPCR was corrected based on this estimated true recombinant fraction for each element integration. Quantitative reverse transcription polymerase chain reaction (RT-qPCR) K562 cells were lysed with the TRIzol Reagent (Invitrogen), and RNA was isolated using the Direct-zol RNA miniprep kit (Zymo Research) with 15 min DNase treatment on column. Reverse transcription was performed using M-MuLV RT (NEB M0253L) and Random Primer Mix (NEB S1330S) following manufacturer’s instructions. Real-time quantitative PCR (qPCR) was carried out with a custom protocol: 1/10 volume of cDNA, 1× Phusion HF Buffer (NEB), 500 nM of each primer, 200 μM dNTPs (Thermo Fisher), 0.7× SYBR Green I (Invitrogen), and 1/100 dilution of homemade Phusion polymerase. All qPCR reactions were run in technical triplicates in 10 μL volumes in 384-well plates on a Roche LightCycler 480 Instrument II with the 79 following program setting: initial denaturation 98°C for 2 min; 45 cycles of 98°C for 10 s, 58°C for 20 s and 72°C for 30 s; melt curve 98°C for 5 s, 55°C for 1 min, ramp to 98°C at 0.11°C/s; and cool down to 40°C. NMU expression was normalized to the housekeeping gene ACTB using the 2-ΔΔCT method (Livak & Schmittgen, 2001). Primers used for RT-qPCR are listed in Supplementary Table S2 (separate file). Chromatin Immunoprecipitation (ChIP) ChIP experiments were conducted using two independently derived single cell clones as biological replicates for each of the four genotypes of interest: WT_eNMU, e1_mGATA1, e1_mRUNX1, e2_mSTAT5. For GATA1 and RUNX1 ChIP, cells were washed twice with ice-cold PBS and crosslinked with 1% formaldehyde (Electron Microscopy Sciences) at room temperature for 10 min before quenching by 200 mM glycine at room temperature for 5 min. For STAT5 and p300 ChIP, cells were first crosslinked with 2 mM disuccinimidyl glutarate (Santa Cruz) at room temperature for 30 min, washed 3 times with PBS and then crosslinked with 1% formaldehyde at room temperature for 5 min before quenching with 200 mM glycine at room temperature for 5 min. Two additional PBS washes were performed, and cell pellets were lysed with Farnham Lysis Buffer (5 mM PIPES, pH 8, 85 mM KCl, 0.5% NP40, 10 mM glycine, 1× Thermo Scientific Pierce Protease Inhibitor) on ice for 20 min. After centrifugation and supernatant removal, the nuclear pellet was resuspended in RIPA Lysis Buffer (10 mM Tris-HCl, pH 8, 150 mM NaCl, 1 mM EDTA, 1% NP-40, 0.5% sodium deoxycholate, 0.1% SDS, 1× Thermo Scientific Pierce Protease Inhibitor) and incubated on ice for 10 min. Sonication was carried out using a Diagenode Bioruptor device at High Setting, 30 sec on/30 sec off for three rounds of 10-min cycle to shear chromatin 80 to a size of 100~300 bp. The lysate was then clarified by centrifugation at 20,000 r.c.f., 4°C for 15 min, of which 2% was kept as ChIP input. The following antibodies were used for IP: GATA1, Abcam ab11852; RUNX1, Abcam ab23980; STAT5, R&D Systems AF2168; p300, Abcam ab14984; normal rabbit IgG control, Cell Signaling Technology 2729S; normal mouse IgG1 control, Santa Cruz sc-3877. Each IP used 4 million cells/4 µg antibody/40 µL Dynabeads Protein A (for rabbit IgG) or Protein G (for mouse IgG1) (Thermo Scientific). Beads were washed three times with 5 mg/mL BSA in PBS and incubated with corresponding antibodies at 4°C for 6 h to overnight with rotation. Another three BSA/PBS washes were performed to remove unbound antibodies, and the clarified chromatin lysate was added to the beads and incubated overnight at 4°C with rotation. Beads were then washed with the following buffers, each for three times: Low Salt Wash Buffer (20 mM Tri-HCl, pH 8, 2 mM EDTA, 150 mM NaCl, 1% Triton X-100, 0.1% SDS), High Salt Wash Buffer (20 mM Tri-HCl, pH 8, 2 mM EDTA, 500 mM NaCl, 1% Triton X-100, 0.1% SDS), and LiCl Wash Buffer (10 mM Tri-HCl, pH 8, 1 mM EDTA, 250 mM LiCl, 1% NP-40, 1% sodium deoxycholate). After one final wash with TE Buffer (10 mM Tris-Cl, pH 8, 1 mM EDTA), chromatin was eluted from beads by two rounds of incubation with ChIP elution buffer (1% SDS, 0.1 M sodium bicarbonate). Each incubation involved 15 min shaking at 1200 rpm at 65°C, followed by 15 min rotation at room temperature. The eluates and input samples were treated with RNase A at 37°C for 30 min, de-crosslinked at 65°C overnight with 900 rpm shaking, followed by Proteinase K treatment at 45°C for 2 h with shaking. DNA was purified using MinElute PCR Purification Kit (Qiagen). 81 For GATA1, RUNX1 and STAT5 ChIP, qPCR was performed on the purified input and eluate samples to measure enrichment of TF binding at the eNMU locus. A negative control locus was also probed to estimate background signal of non-specific pull-down. Primers used for ChIP-qPCR are listed in Supplementary Table S2 (separate file). Since ChIP eluates were in low abundance and could be difficult to quantitate, qPCR was carried out with a custom 10× reaction mix from previous studies (Lutfalla & Uze, 2006; Rebouissou et al., 2022) that gave great sensitivity and specificity. The 10× reaction mix composition was: 400 mM 2‐amino‐2‐methyl‐1,3‐propanediol (pH adjusted to 8.3 using HCl), 50 mM KCl, 30 mM MgCl2, 0.09% Brij C10, 0.15% Brij 58, 500 μg/mL BSA, 300 μM dNTPs, 16.24% glycerol (v/v), 1/3000 SYBR Green I (10,000× stock), and 0.4 U/μL Platinum Taq DNA polymerase (Invitrogen). Final ChIP- qPCR condition was optimized to be: 1/10 volume of ChIP material, 1× custom reaction mix, 500 nM of each primer, 170 μM dNTPs, 0.35× SYBR Green I. All qPCR reactions were run in technical triplicates in 10 μL volumes in 384-well plates on a Roche LightCycler 480 Instrument II with the following program setting: initial denaturation 95°C for 10 min; 45 cycles of 95°C for 10 s, 60°C for 8 s and 72°C for 14 s; melt curve 95°C for 5 s, 45°C for 30 s, ramp to 95°C at 0.11°C/s; and cool down to 40°C. Serial dilutions of one input sample was included in each qPCR run to generate standard curves for each primer set, from which the amplification efficiency (E) was calculated. ChIP enrichment was then determined as percent input using the following equation: %Input = 100% × E^(Ct_input – Ct_ChIP) × Input Fraction Where E represents the qPCR amplification efficiency (ranging from 1.8 to 2.0 in our experiments), and Input Fraction refers to the proportion of total chromatin lysate 82 used for the input (2%, or 0.02, in our case). For p300 ChIP, sequencing libraries were prepared using a Tn5 tagmentation- based protocol. Briefly, 20 μL tagmentation reactions were set up with 0.35 ng of ChIP DNA or 1 ng of input DNA, 1× TAPS-DMF Buffer (10 mM TAPS-NaOH, pH 8.5, 5 mM MgCl2, 10% DMF), and 1 μL of 1:15 diluted homemade Tn5 transposase (Spektor et al., 2019) (a kind gift from Dr. Roman Spektor). The reaction was incubated at 55°C for 10 min, and 2 μL of 1% SDS was added immediately, followed by another 55°C incubation for 7 min to strip off Tn5 from DNA. Post-tagmentation PCR was carried out in 100 μL volume with 10 μL of the tagmentation reaction, 1× Phusion HF Buffer, 200 μM dNTPs, 400 nM of each Illumina Nextera index primer, and 1 μL of homemade Phusion polymerase. The PCR program was: initial extension 72°C for 3 min; initial denaturation 98°C 30s; 13 cycles of 98°C for 10 s, 63°C for 30 s, 72°C for 3 min; and final extension 72°C for 5 min. PCR products were purified first using the MinElute PCR purification kit (Qiagen), followed by an additional cleanup with homemade SPRI beads (1.5× bead ratio) to ensure complete removal of unused primers. DNA concentration was measured by the Qubit dsDNA High Sensitivity assay (Thermo Fisher). The libraries were pooled for sequencing on the Element Biosciences AVITI platform (2 × 80 bp paired-end sequencing). ATAC-seq For TF motif mutants, ATAC-seq was performed on two independently derived recombinant single cell clones as biological replicates. For WT K562 and CRISPR deletion cell lines (ΔeNMU, Δe1, and Δe2) obtained from the Yu lab (Tippens et al., 83 2020), ATAC-seq was conducted on two independent cultures from the same clonal source, as only one deletion clone was available for the genotypes ΔeNMU and Δe1. ATAC-seq was performed on 50,000 K562 cells with all buffer compositions and reaction conditions following the published Omni-ATAC protocol (Corces et al., 2017) unless otherwise specified. Briefly, cells were pelleted, washed once with ice-cold PBS, resuspended in ice-cold Lysis Buffer, and incubated on ice for 3 min. Upon addition of Wash Buffer and gentle inversion, nuclei were pelleted, and supernatant (cytoplasm) was discarded. Nuclei were then resuspended gently in 50 μL transposition reaction mix containing 1 μL homemade Tn5 transposase and incubated at 37°C for 30 min with 1000 rpm shaking. DNA was purified using the MinElute PCR purification kit (Qiagen) and eluted in 21 μL volume. The entire product (~20 μL) was mixed with 2.5 μL of each Nextera index primer (25 μM) and 25 μL NEBNext High-Fidelity 2X PCR Master Mix and subjected to a first round of PCR: initial extension 72°C for 5 min; initial denaturation 98°C 30s; 5 cycles of 98°C for 10 s, 63°C for 30 s, 72°C for 1 min. To determine additional cycles needed to avoid over-amplification, a qPCR analysis was performed using 5 μL of the first-round PCR reaction (Buenrostro et al., 2015). For all ATAC-seq libraries, 3~4 additional cycles of amplification were performed on the remaining 45 μL PCR reaction, followed by sequential cleanup using the MinElute PCR purification kit (Qiagen) and homebrew SPRI beads (1.5× bead ratio). DNA concentration was measured by the Qubit dsDNA High Sensitivity assay (Thermo Fisher). The libraries were pooled for sequencing on the Element Biosciences AVITI platform (2 × 80 bp paired-end sequencing) or Illumina NovaSeq X Plus platform (2 × 150 bp paired-end sequencing). 84 PRO-seq PRO-seq was performed on the same clonal lines used for ATAC-seq, as well as on two independently derived recombinant single cell clones harboring e1_WT or e1_mGATA1 integration without e2. All buffer compositions and reaction conditions followed the published protocol of Mahat et al. (2016) unless otherwise specified. Briefly, 5 million K562 cells were mixed with 250,000 Drosophila S2 cells (5% spike- in), pelleted at 1000 r.c.f. for 5 min at 4°C, washed once with ice-cold PBS, resuspended with ice-cold permeabilization buffer at a density of 1 million cells/mL, and incubated on ice for 5 min. Cells were washed twice with the same volume of ice-cold permeabilization buffer, and resuspended in 100 μL storage buffer before immediate nuclear run-on or flash-freezing in liquid nitrogen for long-term storage at –80°C. Prior to the run-on reaction, 40 μL Dynabeads MyOne Streptavidin C1 beads (Thermo Fisher) per sample were pre-washed sequentially with Hydrolysis Buffer (0.1N NaOH + 50 mM NaCl), High Salt Wash Buffer, and Binding Buffer. Pre-washed beads were resuspended in 60 μL Binding Buffer per sample. The nuclear run-on reaction was performed at 37°C for 5 min with a final concentration of 20 µM each of Biotin-11- CTP, Biotin-11-UTP, ATP, and GTP. Following RNA extraction by Trizol LS (Invitrogen) and RNA fragmentation by base hydrolysis, 30 μL pre-washed C1 beads and 30 μL Binding Buffer were added to the ~60 μL RNA sample, and bead binding was performed at room temperature for 20 min on a rotational device. Beads were then washed twice using 500 μL High Salt Wash Buffer and once using 500 μL Low Salt Wash Buffer, with tube swap after each wash. Biotinylated RNA was eluted from beads by Trizol extraction, and 3¢ RNA adaptor ligation was performed in a total volume of 85 20 μL with 5 µM final adaptor concentration and 2 μL T4 RNA Ligase I, High Concentration (NEB). The reaction was incubated at 20°C for 4 h and held at 4°C overnight. On the next day, 50 μL Binding buffer and 30 μL pre-washed C1 beads were added to the reaction, and another round of bead binding and bead washing was performed as described above. Subsequent 5¢ enzymatic modifications of RNA were performed on beads with a reaction volume of 20 μL assuming 1 μL bead volume: 5¢ decapping reaction involved 1 μL RppH (NEB) and 1-h incubation at 37°C; 5¢ hydroxyl repair involved 1 μL T4 PNK (NEB) and 1-h incubation at 37°C. Beads were then washed once with 300 μL Binding Buffer, and 5¢ RNA adaptor ligation was performed on beads in a total volume of 20 μL with 5 µM final adaptor concentration and 2 μL T4 RNA Ligase I, High Concentration (NEB), incubated at room temperature for 1 h on a rotational device. Beads were washed twice using 500 μL High Salt Wash Buffer and once using 500 μL Low Salt Wash Buffer, with tube swap after each wash. RNA was eluted from beads by Trizol extraction and resuspended in 13 μL RT resuspension mix (8 μL DEPC H2O, 4 μL of 10 μM Illumina RP1 primer, 1 μL of 10 mM dNTPs). RNA was denatured at 65°C for 5 min and snap cooled on ice, and 7 μL RT master mix was added to the sample (4 μL of 5× RT Buffer, 1 μL of 100 mM DTT, 1 μL Invitrogen SUPERase·In RNase Inhibitor, 1 μL Thermo Scientific Maxima H Minus Reverse Transcriptase). RT reaction program was: 50°C for 30 min, 65°C for 15 min, 85°C for 5 min, hold at 4°C. The resulting cDNA was diluted with an equal volume of DEPC H2O. A test amplification was performed on 1:4 serial dilutions of 2 μL cDNA sample and run on 6% native PAGE TBE gel to determine the optimal PCR cycle number (N). Final amplification was performed in 100 μL volume (32.5 μL DEPC H2O, 20 μL of 5× 86 Phusion HF Buffer, 20 μL of 5 M betaine, 2.5 μL of 10 µM Illumina RP1 primer, 2.5 μL of 10 µM Illumina indexing RPI-n primer, 2.5 μL of 10 mM dNTPs, 1 μL homemade Phusion polymerase, 19 μL cDNA sample). PCR program was: initial denaturation 95°C for 2 min; 5 cycles of 95°C for 30 s, 56°C for 30 s, 72°C for 30 s; N cycles of 95°C for 30 s, 65°C for 30 s, 72°C for 30 s; final extension 72°C for 5 min. PCR products were sequentially purified using the MinElute PCR purification kit (Qiagen) and homebrew SPRI beads (1.5× bead ratio) to remove all unused primers. DNA concentration was measured by the Qubit dsDNA High Sensitivity assay (Thermo Fisher). The libraries were pooled for sequencing on the Element Biosciences AVITI (2 × 80 bp paired-end sequencing) or Illumina NovaSeq 6000/X Plus platform (2 × 150 bp paired-end sequencing). Note that the 3¢ and 5¢ RNA adaptors contain a 6-nt unique molecular identifier (UMI) to enable accurate identification of PCR duplicates in downstream bioinformatic analysis. The complete sequences of the adaptors can be found in Judd et al. (2020). Sequencing data analysis Next-Generation Sequencing (NGS) data preprocessing: For all NGS sequencing data, the quality of FASTQ files was first accessed using FastQC (LaMar, 2015), and the Illumina sequencing adaptors were trimmed using fastp (S. Chen et al., 2018). Note that one of the ATAC-seq samples was sequenced at 2 × 150 bp instead of 2 × 80 bp. To ensure consistency across samples, the raw FASTQ files for this sample were trimmed to 80 bp prior to any analysis. Activity score calculation of HCR-FlowFISH: First, 8-bp element barcodes were 87 extracted from the trimmed FASTQ Read 1 using fastx_trimmer (Gordon, 2013/2010) with flags -Q33 -f 23 -l 30. The FASTQ format was converted into FASTA format using seqtk (H. Li, 2012/2023) with the command “seqtk seq -a -q20 -n N”. Occurrences of each barcode were counted using fastx_collapser (Gordon, 2013/2010). The 328 unique barcodes were associated with their corresponding 83 elements using a custom python script. We excluded 13 barcodes that had <50 raw reads in the unsorted background sample in at least one biological replicate. Fraction of each barcode in each bin was calculated. In parallel, the mean NMU fluorescence intensity (AF647) of each bin was normalized by the mean ATCB fluorescence intensity (AF488) of the same bin. The activity score of each barcode was then calculated using the weighted average method shown in Figure 2.4E. Calculated barcode activity scores are provided in Supplementary Table S3 (separate file). Multiplicative model of double mutants: To assess whether e1+e2 double mutations (either deletions or disruptions of the same TF motif) followed a multiplicative model based on single e1 and e2 mutations (inspired by Lin et al. (2022)), we first converted the median activity scores of both single and double mutants into pseudo expression values. This conversion used the linear regression equation derived from Figure 2.4G, which correlates FlowFISH scores with RT-qPCR measurements of NMU expression in a select set of mutants. The conversion was necessary because background fluorescence in the FlowFISH assay caused the activity scores to deviate from a direct representation of gene expression levels (i.e., the regression line did not pass through the origin). Log2 fold changes of each mutant relative to the WT_eNMU control was then calculated using the converted pseudo expression values. For each 88 e1+e2 double mutant, the log2 fold change was plotted against the sum of the log2 fold changes of the corresponding single mutants, as shown in Figure 2.6C. Genomics sequencing read alignment: To enable accurate alignment of sequencing reads to the eNMU locus for ATAC-/ChIP-/PRO-seq data, we built a custom genome for each recombinant TF motif mutant using the reform command line tool (Khalfan, 2018/2021). Each custom genome incorporated the exact mutant sequence flanked by Bxb1 recombination sites. Alignment was performed using the bowtie2 aligner (Langmead & Salzberg, 2012): for ATAC-seq and ChIP-seq, “-end-to-end -- very-sensitive” mode was used, with ATAC-seq involving a subsequent step to remove mitochondrial reads using samtools view (H. Li et al., 2009); Picard MarkDuplicates (Broad Institute, 2014/2019) was then used for deduplication; for PRO-seq, rRNA read removal, alignment, and deduplication (with UMI-tools (T. Smith et al., 2017)) were performed using the published bash script at https://github.com/JAJ256/PROseq_alignment.sh (Judd, 2020/2020). ATAC-seq and ChIP-seq data analysis: To better normalize genomics datasets across different clonal lines, we applied a “reads-under-peaks” approach to calculate scaling factors for each sample. ATAC-seq peaks were called for each sample using HMMRATAC (Tarbell & Liu, 2019) in the MACS3 software (Y. Zhang et al., 2008) with “-u 90 -l 30 -c 10” parameters. To create a unified peak set for count normalization, we first identified reciprocal >50% overlaps between peaks from biological replicates using bedtools intersect “-f 0.50 -r” (Quinlan & Hall, 2010). Consensus peak sets from https://github.com/JAJ256/PROseq_alignment.sh 89 each group of replicates were then merged to create a union peak set. ChIP-seq peaks were called for each biological replicate and for pooled replicates using MACS2 (Y. Zhang et al., 2008) with the input sample as control and the parameters “--broad --broad- cutoff 0.05 --keep-dup all”. Consensus peak sets from each group of replicates was generated using a published bash script (Additional file 5 of Reske et al. (2020)), which identifies pooled peaks that show >50% reciprocal overlap with each biological replicate. These consensus peak sets were then merged to create a union peak set. To generate the read count matrix for ATAC-seq and ChIP-seq datasets, read pairs overlapping union peaks were quantified using featureCounts (Liao et al., 2014). DESeq2 (Love et al., 2014) was then used to calculate scaling factors for normalization. bamCoverage (Ramírez et al., 2016) was used to generate normalized bigwig files at a bin size of 1 bp for ATAC-seq and 50 bp for ChIP-seq. p300 ChIP-seq signal at the eNMU locus was quantified using bigWigAverageOverBed from the kentUtils of the UCSC Genome Browser (Kent et al., 2010). Merged bigwig files from two biological replicates were generated also using kentUtils (Kent et al., 2010). PRO-seq data analysis: Unnormalized 3¢-end bigwigs files for PRO-seq were first generated from bam alignment files by PINTS (Yao et al., 2022) using the option “pints_visualizer -e R1_5 --reverse-complement”. To calculate scaling factors for PRO- seq data, a collapsed list of all GENCODE v46 transcripts (Mudge et al., 2025) was first generated using the reduceByGene() function in the BRGenomics R package (DeBerardine, 2023). The unnormalized bigwig files were loaded into Rstudio, and a DESeq2 object was generated for the combined list of all PRO-seq samples using the 90 getDESeqDataSet() function in BRGenomics, using the collapsed transcript list to specify genomic regions of interest. Scaling factors calculated by DESeq2 (Love et al., 2014) were applied to each bigwig file using the applyNFsGRanges() function in BRGenomics before data export. Unnormalized 5¢-end bigwigs files for PRO-seq were generated from bam alignment files by PINTS (Yao et al., 2022) using the option “pints_visualizer -e R2_5” and normalized using the same scaling factors calculated from the 3¢-end bigwigs files. Normalized bigwig files of two biological replicates were merged using kentUtils of the UCSC Genome Browser for visualization (Kent et al., 2010). Pausing index of the NMU gene was calculated using the getPausingIndices() function in BRGenomics (DeBerardine, 2023), with the promoter region defined as TSS to TSS+250 bp and the gene body region as TSS+500 bp to TES–500 bp (TSS, transcription start site; TES, transcription end site) using the promoters() and genebodies() functions in BRGenomics. Pausing indices for both individual samples and merged biological replicates were calculated and plotted in the figures. Public sequencing data analysis: All publicly available datasets visualized in this study are listed in Supplementary Table S4 (separate file). All ChIP-seq datasets, except for the one targeting LDB1(X. Guo et al., 2020) (GEO accession: GSE142227), were obtained from the ENCODE data portal (Luo et al., 2020) (https://www.encodeproject.org/). To ensure consistency, we re-analyzed the LDB1 ChIP-seq raw FASTQ files using the standard ENCODE ChIP-seq pipeline (Hitz et al., 2023) (https://github.com/ENCODE-DCC/chip-seq-pipeline2). For RNA-seq datasets https://www.encodeproject.org/ https://github.com/ENCODE-DCC/chip-seq-pipeline2 91 from D. Li et al. (2023) raw read counts were downloaded from GEO accession GSE214809. For RNA-seq datasets from Schulz et al. (2019) and An et al. (2014), NCBI-generated RNA-seq raw read counts (Sayers et al., 2025) were downloaded from GEO accessions GSE128268 and GSE53983. Differential gene expression analysis was performed using DESeq2 (Love et al., 2014) in Rstudio. Data visualization: Genome browser tracks were visualized using pyGenomeTracks (Lopez-Delisle et al., 2021). All bar plots, box plots, scatter plots, line plots, volcano plots, and correlation analyses were generated using the ggplot2 package (Wickham, n.d.) in R (version 4.2.3) and RStudio. Flow cytometry data was analyzed and plotted using FlowJo (version 10.10.0). Schematic illustrations were generated using BioRender.com under an academic license. Figures were assembled, annotated, and finalized using Adobe Illustrator. All plots used consistent color scales for cross- comparison. Quantification and statistical analysis One-way ANOVA with Dunnett’s post hoc test using WT_eNMU as the control was applied to ChIP-qPCR analysis. The statistical details of each experiment, including exact sample sizes (n) and additional tests (e.g., Pearson’s correlation coefficient r for linear relationships), are provided in the figure legends and/or shown graphically in the figures. For RNA-seq analysis, DESeq2 was used to determine the significance of differentially expressed genes. 2.6 Acknowledgements We thank all past and present members of the Lis Lab for insightful discussions 92 and support throughout this work. A special thank you to Dr. Judhajeet Ray, former research associate in the Lis Lab and currently at the Broad Institute, for his unwavering mentorship and patience in guiding Zhou Zhou during the formative early years of her Ph.D. training. Another special thank you to Jessica West, former Ph.D. candidate in Dr. Andrew Grimson’s lab and currently at UCSF, for generously sharing landing pad- related plasmids and for valuable input on the design of the recombination platform and library screening strategy. Thanks to former lab members Dr. Jin Liang and Nathaniel Tippens in Dr. Haiyuan Yu’s lab for providing eNMU deletion cell lines and contributing early ideas to the construction of the landing pad system. Thanks to Dr. Andrew Grimson and Dr. Charles Danko for their critical feedback on the project. Thanks to Jaret Lieberth (the Lis and Feschotte Labs), Adam He (Dr. Charles Danko’s lab), and Haining Chen (Dr. Franklin Pugh’s lab) for helpful discussions and expertise. Thanks to Xinchen Chen (Dr. Chun Han’s lab) for guidance on figure preparation using Adobe Illustrator. Fluorescence-activated cell sorting experiments were conducted at the Cornell Institute of Biotechnology’s Flow Cytometry Facility. Most next-generation sequencing data were generated by the Institute’s Epigenomics Core Facility, and initial pilot data produced by the Institute’s Genomics Facility. This work was supported by the National Human Genome Research Institute (NHGRI) grant 5R01HG012970 to J.T.L. and H.Y. 93 CHAPTER 3 INVESTIGATING THE NECESSITY OF ENHANCER TRANSCRIPTION FOR ENHANCER FUNCTION IN A HEAT-INDUCIBLE SYSTEM 3.1 Abstract Enhancers are key cis-regulatory elements that activate transcription at target promoters independent of distance and orientation. While the epigenomic features of enhancers have been extensively characterized, the precise mechanisms by which enhancers stimulate promoter transcription still remain unclear. Previously, our lab and the Yu lab developed eSTARR-seq, a massively parallel episomal assay that reliably quantifies enhancer activity across thousands of cloned elements. Using this method, we found that transcriptional activity is a stronger predictor of enhancer function than classical epigenomic marks such as DNase I hypersensitivity or histone modifications. To further explore whether enhancer transcription is required for function, I focused on a set of distal elements in human K562 cells that gain HSF1 binding and elevated H4 acetylation following 30-min heat shock. Interestingly, although HSF1 is a potent transcriptional activator, only ~300 of the bound elements exhibited heat-induced transcription by PRO-seq, while ~500 showed no detectable transcription. To test whether both transcribed and untranscribed HSF1-bound elements can function as enhancers, I cloned a library comprising ~120 transcribed elements (including ~60 heat- induced and ~60 unchanged), along with ~60 untranscribed elements. These were then subjected to eSTARR-seq to measure enhancer activity in K562 cells under heat shock (HS) and non-heat shock (NHS) conditions. Preliminary results showed that, unexpectedly, both the upregulated transcribed elements and the untranscribed elements could act as heat shock-inducible enhancers, though the overall fraction of active elements was small. Notably, the untranscribed distal elements exhibited lower basal activity but greater inducibility upon heat shock. Applying our most sensitive PRO-cap assay to HS and NHS cells further revealed that many of the elements previously identified as “untranscribed” by PRO-seq indeed exhibited induced transcription initiation upon HS. I further discuss the implications and caveats of this study. 94 3.2 Introduction Since its initial discovery genome-wide (T.-K. Kim et al., 2010), enhancer transcription has been recognized as one of the hallmark features of active enhancers, yet its functional role has remained unclear. Applying the sensitive GRO-cap assay to human cells has revealed a unified molecular architecture of enhancers and promoters, where an upstream TF binding region is flanked by two divergent core promoters (Core et al., 2014). We previously showed that deletion of core promoter regions in enhancers caused reduced enhancer activity as measured by element-STARR-seq (eSTARR-seq) (Tippens et al., 2020), suggesting that enhancer transcription contributes to function. However, sequence alterations in these experiments may confound the data interpretation as disruption or creation of transcription factor (TF) motifs is unavoidable. To rigorously test the necessity of enhancer transcription for its activity, combining eSTARR-seq with a controlled inducible system will be an ideal approach. To this end, I turned to the well-studied mammalian heat shock (HS) response. While heat stress induces global downregulation of nascent transcription, hundreds of genes, including the heat shock protein (HSP) genes, are rapidly upregulated by the master regulator Heat Shock Factor 1 (HSF1) (Mahat, Salamanca, et al., 2016; Vihervaara et al., 2017). Notably, in human erythroleukemia K562 cells, HS treatment induced HSF1 binding not only at gene promoters but also at hundreds of gene-distal regions (Vihervaara et al., 2017). Since HSF1 is a potent transcriptional activator that causes universally elevated H4 acetylation (H4ac) upon binding, I hypothesized that some of these HSF1-bound distal elements may act as heat-inducible enhancers. More interestingly, PRO-seq analysis revealed that only a subset of these distal elements 95 showed upregulated transcriptional activity upon HS, while the others remained transcriptionally uninduced or completely inactive. These findings made the HS system an appealing playground to tease apart the relationship between enhancer transcription and enhancer activity. In this study, I systematically tested the enhancer activity under both heat shock and non-heat shock conditions for a library of ~200 HSF1-bound elements that exhibit distinct transcriptional profiles as mentioned above. Preliminary results showed that, surprisingly, both transcriptionally upregulated and untranscribed distal elements can function as heat-inducible enhancers. Further examination of their transcription initiation profiles using PRO-cap, our most sensitive enhancer detection assay, revealed that the untranscribed elements are also induced at low levels upon heat stress. The implications and limitations of this study are further explored. 3.3 Results 3.3.1 Evidence of an HSF1-bound, heat-inducible enhancer To obtain initial evidence that HSF1-bound distal elements can function as HS- inducible enhancers, I focused on a candidate element identified in Vihervaara et al. (2021), located approximately 4.5 kb upstream of the TAX1BP1 gene (Figure 3.1A). Following 30-min heat stress, this element—but not the TAX1BP1 promoter—exhibited strong HSF1 binding, and showed increased nascent transcription concurrent with TAX1BP1 upregulation (Figures 3.1, A and B, adapted from Vihervaara et al., 2021). Moreover, short hairpin RNA (shRNA)-mediated knockdown of HSF1 attenuated heat- induced transcription at both TAX1BP1 and this upstream element (Vihervaara et al., 96 2021). To directly test its function, homozygous CRISPR deletion of the element was performed. In two independently derived single cell clonal lines, heat-induced TAX1BP1 expression was completely abolished (Figure 3.1C). Together, these findings demonstrate that this element functions as a bona fide HSF1-bound, HS-inducible enhancer, hereafter referred to as eTAX1BP1. Fig. 3.1 Homozygous deletion of eTAX1BP1 abolishes heat inducibility of TAX1BP1 expression. (A and B) HSF1 and TBP ChIP-seq signal (A) and PRO-seq tracks (B) at the TAX1BP1 locus under HS and NHS conditions in K562 cells. The purple box and the green bar highlight the candidate HSF1-bound enhancer. Adapted from Vihervaara et al. (2021). (C) Heat-induced TAX1BP1 expression in WT and DeTAX1BP1 cell lines measured by RT-qPCR. 97 To test whether eTAX1BP1’s function could be recapitulated in an episomal context using the eSTARR-seq plasmid (Tippens et al., 2020), I configured three reporter constructs (Figure 3.2A): a negative control where the MYC promoter (which lacks HS-induced HSF1 binding) is linked to a non-regulatory EGFP fragment; a positive control linking the HSPA1A promoter (which exhibits strong HSF1 binding upon HS) to EGFP; and a test construct linking the MYC promoter to the eTAX1BP1 element. Each construct was transfected into K562 cells, followed by 30- or 60-min HS at either 6 or 12 hr post-transfection. Cells were then harvested for RT-qPCR quantification of the luciferase reporter mRNA. As expected, the negative control showed minimal expression change between HS and NHS conditions, whereas both the positive control and the eTAX1BP1 test construct exhibited clear heat-induced transcription (Figure 3.2B). These results validate the feasibility of eSTARR-seq to measure inducible enhancer activity in response to heat stress. 98 Fig. 3.2 eTAX1BP1 activates episomal reporter gene expression in response to heat shock. (A) Schematic of the experimental design used to test heat-inducible enhancer activity of eTAX1BP1 on an episomal eSTARR-seq plasmid. Abbreviations: prom, promoter; pA, polyadenylation signal; trfx, transfection. (B) RT-qPCR measurement of luciferase mRNA fold change (HS over NHS) across the four treatment conditions shown in (A). 3.3.2 Systematically testing enhancer activity of a library of HSF1-bound candidate elements using eSTARR-seq Next, I set out to systematically test the heat-induced enhancer activity for a library of HSF-bound elements exhibiting distinct transcriptional profiles upon 30-min HS treatment (Figure 3.3). These include 50~60 elements from each of the three classes: upregulated distal transcriptional regulatory elements (dTREs), unchanged dTREs, and untranscribed distal elements. Eleven upregulated HSP gene promoters were also included. To limit the confounding factors, these elements were selected to match their HSF1 binding signals upon HS. Following individual element cloning from K562 genomic DNA via the Gateway BP reaction and high-throughput sequencing verification of cloned sequences, I pooled and moved the element library into the eSTARR-seq destination vector by the Gateway LR reaction. Transfected cells were heat shocked for 30 min (to match the PRO-seq and ChIP-seq HS duration) at 12 hr post-transfection, which showed a slightly higher heat inducibility than the 6 hr time point (Figure 3.2B). The eSTARR-seq sequencing libraries were prepared according to Tippens et al. (2020) using the tagmentation approach. Data analysis was performed by Dr. Alden K. Leung in the Yu Lab. 99 Fig. 3.3 Four classes of HSF1-bound elements for eSTARR-seq testing. Representative PRO-seq (Vihervaara et al., 2017) and HSF1 ChIP-seq (Vihervaara et al., 2013) profiles of the four classes of tested elements: upregulated dTREs (58 cloned), unchanged dTREs (55 cloned), untranscribed distal elements (54 cloned), and upregulated gene promoters (11 cloned). HS (30 min) and NHS conditions are compared side by side using the same scales. PROseq (NHS) PROseq (HS) HSF1 (30’ HS) HSF1 (NHS) PROseq (NHS) PROseq (HS) HSF1 (30’ HS) HSF1 (NHS) PROseq (NHS) PROseq (HS) HSF1 (30’ HS) HSF1 (NHS) PROseq (NHS) PROseq (HS) HSF1 (30’ HS) HSF1 (NHS) 58 upregulated dTREs 55 unchanged dTREs 54 untranscribed distal elements 11 upregulated promoters 100 Fig. 3.4 Systematically testing heat-induced enhancer activity of HSF1-bound candidate elements using eSTARR-seq. (A) Schematic of eSTARR-seq workflow in this study. Adapted from Tippens et al. (2020). (B) Correlation plots between measured enhancer activities under HS and NHS conditions, with elements placed in the forward or reverse orientation relative to the reporter. Top two plots are color coded by enhancer activity levels, and the bottom two plots color coded by their transcriptional profile classes. Data anlysis and visualization credit to Dr. Alden K. Leung. (C) Box plot comparing the bulk trend of HS-mediated enhancer activity inducibility across the four classes of HSF1- bound elements, only showing the forward orientation results as representative. 101 The eSTARR-seq results showed only a small subset of elements exhibiting heat-induced enhancer activity in both orientations. Notably, this included the positive control element eTAX1BP1, thereby validating the dataset (Figure 3.4B, top). Consistent with prior findings (Tippens et al., 2020), transcribed elements showed higher basal activity than untranscribed elements under the NHS condition (Figure 3.4C). While the unchanged dTREs exhibited slight downregulation of enhancer activity upon HS, both the upregulated dTREs and the untranscribed distal elements exhibited some level of heat inducibility (Figures 3.4C). Moreover, as shown in Figure 3.4B bottom plots, fold change for the untranscribed elements was actually higher than the upregulated dTREs. A few HSP promoters also exhibited heat-induced enhancer activity, demonstrating the functional flexibility between enhancers and promoters. 3.3.3 Induced transcription initiation at “untranscribed” elements detected by PRO-cap The unexpected observation that untranscribed elements exhibited heat-induced enhancer activity prompted us to more rigorously examine their transcriptional status using PRO-cap—a more sensitive assay that selectively enriches for 5′-capped nascent RNAs to map genome-wide transcription initiation events. Unlike PRO-seq, whose reads predominantly map to gene body regions, PRO-cap offers much higher coverage around transcription start sites (TSSs). Applying PRO-cap to K562 cells indeed revealed low levels of induced transcription initiation at the previously classified “untranscribed” distal elements upon 30-min HS, though the degree of transcriptional induction did not seem to correlate with their enhancer activity induction (Figure 3.5). 102 Fig. 3.5 PRO-cap detects induced transcription initiation at “untranscribed” elements upon HS. PRO-cap browser shots of three highly heat-induced untranscribed elements (DU0031, DU0033, DU0014) and the positive control element eTAX1BP1 (DT002), which is an upregulated transcribed dTRE. The tested regions are highlighted in light orange shade. “F” and “R” values indicate the log2 fold change of HS over NHS enhancer activity in the forward and reverse orientations, respectively. Both the 5′- 103 and 3′-end (labeled as 3p) read position tracks are shown, with the 5′ ends representing TSSs and the 3′ ends representing paused or elongating Pol II positions. 3.4 Discussion In this study, I set out to study the relationship between enhancer activity and enhancer transcription in a well-controlled HS-inducible system. However, several caveats in the experimental design are worth noting. First, classification of distal elements was based on PRO-seq profiles obtained prior to PRO-cap experiments. Given PRO-seq’s limited sensitivity to lowly transcribed enhancers, this may have led to misclassification, as illustrated in Figure 3.5. Second, unlike the approach taken by Tippens et al. (2020), we did not control for chromatin accessibility between different classes of elements due to the limited number of HSF1-bound distal sites. As a result, the apparent lack of transcriptional activity at “untranscribed” elements could simply reflect a closed chromatin state—one that may not be faithfully modeled on non- chromatinized plasmids. In other words, enhancer transcriptional states on episomal plasmids might diverge significantly from PRO-seq or PRO-cap patterns observed in native genomic contexts. This limitation underscores the value of a chromosomal reporter assay for more physiologically relevant assessments. Nonetheless, the lower basal activity of the untranscribed elements measured by STARR-seq suggests that they possess fewer intrinsically activating features compared to transcribed elements. Their higher heat inducibility may arise from cooperative effects between HSF1 and weak pre-existing activating components, whereas transcribed elements—already active—derive only modest additional benefit from 104 HSF1 binding. Interestingly, almost none of the unchanged dTREs exhibited heat- induced enhancer activity. Assuming these elements reside in similarly accessible chromatin as the upregulated dTREs, this observation supports the idea that changes in enhancer transcription can serve as a predictor of functional activity shifts. In summary, the heat shock system, while informative, may not be optimal for testing this hypothesis due to the limited number of candidate elements and the modest activation rate. Future studies would benefit from exploring alternative inducible systems, such as the interferon response (Doughty et al., 2024), while carefully controlling for epigenomic context, leveraging the sensitivity of PRO-cap for candidate classification, and ideally employing chromosomal reporter assays for functional validation. 3.5 Methods Cell Culture and Heat Shock Treatment K562 cells were cultured as described in Chapter 2 (Section 2.5). Cells were heat shocked in a 42°C water bath for the indicated duration plus 5 min (the time needed for temperature to reach 42°C), while the non-heat shock control was incubated in parallel in a 37°C water bath. Following the treatment, cells were promptly put on ice to stop any further heat shock response. eSTARR-seq element library cloning eSTARR-seq element library cloning was performed as described in Tippens et al., 105 2020. Briefly, candidate HSF1-bound elements were individually PCR-amplified from K562 genomic DNA (primers listed in Supplementary Table S5) and cloned into the pDONR223 vector by Gateway BP reaction. Four colonies were picked for each element and grown in a 96-well deep well plate. The sequences were verified by Illumina sequencing following the published Clone-seq protocol (Wei et al., 2014). The correct clones were propagated individually in 96-well plates and pooled together for Midiprep to isolate the pENTR library. Next, an en masse Gateway LR reaction was performed to move the elements into the eSTARR-seq destination vector. The transformants were propagated in LB broth and the plasmid library extracted by Maxiprep. eSTARR-seq eSTARR-seq was performed as described in Tippens et al., 2020, following the Tn5 tagmentation protocol. Cells were recovered for 11.5 hrs post-transfection before subjection to 30-min heat shock treatment. Forward and reverse orientation libraries were transfected separately to prevent interference during sequencing library preparation. Data analysis was performed by Dr. Alden K. Leung in the Yu lab. Raw and processed data are summarized in Supplementary Table S5. RT-qPCR RT-qPCR was performed as described in Chapter 2 (Section 2.5). Primers are listed as below: Table 3.1qPCR primer sequences used in this study 106 Name Sequence FFluc2_qPCR_For1 GTGGTGTGCAGCGAGAATAG FFluc2_qPCR_Rev1 CGCTCGTTGTAGATGTCGTTAG HSPA1A_qPCR_For1 AGGCCAACAAGATCACCATC HSPA1A_qPCR_Rev1 GTCCTCCGCTTTGTACTTCTC Bactin_QPCR_For2 CAAGCAGGAGTATGACGAGTC Bactin_QPCR_Rev2 GCCATGCCAATCTCATCTTG TAX1BPB1_QPCR_For1 GAGACAGAACGATGGCAGAC TAX1BPB1_QPCR_Rev1 AGTTCGTGTTCCAGTGTATCAG PRO-cap PRO-cap protocol was largely similar to the PRO-seq protocol descried in Chapter 2 (Section 2.5) except the following differences. First, the SUPERase·In RNase Inhibitor (Invitrogen AM2696) was switched to Protector RNase Inhibitor (Roche 3335402001), which was found to give a much better yield of the final library with little adaptor dimer formation. Second, the 5¢ enzymatic modifications of RNA on beads differed from PRO-seq: in PRO-cap, RNA containing 5¢-monophosphate was first removed by a reaction involved 1 μL XRN-1 (NEB) and 30-min incubation at 37°C; uncapped RNA was then dephosphorylated by Quick CIP (NEB) for 30 min at 37°C; capped RNA was decapped by Cap-Clip Acid Pyrophosphatase (Cellscript Inc) for 1 h at 37°C before 5¢ 107 adaptor ligation. 3.6 Acknowledgements I thank Dr. Alden K. Leung in Dr. Haiyuan Yu’s lab for collaborating with me on this project, who selected candidate elements, designed cloning primers, and performed eSTARR-seq data analysis. I thank Dr. Jin Liang in the Yu lab for sharing eSTARR-seq reagents and protocols, and Dr. Sagar Shah for helping to troubleshoot the PRO-cap protocol with me. 108 CHAPTER 4 CONCLUSION AND PERSPECTIVES In this thesis, I explored multiple facets of enhancer function through distinct experimental systems and techniques. In Chapter 2, I developed a novel recombinase- mediated cassette exchange platform, integrating pooled mutagenesis screens with functional genomics to dissect the sequence-to-function relationship of a long-range enhancer in its native chromatin context. In Chapter 3, I examined the role of enhancer transcription on enhancer activity using a heat shock–inducible system and a massively parallel episomal reporter assay. Together, these studies emphasize the power of studying enhancers in chromosomal contexts and highlight the utility of the divergent transcription model for understanding and predicting enhancer function. Chapter 2’s systematic analysis revealed several key insights, summarized below: (1) eNMU is composed of two functionally distinct sub-elements: a canonical autonomous enhancer e1, and an intrinsically inactive facilitator e2 that augments e1’s activity; (2) The facilitator e2 universally buffers the activity of a collection of tested enhancers and ensures enhancer robustness against disruptive mutations; (3) The autonomous enhancer e1 is functionally hierarchical to the NMU promoter, e2, and additional facilitators across the ~100-kb NMU–eNMU region, and it orchestrates the formation of a 3D regulatory hub with these elements; (4) e1 also harbors a bipartite structure: a divergently transcribed retroviral LTR enhancer that serves as the minimal core enhancer unit, and an adjacent unidirectionally transcribed LTR promoter that dampens NMU expression by competing with the NMU promoter for LTR enhancer activity; and (5) Coordinated action of lineage-specific transcription factors underlies the intricate cis-regulatory element interplay at eNMU: GATA1 and RUNX1 109 cooperatively confer intrinsic activity at the core LTR enhancer; KLF/SP factors at the adjacent LTR promoter modulate context-specific repression of enhancer activity; and STAT5 binding at the facilitator e2 is essential for mediating e1–e2 crosstalk. This work provides both conceptual and technical advances with broad implications for enhancer biology and gene regulation research: (1) it presents the first in situ dissection of enhancer–facilitator interplay at a typical enhancer (rather than a super-enhancer), suggesting that many CRISPR-identified candidate enhancers may function instead as facilitators; (2) it uncovers a previously unrecognized role of the LTR enhancer– promoter axis in fine-tuning gene expression, a potentially widespread mechanism given the abundance of LTRs in the human genome; (3) it showcases the power of the divergent transcription-based framework for enhancer definition, which enables high- resolution parsing of enhancer architecture and regulatory element classification; and (4) it introduces a versatile recombinase-mediated genome rewriting platform for functional interrogation and synthetic element design within native chromatin contexts, with broad applications in biotechnology and therapeutic development. Multiple avenues for follow-up investigation emerge from this work. Pertaining to the facilitator mechanisms, one may ask: (1) does the facilitator e2 mediate enhancer– promoter interactions in addition to establishing cooperative environments? This can be answered directly by performing region capture micro-C (RCMC) (Goel et al., 2023) or similar targeted 3C-based assays in the e2 deletion or e2_mSTAT5 mutants and comparing the contact frequencies with WT and e1 deletion cell lines; (2) do other STAT5 motif clusters genome-wide have similar functions as facilitator elements? This may be answered by bioinformatic classification of STAT5 ChIP-seq peaks into those 110 containing tandem arrays of STAT5 motifs without GATA1/RUNX1 binding and those containing no good STAT5 motifs but well bound by GATA1 and RUNX1. If the former can function as facilitators and the latter as buffered bona fide enhancer, these two classes of STAT5 peaks should have some spatial connectivity, i.e., either residing adjacent to each other as in the case of e1 and e2, or colocalized in the same chromatin hub that can be inferred from public Hi-C datasets. These candidates can be further cloned and tested in the eNMU landing pad system. It will also be interesting to treat K562 cells with STAT5 inhibitors and perform ATAC-seq and PRO-seq analysis to study how STAT5 regulates the behavior of clusters of cis-regulatory elements; (3) do the tested heterologous enhancers communicate with the same set of additional facilitators (F1–F3) as e1 does? This can be answered immediately by performing ATAC-seq in a few examples; (4) what will the eNMU mutagenesis screen results look like if the additional facilitators (F1–F3) are all deleted or repressed in the genome? How much of amplification do they collectively contribute to eNMU activity? One particularly puzzling observation in the case of e2 fusion experiments was that direct duplication of e2 exhibited ~30% of WT_eNMU (i.e., e1 fused with e2) activity (data not shown). This strongly implies that once duplicated, some TFs binding at e2 are able to act synergistically over the ~500 bp added distance. Elucidating the nature of this synergy may allow us to get to the bottom of how e2 works. Since the eNMU mutagenesis screen revealed that e2’s STAT5 and GATA1 motifs both contribute to eNMU activity, it will be interesting to mutate the e2+e2 direct fusion in different combinations of STAT5 and GATA1 motifs in each of the e2 unit (i.e., mGATA1/mGATA1, mSTAT5/mSTAT5, mSTAT5/mGATA1, mGATA1/mSTAT5). 111 This should reveal the TFs participating in the synergistic effect. Moreover, it could be interesting to perform a PRO-seq or PRO-cap experiment to examine the transcription profile of this fusion element—since it has become an active enhancer, there should be a clear divergent transcription pattern that is distinct from e2 alone (no transcription at all) or e1+e2 (where e2 is predominately unidirectionally transcribed). The interesting LTR enhancer–promoter regulatory axis in e1 also raises the question on its genome-wide prevalence. It may be possible to make use of the whole- genome STARR-seq datasets and employ the genomic binning strategy (developed by J. Zhang et al., 2025) to look for more examples when the introduction of a unidirectional TSS represses enhancer activity. Alternatively, we can screen across all transcribed LTRs in K562 cells in search of similar “divergent + unidirectional” TSS structures and experimentally test the unidirectional TSS function in our landing pad system or using a plasmid reporter assay. Similar to Martyn et al. (2025), our in situ mutagenesis study provided a gold- standard dataset for evaluating different deep learning-based predictive models. While direct prediction from sequence to gene expression over ~94 kb may be inaccurate, our high-quality ATAC-seq profiles may offer an ideal intermediate to calibrate model performance. However, we are indeed limited by the sample size to reach any solid conclusions. I would also like to mention some relevant ongoing projects in the lab. Yiyang and Jessica are currently working on a facilitator screen, where a library of CRISPRi- validated “enhancers” are integrated into the eNMU landing pad as standalone elements, 112 or fused with e1 or e2. They will perform HCR-FlowFISH to measure enhancer activity for these three configurations and classify tested elements into enhancers and facilitators to understand their sequence and functional features in an unbiased way. Yang is creating an orthogonal promoter landing pad that would allow swapping of the NMU promoter with any other promoters of interest. Combining with the existing eNMU landing pad, this project holds tremendous promise in studying long-range enhancer– promoter compatibility. To address the major caveat in Chapter 3 (i.e., the episomal context), the eTAX1BP1 enhancer deletion lines may serve as the starting point to construct an eTAX1BP1 landing pad. The HSF1-bound element library can then be integrated at the fixed eTAX1BP1 locus to rigorously test their heat-induced enhancer activity on a heat shock-responsive promoter located 4.5 kb away. The short distance between this proposed landing and its target promoter may make it useful to measure the “intrinsic” activity of tested elements, which can be compared with the measurements at the eNMU landing pad to derive insights into long-range enhancer function. However, considering basal transcription of TAX1BP1 is not affected by eTAX1BP1 deletion, it would require more detailed examination to check if there exist other enhancer elements for TAX1BP1 that may interact differentially with different candidate elements. Another possibility is to use a chromosomal reporter assay to assess the enhancer activity of our HSF1-bound distal element library on a mutant HSPA1A promoter where all the HSF1 binding sites have been disrupted. This design should enhance the dynamic range of activation and allow cleaner identification of HS-inducible enhancers. Taken together, I envision that future advances in enhancer biology will hinge 113 on the integration of systematic genome engineering, cutting-edge functional assays— including genomics and imaging—and powerful computational tools to decipher and apply the principles of enhancer logic. 114 APPENDIX A A FLEXIBLE PROTEIN-TAGGING STRATEGY FOR MAPPING CDK9 CHROMATIN OCCUPANCY AND NUCLEAR PROXIMITY INTERACTOME The release of RNA Pol II from the promoter-proximal pausing site into productive elongation critically depends on the positive elongation factor P-TEFb (Jonkers et al., 2014), a complex of Cyclin-Dependent Kinase 9 (CDK9) and Cyclin T1 or T2 in human cells (Fujinaga et al., 2023). P-TEFb activity is tightly regulated by the inhibitory 7SK complex and by positive factors including BRD4 and the Super Elongation Complex (SEC) (reviewed in Lu et al., 2016). Despite these insights, a proteome-wide survey of potential P-TEFb recruitment factors—especially sequence- specific TFs—remains lacking. To address this, I undertook a side project to map the proximal proteome of CDK9 by fusing it with the APEX2 peroxidase (Lam et al., 2015). In parallel, I also aimed to generate a CDK9-EGFP cell line to test whether this could improve the quality of CDK9 ChIP-seq. As CDK9 is an interesting target that could be studied by other tags (e.g., HaloTag for live-cell imaging, dTAG for acute depletion), I first developed a flexible protein-tagging strategy using the Bxb1-mediated landing pad system described in Chapter 2. Specifically, I inserted the landing pad at the C-terminus of the endogenous CDK9 gene, between the stop codon and the 3′ UTR (Figure A.1), and established a homozygous HCT116 clonal cell line. This allowed me to swap in any desired protein tag via Bxb1 recombination. 115 Fig. 4.1 Schematic of Bxb1-mediated flexible protein-tagging strategy for CDK9. Using the parental CDK9 landing pad cell line, I successfully established homozygous CDK9-EGFP clones (Figure A.2A). In contrast, only heterozygous clones could be obtained for the CDK9-APEX2 fusion (Figure A.2B), suggesting that homozygous APEX2 tagging is incompatible with cell viability. This limitation makes it challenging to determine whether the CDK9-APEX2 fusion protein is fully functional, though some interaction between CDK9-APEX2 and Cyclin T1 was observed in the co- IP experiment, especially for Clones A5 and A10. (Figure A.3, right panel, compare “C” (CDK9) and “F” (FLAG) IP). Fig. 4.2 Western blot analysis of tagged CDK9 clonal cell lines. (A) Anti-CDK9 (red) western blot for CDK9-EGFP clones with anti-TBP (green) as 116 a loading control; (B) Anti-CDK9 and anti-FLAG wester blots for CDK9-FLAG- APEX2 clones. Figure 4.3 Probing physical interaction between CDK9-FLAG-APEX2 and Cyclin T1 by Co-IP analysis. Lysates from the three heterozygous CDK9-APEX2 clones (A4, A5, and A10) and the non-tagged control clone A11 were subjected to IgG, FLAG or CDK9 immunoprecipitation (IP). Left panel shows the immunoblot (IB) for the bait protein CDK9. Right panel shows the immunoblot for the interactor Cyclin T1. APEX2 catalyzes the biotinylation of proximal proteins using biotin-phenol and H₂O₂ as substrates. Testing the CDK9–APEX2 clones for biotin labeling revealed a strong smear of biotinylated proteins (Figure A.4A), confirming that the APEX2 enzyme is functional in these cells. However, whole cell lysates showed a high background of endogenously biotinylated proteins, which are mainly located in mitochondria and cytoplasm (Niers et al., 2011). Indeed, isolating nuclei using an efficient Lyse-and-Wash protocol (Senichkin et al., 2021) eliminated most of these proteins and improved the signal-to-noise ratio of APEX2-mediated biotinylation (Figure A.4B). 117 Figure 4.4 Nuclear isolation effectively eliminates endogenously biotinylated proteins. (A) Anti-biotin western blot of whole-cell lysates (untreated, biotin-phenol only, and biotin-phenol + H2O2) from the three heterozygous CDK9-APEX2 clones (A4, A5, and A10) and the non-tagged control clone A11; (B) Streptavidin HRP blot of whole-cell lysates, cytoplasmic fractions, and nuclear fractions of Clones A10 and A11 treated with biotin-phenol ± H2O2. I pulled down biotinylated proteins from ±H2O2 nuclear lysates of Clone A10 and sent the eluates to Dr. Jin Joo Kang in the Yu lab for a preliminary mass spectrometry run. However, the data were problematic: the -H2O2 control yielded very little pull-down material (as shown in Figure A.4B), preventing reliable identification of enriched targets. I conclude that a proper control cell line (i.e., one expressing free nuclear APEX2) is needed for future experiments. The project was subsequently taken over by Miliarys. Working together with a rotation student Simian Cai, I also attempted to optimize CDK9 ChIP-seq using the anti-GFP antibody (Abcam ab290) and the homozygous CDK9-EGFP Clone G4. Of the three crosslinking methods tested (Figure A.5A), the two-step crosslinking protocol (Tian et al., 2012) appeared most promising, showing increased CDK9 occupancy at the heat shock gene HSPH1 upon heat shock 118 treatment (Figure A.5B). Nonetheless, discernible CDK9 signal was lacking at most genes, indicating that further optimization of crosslinking and sonication conditions is required. Figure 4.5 Optimizing CDK9-EGFP ChIP-seq with different crosslinking conditions. (A) Summary of experimental conditions in the CDK9-EGFP ChIP optimization test. (B) Example two-step crosslinking CDK9-EGFP ChIP-seq profile at the HSPH1 gene, showing input (genomic background control), heat shock (HS) and non-heat shock (NHS) tracks. PRO-seq tracks are kindly provided by Jawaher. In summary, this side project proved extremely challenging due to the various biological and technical issues described above. Nevertheless, the Bxb1 landing pad remains a versatile system for endogenous protein tagging. 119 REFERENCES Abdella, R., Talyzina, A., Chen, S., Inouye, C. J., Tjian, R., & He, Y. (2021). Structure of the human Mediator-bound transcription preinitiation complex. Science (New York, N.Y.), 372(6537), 52–56. https://doi.org/10.1126/science.abg3074 Aboreden, N. G., Lam, J. C., Goel, V. Y., Wang, S., Wang, X., Midla, S. C., Quijano, A., Keller, C. A., Giardine, B. M., Hardison, R. C., Zhang, H., Hansen, A. S., & Blobel, G. A. (2025). LDB1 establishes multi-enhancer networks to regulate gene expression. Molecular Cell, 85(2), 376-393.e9. https://doi.org/10.1016/j.molcel.2024.11.037 Adams, C. C., & and Workman, J. L. (1995). Binding of disparate transcriptional activators to nucleosomal DNA is inherently cooperative. Molecular and Cellular Biology, 15(3), 1405–1421. https://doi.org/10.1128/MCB.15.3.1405 Adelman, K., & Lis, J. T. (2012). Promoter-proximal pausing of RNA polymerase II: Emerging roles in metazoans. Nature Reviews. Genetics, 13(10), 720–731. https://doi.org/10.1038/nrg3293 Alexander, J. M., Guan, J., Li, B., Maliskova, L., Song, M., Shen, Y., Huang, B., Lomvardas, S., & Weiner, O. D. (2019). Live-cell imaging reveals enhancer- dependent Sox2 transcription in the absence of enhancer proximity. eLife, 8, e41769. https://doi.org/10.7554/eLife.41769 120 Alipanahi, B., Delong, A., Weirauch, M. T., & Frey, B. J. (2015). Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nature Biotechnology, 33(8), 831–838. https://doi.org/10.1038/nbt.3300 Allahyar, A., Vermeulen, C., Bouwman, B. A. M., Krijger, P. H. L., Verstegen, M. J. A. M., Geeven, G., van Kranenburg, M., Pieterse, M., Straver, R., Haarhuis, J. H. I., Jalink, K., Teunissen, H., Renkens, I. J., Kloosterman, W. P., Rowland, B. D., de Wit, E., de Ridder, J., & de Laat, W. (2018). Enhancer hubs and loop collisions identified from single-allele topologies. Nature Genetics, 50(8), 1151–1160. https://doi.org/10.1038/s41588-018-0161-5 An, X., Schulz, V. P., Li, J., Wu, K., Liu, J., Xue, F., Hu, J., Mohandas, N., & Gallagher, P. G. (2014). Global transcriptome analyses of human and murine terminal erythroid differentiation. Blood, 123(22), 3466–3477. https://doi.org/10.1182/blood-2014-01-548305 Andersson, R., Sandelin, A., & Danko, C. G. (2015). A unified architecture of transcriptional regulatory elements. Trends in Genetics: TIG, 31(8), 426– 433. https://doi.org/10.1016/j.tig.2015.05.007 Arnold, C. D., Gerlach, D., Stelzer, C., Boryń, Ł. M., Rath, M., & Stark, A. (2013). Genome-wide quantitative enhancer activity maps identified by STARR- seq. Science (New York, N.Y.), 339(6123), 1074–1077. https://doi.org/10.1126/science.1232542 121 Arnosti, D. N., & Kulkarni, M. M. (2005). Transcriptional enhancers: Intelligent enhanceosomes or flexible billboards? Journal of Cellular Biochemistry, 94(5), 890–898. https://doi.org/10.1002/jcb.20352 Avsec, Ž., Agarwal, V., Visentin, D., Ledsam, J. R., Grabska-Barwinska, A., Taylor, K. R., Assael, Y., Jumper, J., Kohli, P., & Kelley, D. R. (2021). Effective gene expression prediction from sequence by integrating long-range interactions. Nature Methods, 18(10), 1196–1203. https://doi.org/10.1038/s41592-021- 01252-x Avsec, Ž., Weilert, M., Shrikumar, A., Krueger, S., Alexandari, A., Dalal, K., Fropf, R., McAnany, C., Gagneur, J., Kundaje, A., & Zeitlinger, J. (2021). Base- resolution models of transcription-factor binding reveal soft motif syntax. Nature Genetics, 53(3), 354–366. https://doi.org/10.1038/s41588-021-00782- 6 Bakshi, R., Hassan, M. Q., Pratap, J., Lian, J. B., Montecino, M. A., van Wijnen, A. J., Stein, J. L., Imbalzano, A. N., & Stein, G. S. (2010). The human SWI/SNF complex associates with RUNX1 to control transcription of hematopoietic target genes. Journal of Cellular Physiology, 225(2), 569–576. https://doi.org/10.1002/jcp.22240 Banerji, J., Olson, L., & Schaffner, W. (1983). A lymphocyte-specific cellular enhancer is located downstream of the joining region in immunoglobulin 122 heavy chain genes. Cell, 33(3), 729–740. https://doi.org/10.1016/0092- 8674(83)90015-6 Banerji, J., Rusconi, S., & Schaffner, W. (1981). Expression of a β-globin gene is enhanced by remote SV40 DNA sequences. Cell, 27(2), 299–308. https://doi.org/10.1016/0092-8674(81)90413-X Batut, P. J., Bing, X. Y., Sisco, Z., Raimundo, J., Levo, M., & Levine, M. S. (2022). Genome organization controls transcriptional dynamics during development. Science (New York, N.Y.), 375(6580), 566–570. https://doi.org/10.1126/science.abi7178 Bauer, D. E., Kamran, S. C., Lessard, S., Xu, J., Fujiwara, Y., Lin, C., Shao, Z., Canver, M. C., Smith, E. C., Pinello, L., Sabo, P. J., Vierstra, J., Voit, R. A., Yuan, G.-C., Porteus, M. H., Stamatoyannopoulos, J. A., Lettre, G., & Orkin, S. H. (2013). An erythroid enhancer of BCL11A subject to genetic variation determines fetal hemoglobin level. Science (New York, N.Y.), 342(6155), 253–257. https://doi.org/10.1126/science.1242088 Bell, C. C., Balic, J. J., Talarmain, L., Gillespie, A., Scolamiero, L., Lam, E. Y. N., Ang, C.-S., Faulkner, G. J., Gilan, O., & Dawson, M. A. (2024). Comparative cofactor screens show the influence of transactivation domains and core promoters on the mechanisms of transcription. Nature Genetics, 56(6), 1181–1192. https://doi.org/10.1038/s41588-024-01749-z 123 Berger, S. L., Cress, W. D., Cress, A., Triezenberg, S. J., & Guarente, L. (1990). Selective inhibition of activated but not basal transcription by the acidic activation domain of VP16: Evidence for transcriptional adaptors. Cell, 61(7), 1199–1208. https://doi.org/10.1016/0092-8674(90)90684-7 Bergman, D. T., Jones, T. R., Liu, V., Ray, J., Jagoda, E., Siraj, L., Kang, H. Y., Nasser, J., Kane, M., Rios, A., Nguyen, T. H., Grossman, S. R., Fulco, C. P., Lander, E. S., & Engreitz, J. M. (2022). Compatibility rules of human enhancer and promoter sequences. Nature, 607(7917), 176–184. https://doi.org/10.1038/s41586-022-04877-w Bintu, B., Mateo, L. J., Su, J.-H., Sinnott-Armstrong, N. A., Parker, M., Kinrot, S., Yamaya, K., Boettiger, A. N., & Zhuang, X. (2018). Super-resolution chromatin tracing reveals domains and cooperative interactions in single cells. Science (New York, N.Y.), 362(6413), eaau1783. https://doi.org/10.1126/science.aau1783 Birney, E., Stamatoyannopoulos, J. A., Dutta, A., Guigó, R., Gingeras, T. R., Margulies, E. H., Weng, Z., Snyder, M., Dermitzakis, E. T., Stamatoyannopoulos, J. A., Thurman, R. E., Kuehn, M. S., Taylor, C. M., Neph, S., Koch, C. M., Asthana, S., Malhotra, A., Adzhubei, I., Greenbaum, J. A., … Transcriptional Regulatory Elements. (2007). Identification and analysis of functional elements in 1% of the human genome by the 124 ENCODE pilot project. Nature, 447(7146), 799–816. https://doi.org/10.1038/nature05874 Blayney, J. W., Francis, H., Rampasekova, A., Camellato, B., Mitchell, L., Stolper, R., Cornell, L., Babbs, C., Boeke, J. D., Higgs, D. R., & Kassouf, M. (2023). Super-enhancers include classical enhancers and facilitators to fully activate gene expression. Cell, 186(26), 5826-5839.e18. https://doi.org/10.1016/j.cell.2023.11.030 Borrelli, E., Hen, R., & Chambon, P. (1984). Adenovirus-2 E1A products repress enhancer-induced stimulation of transcription. Nature, 312(5995), 608–612. https://doi.org/10.1038/312608a0 Boswell, S. (2020, November 10). Home-Brew SPRI Beads. Protocols.Io. https://www.protocols.io/view/home-brew-spri-beads-bkppkvmn Bothma, J. P., Garcia, H. G., Ng, S., Perry, M. W., Gregor, T., & Levine, M. (2015). Enhancer additivity and non-additivity are determined by enhancer strength in the Drosophila embryo. eLife, 4, e07956. https://doi.org/10.7554/eLife.07956 Bourbon, H.-M., Aguilera, A., Ansari, A. Z., Asturias, F. J., Berk, A. J., Bjorklund, S., Blackwell, T. K., Borggrefe, T., Carey, M., Carlson, M., Conaway, J. W., Conaway, R. C., Emmons, S. W., Fondell, J. D., Freedman, L. P., Fukasawa, T., Gustafsson, C. M., Han, M., He, X., … Kornberg, R. D. (2004). A unified 125 nomenclature for protein subunits of mediator complexes linking transcriptional regulators to RNA polymerase II. Molecular Cell, 14(5), 553– 557. https://doi.org/10.1016/j.molcel.2004.05.011 Bower, G., Hollingsworth, E. W., Jacinto, S. H., Alcantara, J. A., Clock, B., Cao, K., Liu, M., Dziulko, A., Alcaina-Caro, A., Xu, Q., Skowronska-Krawczyk, D., Lopez-Rios, J., Dickel, D. E., Bardet, A. F., Pennacchio, L. A., Visel, A., & Kvon, E. Z. (2025). Range extender mediates long-distance enhancer activity. Nature, 643(8072), 830–838. https://doi.org/10.1038/s41586-025- 09221-6 Boyle, A. P., Davis, S., Shulha, H. P., Meltzer, P., Margulies, E. H., Weng, Z., Furey, T. S., & Crawford, G. E. (2008). High-Resolution Mapping and Characterization of Open Chromatin across the Genome. Cell, 132(2), 311–322. https://doi.org/10.1016/j.cell.2007.12.014 Broad Institute. (2019). Picard Toolkit. GitHub Repository. https://broadinstitute.github.io/picard/ (Original work published 2014) Brosh, R., Coelho, C., Ribeiro-Dos-Santos, A. M., Ellis, G., Hogan, M. S., Ashe, H. J., Somogyi, N., Ordoñez, R., Luther, R. D., Huang, E., Boeke, J. D., & Maurano, M. T. (2023). Synthetic regulatory genomics uncovers enhancer context dependence at the Sox2 locus. Molecular Cell, 83(7), 1140-1152.e7. https://doi.org/10.1016/j.molcel.2023.02.027 126 Brosh, R., Laurent, J. M., Ordoñez, R., Huang, E., Hogan, M. S., Hitchcock, A. M., Mitchell, L. A., Pinglay, S., Cadley, J. A., Luther, R. D., Truong, D. M., Boeke, J. D., & Maurano, M. T. (2021). A versatile platform for locus-scale genome rewriting and verification. Proceedings of the National Academy of Sciences of the United States of America, 118(10), e2023952118. https://doi.org/10.1073/pnas.2023952118 Buenrostro, J. D., Wu, B., Chang, H. Y., & Greenleaf, W. J. (2015). ATAC-seq: A Method for Assaying Chromatin Accessibility Genome-Wide. Current Protocols in Molecular Biology, 109, 21.29.1-21.29.9. https://doi.org/10.1002/0471142727.mb2129s109 Buratowski, S., Hahn, S., Guarente, L., & Sharp, P. A. (1989). Five intermediate complexes in transcription initiation by RNA polymerase II. Cell, 56(4), 549–561. https://doi.org/10.1016/0092-8674(89)90578-3 Canver, M. C., Smith, E. C., Sher, F., Pinello, L., Sanjana, N. E., Shalem, O., Chen, D. D., Schupp, P. G., Vinjamur, D. S., Garcia, S. P., Luc, S., Kurita, R., Nakamura, Y., Fujiwara, Y., Maeda, T., Yuan, G.-C., Zhang, F., Orkin, S. H., & Bauer, D. E. (2015a). BCL11A enhancer dissection by Cas9-mediated in situ saturating mutagenesis. Nature, 527(7577), 192–197. https://doi.org/10.1038/nature15521 Canver, M. C., Smith, E. C., Sher, F., Pinello, L., Sanjana, N. E., Shalem, O., Chen, 127 D. D., Schupp, P. G., Vinjamur, D. S., Garcia, S. P., Luc, S., Kurita, R., Nakamura, Y., Fujiwara, Y., Maeda, T., Yuan, G.-C., Zhang, F., Orkin, S. H., & Bauer, D. E. (2015b). BCL11A enhancer dissection by Cas9-mediated in situ saturating mutagenesis. Nature, 527(7577), 192–197. https://doi.org/10.1038/nature15521 Carleton, J. B., Berrett, K. C., & Gertz, J. (2017). Multiplex Enhancer Interference Reveals Collaborative Control of Gene Regulation by Estrogen Receptor α- Bound Enhancers. Cell Systems, 5(4), 333-344.e5. https://doi.org/10.1016/j.cels.2017.08.011 Castro-Mondragon, J. A., Riudavets-Puig, R., Rauluseviciute, I., Berhanu Lemma, R., Turchi, L., Blanc-Mathieu, R., Lucas, J., Boddie, P., Khan, A., Manosalva Pérez, N., Fornes, O., Leung, T. Y., Aguirre, A., Hammal, F., Schmelter, D., Baranasic, D., Ballester, B., Sandelin, A., Lenhard, B., … Mathelier, A. (2022). JASPAR 2022: The 9th release of the open-access database of transcription factor binding profiles. Nucleic Acids Research, 50(D1), D165–D173. https://doi.org/10.1093/nar/gkab1113 Chakraborty, S., Kopitchinski, N., Zuo, Z., Eraso, A., Awasthi, P., Chari, R., Mitra, A., Tobias, I. C., Moorthy, S. D., Dale, R. K., Mitchell, J. A., Petros, T. J., & Rocha, P. P. (2023). Enhancer–promoter interactions can bypass CTCF- mediated boundaries and contribute to phenotypic robustness. Nature 128 Genetics, 55(2), 280–290. https://doi.org/10.1038/s41588-022-01295-6 Chanda, B., Ditadi, A., Iscove, N. N., & Keller, G. (2013). Retinoic acid signaling is essential for embryonic hematopoietic stem cell development. Cell, 155(1), 215–227. https://doi.org/10.1016/j.cell.2013.08.055 Chen, H., Levo, M., Barinov, L., Fujioka, M., Jaynes, J. B., & Gregor, T. (2018). Dynamic interplay between enhancer-promoter topology and gene activity. Nature Genetics, 50(9), 1296–1303. https://doi.org/10.1038/s41588-018- 0175-z Chen, M. J., Yokomizo, T., Zeigler, B. M., Dzierzak, E., & Speck, N. A. (2009). Runx1 is required for the endothelial to haematopoietic cell transition but not thereafter. Nature, 457(7231), 887–891. https://doi.org/10.1038/nature07619 Chen, S., Zhou, Y., Chen, Y., & Gu, J. (2018). fastp: An ultra-fast all-in-one FASTQ preprocessor. Bioinformatics (Oxford, England), 34(17), i884–i890. https://doi.org/10.1093/bioinformatics/bty560 Chen, Z., Snetkova, V., Bower, G., Jacinto, S., Clock, B., Dizehchi, A., Barozzi, I., Mannion, B. J., Alcaina-Caro, A., Lopez-Rios, J., Dickel, D. E., Visel, A., Pennacchio, L. A., & Kvon, E. Z. (2024). Increased enhancer–promoter interactions during developmental enhancer activation in mammals. Nature Genetics, 56(4), 675–685. https://doi.org/10.1038/s41588-024-01681-2 129 Chong, S., Graham, T. G. W., Dugast-Darzacq, C., Dailey, G. M., Darzacq, X., & Tjian, R. (2022). Tuning levels of low-complexity domain interactions to modulate endogenous oncogenic transcription. Molecular Cell, 82(11), 2084-2097.e5. https://doi.org/10.1016/j.molcel.2022.04.007 Chrivia, J. C., Kwok, R. P., Lamb, N., Hagiwara, M., Montminy, M. R., & Goodman, R. H. (1993). Phosphorylated CREB binds specifically to the nuclear protein CBP. Nature, 365(6449), 855–859. https://doi.org/10.1038/365855a0 Church, G. M., Ephrussi, A., Gilbert, W., & Tonegawa, S. (1985). Cell-type-specific contacts to immunoglobulin enhancers in nuclei. Nature, 313(6005), 798– 801. https://doi.org/10.1038/313798a0 Cirillo, L. A., Lin, F. R., Cuesta, I., Friedman, D., Jarnik, M., & Zaret, K. S. (2002). Opening of Compacted Chromatin by Early Developmental Transcription Factors HNF3 (FoxA) and GATA-4. Molecular Cell, 9(2), 279–289. https://doi.org/10.1016/S1097-2765(02)00459-8 Cong, L., Ran, F. A., Cox, D., Lin, S., Barretto, R., Habib, N., Hsu, P. D., Wu, X., Jiang, W., Marraffini, L. A., & Zhang, F. (2013). Multiplex genome engineering using CRISPR/Cas systems. Science (New York, N.Y.), 339(6121), 819–823. https://doi.org/10.1126/science.1231143 Corces, M. R., Trevino, A. E., Hamilton, E. G., Greenside, P. G., Sinnott- 130 Armstrong, N. A., Vesuna, S., Satpathy, A. T., Rubin, A. J., Montine, K. S., Wu, B., Kathiria, A., Cho, S. W., Mumbach, M. R., Carter, A. C., Kasowski, M., Orloff, L. A., Risca, V. I., Kundaje, A., Khavari, P. A., … Chang, H. Y. (2017). An improved ATAC-seq protocol reduces background and enables interrogation of frozen tissues. Nature Methods, 14(10), 959–962. https://doi.org/10.1038/nmeth.4396 Core, L. J., Martins, A. L., Danko, C. G., Waters, C. T., Siepel, A., & Lis, J. T. (2014). Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers. Nature Genetics, 46(12), 1311–1320. https://doi.org/10.1038/ng.3142 Core, L. J., Waterfall, J. J., & Lis, J. T. (2008). Nascent RNA Sequencing Reveals Widespread Pausing and Divergent Initiation at Human Promoters. Science, 322(5909), 1845–1848. https://doi.org/10.1126/science.1162228 Crawford, G. E., Davis, S., Scacheri, P. C., Renaud, G., Halawi, M. J., Erdos, M. R., Green, R., Meltzer, P. S., Wolfsberg, T. G., & Collins, F. S. (2006). DNase- chip: A high-resolution method to identify DNase I hypersensitive sites using tiled microarrays. Nature Methods, 3(7), 503–509. https://doi.org/10.1038/nmeth888 Creyghton, M. P., Cheng, A. W., Welstead, G. G., Kooistra, T., Carey, B. W., Steine, E. J., Hanna, J., Lodato, M. A., Frampton, G. M., Sharp, P. A., Boyer, 131 L. A., Young, R. A., & Jaenisch, R. (2010). Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proceedings of the National Academy of Sciences, 107(50), 21931–21936. https://doi.org/10.1073/pnas.1016071107 Dancy, B. M., & Cole, P. A. (2015). Protein Lysine Acetylation by p300/CBP. Chemical Reviews, 115(6), 2419–2452. https://doi.org/10.1021/cr500452k Davidson, I., Fromental, C., Augereau, P., Wildeman, A., Zenke, M., & Chambon, P. (1986). Cell-type specific protein binding to the enhancer of simian virus 40 in nuclear extracts. Nature, 323(6088), 544–548. https://doi.org/10.1038/323544a0 de Almeida, B. P., Schaub, C., Pagani, M., Secchia, S., Furlong, E. E. M., & Stark, A. (2024). Targeted design of synthetic enhancers for selected tissues in the Drosophila embryo. Nature, 626(7997), 207–211. https://doi.org/10.1038/s41586-023-06905-9 de Groot, R. P., Raaijmakers, J. A., Lammers, J. W., Jove, R., & Koenderman, L. (1999). STAT5 activation by BCR-Abl contributes to transformation of K562 leukemia cells. Blood, 94(3), 1108–1112. de Wit, E., Vos, E. S. M., Holwerda, S. J. B., Valdes-Quezada, C., Verstegen, M. J. A. M., Teunissen, H., Splinter, E., Wijchers, P. J., Krijger, P. H. L., & de Laat, W. (2015). CTCF Binding Polarity Determines Chromatin Looping. 132 Molecular Cell, 60(4), 676–684. https://doi.org/10.1016/j.molcel.2015.09.023 DeBerardine, M. (2023). BRGenomics for analyzing high-resolution genomics data in R. Bioinformatics (Oxford, England), 39(6), btad331. https://doi.org/10.1093/bioinformatics/btad331 Dekker, J., Rippe, K., Dekker, M., & Kleckner, N. (2002). Capturing Chromosome Conformation. Science, 295(5558), 1306–1311. https://doi.org/10.1126/science.1067799 DelRosso, N., Suzuki, P. H., Griffith, D., Lotthammer, J. M., Novak, B., Kocalar, S., Sheth, M. U., Holehouse, A. S., Bintu, L., & Fordyce, P. (2024). High- throughput affinity measurements of direct interactions between activation domains and co-activators. bioRxiv: The Preprint Server for Biology, 2024.08.19.608698. https://doi.org/10.1101/2024.08.19.608698 Deng, W., Lee, J., Wang, H., Miller, J., Reik, A., Gregory, P. D., Dean, A., & Blobel, G. A. (2012). Controlling long-range genomic interactions at a native locus by targeted tethering of a looping factor. Cell, 149(6), 1233–1244. https://doi.org/10.1016/j.cell.2012.03.051 Dey, A., Chitsaz, F., Abbasi, A., Misteli, T., & Ozato, K. (2003). The double bromodomain protein Brd4 binds to acetylated chromatin during interphase and mitosis. Proceedings of the National Academy of Sciences, 100(15), 8758–8763. https://doi.org/10.1073/pnas.1433065100 133 Dickel, D. E., Ypsilanti, A. R., Pla, R., Zhu, Y., Barozzi, I., Mannion, B. J., Khin, Y. S., Fukuda-Yuzawa, Y., Plajzer-Frick, I., Pickle, C. S., Lee, E. A., Harrington, A. N., Pham, Q. T., Garvin, T. H., Kato, M., Osterwalder, M., Akiyama, J. A., Afzal, V., Rubenstein, J. L. R., … Visel, A. (2018). Ultraconserved Enhancers Are Required for Normal Development. Cell, 172(3), 491-499.e15. https://doi.org/10.1016/j.cell.2017.12.017 Dixon, J. R., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., Hu, M., Liu, J. S., & Ren, B. (2012). Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature, 485(7398), 376–380. https://doi.org/10.1038/nature11082 Dogan, N., Wu, W., Morrissey, C. S., Chen, K.-B., Stonestrom, A., Long, M., Keller, C. A., Cheng, Y., Jain, D., Visel, A., Pennacchio, L. A., Weiss, M. J., Blobel, G. A., & Hardison, R. C. (2015). Occupancy by key transcription factors is a more accurate predictor of enhancer activity than histone modifications or chromatin accessibility. Epigenetics & Chromatin, 8, 16. https://doi.org/10.1186/s13072-015-0009-5 Dong, P., Zhang, S., Gandin, V., Xie, L., Wang, L., Lemire, A. L., Li, W., Otsuna, H., Kawase, T., Lander, A. D., Chang, H. Y., & Liu, Z. J. (2024). Cohesin prevents cross-domain gene coactivation. Nature Genetics, 56(8), 1654– 1664. https://doi.org/10.1038/s41588-024-01852-1 134 Dorighi, K. M., Swigut, T., Henriques, T., Bhanu, N. V., Scruggs, B. S., Nady, N., Still, C. D., Garcia, B. A., Adelman, K., & Wysocka, J. (2017). Mll3 and Mll4 Facilitate Enhancer RNA Synthesis and Transcription from Promoters Independently of H3K4 Monomethylation. Molecular Cell, 66(4), 568- 576.e4. https://doi.org/10.1016/j.molcel.2017.04.018 Dostie, J., Richmond, T. A., Arnaout, R. A., Selzer, R. R., Lee, W. L., Honan, T. A., Rubio, E. D., Krumm, A., Lamb, J., Nusbaum, C., Green, R. D., & Dekker, J. (2006). Chromosome Conformation Capture Carbon Copy (5C): A massively parallel solution for mapping interactions between genomic elements. Genome Research, 16(10), 1299–1309. https://doi.org/10.1101/gr.5571506 Doughty, B. R., Hinks, M. M., Schaepe, J. M., Marinov, G. K., Thurm, A. R., Rios- Martinez, C., Parks, B. E., Tan, Y., Marklund, E., Dubocanin, D., Bintu, L., & Greenleaf, W. J. (2024). Single-molecule states link transcription factor binding to gene expression. Nature, 636(8043), 745–754. https://doi.org/10.1038/s41586-024-08219-w Du, A. Y., Chobirko, J. D., Zhuo, X., Feschotte, C., & Wang, T. (2024). Regulatory transposable elements in the encyclopedia of DNA elements. Nature Communications, 15(1), 7594. https://doi.org/10.1038/s41467-024-51921-6 Du, M., Stitzinger, S. H., Spille, J.-H., Cho, W.-K., Lee, C., Hijaz, M., Quintana, A., & Cissé, I. I. (2024). Direct observation of a condensate effect on super- 135 enhancer controlled gene bursting. Cell, 187(2), 331-344.e17. https://doi.org/10.1016/j.cell.2023.12.005 Duarte, F. M., Fuda, N. J., Mahat, D. B., Core, L. J., Guertin, M. J., & Lis, J. T. (2016). Transcription factors GAF and HSF act at distinct regulatory steps to modulate stress-induced gene activation. Genes & Development, 30(15), 1731–1746. https://doi.org/10.1101/gad.284430.116 Duttke, S. H., Guzman, C., Chang, M., Delos Santos, N. P., McDonald, B. R., Xie, J., Carlin, A. F., Heinz, S., & Benner, C. (2024). Position-dependent function of human sequence-specific transcription factors. Nature, 631(8022), 891–898. https://doi.org/10.1038/s41586-024-07662-z Eckner, R., Ewen, M. E., Newsome, D., Gerdes, M., DeCaprio, J. A., Lawrence, J. B., & Livingston, D. M. (1994). Molecular cloning and functional analysis of the adenovirus E1A-associated 300-kD protein (p300) reveals a protein with properties of a transcriptional adaptor. Genes & Development, 8(8), 869– 884. https://doi.org/10.1101/gad.8.8.869 ENCODE Project Consortium. (2004). The ENCODE (ENCyclopedia Of DNA Elements) Project. Science (New York, N.Y.), 306(5696), 636–640. https://doi.org/10.1126/science.1105136 ENCODE Project Consortium, Moore, J. E., Purcaro, M. J., Pratt, H. E., Epstein, C. B., Shoresh, N., Adrian, J., Kawli, T., Davis, C. A., Dobin, A., Kaul, R., 136 Halow, J., Van Nostrand, E. L., Freese, P., Gorkin, D. U., Shen, Y., He, Y., Mackiewicz, M., Pauli-Behn, F., … Weng, Z. (2020). Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature, 583(7818), 699–710. https://doi.org/10.1038/s41586-020-2493-4 Engreitz, J. M., Haines, J. E., Perez, E. M., Munson, G., Chen, J., Kane, M., McDonel, P. E., Guttman, M., & Lander, E. S. (2016). Local regulation of gene expression by lncRNA promoters, transcription and splicing. Nature, 539(7629), 452–455. https://doi.org/10.1038/nature20149 Ephrussi, A., Church, G. M., Tonegawa, S., & Gilbert, W. (1985). B lineage— Specific interactions of an immunoglobulin enhancer with cellular factors in vivo. Science (New York, N.Y.), 227(4683), 134–140. https://doi.org/10.1126/science.3917574 Farley, E. K., Olson, K. M., Zhang, W., Brandt, A. J., Rokhsar, D. S., & Levine, M. S. (2015). Suboptimization of developmental enhancers. Science, 350(6258), 325–328. https://doi.org/10.1126/science.aac6948 Fitz, J., Neumann, T., Steininger, M., Wiedemann, E.-M., Garcia, A. C., Athanasiadis, A., Schoeberl, U. E., & Pavri, R. (2020). Spt5-mediated enhancer transcription directly couples enhancer activation with physical promoter interaction. Nature Genetics, 52(5), 505–515. https://doi.org/10.1038/s41588-020-0605-6 137 Flanagan, P. M., Kelleher, R. J., Sayre, M. H., Tschochner, H., & Kornberg, R. D. (1991). A mediator required for activation of RNA polymerase II transcription in vitro. Nature, 350(6317), 436–438. https://doi.org/10.1038/350436a0 Frankel, N., Davis, G. K., Vargas, D., Wang, S., Payre, F., & Stern, D. L. (2010). Phenotypic robustness conferred by apparently redundant transcriptional enhancers. Nature, 466(7305), 490–493. https://doi.org/10.1038/nature09158 Frömel, R., Rühle, J., Bernal Martinez, A., Szu-Tu, C., Pacheco Pastor, F., Martinez-Corral, R., & Velten, L. (2025). Design principles of cell-state- specific enhancers in hematopoiesis. Cell, 188(12), 3202-3218.e21. https://doi.org/10.1016/j.cell.2025.04.017 Frömel, R., Rühle, J., Martinez, A. B., Szu-Tu, C., Pastor, F. P., Martinez-Corral, R., & Velten, L. (2025). Design principles of cell-state-specific enhancers in hematopoiesis. Cell, 0(0). https://doi.org/10.1016/j.cell.2025.04.017 Fuda, N. J., Ardehali, M. B., & Lis, J. T. (2009). Defining mechanisms that regulate RNA polymerase II transcription in vivo. Nature, 461(7261), 186–192. https://doi.org/10.1038/nature08449 Fujinaga, K., Huang, F., & Peterlin, B. M. (2023). P-TEFb: The master regulator of transcription elongation. Molecular Cell, 83(3), 393–403. https://doi.org/10.1016/j.molcel.2022.12.006 138 Fulco, C. P., Munschauer, M., Anyoha, R., Munson, G., Grossman, S. R., Perez, E. M., Kane, M., Cleary, B., Lander, E. S., & Engreitz, J. M. (2016). Systematic mapping of functional enhancer–promoter connections with CRISPR interference. Science, 354(6313), 769–773. https://doi.org/10.1126/science.aag2445 Fulco, C. P., Nasser, J., Jones, T. R., Munson, G., Bergman, D. T., Subramanian, V., Grossman, S. R., Anyoha, R., Doughty, B. R., Patwardhan, T. A., Nguyen, T. H., Kane, M., Perez, E. M., Durand, N. C., Lareau, C. A., Stamenova, E. K., Aiden, E. L., Lander, E. S., & Engreitz, J. M. (2019). Activity-by-Contact model of enhancer-promoter regulation from thousands of CRISPR perturbations. Nature Genetics, 51(12), 1664–1669. https://doi.org/10.1038/s41588-019-0538-0 Gabriele, M., Brandão, H. B., Grosse-Holz, S., Jha, A., Dailey, G. M., Cattoglio, C., Hsieh, T.-H. S., Mirny, L., Zechner, C., & Hansen, A. S. (2022). Dynamics of CTCF- and cohesin-mediated chromatin looping revealed by live-cell imaging. Science, 376(6592), 496–501. https://doi.org/10.1126/science.abn6583 Gambone, J. E., Dusaban, S. S., Loperena, R., Nakata, Y., & Shetzline, S. E. (2011). The c-Myb target gene neuromedin U functions as a novel cofactor during the early stages of erythropoiesis. Blood, 117(21), 5733–5743. 139 https://doi.org/10.1182/blood-2009-09-242131 Gasperini, M., Hill, A. J., McFaline-Figueroa, J. L., Martin, B., Kim, S., Zhang, M. D., Jackson, D., Leith, A., Schreiber, J., Noble, W. S., Trapnell, C., Ahituv, N., & Shendure, J. (2019). A Genome-wide Framework for Mapping Gene Regulation via Cellular Genetic Screens. Cell, 176(1–2), 377-390.e19. https://doi.org/10.1016/j.cell.2018.11.029 Georgakopoulos-Soares, I., Deng, C., Agarwal, V., Chan, C. S. Y., Zhao, J., Inoue, F., & Ahituv, N. (2023). Transcription factor binding site orientation and order are major drivers of gene regulatory activity. Nature Communications, 14(1), 2333. https://doi.org/10.1038/s41467-023-37960-5 Gill, G., & Ptashne, M. (1988). Negative effect of the transcriptional activator GAL4. Nature, 334(6184), 721–724. https://doi.org/10.1038/334721a0 Gillies, S. D., Morrison, S. L., Oi, V. T., & Tonegawa, S. (1983). A tissue-specific transcription enhancer element is located in the major intron of a rearranged immunoglobulin heavy chain gene. Cell, 33(3), 717–728. https://doi.org/10.1016/0092-8674(83)90014-4 Gilmour, J., Assi, S. A., Noailles, L., Lichtinger, M., Obier, N., & Bonifer, C. (2018). The Co-operation of RUNX1 with LDB1, CDK9 and BRD4 Drives Transcription Factor Complex Relocation During Haematopoietic Specification. Scientific Reports, 8, 10410. https://doi.org/10.1038/s41598- 140 018-28506-7 Goel, V. Y., Huseyin, M. K., & Hansen, A. S. (2023). Region Capture Micro-C reveals coalescence of enhancers and promoters into nested microcompartments. Nature Genetics, 55(6), 1048–1056. https://doi.org/10.1038/s41588-023-01391-1 Goodman, R. H., & Smolik, S. (2000). CBP/p300 in cell growth, transformation, and development. Genes & Development, 14(13), 1553–1577. https://doi.org/10.1101/gad.14.13.1553 Gordon, A. (2010). FASTX-Toolkit. GitHub. https://github.com/agordon/fastx_toolkit (Original work published 2013) Gosai, S. J., Castro, R. I., Fuentes, N., Butts, J. C., Mouri, K., Alasoadura, M., Kales, S., Nguyen, T. T. L., Noche, R. R., Rao, A. S., Joy, M. T., Sabeti, P. C., Reilly, S. K., & Tewhey, R. (2024). Machine-guided design of cell-type-targeting cis-regulatory elements. Nature, 634(8036), 1211–1220. https://doi.org/10.1038/s41586-024-08070-z Grande, A., Montanari, M., Manfredini, R., Tagliafico, E., Zanocco-Marani, T., Trevisan, F., Ligabue, G., Siena, M., Ferrari, S., & Ferrari, S. (2001). A functionally active RARalpha nuclear receptor is expressed in retinoic acid non responsive early myeloblastic cell lines. Cell Death and Differentiation, 8(1), 70–82. https://doi.org/10.1038/sj.cdd.4400771 141 Grebien, F., Kerenyi, M. A., Kovacic, B., Kolbe, T., Becker, V., Dolznig, H., Pfeffer, K., Klingmüller, U., Müller, M., Beug, H., Müllner, E. W., & Moriggl, R. (2008). Stat5 activation enables erythropoiesis in the absence of EpoR and Jak2. Blood, 111(9), 4511–4522. https://doi.org/10.1182/blood-2007-07- 102848 Gressel, S., Schwalb, B., & Cramer, P. (2019). The pause-initiation limit restricts transcription activation in human cells. Nature Communications, 10(1), 3603. https://doi.org/10.1038/s41467-019-11536-8 Gressel, S., Schwalb, B., Decker, T. M., Qin, W., Leonhardt, H., Eick, D., & Cramer, P. (2017). CDK9-dependent RNA polymerase II pausing controls transcription initiation. eLife, 6, e29736. https://doi.org/10.7554/eLife.29736 Grossman, S. R., Zhang, X., Wang, L., Engreitz, J., Melnikov, A., Rogov, P., Tewhey, R., Isakova, A., Deplancke, B., Bernstein, B. E., Mikkelsen, T. S., & Lander, E. S. (2017). Systematic dissection of genomic features determining transcription factor binding and enhancer function. Proceedings of the National Academy of Sciences of the United States of America, 114(7), E1291–E1300. https://doi.org/10.1073/pnas.1621150114 Gu, B., Swigut, T., Spencley, A., Bauer, M. R., Chung, M., Meyer, T., & Wysocka, J. (2018). Transcription-coupled changes in nuclear mobility of mammalian cis-regulatory elements. Science (New York, N.Y.), 359(6379), 1050–1055. 142 https://doi.org/10.1126/science.aao3136 Guo, C., McDowell, I. C., Nodzenski, M., Scholtens, D. M., Allen, A. S., Lowe, W. L., & Reddy, T. E. (2017). Transversions have larger regulatory effects than transitions. BMC Genomics, 18, 394. https://doi.org/10.1186/s12864-017- 3785-4 Guo, X., Plank-Bazinet, J., Krivega, I., Dale, R. K., & Dean, A. (2020). Embryonic erythropoiesis and hemoglobin switching require transcriptional repressor ETO2 to modulate chromatin organization. Nucleic Acids Research, 48(18), 10226–10240. https://doi.org/10.1093/nar/gkaa736 Guo, Y., Xu, Q., Canzio, D., Shou, J., Li, J., Gorkin, D. U., Jung, I., Wu, H., Zhai, Y., Tang, Y., Lu, Y., Wu, Y., Jia, Z., Li, W., Zhang, M. Q., Ren, B., Krainer, A. R., Maniatis, T., & Wu, Q. (2015). CRISPR Inversion of CTCF Sites Alters Genome Topology and Enhancer/Promoter Function. Cell, 162(4), 900–910. https://doi.org/10.1016/j.cell.2015.07.038 Heintzman, N. D., Hon, G. C., Hawkins, R. D., Kheradpour, P., Stark, A., Harp, L. F., Ye, Z., Lee, L. K., Stuart, R. K., Ching, C. W., Ching, K. A., Antosiewicz- Bourget, J. E., Liu, H., Zhang, X., Green, R. D., Lobanenkov, V. V., Stewart, R., Thomson, J. A., Crawford, G. E., … Ren, B. (2009). Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature, 459(7243), 108–112. 143 https://doi.org/10.1038/nature07829 Heintzman, N. D., Stuart, R. K., Hon, G., Fu, Y., Ching, C. W., Hawkins, R. D., Barrera, L. O., Van Calcar, S., Qu, C., Ching, K. A., Wang, W., Weng, Z., Green, R. D., Crawford, G. E., & Ren, B. (2007). Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nature Genetics, 39(3), 311–318. https://doi.org/10.1038/ng1966 Hen, R., Borrelli, E., & Chambon, P. (1985). Repression of the immunoglobulin heavy chain enhancer by the adenovirus-2 E1A products. Science (New York, N.Y.), 230(4732), 1391–1394. https://doi.org/10.1126/science.2999984 Henriques, T., Scruggs, B. S., Inouye, M. O., Muse, G. W., Williams, L. H., Burkholder, A. B., Lavender, C. A., Fargo, D. C., & Adelman, K. (2018). Widespread transcriptional pausing and elongation control at enhancers. Genes & Development, 32(1), 26–41. https://doi.org/10.1101/gad.309351.117 Hitz, B. C., Jin-Wook, L., Jolanki, O., Kagda, M. S., Graham, K., Sud, P., Gabdank, I., Strattan, J. S., Sloan, C. A., Dreszer, T., Rowe, L. D., Podduturi, N. R., Malladi, V. S., Chan, E. T., Davidson, J. M., Ho, M., Miyasato, S., Simison, M., Tanaka, F., … Cherry, J. M. (2023). The ENCODE Uniform Analysis Pipelines. bioRxiv: The Preprint Server for Biology, 2023.04.04.535623. 144 https://doi.org/10.1101/2023.04.04.535623 Hnisz, D., Abraham, B. J., Lee, T. I., Lau, A., Saint-André, V., Sigova, A. A., Hoke, H. A., & Young, R. A. (2013). Super-Enhancers in the Control of Cell Identity and Disease. Cell, 155(4), 934–947. https://doi.org/10.1016/j.cell.2013.09.053 Ho, L., & Crabtree, G. R. (2010). Chromatin remodelling during development. Nature, 463(7280), 474–484. https://doi.org/10.1038/nature08911 Hochheimer, A., & Tjian, R. (2003). Diversified transcription initiation complexes expand promoter selectivity and tissue-specific gene expression. Genes & Development, 17(11), 1309–1320. https://doi.org/10.1101/gad.1099903 Hong, J.-W., Hendrix, D. A., & Levine, M. S. (2008). Shadow Enhancers as a Source of Evolutionary Novelty. Science, 321(5894), 1314–1314. https://doi.org/10.1126/science.1160631 Hsieh, T.-H. S., Cattoglio, C., Slobodyanyuk, E., Hansen, A. S., Darzacq, X., & Tjian, R. (2022). Enhancer–promoter interactions and transcription are largely maintained upon acute loss of CTCF, cohesin, WAPL or YY1. Nature Genetics, 54(12), 1919–1932. https://doi.org/10.1038/s41588-022- 01223-8 Hsieh, T.-H. S., Cattoglio, C., Slobodyanyuk, E., Hansen, A. S., Rando, O. J., Tjian, R., & Darzacq, X. (2020). Resolving the 3D Landscape of Transcription- 145 Linked Mammalian Chromatin Folding. Molecular Cell, 78(3), 539-553.e8. https://doi.org/10.1016/j.molcel.2020.03.002 Hu, J., Liu, J., Xue, F., Halverson, G., Reid, M., Guo, A., Chen, L., Raza, A., Galili, N., Jaffray, J., Lane, J., Chasis, J. A., Taylor, N., Mohandas, N., & An, X. (2013). Isolation and functional characterization of human erythroblasts at distinct stages: Implications for understanding of normal and disordered erythropoiesis in vivo. Blood, 121(16), 3246–3253. https://doi.org/10.1182/blood-2013-01-476390 Huang, J., Li, K., Cai, W., Liu, X., Zhang, Y., Orkin, S. H., Xu, J., & Yuan, G.-C. (2018). Dissecting super-enhancer hierarchy based on chromatin interactions. Nature Communications, 9, 943. https://doi.org/10.1038/s41467-018-03279-9 International Human Genome Sequencing Consortium. (2004). Finishing the euchromatic sequence of the human genome. Nature, 431(7011), 931–945. https://doi.org/10.1038/nature03001 Jang, M. K., Mochizuki, K., Zhou, M., Jeong, H.-S., Brady, J. N., & Ozato, K. (2005). The Bromodomain Protein Brd4 Is a Positive Regulatory Component of P- TEFb and Stimulates RNA Polymerase II-Dependent Transcription. Molecular Cell, 19(4), 523–534. https://doi.org/10.1016/j.molcel.2005.06.027 Janknecht, R., & Hunter, T. (1996). A growing coactivator network. Nature, 146 383(6595), 22–23. https://doi.org/10.1038/383022a0 Jin, Q., Yu, L.-R., Wang, L., Zhang, Z., Kasper, L. H., Lee, J.-E., Wang, C., Brindle, P. K., Dent, S. Y. R., & Ge, K. (2011). Distinct roles of GCN5/PCAF- mediated H3K9ac and CBP/p300-mediated H3K18/27ac in nuclear receptor transactivation. The EMBO Journal, 30(2), 249–262. https://doi.org/10.1038/emboj.2010.318 Jinek, M., Chylinski, K., Fonfara, I., Hauer, M., Doudna, J. A., & Charpentier, E. (2012). A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science (New York, N.Y.), 337(6096), 816–821. https://doi.org/10.1126/science.1225829 John, S., Vinkemeier, U., Soldaini, E., Darnell, J. E., & Leonard, W. J. (1999). The significance of tetramerization in promoter recruitment by Stat5. Molecular and Cellular Biology, 19(3), 1910–1918. https://doi.org/10.1128/MCB.19.3.1910 Jonkers, I., Kwak, H., & Lis, J. T. (2014). Genome-wide dynamics of Pol II elongation and its interplay with promoter proximal pausing, chromatin, and exons. eLife, 3, e02407. https://doi.org/10.7554/eLife.02407 Judd, J. (2020). PROseq_alignment.sh. GitHub. https://github.com/JAJ256/PROseq_alignment.sh (Original work published 2020) 147 Judd, J., Wojenski, L. A., Wainman, L. M., Tippens, N. D., Rice, E. J., Dziubek, A., Villafano, G. J., Wissink, E. M., Versluis, P., Bagepalli, L., Shah, S. R., Mahat, D. B., Tome, J. M., Danko, C. G., Lis, J. T., & Core, L. J. (2020). A rapid, sensitive, scalable method for Precision Run-On sequencing (PRO- seq). bioRxiv, 2020.05.18.102277. https://doi.org/10.1101/2020.05.18.102277 Junion, G., Spivakov, M., Girardot, C., Braun, M., Gustafson, E. H., Birney, E., & Furlong, E. E. M. (2012). A Transcription Factor Collective Defines Cardiac Cell Fate and Reflects Lineage History. Cell, 148(3), 473–486. https://doi.org/10.1016/j.cell.2012.01.030 Kadam, S., & Emerson, B. M. (2003). Transcriptional specificity of human SWI/SNF BRG1 and BRM chromatin remodeling complexes. Molecular Cell, 11(2), 377–389. https://doi.org/10.1016/s1097-2765(03)00034-0 Karlsson, M., Zhang, C., Méar, L., Zhong, W., Digre, A., Katona, B., Sjöstedt, E., Butler, L., Odeberg, J., Dusart, P., Edfors, F., Oksvold, P., von Feilitzen, K., Zwahlen, M., Arif, M., Altay, O., Li, X., Ozcan, M., Mardinoglu, A., … Lindskog, C. (2021). A single-cell type transcriptomics map of human tissues. Science Advances, 7(31), eabh2169. https://doi.org/10.1126/sciadv.abh2169 Karolchik, D., Hinrichs, A. S., Furey, T. S., Roskin, K. M., Sugnet, C. W., Haussler, D., & Kent, W. J. (2004). The UCSC Table Browser data retrieval tool. 148 Nucleic Acids Research, 32(Database issue), D493-496. https://doi.org/10.1093/nar/gkh103 Kelleher, R. J., Flanagan, P. M., & Kornberg, R. D. (1990). A novel mediator between activator proteins and the RNA polymerase II transcription apparatus. Cell, 61(7), 1209–1215. https://doi.org/10.1016/0092- 8674(90)90685-8 Kelley, D. R., Snoek, J., & Rinn, J. L. (2016). Basset: Learning the regulatory code of the accessible genome with deep convolutional neural networks. Genome Research, 26(7), 990–999. https://doi.org/10.1101/gr.200535.115 Kent, W. J., Zweig, A. S., Barber, G., Hinrichs, A. S., & Karolchik, D. (2010). BigWig and BigBed: Enabling browsing of large distributed datasets. Bioinformatics (Oxford, England), 26(17), 2204–2207. https://doi.org/10.1093/bioinformatics/btq351 Khalfan, M. (2021). reform: Modify Reference Sequence and Annotation Files Quickly and Reproducibly. Genomics Core at NYU CGSB. https://gencore.bio.nyu.edu/reform/ (Original work published 2018) Kim, T.-K., Hemberg, M., Gray, J. M., Costa, A. M., Bear, D. M., Wu, J., Harmin, D. A., Laptewicz, M., Barbara-Haley, K., Kuersten, S., Markenscoff- Papadimitriou, E., Kuhl, D., Bito, H., Worley, P. F., Kreiman, G., & Greenberg, M. E. (2010). Widespread transcription at neuronal activity- 149 regulated enhancers. Nature, 465(7295), 182–187. https://doi.org/10.1038/nature09033 Kim, Y. J., Björklund, S., Li, Y., Sayre, M. H., & Kornberg, R. D. (1994). A multiprotein mediator of transcriptional activation and its interaction with the C-terminal repeat domain of RNA polymerase II. Cell, 77(4), 599–608. https://doi.org/10.1016/0092-8674(94)90221-6 Kircher, M., Xiong, C., Martin, B., Schubach, M., Inoue, F., Bell, R. J. A., Costello, J. F., Shendure, J., & Ahituv, N. (2019). Saturation mutagenesis of twenty disease-associated regulatory elements at single base-pair resolution. Nature Communications, 10(1), 3583. https://doi.org/10.1038/s41467-019-11526-w Klein, J. C., Agarwal, V., Inoue, F., Keith, A., Martin, B., Kircher, M., Ahituv, N., & Shendure, J. (2020). A systematic evaluation of the design and context dependencies of massively parallel reporter assays. Nature Methods, 17(11), 1083–1091. https://doi.org/10.1038/s41592-020-0965-y Kosicki, M., Zhang, B., Pampari, A., Akiyama, J. A., Plajzer-Frick, I., Novak, C. S., Tran, S., Zhu, Y., Kato, M., Hunter, R. D., von Maydell, K., Barton, S., Beckman, E., Kundaje, A., Dickel, D. E., Visel, A., & Pennacchio, L. A. (2024). Mutagenesis Sensitivity Mapping of Human Enhancers In Vivo. bioRxiv: The Preprint Server for Biology, 2024.09.06.611737. https://doi.org/10.1101/2024.09.06.611737 150 Kribelbauer, J. F., Rastogi, C., Bussemaker, H. J., & Mann, R. S. (2019). Low- Affinity Binding Sites and the Transcription Factor Specificity Paradox in Eukaryotes. Annual Review of Cell and Developmental Biology, 35, 357– 379. https://doi.org/10.1146/annurev-cellbio-100617-062719 Kribelbauer-Swietek, J. F., Pushkarev, O., Gardeux, V., Faltejskova, K., Russeil, J., van Mierlo, G., & Deplancke, B. (2024). Context transcription factors establish cooperative environments and mediate enhancer communication. Nature Genetics, 56(10), 2199–2212. https://doi.org/10.1038/s41588-024- 01892-7 Kruesi, W. S., Core, L. J., Waters, C. T., Lis, J. T., & Meyer, B. J. (2013). Condensin controls recruitment of RNA polymerase II to achieve nematode X- chromosome dosage compensation. eLife, 2, e00808. https://doi.org/10.7554/eLife.00808 Kubo, N., Chen, P. B., Hu, R., Ye, Z., Sasaki, H., & Ren, B. (2024). H3K4me1 facilitates promoter-enhancer interactions and gene activation during embryonic stem cell differentiation. Molecular Cell, 84(9), 1742-1752.e5. https://doi.org/10.1016/j.molcel.2024.02.030 Kulkarni, M. M., & Arnosti, D. N. (2003). Information display by transcriptional enhancers. Development (Cambridge, England), 130(26), 6569–6575. https://doi.org/10.1242/dev.00890 151 Kvon, E. Z., Waymack, R., Gad, M., & Wunderlich, Z. (2021). Enhancer redundancy in development and disease. Nature Reviews Genetics, 22(5), 324–336. https://doi.org/10.1038/s41576-020-00311-x Kwak, H., Fuda, N. J., Core, L. J., & Lis, J. T. (2013). Precise maps of RNA polymerase reveal how promoters direct initiation and pausing. Science (New York, N.Y.), 339(6122), 950–953. https://doi.org/10.1126/science.1229386 Labun, K., Montague, T. G., Krause, M., Torres Cleuren, Y. N., Tjeldnes, H., & Valen, E. (2019). CHOPCHOP v3: Expanding the CRISPR web toolbox beyond genome editing. Nucleic Acids Research, 47(W1), W171–W174. https://doi.org/10.1093/nar/gkz365 Lam, S. S., Martell, J. D., Kamer, K. J., Deerinck, T. J., Ellisman, M. H., Mootha, V. K., & Ting, A. Y. (2015). Directed evolution of APEX2 for electron microscopy and proximity labeling. Nature Methods, 12(1), 51–54. https://doi.org/10.1038/nmeth.3179 LaMar, D. (2015). FastQC. https://qubeshub.org/resources/fastqc Lander, E. S., Linton, L. M., Birren, B., Nusbaum, C., Zody, M. C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., Funke, R., Gage, D., Harris, K., Heaford, A., Howland, J., Kann, L., Lehoczky, J., LeVine, R., McEwan, P., … The Wellcome Trust: (2001). Initial sequencing and analysis of the 152 human genome. Nature, 409(6822), 860–921. https://doi.org/10.1038/35057062 Langmead, B., & Salzberg, S. L. (2012). Fast gapped-read alignment with Bowtie 2. Nature Methods, 9(4), 357–359. https://doi.org/10.1038/nmeth.1923 Lettice, L. A., Heaney, S. J. H., Purdie, L. A., Li, L., de Beer, P., Oostra, B. A., Goode, D., Elgar, G., Hill, R. E., & de Graaff, E. (2003). A long-range Shh enhancer regulates expression in the developing limb and fin and is associated with preaxial polydactyly. Human Molecular Genetics, 12(14), 1725–1735. https://doi.org/10.1093/hmg/ddg180 Levo, M., Raimundo, J., Bing, X. Y., Sisco, Z., Batut, P. J., Ryabichko, S., Gregor, T., & Levine, M. S. (2022). Transcriptional coupling of distant regulatory genes in living embryos. Nature, 605(7911), 754–760. https://doi.org/10.1038/s41586-022-04680-7 Li, D., Zhao, X.-Y., Zhou, S., Hu, Q., Wu, F., & Lee, H.-Y. (2023). Multidimensional profiling reveals GATA1-modulated stage-specific chromatin states and functional associations during human erythropoiesis. Nucleic Acids Research, 51(13), 6634–6653. https://doi.org/10.1093/nar/gkad468 Li, H. (2023). Seqtk. GitHub. https://github.com/lh3/seqtk (Original work published 2012) 153 Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R., & 1000 Genome Project Data Processing Subgroup. (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics (Oxford, England), 25(16), 2078–2079. https://doi.org/10.1093/bioinformatics/btp352 Li, J., Hale, J., Bhagia, P., Xue, F., Chen, L., Jaffray, J., Yan, H., Lane, J., Gallagher, P. G., Mohandas, N., Liu, J., & An, X. (2014). Isolation and transcriptome analyses of human erythroid progenitors: BFU-E and CFU-E. Blood, 124(24), 3636–3645. https://doi.org/10.1182/blood-2014-07-588806 Li, X., Tang, X., Bing, X., Catalano, C., Li, T., Dolsten, G., Wu, C., & Levine, M. (2023). GAGA-associated factor fosters loop formation in the Drosophila genome. Molecular Cell, 83(9), 1519-1526.e4. https://doi.org/10.1016/j.molcel.2023.03.011 Liao, Y., Smyth, G. K., & Shi, W. (2014). featureCounts: An efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics (Oxford, England), 30(7), 923–930. https://doi.org/10.1093/bioinformatics/btt656 Lieberman-Aiden, E., van Berkum, N. L., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B. R., Sabo, P. J., Dorschner, M. O., Sandstrom, R., Bernstein, B., Bender, M. A., Groudine, M., Gnirke, A., 154 Stamatoyannopoulos, J., Mirny, L. A., Lander, E. S., & Dekker, J. (2009). Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome. Science, 326(5950), 289–293. https://doi.org/10.1126/science.1181369 Lim, C. P., & Cao, X. (2006). Structure, function, and regulation of STAT proteins. Molecular bioSystems, 2(11), 536–550. https://doi.org/10.1039/b606246f Lim, F., Solvason, J. J., Ryan, G. E., Le, S. H., Jindal, G. A., Steffen, P., Jandu, S. K., & Farley, E. K. (2024). Affinity-optimizing enhancer variants disrupt development. Nature, 626(7997), 151–159. https://doi.org/10.1038/s41586- 023-06922-8 Lin, X., Liu, Y., Liu, S., Zhu, X., Wu, L., Zhu, Y., Zhao, D., Xu, X., Chemparathy, A., Wang, H., Cao, Y., Nakamura, M., Noordermeer, J. N., La Russa, M., Wong, W. H., Zhao, K., & Qi, L. S. (2022). Nested epistasis enhancer networks for robust genome regulation. Science, 377(6610), 1077–1085. https://doi.org/10.1126/science.abk3512 Lis, J. T., Mason, P., Peng, J., Price, D. H., & Werner, J. (2000). P-TEFb kinase recruitment and function at heat shock loci. Genes & Development, 14(7), 792–803. Liu, W., Ma, Q., Wong, K., Li, W., Ohgi, K., Zhang, J., Aggarwal, A., & Rosenfeld, M. G. (2013). Brd4 and JMJD6-associated anti-pause enhancers in 155 regulation of transcriptional pause release. Cell, 155(7), 1581–1595. https://doi.org/10.1016/j.cell.2013.10.056 Livak, K. J., & Schmittgen, T. D. (2001). Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) Method. Methods (San Diego, Calif.), 25(4), 402–408. https://doi.org/10.1006/meth.2001.1262 Lopez-Delisle, L., Rabbani, L., Wolff, J., Bhardwaj, V., Backofen, R., Grüning, B., Ramírez, F., & Manke, T. (2021). pyGenomeTracks: Reproducible plots for multivariate genomic datasets. Bioinformatics (Oxford, England), 37(3), 422–423. https://doi.org/10.1093/bioinformatics/btaa692 Love, M. I., Huber, W., & Anders, S. (2014). Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology, 15(12), 550. https://doi.org/10.1186/s13059-014-0550-8 Lu, X., Zhu, X., Li, Y., Liu, M., Yu, B., Wang, Y., Rao, M., Yang, H., Zhou, K., Wang, Y., Chen, Y., Chen, M., Zhuang, S., Chen, L.-F., Liu, R., & Chen, R. (2016). Multiple P-TEFbs cooperatively regulate the release of promoter- proximally paused RNA polymerase II. Nucleic Acids Research, 44(14), 6853–6867. https://doi.org/10.1093/nar/gkw571 Luo, Y., Hitz, B. C., Gabdank, I., Hilton, J. A., Kagda, M. S., Lam, B., Myers, Z., Sud, P., Jou, J., Lin, K., Baymuradov, U. K., Graham, K., Litton, C., 156 Miyasato, S. R., Strattan, J. S., Jolanki, O., Lee, J.-W., Tanaka, F. Y., Adenekan, P., … Cherry, J. M. (2020). New developments on the Encyclopedia of DNA Elements (ENCODE) data portal. Nucleic Acids Research, 48(D1), D882–D889. https://doi.org/10.1093/nar/gkz1062 Lupiáñez, D. G., Kraft, K., Heinrich, V., Krawitz, P., Brancati, F., Klopocki, E., Horn, D., Kayserili, H., Opitz, J. M., Laxova, R., Santos-Simarro, F., Gilbert- Dussardier, B., Wittler, L., Borschiwer, M., Haas, S. A., Osterwalder, M., Franke, M., Timmermann, B., Hecht, J., … Mundlos, S. (2015). Disruptions of Topological Chromatin Domains Cause Pathogenic Rewiring of Gene- Enhancer Interactions. Cell, 161(5), 1012–1025. https://doi.org/10.1016/j.cell.2015.04.004 Lutfalla, G., & Uze, G. (2006). Performing quantitative reverse-transcribed polymerase chain reaction experiments. Methods in Enzymology, 410, 386– 400. https://doi.org/10.1016/S0076-6879(06)10019-1 Mahat, D. B., Kwak, H., Booth, G. T., Jonkers, I. H., Danko, C. G., Patel, R. K., Waters, C. T., Munson, K., Core, L. J., & Lis, J. T. (2016). Base-pair- resolution genome-wide mapping of active RNA polymerases using precision nuclear run-on (PRO-seq). Nature Protocols, 11(8), 1455–1476. https://doi.org/10.1038/nprot.2016.086 Mahat, D. B., Salamanca, H. H., Duarte, F. M., Danko, C. G., & Lis, J. T. (2016). 157 Mammalian Heat Shock Response and Mechanisms Underlying Its Genome-wide Transcriptional Regulation. Molecular Cell, 62(1), 63–78. https://doi.org/10.1016/j.molcel.2016.02.025 Mali, P., Yang, L., Esvelt, K. M., Aach, J., Guell, M., DiCarlo, J. E., Norville, J. E., & Church, G. M. (2013). RNA-guided human genome engineering via Cas9. Science (New York, N.Y.), 339(6121), 823–826. https://doi.org/10.1126/science.1232033 Martin, K. J., Lillie, J. W., & Green, M. R. (1990). Evidence for interaction of different eukaryotic transcriptional activators with distinct cellular targets. Nature, 346(6280), 147–152. https://doi.org/10.1038/346147a0 Martinez-Ara, M., Comoglio, F., Arensbergen, J. van, & Steensel, B. van. (2022). Systematic analysis of intrinsic enhancer-promoter compatibility in the mouse genome. Molecular Cell, 82(13), 2519-2531.e6. https://doi.org/10.1016/j.molcel.2022.04.009 Martyn, G. E., Montgomery, M. T., Jones, H., Guo, K., Doughty, B. R., Linder, J., Bisht, D., Xia, F., Cai, X. S., Chen, Z., Cochran, K., Lawrence, K. A., Munson, G., Pampari, A., Fulco, C. P., Sahni, N., Kelley, D. R., Lander, E. S., Kundaje, A., & Engreitz, J. M. (2025). Rewriting regulatory DNA to dissect and reprogram gene expression. Cell, S0092-8674(25)00352-6. https://doi.org/10.1016/j.cell.2025.03.034 158 Matreyek, K. A., Stephany, J. J., Chiasson, M. A., Hasle, N., & Fowler, D. M. (2020). An improved platform for functional assessment of large protein libraries in mammalian cells. Nucleic Acids Research, 48(1), e1. https://doi.org/10.1093/nar/gkz910 Medstrand, P., Landry, J. R., & Mager, D. L. (2001). Long terminal repeats are used as alternative promoters for the endothelin B receptor and apolipoprotein C-I genes in humans. The Journal of Biological Chemistry, 276(3), 1896– 1903. https://doi.org/10.1074/jbc.M006557200 Meier, N., Krpic, S., Rodriguez, P., Strouboulis, J., Monti, M., Krijgsveld, J., Gering, M., Patient, R., Hostert, A., & Grosveld, F. (2006). Novel binding partners of Ldb1 are required for haematopoietic development. Development, 133(24), 4913–4923. https://doi.org/10.1242/dev.02656 Melnikov, A., Murugan, A., Zhang, X., Tesileanu, T., Wang, L., Rogov, P., Feizi, S., Gnirke, A., Callan, C. G., Kinney, J. B., Kellis, M., Lander, E. S., & Mikkelsen, T. S. (2012). Systematic dissection and optimization of inducible enhancers in human cells using a massively parallel reporter assay. Nature Biotechnology, 30(3), 271–277. https://doi.org/10.1038/nbt.2137 Mercola, M., Goverman, J., Mirell, C., & Calame, K. (1985). Immunoglobulin Heavy-Chain Enhancer Requires One or More Tissue-Specific Factors. Science, 227(4684), 266–270. https://doi.org/10.1126/science.3917575 159 Merli, C., Bergstrom, D. E., Cygan, J. A., & Blackman, R. K. (1996). Promoter specificity mediates the independent regulation of neighboring genes. Genes & Development, 10(10), 1260–1270. https://doi.org/10.1101/gad.10.10.1260 Meyer, M. E., Gronemeyer, H., Turcotte, B., Bocquel, M. T., Tasset, D., & Chambon, P. (1989). Steroid hormone receptors compete for factors that mediate their enhancer function. Cell, 57(3), 433–442. https://doi.org/10.1016/0092-8674(89)90918-5 Meyer, W. K., Reichenbach, P., Schindler, U., Soldaini, E., & Nabholz, M. (1997). Interaction of STAT5 dimers on two low affinity binding sites mediates interleukin 2 (IL-2) stimulation of IL-2 receptor alpha gene transcription. The Journal of Biological Chemistry, 272(50), 31821–31828. https://doi.org/10.1074/jbc.272.50.31821 Mikhaylichenko, O., Bondarenko, V., Harnett, D., Schor, I. E., Males, M., Viales, R. R., & Furlong, E. E. M. (2018). The degree of enhancer or promoter activity is reflected by the levels and directionality of eRNA transcription. Genes & Development, 32(1), 42–57. https://doi.org/10.1101/gad.308619.117 Miller, J. A., & Widom, J. (2003). Collaborative competition mechanism for gene activation in vivo. Molecular and Cellular Biology, 23(5), 1623–1632. 160 https://doi.org/10.1128/MCB.23.5.1623-1632.2003 Mirny, L. A. (2010). Nucleosome-mediated cooperativity between transcription factors. Proceedings of the National Academy of Sciences of the United States of America, 107(52), 22534–22539. https://doi.org/10.1073/pnas.0913805107 Mitchell, P. J., Wang, C., & Tjian, R. (1987). Positive and negative regulation of transcription in vitro: Enhancer-binding protein AP-2 is inhibited by SV40 T antigen. Cell, 50(6), 847–861. https://doi.org/10.1016/0092- 8674(87)90512-5 Moreau, P., Hen, R., Wasylyk, B., Everett, R., Gaub, M. P., & Chambon, P. (1981). The SV40 72 base repair repeat has a striking effect on gene expression both in SV40 and other chimeric recombinants. Nucleic Acids Research, 9(22), 6047–6068. https://doi.org/10.1093/nar/9.22.6047 Morgunova, E., & Taipale, J. (2017). Structural perspective of cooperative transcription factor binding. Current Opinion in Structural Biology, 47, 1– 8. https://doi.org/10.1016/j.sbi.2017.03.006 Mudge, J. M., Carbonell-Sala, S., Diekhans, M., Martinez, J. G., Hunt, T., Jungreis, I., Loveland, J. E., Arnan, C., Barnes, I., Bennett, R., Berry, A., Bignell, A., Cerdán-Vélez, D., Cochran, K., Cortés, L. T., Davidson, C., Donaldson, S., Dursun, C., Fatima, R., … Frankish, A. (2025). GENCODE 2025: Reference 161 gene annotation for human and mouse. Nucleic Acids Research, 53(D1), D966–D975. https://doi.org/10.1093/nar/gkae1078 Neuberger, M. S. (1983). Expression and regulation of immunoglobulin heavy chain gene transfected into lymphoid cells. The EMBO Journal, 2(8), 1373– 1378. https://doi.org/10.1002/j.1460-2075.1983.tb01594.x Neumayr, C., Haberle, V., Serebreni, L., Karner, K., Hendy, O., Boija, A., Henninger, J. E., Li, C. H., Stejskal, K., Lin, G., Bergauer, K., Pagani, M., Rath, M., Mechtler, K., Arnold, C. D., & Stark, A. (2022). Differential cofactor dependencies define distinct types of human enhancers. Nature, 606(7913), 406–413. https://doi.org/10.1038/s41586-022-04779-x Niers, J. M., Chen, J. W., Weissleder, R., & Tannous, B. A. (2011). Enhanced in vivo imaging of metabolically biotinylated cell surface reporters. Analytical Chemistry, 83(3), 994–999. https://doi.org/10.1021/ac102758m Nora, E. P., Goloborodko, A., Valton, A.-L., Gibcus, J. H., Uebersohn, A., Abdennur, N., Dekker, J., Mirny, L. A., & Bruneau, B. G. (2017). Targeted Degradation of CTCF Decouples Local Insulation of Chromosome Domains from Genomic Compartmentalization. Cell, 169(5), 930-944.e22. https://doi.org/10.1016/j.cell.2017.05.004 Nora, E. P., Lajoie, B. R., Schulz, E. G., Giorgetti, L., Okamoto, I., Servant, N., Piolot, T., van Berkum, N. L., Meisig, J., Sedat, J., Gribnau, J., Barillot, E., 162 Blüthgen, N., Dekker, J., & Heard, E. (2012). Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature, 485(7398), 381– 385. https://doi.org/10.1038/nature11049 Nuez, B., Michalovich, D., Bygrave, A., Ploemacher, R., & Grosveld, F. (1995). Defective haematopoiesis in fetal liver resulting from inactivation of the EKLF gene. Nature, 375(6529), 316–318. https://doi.org/10.1038/375316a0 Okuda, T., van Deursen, J., Hiebert, S. W., Grosveld, G., & Downing, J. R. (1996). AML1, the target of multiple chromosomal translocations in human leukemia, is essential for normal fetal liver hematopoiesis. Cell, 84(2), 321– 330. https://doi.org/10.1016/s0092-8674(00)80986-1 Osterwalder, M., Barozzi, I., Tissières, V., Fukuda-Yuzawa, Y., Mannion, B. J., Afzal, S. Y., Lee, E. A., Zhu, Y., Plajzer-Frick, I., Pickle, C. S., Kato, M., Garvin, T. H., Pham, Q. T., Harrington, A. N., Akiyama, J. A., Afzal, V., Lopez-Rios, J., Dickel, D. E., Visel, A., & Pennacchio, L. A. (2018). Enhancer redundancy provides phenotypic robustness in mammalian development. Nature, 554(7691), 239–243. https://doi.org/10.1038/nature25461 Panne, D., Maniatis, T., & Harrison, S. C. (2007). An atomic model of the interferon-beta enhanceosome. Cell, 129(6), 1111–1123. https://doi.org/10.1016/j.cell.2007.05.019 163 Patwardhan, R. P., Hiatt, J. B., Witten, D. M., Kim, M. J., Smith, R. P., May, D., Lee, C., Andrie, J. M., Lee, S.-I., Cooper, G. M., Ahituv, N., Pennacchio, L. A., & Shendure, J. (2012). Massively parallel functional dissection of mammalian enhancers in vivo. Nature Biotechnology, 30(3), 265–270. https://doi.org/10.1038/nbt.2136 Payvar, F., DeFranco, D., Firestone, G. L., Edgar, B., Wrange, O., Okret, S., Gustafsson, J. A., & Yamamoto, K. R. (1983). Sequence-specific binding of glucocorticoid receptor to MTV DNA at sites within and upstream of the transcribed region. Cell, 35(2 Pt 1), 381–392. https://doi.org/10.1016/0092- 8674(83)90171-x Perez, G., Barber, G. P., Benet-Pages, A., Casper, J., Clawson, H., Diekhans, M., Fischer, C., Gonzalez, J. N., Hinrichs, A. S., Lee, C. M., Nassar, L. R., Raney, B. J., Speir, M. L., van Baren, M. J., Vaske, C. J., Haussler, D., Kent, W. J., & Haeussler, M. (2025). The UCSC Genome Browser database: 2025 update. Nucleic Acids Research, 53(D1), D1243–D1249. https://doi.org/10.1093/nar/gkae974 Perkins, A. C., Sharpe, A. H., & Orkin, S. H. (1995). Lethal beta-thalassaemia in mice lacking the erythroid CACCC-transcription factor EKLF. Nature, 375(6529), 318–322. https://doi.org/10.1038/375318a0 Perry, M. W., Boettiger, A. N., Bothma, J. P., & Levine, M. (2010). Shadow 164 Enhancers Foster Robustness of Drosophila Gastrulation. Current Biology, 20(17), 1562–1567. https://doi.org/10.1016/j.cub.2010.07.043 Pevny, L., Simon, M. C., Robertson, E., Klein, W. H., Tsai, S. F., D’Agati, V., Orkin, S. H., & Costantini, F. (1991). Erythroid differentiation in chimaeric mice blocked by a targeted mutation in the gene for transcription factor GATA- 1. Nature, 349(6306), 257–260. https://doi.org/10.1038/349257a0 Phan, M. H. Q., Zehnder, T., Puntieri, F., Magg, A., Majchrzycka, B., Antonović, M., Wieler, H., Lo, B.-W., Baranasic, D., Lenhard, B., Müller, F., Vingron, M., & Ibrahim, D. M. (2025). Conservation of regulatory elements with highly diverged sequences across large evolutionary distances. Nature Genetics, 57(6), 1524–1534. https://doi.org/10.1038/s41588-025-02202-5 Picard, D., & Schaffner, W. (1984). A lymphocyte-specific enhancer in the mouse immunoglobulin κ gene. Nature, 307(5946), 80–82. https://doi.org/10.1038/307080a0 Ptashne, M., & Gann, A. A. (1990). Activators and targets. Nature, 346(6282), 329– 331. https://doi.org/10.1038/346329a0 Qi, L. S., Larson, M. H., Gilbert, L. A., Doudna, J. A., Weissman, J. S., Arkin, A. P., & Lim, W. A. (2013). Repurposing CRISPR as an RNA-Guided Platform for Sequence-Specific Control of Gene Expression. Cell, 152(5), 1173–1183. https://doi.org/10.1016/j.cell.2013.02.022 165 Queen, C., & Baltimore, D. (1983). Immunoglobulin gene transcription is activated by downstream sequence elements. Cell, 33(3), 741–748. https://doi.org/10.1016/0092-8674(83)90016-8 Quinlan, A. R., & Hall, I. M. (2010). BEDTools: A flexible suite of utilities for comparing genomic features. Bioinformatics (Oxford, England), 26(6), 841– 842. https://doi.org/10.1093/bioinformatics/btq033 Rahl, P. B., Lin, C. Y., Seila, A. C., Flynn, R. A., McCuine, S., Burge, C. B., Sharp, P. A., & Young, R. A. (2010). C-Myc regulates transcriptional pause release. Cell, 141(3), 432–445. https://doi.org/10.1016/j.cell.2010.03.030 Ramírez, F., Ryan, D. P., Grüning, B., Bhardwaj, V., Kilpert, F., Richter, A. S., Heyne, S., Dündar, F., & Manke, T. (2016). deepTools2: A next generation web server for deep-sequencing data analysis. Nucleic Acids Research, 44(W1), W160-165. https://doi.org/10.1093/nar/gkw257 Rao, S. S. P., Huang, S.-C., Hilaire, B. G. S., Engreitz, J. M., Perez, E. M., Kieffer- Kwon, K.-R., Sanborn, A. L., Johnstone, S. E., Bascom, G. D., Bochkov, I. D., Huang, X., Shamim, M. S., Shin, J., Turner, D., Ye, Z., Omer, A. D., Robinson, J. T., Schlick, T., Bernstein, B. E., … Aiden, E. L. (2017). Cohesin Loss Eliminates All Loop Domains. Cell, 171(2), 305-320.e24. https://doi.org/10.1016/j.cell.2017.09.026 Rao, S. S. P., Huntley, M. H., Durand, N. C., Stamenova, E. K., Bochkov, I. D., 166 Robinson, J. T., Sanborn, A. L., Machol, I., Omer, A. D., Lander, E. S., & Aiden, E. L. (2014). A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell, 159(7), 1665–1680. https://doi.org/10.1016/j.cell.2014.11.021 Rebouissou, C., Sallis, S., & Forné, T. (2022). Quantitative Chromosome Conformation Capture (3C-qPCR). Methods in Molecular Biology (Clifton, N.J.), 2532, 3–13. https://doi.org/10.1007/978-1-0716-2497-5_1 Reilly, S. K., Gosai, S. J., Gutierrez, A., Mackay-Smith, A., Ulirsch, J. C., Kanai, M., Mouri, K., Berenzy, D., Kales, S., Butler, G. M., Gladden-Young, A., Bhuiyan, R. M., Stitzel, M. L., Finucane, H. K., Sabeti, P. C., & Tewhey, R. (2021). Direct characterization of cis-regulatory elements and functional dissection of complex genetic associations using HCR-FlowFISH. Nature Genetics, 53(8), 1166–1176. https://doi.org/10.1038/s41588-021-00900-4 Rengachari, S., Schilbach, S., Aibara, S., Dienemann, C., & Cramer, P. (2021). Structure of the human Mediator–RNA polymerase II pre-initiation complex. Nature, 594(7861), 129–133. https://doi.org/10.1038/s41586-021- 03555-7 Reske, J. J., Wilson, M. R., & Chandler, R. L. (2020). ATAC-seq normalization method can significantly affect differential accessibility analysis and interpretation. Epigenetics & Chromatin, 13(1), 22. 167 https://doi.org/10.1186/s13072-020-00342-y Roh, H., Shen, S. P., Hu, Y., Kwok, H. S., Siegenfeld, A. P., Lee, C., Zepeda, M. A., Guo, C.-J., Roseman, S. A., Sankaran, V. G., Buenrostro, J. D., & Liau, B. B. (2024). Coupling CRISPR Scanning with Targeted Chromatin Accessibility Profiling using a Double-Stranded DNA Deaminase. bioRxiv, 2024.12.17.628791. https://doi.org/10.1101/2024.12.17.628791 Sabari, B. R., Dall’Agnese, A., Boija, A., Klein, I. A., Coffey, E. L., Shrinivas, K., Abraham, B. J., Hannett, N. M., Zamudio, A. V., Manteiga, J. C., Li, C. H., Guo, Y. E., Day, D. S., Schuijers, J., Vasile, E., Malik, S., Hnisz, D., Lee, T. I., Cisse, I. I., … Young, R. A. (2018). Coactivator condensation at super- enhancers links phase separation and gene control. Science, 361(6400), eaar3958. https://doi.org/10.1126/science.aar3958 Sahu, B., Hartonen, T., Pihlajamaa, P., Wei, B., Dave, K., Zhu, F., Kaasinen, E., Lidschreiber, K., Lidschreiber, M., Daub, C. O., Cramer, P., Kivioja, T., & Taipale, J. (2022). Sequence determinants of human gene regulatory elements. Nature Genetics, 54(3), 283–294. https://doi.org/10.1038/s41588- 021-01009-4 Sanborn, A. L., Rao, S. S. P., Huang, S.-C., Durand, N. C., Huntley, M. H., Jewett, A. I., Bochkov, I. D., Chinnappan, D., Cutkosky, A., Li, J., Geeting, K. P., Gnirke, A., Melnikov, A., McKenna, D., Stamenova, E. K., Lander, E. S., & 168 Aiden, E. L. (2015). Chromatin extrusion explains key features of loop and domain formation in wild-type and engineered genomes. Proceedings of the National Academy of Sciences, 112(47), E6456–E6465. https://doi.org/10.1073/pnas.1518552112 Sartorelli, V., & Lauberth, S. M. (2020). Enhancer RNAs are an important regulatory layer of the epigenome. Nature Structural & Molecular Biology, 27(6), 521–528. https://doi.org/10.1038/s41594-020-0446-0 Sassone-Corsi, P., Wildeman, A., & Chambon, P. (1985). A trans-acting factor is responsible for the simian virus 40 enhancer activity in vitro. Nature, 313(6002), 458–463. https://doi.org/10.1038/313458a0 Sayers, E. W., Beck, J., Bolton, E. E., Brister, J. R., Chan, J., Connor, R., Feldgarden, M., Fine, A. M., Funk, K., Hoffman, J., Kannan, S., Kelly, C., Klimke, W., Kim, S., Lathrop, S., Marchler-Bauer, A., Murphy, T. D., O’Sullivan, C., Schmieder, E., … Pruitt, K. D. (2025). Database resources of the National Center for Biotechnology Information in 2025. Nucleic Acids Research, 53(D1), D20–D29. https://doi.org/10.1093/nar/gkae979 Schöler, H. R., & Gruss, P. (1984). Specific interaction between enhancer- containing molecules and cellular components. Cell, 36(2), 403–411. https://doi.org/10.1016/0092-8674(84)90233-2 Schöler, H. R., & Gruss, P. (1985). Cell type-specific transcriptional enhancement 169 in vitro requires the presence of trans-acting factors. The EMBO Journal, 4(11), 3005–3013. https://doi.org/10.1002/j.1460-2075.1985.tb04036.x Schulz, V. P., Yan, H., Lezon-Geyda, K., An, X., Hale, J., Hillyer, C. D., Mohandas, N., & Gallagher, P. G. (2019). A Unique Epigenomic Landscape Defines Human Erythropoiesis. Cell Reports, 28(11), 2996-3009.e7. https://doi.org/10.1016/j.celrep.2019.08.020 Sen, R., & Baltimore, D. (1986). Multiple nuclear factors interact with the immunoglobulin enhancer sequences. Cell, 46(5), 705–716. https://doi.org/10.1016/0092-8674(86)90346-6 Senichkin, V. V., Prokhorova, E. A., Zhivotovsky, B., & Kopeina, G. S. (2021). Simple and Efficient Protocol for Subcellular Fractionation of Normal and Apoptotic Cells. Cells, 10(4), 852. https://doi.org/10.3390/cells10040852 Shiama, N. (1997). The p300/CBP family: Integrating signals with transcription factors and chromatin. Trends in Cell Biology, 7(6), 230–236. https://doi.org/10.1016/S0962-8924(97)01048-9 Shin, H. Y., Willi, M., Yoo, K. H., Zeng, X., Wang, C., Metser, G., & Hennighausen, L. (2016). Hierarchy within the mammary STAT5-driven Wap super-enhancer. Nature Genetics, 48(8), 904–911. https://doi.org/10.1038/ng.3606 Singh, H., Sen, R., Baltimore, D., & Sharp, P. A. (1986). A nuclear factor that binds 170 to a conserved sequence motif in transcriptional control elements of immunoglobulin genes. Nature, 319(6049), 154–158. https://doi.org/10.1038/319154a0 Smith, R. P., Taher, L., Patwardhan, R. P., Kim, M. J., Inoue, F., Shendure, J., Ovcharenko, I., & Ahituv, N. (2013). Massively parallel decoding of mammalian regulatory sequences supports a flexible organizational model. Nature Genetics, 45(9), 1021–1028. https://doi.org/10.1038/ng.2713 Smith, T., Heger, A., & Sudbery, I. (2017). UMI-tools: Modeling sequencing errors in Unique Molecular Identifiers to improve quantification accuracy. Genome Research, 27(3), 491–499. https://doi.org/10.1101/gr.209601.116 Snetkova, V., Ypsilanti, A. R., Akiyama, J. A., Mannion, B. J., Plajzer-Frick, I., Novak, C. S., Harrington, A. N., Pham, Q. T., Kato, M., Zhu, Y., Godoy, J., Meky, E., Hunter, R. D., Shi, M., Kvon, E. Z., Afzal, V., Tran, S., Rubenstein, J. L. R., Visel, A., … Dickel, D. E. (2021). Ultraconserved enhancer function does not require perfect sequence conservation. Nature Genetics, 53(4), 521–528. https://doi.org/10.1038/s41588-021-00812-3 Socolovsky, M., Fallon, A. E., Wang, S., Brugnara, C., & Lodish, H. F. (1999). Fetal anemia and apoptosis of red cell progenitors in Stat5a-/-5b-/- mice: A direct role for Stat5 in Bcl-X(L) induction. Cell, 98(2), 181–191. https://doi.org/10.1016/s0092-8674(00)81013-2 171 Soldaini, E., John, S., Moro, S., Bollenbacher, J., Schindler, U., & Leonard, W. J. (2000). DNA binding site selection of dimeric and tetrameric Stat5 proteins reveals a large repertoire of divergent tetrameric Stat5a binding sites. Molecular and Cellular Biology, 20(1), 389–401. https://doi.org/10.1128/MCB.20.1.389-401.2000 Song, S.-H., Hou, C., & Dean, A. (2007). A positive role for NLI/Ldb1 in long range β-globin locus control region function. Molecular Cell, 28(5), 810–822. https://doi.org/10.1016/j.molcel.2007.09.025 Spektor, R., Tippens, N. D., Mimoso, C. A., & Soloway, P. D. (2019). Methyl- ATAC-seq measures DNA methylation at accessible chromatin. Genome Research, 29(6), 969–977. https://doi.org/10.1101/gr.245399.118 Spitz, F., & Furlong, E. E. M. (2012). Transcription factors: From enhancer binding to developmental control. Nature Reviews. Genetics, 13(9), 613–626. https://doi.org/10.1038/nrg3207 Staudt, L. M., Singh, H., Sen, R., Wirth, T., Sharp, P. A., & Baltimore, D. (1986). A lymphoid-specific protein binding to the octamer motif of immunoglobulin genes. Nature, 323(6089), 640–643. https://doi.org/10.1038/323640a0 Stein, R. W., Corrigan, M., Yaciuk, P., Whelan, J., & Moran, E. (1990). Analysis of E1A-mediated growth regulation functions: Binding of the 300-kilodalton cellular product correlates with E1A enhancer repression function and 172 DNA synthesis-inducing activity. Journal of Virology, 64(9), 4421–4427. https://doi.org/10.1128/jvi.64.9.4421-4427.1990 Storer, J., Hubley, R., Rosen, J., Wheeler, T. J., & Smit, A. F. (2021). The Dfam community resource of transposable element families, sequence models, and genome annotations. Mobile DNA, 12(1), 2. https://doi.org/10.1186/s13100-020-00230-y Tarbell, E. D., & Liu, T. (2019). HMMRATAC: A Hidden Markov ModeleR for ATAC-seq. Nucleic Acids Research, 47(16), e91. https://doi.org/10.1093/nar/gkz533 Taskiran, I. I., Spanier, K. I., Dickmänken, H., Kempynck, N., Pančíková, A., Ekşi, E. C., Hulselmans, G., Ismail, J. N., Theunis, K., Vandepoel, R., Christiaens, V., Mauduit, D., & Aerts, S. (2024). Cell-type-directed design of synthetic enhancers. Nature, 626(7997), 212–220. https://doi.org/10.1038/s41586-023- 06936-2 Thanos, D., & Maniatis, T. (1995). Virus induction of human IFN beta gene expression requires the assembly of an enhanceosome. Cell, 83(7), 1091– 1100. https://doi.org/10.1016/0092-8674(95)90136-1 Thomas, H. F., Kotova, E., Jayaram, S., Pilz, A., Romeike, M., Lackner, A., Penz, T., Bock, C., Leeb, M., Halbritter, F., Wysocka, J., & Buecker, C. (2021). Temporal dissection of an enhancer cluster reveals distinct temporal and 173 functional contributions of individual elements. Molecular Cell, 81(5), 969- 982.e13. https://doi.org/10.1016/j.molcel.2020.12.047 Thurman, R. E., Rynes, E., Humbert, R., Vierstra, J., Maurano, M. T., Haugen, E., Sheffield, N. C., Stergachis, A. B., Wang, H., Vernot, B., Garg, K., John, S., Sandstrom, R., Bates, D., Boatman, L., Canfield, T. K., Diegel, M., Dunn, D., Ebersol, A. K., … Stamatoyannopoulos, J. A. (2012). The accessible chromatin landscape of the human genome. Nature, 489(7414), 75–82. https://doi.org/10.1038/nature11232 Tian, B., Yang, J., & Brasier, A. R. (2012). Two-step cross-linking for analysis of protein-chromatin interactions. Methods in Molecular Biology (Clifton, N.J.), 809, 105–120. https://doi.org/10.1007/978-1-61779-376-9_7 Tippens, N. D., Liang, J., Leung, A. K.-Y., Wierbowski, S. D., Ozer, A., Booth, J. G., Lis, J. T., & Yu, H. (2020). Transcription imparts architecture, function and logic to enhancer units. Nature Genetics, 52(10), 1067–1075. https://doi.org/10.1038/s41588-020-0686-2 Tolhuis, B., Palstra, R. J., Splinter, E., Grosveld, F., & de Laat, W. (2002). Looping and interaction between hypersensitive sites in the active beta-globin locus. Molecular Cell, 10(6), 1453–1465. https://doi.org/10.1016/s1097- 2765(02)00781-5 Tóthová, Z., Tomc, J., Debeljak, N., & Solár, P. (2021). STAT5 as a Key Protein of 174 Erythropoietin Signalization. International Journal of Molecular Sciences, 22(13), 7109. https://doi.org/10.3390/ijms22137109 Treisman, R. (1985). Transient accumulation of c-fos RNA following serum stimulation requires a conserved 5’ element and c-fos 3’ sequences. Cell, 42(3), 889–902. https://doi.org/10.1016/0092-8674(85)90285-5 Trojanowski, J., Frank, L., Rademacher, A., Mücke, N., Grigaitis, P., & Rippe, K. (2022). Transcription activation is enhanced by multivalent interactions independent of phase separation. Molecular Cell, 82(10), 1878-1893.e10. https://doi.org/10.1016/j.molcel.2022.04.017 Ulirsch, J. C., Nandakumar, S. K., Wang, L., Giani, F. C., Zhang, X., Rogov, P., Melnikov, A., McDonel, P., Do, R., Mikkelsen, T. S., & Sankaran, V. G. (2016). Systematic Functional Dissection of Common Genetic Variation Affecting Red Blood Cell Traits. Cell, 165(6), 1530–1545. https://doi.org/10.1016/j.cell.2016.04.048 Vakoc, C. R., Letting, D. L., Gheldof, N., Sawado, T., Bender, M. A., Groudine, M., Weiss, M. J., Dekker, J., & Blobel, G. A. (2005). Proximity among distant regulatory elements at the beta-globin locus requires GATA-1 and FOG-1. Molecular Cell, 17(3), 453–462. https://doi.org/10.1016/j.molcel.2004.12.028 Vihervaara, A., Mahat, D. B., Guertin, M. J., Chu, T., Danko, C. G., Lis, J. T., & Sistonen, L. (2017). Transcriptional response to stress is pre-wired by 175 promoter and enhancer architecture. Nature Communications, 8(1), 255. https://doi.org/10.1038/s41467-017-00151-0 Vihervaara, A., Mahat, D. B., Himanen, S. V., Blom, M. A. H., Lis, J. T., & Sistonen, L. (2021). Stress-induced transcriptional memory accelerates promoter- proximal pause release and decelerates termination over mitotic divisions. Molecular Cell, 81(8), 1715-1731.e6. https://doi.org/10.1016/j.molcel.2021.03.007 Vihervaara, A., Sergelius, C., Vasara, J., Blom, M. A. H., Elsing, A. N., Roos- Mattjus, P., & Sistonen, L. (2013). Transcriptional response to stress in the dynamic chromatin environment of cycling and mitotic cells. Proceedings of the National Academy of Sciences, 110(36), E3388–E3397. https://doi.org/10.1073/pnas.1305275110 Villar, D., Berthelot, C., Aldridge, S., Rayner, T. F., Lukk, M., Pignatelli, M., Park, T. J., Deaville, R., Erichsen, J. T., Jasinska, A. J., Turner, J. M. A., Bertelsen, M. F., Murchison, E. P., Flicek, P., & Odom, D. T. (2015). Enhancer Evolution across 20 Mammalian Species. Cell, 160(3), 554–566. https://doi.org/10.1016/j.cell.2015.01.006 Visel, A., Blow, M. J., Li, Z., Zhang, T., Akiyama, J. A., Holt, A., Plajzer-Frick, I., Shoukry, M., Wright, C., Chen, F., Afzal, V., Ren, B., Rubin, E. M., & Pennacchio, L. A. (2009). ChIP-seq accurately predicts tissue-specific 176 activity of enhancers. Nature, 457(7231), 854–858. https://doi.org/10.1038/nature07730 Wadman, I. A., Osada, H., Grütz, G. G., Agulnick, A. D., Westphal, H., Forster, A., & Rabbitts, T. H. (1997). The LIM-only protein Lmo2 is a bridging molecule assembling an erythroid, DNA-binding complex which includes the TAL1, E47, GATA-1 and Ldb1/NLI proteins. The EMBO Journal, 16(11), 3145– 3157. https://doi.org/10.1093/emboj/16.11.3145 Weber-Nordt, R. M., Egen, C., Wehinger, J., Ludwig, W., Gouilleux-Gruart, V., Mertelsmann, R., & Finke, J. (1996). Constitutive activation of STAT proteins in primary lymphoid and myeloid leukemia cells and in Epstein- Barr virus (EBV)-related lymphoma cell lines. Blood, 88(3), 809–816. Wei, X., Das, J., Fragoza, R., Liang, J., Bastos de Oliveira, F. M., Lee, H. R., Wang, X., Mort, M., Stenson, P. D., Cooper, D. N., Lipkin, S. M., Smolka, M. B., & Yu, H. (2014). A massively parallel pipeline to clone DNA variants and examine molecular phenotypes of human disease mutations. PLoS Genetics, 10(12), e1004819. https://doi.org/10.1371/journal.pgen.1004819 Weinberger, J., Baltimore, D., & Sharp, P. A. (1986). Distinct factors bind to apparently homolgous sequences in the immunoglobulin heavy-chain enhancer. Nature, 322(6082), 846–848. https://doi.org/10.1038/322846a0 Weirauch, M. T., Yang, A., Albu, M., Cote, A. G., Montenegro-Montero, A., 177 Drewe, P., Najafabadi, H. S., Lambert, S. A., Mann, I., Cook, K., Zheng, H., Goity, A., van Bakel, H., Lozano, J.-C., Galli, M., Lewsey, M. G., Huang, E., Mukherjee, T., Chen, X., … Hughes, T. R. (2014). Determination and inference of eukaryotic transcription factor sequence specificity. Cell, 158(6), 1431–1443. https://doi.org/10.1016/j.cell.2014.08.009 Whyte, P., Williamson, N. M., & Harlow, E. (1989). Cellular targets for transformation by the adenovirus E1A proteins. Cell, 56(1), 67–75. https://doi.org/10.1016/0092-8674(89)90984-7 Whyte, W. A., Orlando, D. A., Hnisz, D., Abraham, B. J., Lin, C. Y., Kagey, M. H., Rahl, P. B., Lee, T. I., & Young, R. A. (2013). Master Transcription Factors and Mediator Establish Super-Enhancers at Key Cell Identity Genes. Cell, 153(2), 307–319. https://doi.org/10.1016/j.cell.2013.03.035 Wickham, H. (n.d.). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. Retrieved June 3, 2025, from https://ggplot2.tidyverse.org Wildeman, A. G., Sassone-Corsi, P., Grundström, T., Zenke, M., & Chambon, P. (1984). Stimulation of in vitro transcription from the SV40 early promoter by the enhancer involves a specific trans-acting factor. The EMBO Journal, 3(13), 3129–3133. https://doi.org/10.1002/j.1460-2075.1984.tb02269.x Wildeman, A. G., Zenke, M., Schatz, C., Wintzerith, M., Grundström, T., Matthes, H., Takahashi, K., & Chambon, P. (1986). Specific protein binding to the 178 simian virus 40 enhancer in vitro. Molecular and Cellular Biology, 6(6), 2098–2105. https://doi.org/10.1128/mcb.6.6.2098-2105.1986 Wilson, N. K., Foster, S. D., Wang, X., Knezevic, K., Schütte, J., Kaimakis, P., Chilarska, P. M., Kinston, S., Ouwehand, W. H., Dzierzak, E., Pimanda, J. E., de Bruijn, M. F. T. R., & Göttgens, B. (2010). Combinatorial transcriptional control in blood stem/progenitor cells: Genome-wide analysis of ten major transcriptional regulators. Cell Stem Cell, 7(4), 532– 544. https://doi.org/10.1016/j.stem.2010.07.016 Wu, X., & Sharp, P. A. (2013). Divergent transcription: A driving force for new gene origination? Cell, 155(5), 990–996. https://doi.org/10.1016/j.cell.2013.10.048 Yang, Z., Yik, J. H. N., Chen, R., He, N., Jang, M. K., Ozato, K., & Zhou, Q. (2005). Recruitment of P-TEFb for Stimulation of Transcriptional Elongation by the Bromodomain Protein Brd4. Molecular Cell, 19(4), 535–545. https://doi.org/10.1016/j.molcel.2005.06.029 Yao, L., Liang, J., Ozer, A., Leung, A. K.-Y., Lis, J. T., & Yu, H. (2022). A comparison of experimental assays and analytical methods for genome-wide identification of active enhancers. Nature Biotechnology, 40(7), 1056–1065. https://doi.org/10.1038/s41587-022-01211-7 Yee, S. P., & Branton, P. E. (1985). Detection of cellular proteins associated with 179 human adenovirus type 5 early region 1A polypeptides. Virology, 147(1), 142–153. https://doi.org/10.1016/0042-6822(85)90234-x Zabidi, M. A., Arnold, C. D., Schernhuber, K., Pagani, M., Rath, M., Frank, O., & Stark, A. (2015). Enhancer–core-promoter specificity separates developmental and housekeeping gene regulation. Nature, 518(7540), 556– 559. https://doi.org/10.1038/nature13994 Zenke, M., Grundström, T., Matthes, H., Wintzerith, M., Schatz, C., Wildeman, A., & Chambon, P. (1986). Multiple sequence motifs are involved in SV40 enhancer function. The EMBO Journal, 5(2), 387–397. https://doi.org/10.1002/j.1460-2075.1986.tb04224.x Zhang, J., Leung, A. K.-Y., Zhu, Y., Yao, L., Willis, A., Pan, X., Ozer, A., Zhou, Z., Siklenka, K., Barrera, A., Liang, J., Tippens, N. D., Reddy, T. E., Lis, J. T., & Yu, H. (2025). Comprehensive Evaluation of Diverse Massively Parallel Reporter Assays to Functionally Characterize Human Enhancers Genome- wide (p. 2025.03.25.645321). bioRxiv. https://doi.org/10.1101/2025.03.25.645321 Zhang, X., Song, B., Carlino, M. J., Li, G., Ferchen, K., Chen, M., Thompson, E. N., Kain, B. N., Schnell, D., Thakkar, K., Kouril, M., Jin, K., Hay, S. B., Sen, S., Bernardicius, D., Ma, S., Bennett, S. N., Croteau, J., Salvatori, O., … Grimes, H. L. (2024). An immunophenotype-coupled transcriptomic atlas of human 180 hematopoietic progenitors. Nature Immunology, 25(4), 703–715. https://doi.org/10.1038/s41590-024-01782-4 Zhang, Y., Liu, T., Meyer, C. A., Eeckhoute, J., Johnson, D. S., Bernstein, B. E., Nusbaum, C., Myers, R. M., Brown, M., Li, W., & Liu, X. S. (2008). Model- based analysis of ChIP-Seq (MACS). Genome Biology, 9(9), R137. https://doi.org/10.1186/gb-2008-9-9-r137 Zhou, J., & Troyanskaya, O. G. (2015). Predicting effects of noncoding variants with deep learning–based sequence model. Nature Methods, 12(10), 931– 934. https://doi.org/10.1038/nmeth.3547 Zhu, Y., Balaji, A., Han, M., Andronov, L., Roy, A. R., Wei, Z., Chen, C., Miles, L., Cai, S., Gu, Z., Tse, A., Yu, B. C., Uenaka, T., Lin, X., Spakowitz, A. J., Moerner, W. E., & Qi, L. S. (2025). High-resolution dynamic imaging of chromatin DNA communication using Oligo-LiveFISH. Cell, 188(12), 3310- 3328.e27. https://doi.org/10.1016/j.cell.2025.03.032