BAYESIAN STATISTICAL INFERENCE FOR TUMOR MICROENVIRONMENT COMPOSITIONS
The complex interaction between tumor and its microenvironment is essential for oncogenesis, survival and growth of tumor. These interactions allow tumor to uptake nutrient from environment and evade from immune surveillances. Understanding these interactions is fundamental to the design of immunotherapies and other targeted therapies. Advances in sequencing technologies have enabled measurement of gene transcription and regulation across large cohorts of cancer patients and also down to the single cell resolution. In this work, using glioblastoma (GBM) as a model system, I present the bioinformatic characterization of tumors and their microenvironment, and the statistical models towards an unsupervised and automated way of understanding the compositions. The first part describes the new sequencing method, Chromatin Run-on Sequencing (ChRO-seq), and its use in characterizing the transcription regulatory landscape in primary glioblastoma. Taking advantage of the ability for ChRO-seq to quantify nascent RNAs directly from solid tissues, I developed bioinformatic tools called dREG-HD to map the genome-wide positions of transcription regulatory elements (TREs) based on their nascent RNA patterns, which formed the basis for quantifying the enhancer activity. As ChRO-seq also enables simultaneous quantification of transcription activity of genes, I developed the tool tfTarget to map the network formed between transcription factor, TREs and target genes. Using tfTarget I identified tumor-associated transcription modules and regulatory networks associated with known GBM subtypes. More importantly, I identified three transcription factors from the immune module that negatively correlated with patient survival. This work showed that ChRO-seq is a powerful tool for understanding transcription regulation in complex diseases, highlighting the clinical importance of tumor microenvironment in GBM. The second part develops a Bayesian statistical model for understanding the tumor compositions using bulk sample RNA-seq and/or ChRO-seq collected from large patient cohorts in conjunction with prior knowledge learned from the single cell RNA-seq and/or ATAC-seq data collected from normal and tumor tissues. This model is expected to address the following questions of central importance in cancer biology. First, what transcription pathways are ectopically regulated in tumor patients, and to what extent in each patient? Secondly, what are the cell type compositions in the tumor microenvironment of each patient? Lastly, do any of pathways or the cells present in the microenvironment interact among each other? Answers to these questions shall provide insights into new druggable targets through modulating tumor microenvironment.
Bayesian Inference; Cancer Epigenomics; ChRO-seq; Computational Biology; Statistical learning; Transcription Regulation
Danko, Charles G.
Joachims, Thorsten; Sethupathy, Praveen
Ph. D., Computational Biology
Doctor of Philosophy
dissertation or thesis