Precision Medicine In The Age Of Big Data: Leveraging Machine Learning And Genomics For Drug Discovery
Targeted therapies designed to specifically target molecules involved in carcinogenesis have achieved remarkable antitumor efficacy. However resistance inevitably develops and many cancer patients are not candidates for these targeted therapies. Furthermore the clinical attrition rate continues to rise, which remains a barrier in the development of novel targeted therapies. Integration of extensive genomics datasets with large drug databases allows us to begin to tackle questions about target discovery and drug toxicity with the ultimate goal of accelerating personalized anticancer drug discovery. The purpose of this dissertation was to address these problems through the development of drug repurposing, toxicity prediction, and drug synergy prediction models. First to target the role of transcription factors as drivers of oncogenic activity, we developed a computational drug repositioning approach (CRAFTT) that makes predictions about drugs that specifically disrupt transcription factor activity. To do this, CRAFTT integrates transcription factor binding site information with drug-induced expression profiling. We found that CRAFTT was able to recover a significant number of known drug-transcription factor interactions and identified a novel interaction that we subsequently validated. Our work in drug discovery led us to ask questions about what makes a drug safe. We developed a data-driven approach (PrOCTOR) that integrates the properties of a compound’s targets and its structure to directly predict the likelihood of toxicity in clinical trials and was able to accurately classify known safe and toxic drugs. Finally to address the problem of drug resistance, we developed a machine learning approach to identify synergistic and effective drug combinations based on single drug efficacy information and limited drug combination testing. When applied to mutant BRAF melanoma, this approach exhibited significant predictive power upon evaluation with cross-validation and further experimental testing of previously untested drug combinations in cell lines independent of the training set. Altogether this work demonstrates how the integration of orthogonal datasets gives us power to address difficult questions that are critical for precision medicine and drug discovery. Approaches such as these have the potential to make a direct impact on how patients are treated, as well as to help prioritize and guide additional focused studies.
Genomics; Machine Learning
Computational Biology and Medicine
Doctor of Philosophy
Attribution-NonCommercial-NoDerivatives 4.0 International
dissertation or thesis
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivatives 4.0 International