Markov Methods For Identifying Chip-Seq Peaks

Other Titles
Abstract
Used to analyze protein interactions with DNA, Chromatin Immunoprecipitation sequencing (ChIP-seq) uses high throughput sequencing technologies to map millions of short DNA "reads" to a reference genome. As the majority of reads map to a protein binding region for a specific protein of interest, a large read count at any given position indicates the presence of a binding region, so that scientists seek "peaks," areas of high counts along the genome. This thesis presents several methods to identify binding regions, utilizing hidden Markov model methods. Unlike existing methods, the final model, HiDe-Peak, accounts for both several major covariates, including mappability and GC content, as well as the dependence between counts present in the dataset. On real data, HiDe-Peak performs in line with existing methods, and in simulations, outperforms its competitors.
Journal / Series
Volume & Issue
Description
Sponsorship
Date Issued
2013-01-28
Publisher
Keywords
Location
Effective Date
Expiration Date
Sector
Employer
Union
Union Local
NAICS
Number of Workers
Committee Chair
Booth, James
Committee Co-Chair
Committee Member
Hooker, Giles J.
Wells, Martin Timothy
Degree Discipline
Statistics
Degree Name
Ph. D., Statistics
Degree Level
Doctor of Philosophy
Related Version
Related DOI
Related To
Related Part
Based on Related Item
Has Other Format(s)
Part of Related Item
Related To
Related Publication(s)
Link(s) to Related Publication(s)
References
Link(s) to Reference(s)
Previously Published As
Government Document
ISBN
ISMN
ISSN
Other Identifiers
Rights
Rights URI
Types
dissertation or thesis
Accessibility Feature
Accessibility Hazard
Accessibility Summary
Link(s) to Catalog Record