eCommons

 

Markov Methods For Identifying Chip-Seq Peaks

Other Titles

Abstract

Used to analyze protein interactions with DNA, Chromatin Immunoprecipitation sequencing (ChIP-seq) uses high throughput sequencing technologies to map millions of short DNA "reads" to a reference genome. As the majority of reads map to a protein binding region for a specific protein of interest, a large read count at any given position indicates the presence of a binding region, so that scientists seek "peaks," areas of high counts along the genome. This thesis presents several methods to identify binding regions, utilizing hidden Markov model methods. Unlike existing methods, the final model, HiDe-Peak, accounts for both several major covariates, including mappability and GC content, as well as the dependence between counts present in the dataset. On real data, HiDe-Peak performs in line with existing methods, and in simulations, outperforms its competitors.

Journal / Series

Volume & Issue

Description

Sponsorship

Date Issued

2013-01-28

Publisher

Keywords

Location

Effective Date

Expiration Date

Sector

Employer

Union

Union Local

NAICS

Number of Workers

Committee Chair

Booth, James

Committee Co-Chair

Committee Member

Hooker, Giles J.
Wells, Martin Timothy

Degree Discipline

Statistics

Degree Name

Ph. D., Statistics

Degree Level

Doctor of Philosophy

Related Version

Related DOI

Related To

Related Part

Based on Related Item

Has Other Format(s)

Part of Related Item

Related To

Related Publication(s)

Link(s) to Related Publication(s)

References

Link(s) to Reference(s)

Previously Published As

Government Document

ISBN

ISMN

ISSN

Other Identifiers

Rights

Rights URI

Types

dissertation or thesis

Accessibility Feature

Accessibility Hazard

Accessibility Summary

Link(s) to Catalog Record