JavaScript is disabled for your browser. Some features of this site may not work without it.
Correcting BLAST e-values for low-complexity segments

Author
Yona, Golan; Sharon, Itai; Birkland, Aaron; Chang, Kuan; El-Yaniv, Ran
Abstract
The statistical estimates of BLAST and PSI-BLAST are of extreme
importance to determine the biological relevance of sequence matches. While being very effective in evaluating most matches, these estimates usually overestimate the significance of matches in the presence of low complexity segments. In this paper we present a model, based on divergence measures and statistics of the alignment structure, that corrects BLAST e-values for low complexity sequences without filtering or excluding them. We evaluate our method and compare it to other known methods using the Gene Ontology (GO)knowledge resource as a benchmark. Various performance measures, including ROC analysis, indicate that the new model improves over the state of the art. The program is available at biozon.org/ftp/ and www.cs.technion.ac.il/~itaish/lowcomp/
Date Issued
2004-08-25Publisher
Cornell University
Subject
computer science; technical report
Previously Published As
http://techreports.library.cornell.edu:8081/Dienst/UI/1.0/Display/cul.cis/TR2004-1962
Type
technical report