JavaScript is disabled for your browser. Some features of this site may not work without it.
A Theory of Term Importance in Automatic Text Analysis

Author
Salton, Gerard; Yang, C. S.; Yu, C. T.
Abstract
Most existing automatic content analysis and indexing techniques are based on word frequency characteristics applied largely in an ad hoc manner. Contradictory requirements arise in this connection, in that terms exhibiting high occurence frequencies in individual documents are often useful for high recall performance (to retrieve many relevant items), whereas terms with low frequency in the whole collection are useful for high precision (to reject nonrelevant items).
Date Issued
1974-07Publisher
Cornell University
Subject
computer science; technical report
Previously Published As
http://techreports.library.cornell.edu:8081/Dienst/UI/1.0/Display/cul.cs/TR74-208
Type
technical report