A Vector Space Model for Automatic Indexing
Salton, Gerard; Wong, A.; Yang, C. S.
In a document retieval, or other pattern matching environment where stored entities (documents) are compared with each other, or with incoming patterns (search requests), it appears that the best indexing (property) space is one where each entity lies as far away from the others as possible; that is, retrieval performance correlates inversely with space density. This result is used to choose an optimum indexing vocabulary for a collection of documents. Typical evaluation results are shown demonstrating the usefulness of the model.
computer science; technical report
Previously Published As