The Effectiveness of the Thesaurus Method in Automatic Information Retrieval
Yu, C.T.; Salton, Gerard
Term grouping and thesaurus methods have frequently been incorporated into automatic content analysis programs as devices for the recognition of synonymous expressions and of linguistic entities that may be semantically similar but syntactically distinct. While it has frequently been asserted that the recognition of synonyms is essential in language analysis, actual proofs of the usefulness of a thesaurus in automatic information retrieval are outstanding. In the present study, formal proofs are given of the effectiveness under well-defined conditions of the thesaurus method in information retrieval. It is shown, in particular, that when certain semantically related terms are added to the information queries originally submitted by the user population, a superior retrieval system is obtained in the sense that for every level of the recall the retrieval precision is at least as good for the altered queries as for the original ones.
computer science; technical report
Previously Published As