Effective Automatic Indexing Using Single Terms, Term Phrases and Thesaurus Class Assignments
Yu, C. T.; Salton, Gerard; Siu, M. K.
In a retrieval environment, indexing is the task which consists in the assignment to stored records and incoming information requests of content identifiers capable of representing record or query content. It is known that effective content identifiers (index terms) must exhibit the correct level of specificity in a given collection environment. Terms that are too broad must be rendered specific by being utilized as term phrases, while narrow terms must be broadened by supplying synonymous or related terms normally extracted from a thesaurus. Formal proofs are given in the present study of the retrieval effectiveness of indexing policies using single terms, term phrases and thesaurus class assignments for puposes of content representation. Keywords and Phrases: automatic information retrieval, automatic indexing, content analysis, term phrases, thesaurus classes, term addition, term deletion, retrieval evaluation, recall and precision.
computer science; technical report
Previously Published As