Interactive Query Formulation and Feedback Experiments in Information Retrieval

Other Titles


The effective use of information retrieval systems by end-users has been limited by their lack of knowledge on the particular organization of the databases searched and by their limited experience on how to formulate and modify search statements. This thesis explores and evaluate two mechanisms to improve retrieval performance by end-users. The first mechanism complements the formulation of a query by allowing users to interactively add term phrases. These phrases are generated either from the query text of from known relevant documents. This addition of term phrases to a query is suggested by the term discrimination model as a precision enhancement device. An interactive front-end for the SMART information retrieval system was developed to perform the interactive experiments needed to evaluate different phrase addition strategies. The second aspect of retrieval improvement studied is the evaluation of two database organizations that can be used to obtain new relevant documents by looking in the neighborhood of known relevant documents, browsing. Browsing in cluster hierarchies and nearest-neighbor networks is compared to relevance feedback in non-restrictive experiments. The results obtained for the phrase addition methodology showed that simple non-interactive addition of phrases can perform as well as interactive addition. Even an optimal selection of the phrases based on the relevant documents not yet retrieved, did not significantly improve performance over simply adding all the phrases generated. Many useful phrases are not selected by users because they look like random association of terms. The usefulness of these phrases comes from the fact that either they are pieces of larger (semantically meaningful) phrases, or they are made up of local synonyms specific to the document collection used. The browsing experiments in cluster hierarchies and nearest-neighbor networks showed that the second organization consistently performs better than relevance feedback in different collections. Cluster browsing is more dependent on the characteristics of the collections; but when the circumstances are favorable, cluster browsing can produce larger improvements on retrieval that network browsing. Retrieval in both structures is much faster than relevance feedback since only a small portion of the database needs to be inspected.

Journal / Series

Volume & Issue



Date Issued



Cornell University


computer science; technical report


Effective Date

Expiration Date




Union Local


Number of Workers

Committee Chair

Committee Co-Chair

Committee Member

Degree Discipline

Degree Name

Degree Level

Related Version

Related DOI

Related To

Related Part

Based on Related Item

Has Other Format(s)

Part of Related Item

Related To

Related Publication(s)

Link(s) to Related Publication(s)


Link(s) to Reference(s)

Previously Published As

Government Document




Other Identifiers


Rights URI


technical report

Accessibility Feature

Accessibility Hazard

Accessibility Summary

Link(s) to Catalog Record