Interactive Query Formulation and Feedback Experiments in Information Retrieval

dc.contributor.authorAraya, Jose E.en_US
dc.description.abstractThe effective use of information retrieval systems by end-users has been limited by their lack of knowledge on the particular organization of the databases searched and by their limited experience on how to formulate and modify search statements. This thesis explores and evaluate two mechanisms to improve retrieval performance by end-users. The first mechanism complements the formulation of a query by allowing users to interactively add term phrases. These phrases are generated either from the query text of from known relevant documents. This addition of term phrases to a query is suggested by the term discrimination model as a precision enhancement device. An interactive front-end for the SMART information retrieval system was developed to perform the interactive experiments needed to evaluate different phrase addition strategies. The second aspect of retrieval improvement studied is the evaluation of two database organizations that can be used to obtain new relevant documents by looking in the neighborhood of known relevant documents, browsing. Browsing in cluster hierarchies and nearest-neighbor networks is compared to relevance feedback in non-restrictive experiments. The results obtained for the phrase addition methodology showed that simple non-interactive addition of phrases can perform as well as interactive addition. Even an optimal selection of the phrases based on the relevant documents not yet retrieved, did not significantly improve performance over simply adding all the phrases generated. Many useful phrases are not selected by users because they look like random association of terms. The usefulness of these phrases comes from the fact that either they are pieces of larger (semantically meaningful) phrases, or they are made up of local synonyms specific to the document collection used. The browsing experiments in cluster hierarchies and nearest-neighbor networks showed that the second organization consistently performs better than relevance feedback in different collections. Cluster browsing is more dependent on the characteristics of the collections; but when the circumstances are favorable, cluster browsing can produce larger improvements on retrieval that network browsing. Retrieval in both structures is much faster than relevance feedback since only a small portion of the database needs to be inspected.en_US
dc.format.extent11392652 bytes
dc.format.extent2570980 bytes
dc.publisherCornell Universityen_US
dc.subjectcomputer scienceen_US
dc.subjecttechnical reporten_US
dc.titleInteractive Query Formulation and Feedback Experiments in Information Retrievalen_US
dc.typetechnical reporten_US


Original bundle
Now showing 1 - 2 of 2
Thumbnail Image
10.86 MB
Adobe Portable Document Format
No Thumbnail Available
2.45 MB
Postscript Files