Aspects of the P-Norm Model of Information Retrieval: Syntactic Query Generation, Efficiency, and Theoretical Properties

Other Titles


A practical information retrieval system must be easy to use by untrained users, and it must provide prompt responses to a user's search requests. In this thesis, these practical aspects of the p-norm model of information retrieval are explored. In addition, a study of theoretical properties of the p-norm model is presented. A syntactic method for generating p-norm queries from parse trees generated by the PLNLP syntactic analyzer is presented. The effectiveness of the syntactically generated queries is shown to be comparable to the effectiveness of manually constructed queries, and much better than that of statistically generated queries. The efficiency of a p-norm retrieval is significantly improved with a new p-norm retrieval algorithm which evaluates the entire document collection in one recursive traversal of the query tree. This algorithm is compared against the straightforward algorithm, which requires a traversal of the query tree for each document that is evaluated. The new algorithm is shown to be better both asymptotically and experimentally. The infinity-one model is introduced as a means of approximating the p-norm model without requiring exponentiation. Experimental results show that infinity-one retrieval is essentially as effective as p-norm retrieval, but much faster. List pruning methods for further efficiency improvements are also introduced and are shown to reduce retrieval time significantly without affecting the precision of top-ranked documents. The retrieval time of the infinity-one model with list pruning is shown to be comparable to that of pure Boolean retrieval. A theoretical study is also presented in which certain Boolean algebra properties, such as associativity, are shown to be unsatisfiable by any extended Boolean system with weak operators. The p-norm model is shown to satisfy all those properties that can be satisfied. In addition, the p-norm model is evaluated with respect to the Waller-Kraft wish list for extended Boolean systems.

Journal / Series

Volume & Issue



Date Issued



Cornell University


computer science; technical report


Effective Date

Expiration Date




Union Local


Number of Workers

Committee Chair

Committee Co-Chair

Committee Member

Degree Discipline

Degree Name

Degree Level

Related Version

Related DOI

Related To

Related Part

Based on Related Item

Has Other Format(s)

Part of Related Item

Related To

Related Publication(s)

Link(s) to Related Publication(s)


Link(s) to Reference(s)

Previously Published As

Government Document




Other Identifiers


Rights URI


technical report

Accessibility Feature

Accessibility Hazard

Accessibility Summary

Link(s) to Catalog Record