Automatic Structuring of Text Files
Salton, Gerard; Buckley, Chris; Allan, James
In many practical information retrieval situations, it is necessary to process heterogeneous text databases that vary greatly in scope and coverage, and deal with many different subjects. In such an environment it is important to provide flexible access to individual text pieces, and to structure the collection so that related text elements are identified and appropriately linked. Methods are described in this study for the automatic structuring of heterogeneous text collections, and the construction of browsing tools and access procedures that facilitate collection use. The proposed methods are illustrated by performing searches with a large automated encyclopedia.
computer science; technical report
Previously Published As