Cornell University
Library
Cornell UniversityLibrary

eCommons

Help
Log In(current)
  1. Home
  2. Cornell Computing and Information Science
  3. Computer Science
  4. Computer Science Technical Reports
  5. Automatic Structuring and Retrieval of Large Text Files

Automatic Structuring and Retrieval of Large Text Files

File(s)
92-1286.pdf (3.66 MB)
92-1286.ps (826.88 KB)
Permanent Link(s)
https://hdl.handle.net/1813/7126
Collections
Computer Science Technical Reports
Author
Salton, Gerard
Allan, James
Buckley, Chris
Abstract

In many operational environments, large text files must be processed covering a wide variety of different topic areas. Aids must then be provided to the user that permit collection browsing and make it possible to locate particular items on demand. The conventional text analysis methods based on preconstructed knowledge-bases and other vocabulary-control tools are difficult to apply when the subject coverage is unrestricted. An alternative approach, applicable to text collections in any subject area, is introduced which uses the document collections themselves as a basis for the text analysis, together with sophisticated text matching operations carried out at several levels of detail. Methods are described for relating semantically similar pieces of text, and for using the resulting hypertext structures for collection browsing and information retrieval.

Date Issued
1992-06
Publisher
Cornell University
Keywords
computer science
•
technical report
Previously Published as
http://techreports.library.cornell.edu:8081/Dienst/UI/1.0/Display/cul.cs/TR92-1286
Type
technical report

Site Statistics | Help

About eCommons | Policies | Terms of use | Contact Us

copyright © 2002-2026 Cornell University Library | Privacy | Web Accessibility Assistance