Collecting web resources: selecting, harvesting, cataloging
The proliferation of freely available web resources of high research value has far outpaced libraries' efforts to collect and preserve them, forcing our collection development models to evolve. Building on the work of a 2008 planning grant and with continued support from the Andrew W. Mellon Foundation, Columbia University Libraries has embarked on a three-year project to develop a Web Resources Collection Program. The project's objective is to develop best practices and standards for collecting and preserving freely available web resources and to integrate this activity into existing library collection development and technical services programs. The collection of web resources by libraries is still at an early stage, and new, sustainable models are needed for virtually every phase of the work. While there are parallels in traditional processes for selection, acquisition, cataloging, archival management and preservation, web resources demand different ways of thinking about, organizing, and staffing these activities. Columbia's evolving program is drawing on metadata practices from library cataloging, archival finding aids, and digital collections, while also seeking to incorporate web-appropriate techniques of social networking and tagging. While still in its early stages the project is testing various methods for defining collecting scope, securing permission to archive, harvesting websites, and describing and organizing resources at various levels of detail. A key objective of the project is to broadly engage librarians and scholars, at Columbia and beyond, in these activities, and to explore opportunities for collaboration.
Presented to the CUL Metadata Working Group on April 23, 2010.
web archiving; collection development