A Quantitative Evaluation of Dissemination-Time Preservation Metadata
Proceedings of the 12th European Conference on Research and Advanced Technology for Digital Libraries (ECDL'08). Aarhus, Denmark. September, 2008.
J.A. Smith and M.L. Nelson.
Download: ecdl08_Final.pdf
One of many challenges facing web preservation efforts is the lack of metadata
available for web resources. In prior work, we proposed a model that takes
advantage of a site’s own web server to prepare its resources for preservation.
When responding to a request from an archiving repository, the server applies a
series of metadata utilities, such as Jhove and Exif, to the requested
resource. The output from each utility is included in the HTTP response along
with the resource itself. This paper addresses the question of feasibility: Is
it in fact practical to use the site’s web server as a just-in-time metadata
generator, or does the extra processing create an unacceptable deterioration in
server responsiveness to quotidian events? Our tests indicate that (a) this
approach can work effectively for both the crawler and the server; and that (b)
utility selection is an important factor in overall performance.
@inproceedings{jas:ecdl08,
author = {Joan A. Smith and Michael L. Nelson},
title = {A Quantitative Evaluation of Dissemination-Time Preservation Metadata},
series = {Lecture Notes in Computer Science},
volume = {5173},
year = {2008},
month = {September},
booktitle = {Proceedings of the 12th European Conference on Digital Libraries ({ECDL 2008})}
}