Open Access. Powered by Scholars. Published by Universities.®

Social and Behavioral Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

PDF

Old Dominion University

Computer Science Faculty Publications

Series

Archives & records

Articles 1 - 2 of 2

Full-Text Articles in Social and Behavioral Sciences

Hashes Are Not Suitable To Verify Fixity Of The Public Archived Web, Mohamed Aturban, Martin Klein, Herbert Van De Sompel, Sawood Alam, Michael L. Nelson, Michele C. Weigle Jan 2023

Hashes Are Not Suitable To Verify Fixity Of The Public Archived Web, Mohamed Aturban, Martin Klein, Herbert Van De Sompel, Sawood Alam, Michael L. Nelson, Michele C. Weigle

Computer Science Faculty Publications

Web archives, such as the Internet Archive, preserve the web and allow access to prior states of web pages. We implicitly trust their versions of archived pages, but as their role moves from preserving curios of the past to facilitating present day adjudication, we are concerned with verifying the fixity of archived web pages, or mementos, to ensure they have always remained unaltered. A widely used technique in digital preservation to verify the fixity of an archived resource is to periodically compute a cryptographic hash value on a resource and then compare it with a previous hash value. If the …


Moved But Not Gone: An Evaluation Of Real-Time Methods For Discovering Replacement Web Pages, Martin Klein, Michael L. Nelson Jan 2014

Moved But Not Gone: An Evaluation Of Real-Time Methods For Discovering Replacement Web Pages, Martin Klein, Michael L. Nelson

Computer Science Faculty Publications

Inaccessible Web pages and 404 “Page Not Found” responses are a common Web phenomenon and a detriment to the user’s browsing experience. The rediscovery of missing Web pages is, therefore, a relevant research topic in the digital preservation as well as in the Information Retrieval realm. In this article, we bring these two areas together by analyzing four content- and link-based methods to rediscover missing Web pages. We investigate the retrieval performance of the methods individually as well as their combinations and give an insight into how effective these methods are over time. As the main result of this work, …