Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 2 of 2
Full-Text Articles in Engineering
Using The Web Infrastructure For Real Time Recovery Of Missing Web Pages, Martin Klein
Using The Web Infrastructure For Real Time Recovery Of Missing Web Pages, Martin Klein
Computer Science Theses & Dissertations
Given the dynamic nature of the World Wide Web, missing web pages, or "404 Page not Found" responses, are part of our web browsing experience. It is our intuition that information on the web is rarely completely lost, it is just missing. In whole or in part, content often moves from one URI to another and hence it just needs to be (re-)discovered. We evaluate several methods for a \justin- time" approach to web page preservation. We investigate the suitability of lexical signatures and web page titles to rediscover missing content. It is understood that web pages change over time …
Lazy Preservation: Reconstructing Websites From The Web Infrastructure, Frank Mccown
Lazy Preservation: Reconstructing Websites From The Web Infrastructure, Frank Mccown
Computer Science Theses & Dissertations
Backup or preservation of websites is often not considered until after a catastrophic event has occurred. In the face of complete website loss, webmasters or concerned third parties have attempted to recover some of their websites from the Internet Archive. Still others have sought to retrieve missing resources from the caches of commercial search engines. Inspired by these post hoc reconstruction attempts, this dissertation introduces the concept of lazy preservation{ digital preservation performed as a result of the normal operations of the Web Infrastructure (web archives, search engines and caches). First, the Web Infrastructure (WI) is characterized by its preservation …