Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 3 of 3
Full-Text Articles in Physical Sciences and Mathematics
Mementomap: A Web Archive Profiling Framework For Efficient Memento Routing, Sawood Alam
Mementomap: A Web Archive Profiling Framework For Efficient Memento Routing, Sawood Alam
Computer Science Theses & Dissertations
With the proliferation of public web archives, it is becoming more important to better profile their contents, both to understand their immense holdings as well as to support routing of requests in Memento aggregators. A memento is a past version of a web page and a Memento aggregator is a tool or service that aggregates mementos from many different web archives. To save resources, the Memento aggregator should only poll the archives that are likely to have a copy of the requested Uniform Resource Identifier (URI). Using the Crawler Index (CDX), we generate profiles of the archives that summarize their …
Bootstrapping Web Archive Collections From Micro-Collections In Social Media, Alexander C. Nwala
Bootstrapping Web Archive Collections From Micro-Collections In Social Media, Alexander C. Nwala
Computer Science Theses & Dissertations
In a Web plagued by disappearing resources, Web archive collections provide a valuable means of preserving Web resources important to the study of past events. These archived collections start with seed URIs (Uniform Resource Identifiers) hand-selected by curators. Curators produce high quality seeds by removing non-relevant URIs and adding URIs from credible and authoritative sources, but this ability comes at a cost: it is time consuming to collect these seeds. The result of this is a shortage of curators, a lack of Web archive collections for various important news events, and a need for an automatic system for generating seeds. …
A Framework For Verifying The Fixity Of Archived Web Resources, Mohamed Aturban
A Framework For Verifying The Fixity Of Archived Web Resources, Mohamed Aturban
Computer Science Theses & Dissertations
The number of public and private web archives has increased, and we implicitly trust content delivered by these archives. Fixity is checked to ensure that an archived resource has remained unaltered (i.e., fixed) since the time it was captured. Currently, end users do not have the ability to easily verify the fixity of content preserved in web archives. For instance, if a web page is archived in 1999 and replayed in 2019, how do we know that it has not been tampered with during those 20 years? In order for the users of web archives to verify that archived web …