Databases and Information Systems | Open Access Articles

Creating A Reproducible Metadata Transformation Pipeline Using Technology Best Practices, Cara Key, Mike Waugh Apr 2018

Creating A Reproducible Metadata Transformation Pipeline Using Technology Best Practices, Cara Key, Mike Waugh

Digital Initiatives Symposium

Over the course of two years, a team of librarians and programmers from LSU Libraries migrated the 186 collections of the Louisiana Digital Library from OCLC's CONTENTdm platform over to the open-source Islandora platform.

Early in the process, the team understood the value of creating a reproducible metadata transformation pipeline, because there were so many unknowns at the beginning of the process along with the certainty that mistakes would be made. This presentation will describe how the team used innovative and collaborative tools, such as Trello, Ansible, Vagrant, VirtualBox, git and GitHub to accomplish the task.

Go to article

Client-Assisted Memento Aggregation Using The Prefer Header, Mat Kelly, Sawood Alam, Michael L. Nelson, Michele C. Weigle Jan 2018

Client-Assisted Memento Aggregation Using The Prefer Header, Mat Kelly, Sawood Alam, Michael L. Nelson, Michele C. Weigle

Computer Science Faculty Publications

[First paragraph] Preservation of the Web ensures that future generations have a picture of how the web was. Web archives like Internet Archive's Wayback Machine, WebCite, and archive.is allow individuals to submit URIs to be archived, but the captures they preserve then reside at the archives. Traversing these captures in time as preserved by multiple archive sources (using Memento [8]) provides a more comprehensive picture of the past Web than relying on a single archive. Some content on the Web, such as content behind authentication, may be unsuitable or inaccessible for preservation by these organizations. Furthermore, this content may be …

Go to article

Swimming In A Sea Of Javascript Or: How I Learned To Stop Worrying And Love High-Fidelity Replay, John A. Berlin, Michael L. Nelson, Michele C. Weigle Jan 2018

Swimming In A Sea Of Javascript Or: How I Learned To Stop Worrying And Love High-Fidelity Replay, John A. Berlin, Michael L. Nelson, Michele C. Weigle

Computer Science Faculty Publications

[First paragraph] Preserving and replaying modern web pages in high-fidelity has become an increasingly difficult task due to the increased usage of JavaScript. Reliance on server-side rewriting alone results in live-leakage and or the inability to replay a page due to the preserved JavaScript performing an action not permissible from the archive. The current state-of-the-art high fidelity archival preservation and replay solutions rely on handcrafted client-side URL rewriting libraries specifically tailored for the archive, namely Webrecoder's and Pywb's wombat.js [12]. Web archives not utilizing client-side rewriting rely on server-side rewriting that misses URLs used in a manner not accounted for …

Go to article

Databases and Information Systems Commons^™

Full-Text Articles in Databases and Information Systems

Creating A Reproducible Metadata Transformation Pipeline Using Technology Best Practices, Cara Key, Mike Waugh

Digital Initiatives Symposium

Client-Assisted Memento Aggregation Using The Prefer Header, Mat Kelly, Sawood Alam, Michael L. Nelson, Michele C. Weigle

Computer Science Faculty Publications

Swimming In A Sea Of Javascript Or: How I Learned To Stop Worrying And Love High-Fidelity Replay, John A. Berlin, Michael L. Nelson, Michele C. Weigle

Computer Science Faculty Publications