Open Access. Powered by Scholars. Published by Universities.®

Library and Information Science Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Discipline
- Keyword
-
- Web archiving (5)
- Digital libraries (3)
- Framework (3)
- Archives (2)
- Memento (2)
-
- Social media (2)
- Storytelling (2)
- Summarization (2)
- Web archives (2)
- Archived web pages (1)
- Buckets (1)
- Collections (1)
- Digital preservation (1)
- Information retrieval (1)
- Instagram (1)
- Intelligent sampling (1)
- JavaScript (1)
- Machine learning (1)
- Micro-collection (1)
- News (1)
- Off-topic (1)
- Privacy (1)
- Seed (1)
- Smart objects (1)
- Verifying fixity (1)
- Visualization (1)
- Web (1)
- Web crawling (1)
- Web science (1)
Articles 1 - 8 of 8
Full-Text Articles in Library and Information Science
Supporting Account-Based Queries For Archived Instagram Posts, Himarsha R. Jayanetti
Supporting Account-Based Queries For Archived Instagram Posts, Himarsha R. Jayanetti
Computer Science Theses & Dissertations
Social media has become one of the primary modes of communication in recent times, with popular platforms such as Facebook, Twitter, and Instagram leading the way. Despite its popularity, Instagram has not received as much attention in academic research compared to Facebook and Twitter, and its significant role in contemporary society is often overlooked. Web archives are making efforts to preserve social media content despite the challenges posed by the dynamic nature of these sites. The goal of our research is to facilitate the easy discovery of archived copies, or mementos, of all posts belonging to a specific Instagram account …
Improving Collection Understanding For Web Archives With Storytelling: Shining Light Into Dark And Stormy Archives, Shawn M. Jones
Improving Collection Understanding For Web Archives With Storytelling: Shining Light Into Dark And Stormy Archives, Shawn M. Jones
Computer Science Theses & Dissertations
Collections are the tools that people use to make sense of an ever-increasing number of archived web pages. As collections themselves grow, we need tools to make sense of them. Tools that work on the general web, like search engines, are not a good fit for these collections because search engines do not currently represent multiple document versions well. Web archive collections are vast, some containing hundreds of thousands of documents. Thousands of collections exist, many of which cover the same topic. Few collections include standardized metadata. Too many documents from too many collections with insufficient metadata makes collection understanding …
Bootstrapping Web Archive Collections From Micro-Collections In Social Media, Alexander C. Nwala
Bootstrapping Web Archive Collections From Micro-Collections In Social Media, Alexander C. Nwala
Computer Science Theses & Dissertations
In a Web plagued by disappearing resources, Web archive collections provide a valuable means of preserving Web resources important to the study of past events. These archived collections start with seed URIs (Uniform Resource Identifiers) hand-selected by curators. Curators produce high quality seeds by removing non-relevant URIs and adding URIs from credible and authoritative sources, but this ability comes at a cost: it is time consuming to collect these seeds. The result of this is a shortage of curators, a lack of Web archive collections for various important news events, and a need for an automatic system for generating seeds. …
A Framework For Verifying The Fixity Of Archived Web Resources, Mohamed Aturban
A Framework For Verifying The Fixity Of Archived Web Resources, Mohamed Aturban
Computer Science Theses & Dissertations
The number of public and private web archives has increased, and we implicitly trust content delivered by these archives. Fixity is checked to ensure that an archived resource has remained unaltered (i.e., fixed) since the time it was captured. Currently, end users do not have the ability to easily verify the fixity of content preserved in web archives. For instance, if a web page is archived in 1999 and replayed in 2019, how do we know that it has not been tampered with during those 20 years? In order for the users of web archives to verify that archived web …
Aggregating Private And Public Web Archives Using The Mementity Framework, Matthew R. Kelly
Aggregating Private And Public Web Archives Using The Mementity Framework, Matthew R. Kelly
Computer Science Theses & Dissertations
Web archives preserve the live Web for posterity, but the content on the Web one cares about may not be preserved. The ability to access this content in the future requires the assurance that those sites will continue to exist on the Web until the content is requested and that the content will remain accessible. It is ultimately the responsibility of the individual to preserve this content, but attempting to replay personally preserved pages segregates archived pages by individuals and organizations of personal, private, and public Web content. This is misrepresentative of the Web as it was. While the Memento …
Using Web Archives To Enrich The Live Web Experience Through Storytelling, Yasmin Alnoamany
Using Web Archives To Enrich The Live Web Experience Through Storytelling, Yasmin Alnoamany
Computer Science Theses & Dissertations
Much of our cultural discourse occurs primarily on the Web. Thus, Web preservation is a fundamental precondition for multiple disciplines. Archiving Web pages into themed collections is a method for ensuring these resources are available for posterity. Services such as Archive-It exists to allow institutions to develop, curate, and preserve collections of Web resources. Understanding the contents and boundaries of these archived collections is a challenge for most people, resulting in the paradox of the larger the collection, the harder it is to understand. Meanwhile, as the sheer volume of data grows on the Web, "storytelling" is becoming a popular …
Scripts In A Frame: A Framework For Archiving Deferred Representations, Justin F. Brunelle
Scripts In A Frame: A Framework For Archiving Deferred Representations, Justin F. Brunelle
Computer Science Theses & Dissertations
Web archives provide a view of the Web as seen by Web crawlers. Because of rapid advancements and adoption of client-side technologies like JavaScript and Ajax, coupled with the inability of crawlers to execute these technologies effectively, Web resources become harder to archive as they become more interactive. At Web scale, we cannot capture client-side representations using the current state-of-the art toolsets because of the migration from Web pages to Web applications. Web applications increasingly rely on JavaScript and other client-side programming languages to load embedded resources and change client-side state. We demonstrate that Web crawlers and other automatic archival …
Buckets: Smart Objects For Digital Libraries, Michael L. Nelson
Buckets: Smart Objects For Digital Libraries, Michael L. Nelson
Computer Science Theses & Dissertations
Discussion of digital libraries (DLs) is often dominated by the merits of various archives, repositories, search engines, search interfaces and database systems. While these technologies are necessary for information management, information content and information retrieval systems should progress on independent paths and each should make limited assumptions about the status or capabilities of the other. Information content is more important than the systems used for its storage and retrieval. Digital information should have the same long-term survivability prospects as traditional hardcopy information and should not be impacted by evolving search engine technologies or vendor vagaries in database management systems.
Digital …