Open Access. Powered by Scholars. Published by Universities.®

Social and Behavioral Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Utah State University

Journal of Western Archives

2017

Web archiving, RSS feeds

Articles 1 - 1 of 1

Full-Text Articles in Social and Behavioral Sciences

Using Rss To Improve Web Harvest Results For News Web Sites, Gina M. Jones, Michael Neubert Mar 2017

Using Rss To Improve Web Harvest Results For News Web Sites, Gina M. Jones, Michael Neubert

Journal of Western Archives

In the last several years, the Library of Congress Web archiving program has grown to include large sites that publish news–over more than a year we learned they present serious challenges. After thinking through the use cases for archived online news sites, we realized that completeness of harvest was paramount. As we developed our understanding of deficiencies in the completeness of these kinds of sites we began to test use of RSS feeds to build customized seed lists for shallow crawls as the primary way these sites are crawled. Over time we discovered that while completeness of harvest was greatly …