Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 2 of 2

Full-Text Articles in Physical Sciences and Mathematics

Dial 2004 Working Group Report On Acquisition Quality Control, William A. Barrett, Henry Baird, Frank Le Bourgeois, Xiaofan Lin, George Nagy, Steve Simske, Elisa H. Barney Smith Apr 2006

Dial 2004 Working Group Report On Acquisition Quality Control, William A. Barrett, Henry Baird, Frank Le Bourgeois, Xiaofan Lin, George Nagy, Steve Simske, Elisa H. Barney Smith

Faculty Publications

This report summarizes the discussions of the Working Group on Acquisition Quality at the International Workshop on Document Image Analysis for Libraries, Palo Alto, CA, 23-24 January 2004. Acquisition of the image is one of the most time intensive components of forming a digital library, and the quality of the acquisition will affect all later stages of the digital library project. The current state of the art in acquisition is analyzed. Problems and suggested improvements for image acquisition and storage formats and the special problems associated with acquisition from microfilm follows. A list of general suggestions was developed which was …


Observed Web Robot Behavior On Decaying Web Subsites, Joan A. Smith, Frank Mccown, Michael L. Nelson Jan 2006

Observed Web Robot Behavior On Decaying Web Subsites, Joan A. Smith, Frank Mccown, Michael L. Nelson

Computer Science Faculty Publications

We describe the observed crawling patterns of various search engines (including Google, Yahoo and MSN) as they traverse a series of web subsites whose contents decay at predetermined rates. We plot the progress of the crawlers through the subsites, and their behaviors regarding the various file types included in the web subsites. We chose decaying subsites because we were originally interested in tracking the implication of using search engine caches for digital preservation. However, some of the crawling behaviors themselves proved to be interesting and have implications on using a search engine as an interface to a digital library.