Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 2 of 2

Full-Text Articles in Physical Sciences and Mathematics

A Sophisticated Library Search Strategy Using Folksonomies And Similarity Matching, William Lund, Yiu-Kai Ng, Maria Pera Sep 2014

A Sophisticated Library Search Strategy Using Folksonomies And Similarity Matching, William Lund, Yiu-Kai Ng, Maria Pera

William Lund

Libraries, private and public, offer valuable resources to library patrons. As of today the only way to locate information archived exclusively in libraries is through their catalogs. Library patrons, however, often find it difficult to formulate a proper query, which requires using specific keywords assigned to different fields of desired library catalog records, to obtain relevant results. These improperly formulated queries often yield irrelevant results or no results at all. This negative experience in dealing with existing library systems turn library patrons away from library catalogs; instead, they rely on Web search engines to perform their searches first and upon …


Ensemble Methods For Historical Machine-Printed Document Recognition, William Lund Sep 2014

Ensemble Methods For Historical Machine-Printed Document Recognition, William Lund

William Lund

The usefulness of digitized documents is directly related to the quality of the extracted text. Optical Character Recognition (OCR) has reached a point where well-formatted and clean machine- printed documents are easily recognizable by current commercial OCR products; however, older or degraded machine-printed documents present problems to OCR engines resulting in word error rates (WER) that severely limit either automated or manual use of the extracted text. Major archives of historical machine-printed documents are being assembled around the globe, requiring an accurate transcription of the text for the automated creation of descriptive metadata, full-text searching, and information extraction. Given document …