Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 1 of 1
Full-Text Articles in Entire DC Network
Evaluation Of Page Quality Using Simple Features, Luis Ricardo Blando
Evaluation Of Page Quality Using Simple Features, Luis Ricardo Blando
UNLV Retrospective Theses & Dissertations
A classifier to determine page quality from an Optical Character Recognition (OCR) perspective is developed. It classifies a given page image as either "good" (i.e. high OCR accuracy is expected) or "bad" (i.e., low OCR accuracy expected). The classifier is based upon measuring the amount of white speckle, the amount of broken pieces, and the overall size information in the page. Two different sets of test data were used to evaluate the classifier: the Test dataset containing 439 pages and the Magazine dataset containing 200 pages. The classifier recognized 85% of the pages in the Test dataset correctly. However, approximately …