Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

Theses and Dissertations

2018

Historical Document Processing

Articles 1 - 1 of 1

Full-Text Articles in Physical Sciences and Mathematics

Fully Convolutional Neural Networks For Pixel Classification In Historical Document Images, Seth Andrew Stewart Oct 2018

Fully Convolutional Neural Networks For Pixel Classification In Historical Document Images, Seth Andrew Stewart

Theses and Dissertations

We use a Fully Convolutional Neural Network (FCNN) to classify pixels in historical document images, enabling the extraction of high-quality, pixel-precise and semantically consistent layers of masked content. We also analyze a dataset of hand-labeled historical form images of unprecedented detail and complexity. The semantic categories we consider in this new dataset include handwriting, machine-printed text, dotted and solid lines, and stamps. Segmentation of document images into distinct layers allows handwriting, machine print, and other content to be processed and recognized discriminatively, and therefore more intelligently than might be possible with content-unaware methods. We show that an efficient FCNN with …