Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Data Science

PDF

Theses and Dissertations

Theses/Dissertations

2023

Machine Learning

Articles 1 - 1 of 1

Full-Text Articles in Computer Sciences

Study Of Augmentations On Historical Manuscripts Using Trocr, Erez Meoded Dec 2023

Study Of Augmentations On Historical Manuscripts Using Trocr, Erez Meoded

Theses and Dissertations

Historical manuscripts are an essential source of original content. For many reasons, it is hard to recognize these manuscripts as text. This thesis used a state-of-the-art Handwritten Text Recognizer, TrOCR, to recognize a 16th-century manuscript. TrOCR uses a vision transformer to encode the input images and a language transformer to decode them back to text. We showed that carefully preprocessed images and designed augmentations can improve the performance of TrOCR. We suggest an ensemble of augmented models to achieve an even better performance.