Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Selected Works

2005

Document and Text Processing

Articles 1 - 1 of 1

Full-Text Articles in Computer Sciences

Boosted Decision Trees For Word Recognition In Handwritten Document Retrieval, Nicholas R. Howe, Toni M. Rath, R. Manmatha Dec 2004

Boosted Decision Trees For Word Recognition In Handwritten Document Retrieval, Nicholas R. Howe, Toni M. Rath, R. Manmatha

R. Manmatha

Recognition and retrieval of historical handwritten material is an unsolved problem. We propose a novel approach to recognizing and retrieving handwritten manuscripts, based upon word image classification as a key step. Decision trees with normalized pixels as features form the basis of a highly accurate AdaBoost classifier, trained on a corpus of word images that have been resized and sampled at a pyramid of resolutions. To stem problems from the highly skewed distribution of class frequencies, word classes with very few training samples are augmented with stochastically altered versions of the originals. This increases recognition performance substantially. On a standard …