Physical Sciences and Mathematics | Open Access Articles

Multiple Bernoulli Relevance Models For Image And Video Annotation, S. Feng, R. Manmatha, V. Lavrenko Jan 2004

Multiple Bernoulli Relevance Models For Image And Video Annotation, S. Feng, R. Manmatha, V. Lavrenko

R. Manmatha

Retrieving images in response to textual queries requires some knowledge of the semantics of the picture. Here, we show how we can do both automatic image annotation and retrieval (using one word queries) from images and videos using a multiple Bernoulli relevance model. The model assumes that a training set of images or videos along with keyword annotations is provided. Multiple keywords are provided for an image and the specific correspondence between a keyword and an image is not provided. Each image is partitioned into a set of rectangular regions and a real-valued feature vector is computed over these regions. …

Go to article

Automatic Image Annotation Of News Images With Large Vocabularies And Low Quality Training Data, J. Jeon, R. Manmatha Jan 2004

Automatic Image Annotation Of News Images With Large Vocabularies And Low Quality Training Data, J. Jeon, R. Manmatha

R. Manmatha

A traditional approach to retrieving images is to manually annotate the image with textual keywords and then retrieve images using these keywords. Manual annotation is expensive and recently a few approaches have been proposed for automatically annotating images. These techniques usually learn a statistical model using a training set of images annotated with keywords and use this model to automatically annotate test images. While promising, these techniques have generally been tested on a few thousand images, with vocabularies of a few hundred words or less and using relatively high quality training data where the keywords are categories/objects and are directly …

Go to article

Statistical Models For Automatic Video Annotation And Retrieval, V. Lavrenko, S. L. Feng, R. Manmatha Dec 2003

Statistical Models For Automatic Video Annotation And Retrieval, V. Lavrenko, S. L. Feng, R. Manmatha

R. Manmatha

We apply a continuous relevance model (CRM) to the problem of directly retrieving the visual content of videos using text queries. The model computes a joint probability model for image features and words using a training set of annotated images. The model may then be used to annotate unseen test images. The probabilistic annotations are used for retrieval using text queries. We also propose a modified model - the normalized CRM - which substantially improves performance on a subset of the TREC Video dataset.

Go to article

A Scale Space Approach For Automatically Segmenting Words From Historical Handwritten Documents, R. Manmatha, Jamie L. Rothfeder Dec 2003

A Scale Space Approach For Automatically Segmenting Words From Historical Handwritten Documents, R. Manmatha, Jamie L. Rothfeder

R. Manmatha

Many libraries, museums, and other organizations contain large collections of handwritten historical documents, for example, the papers of early presidents like George Washington at the Library of Congress. The first step in providing recognition/ retrieval tools is to automatically segment handwritten pages into words. State of the art segmentation techniques like the gap metrics algorithm have been mostly developed and tested on highly constrained documents like bank checks and postal addresses. There has been little work on full handwritten pages and this work has usually involved testing on clean artificial documents created for the purpose of research. Historical manuscript images, …

Go to article

Holistic Word Recognition For Handwritten Historical Documents, Victor Lavrenko, Toni M. Rath, R. Manmatha Dec 2003

Holistic Word Recognition For Handwritten Historical Documents, Victor Lavrenko, Toni M. Rath, R. Manmatha

R. Manmatha

Most offline handwriting recognition approaches proceed by segmenting words into smaller pieces (usually characters) which are recognized separately. The recognition result of a word is then the composition of the individually recognized parts. Inspired by results in cognitive psychology, researchers have begun to focus on holistic word recognition approaches. Here we present a holistic word recognition approach for single-author historical documents, which is motivated by the fact that for severely degraded documents a segmentation of words into characters will produce very poor results. The quality of the original documents does not allow us to recognize them with high accuracy - …

Go to article

A Search Engine For Historical Manuscript Images, Toni M. Rath, R. Manmatha, Victor Lavrenko Dec 2003

A Search Engine For Historical Manuscript Images, Toni M. Rath, R. Manmatha, Victor Lavrenko

R. Manmatha

Many museum and library archives are digitizing their large collections of handwritten historical manuscripts to enable public access to them. These collections are only available in image formats and require expensive manual annotation work for access to them. Current handwriting recognizers have word error rates in excess of 50% and therefore cannot be used for such material. We describe two statistical models for retrieval in large collections of handwritten manuscripts given a text query. Both use a set of transcribed page images to learn a joint probability distribution between features computed from word images and their transcriptions. The models can …

Go to article

An Inference Network Approach To Image Retrieval, Donald Metzler, R. Manmatha Dec 2003

An Inference Network Approach To Image Retrieval, Donald Metzler, R. Manmatha

R. Manmatha

Most image retrieval systems only allow a fragment of text or an example image as a query. Most users have more complex information needs that are not easily expressed in either of these forms. This paper proposes a model based on the Inference Network framework from information retrieval that employs a powerful query language that allows structured query operators, term weighting, and the combination of text and images within a query. The model uses non-parametric methods to estimate probabilities within the inference network. Image annotation and retrieval results are reported and compared against other published systems and illustrative structured and …

Go to article

Using Maximum Entropy For Automatic Image Annotation, Jiwoon Jeon, R. Manmatha Dec 2003

Using Maximum Entropy For Automatic Image Annotation, Jiwoon Jeon, R. Manmatha

R. Manmatha

In this paper, we propose the use of the Maximum Entropy approach for the task of automatic image annotation. Given labeled training data, Maximum Entropy is a statistical technique which allows one to predict the probability of a label given test data. The techniques allow for relationships between features to be effectively captured and has been successfully applied to a number of language tasks including machine translation. In our case, we view the image annotation task as one where a training data set of images labeled with keywords is provided and we need to automatically label the test images with …

Go to article

Physical Sciences and Mathematics Commons^™

Full-Text Articles in Physical Sciences and Mathematics

Multiple Bernoulli Relevance Models For Image And Video Annotation, S. Feng, R. Manmatha, V. Lavrenko

R. Manmatha

Automatic Image Annotation Of News Images With Large Vocabularies And Low Quality Training Data, J. Jeon, R. Manmatha

R. Manmatha

Statistical Models For Automatic Video Annotation And Retrieval, V. Lavrenko, S. L. Feng, R. Manmatha

R. Manmatha

A Scale Space Approach For Automatically Segmenting Words From Historical Handwritten Documents, R. Manmatha, Jamie L. Rothfeder

R. Manmatha

Holistic Word Recognition For Handwritten Historical Documents, Victor Lavrenko, Toni M. Rath, R. Manmatha

R. Manmatha

A Search Engine For Historical Manuscript Images, Toni M. Rath, R. Manmatha, Victor Lavrenko

R. Manmatha

An Inference Network Approach To Image Retrieval, Donald Metzler, R. Manmatha

R. Manmatha

Using Maximum Entropy For Automatic Image Annotation, Jiwoon Jeon, R. Manmatha

R. Manmatha