Open Access. Powered by Scholars. Published by Universities.®

Southern Methodist University

Data Science

Computer vision

Articles 1 - 1 of 1

Full-Text Articles in Artificial Intelligence and Robotics

Multi-Modal Classification Using Images And Text, Stuart J. Miller, Justin Howard, Paul Adams, Mel Schwan, Robert Slater Jan 2021

Multi-Modal Classification Using Images And Text, Stuart J. Miller, Justin Howard, Paul Adams, Mel Schwan, Robert Slater

SMU Data Science Review

This paper proposes a method for the integration of natural language understanding in image classification to improve classification accuracy by making use of associated metadata. Traditionally, only image features have been used in the classification process; however, metadata accompanies images from many sources. This study implemented a multi-modal image classification model that combines convolutional methods with natural language understanding of descriptions, titles, and tags to improve image classification. The novelty of this approach was to learn from additional external features associated with the images using natural language understanding with transfer learning. It was found that the combination of ResNet-50 image …