Open Access. Powered by Scholars. Published by Universities.®

Theory and Algorithms Commons

Open Access. Powered by Scholars. Published by Universities.®

2012

Graphics and Human Computer Interfaces

Research Collection School Of Computing and Information Systems

Articles 1 - 1 of 1

Full-Text Articles in Theory and Algorithms

Snap-And-Ask: Answering Multimodal Question By Naming Visual Instance, Wei Zhang, Lei Pang, Chong-Wah Ngo Nov 2012

Snap-And-Ask: Answering Multimodal Question By Naming Visual Instance, Wei Zhang, Lei Pang, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

In real-life, it is easier to provide a visual cue when asking a question about a possibly unfamiliar topic, for example, asking the question, “Where was this crop circle found?”. Providing an image of the instance is far more convenient than texting a verbose description of the visual properties, especially when the name of the query instance is not known. Nevertheless, having to identify the visual instance before processing the question and eventually returning the answer makes multimodal question-answering technically challenging. This paper addresses the problem of visual-totext naming through the paradigm of answering-by-search in a two-stage computational framework, which …