Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Graphics and Human Computer Interfaces

Singapore Management University

Research Collection School Of Computing and Information Systems

Series

Cross-modal retrieval

Publication Year

Articles 1 - 5 of 5

Full-Text Articles in Physical Sciences and Mathematics

Conquer: Contextual Query-Aware Ranking For Video Corpus Moment Retrieval, Zhijian Hou, Chong-Wah Ngo, W. K. Chan Oct 2021

Conquer: Contextual Query-Aware Ranking For Video Corpus Moment Retrieval, Zhijian Hou, Chong-Wah Ngo, W. K. Chan

Research Collection School Of Computing and Information Systems

This paper tackles a recently proposed Video Corpus Moment Retrieval task. This task is essential because advanced video retrieval applications should enable users to retrieve a precise moment from a large video corpus. We propose a novel CONtextual QUery-awarE Ranking (CONQUER) model for effective moment localization and ranking. CONQUER explores query context for multi-modal fusion and representation learning in two different steps. The first step derives fusion weights for the adaptive combination of multi-modal video content. The second step performs bi-directional attention to tightly couple video and query as a single joint representation for moment localization. As query context is …


Cross-Modal Recipe Retrieval With Stacked Attention Model, Jing-Jing Chen, Lei Pang, Chong-Wah Ngo Nov 2018

Cross-Modal Recipe Retrieval With Stacked Attention Model, Jing-Jing Chen, Lei Pang, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

Taking a picture of delicious food and sharing it in social media has been a popular trend. The ability to recommend recipes along will benefit users who want to cook a particular dish, and the feature is yet to be available. The challenge of recipe retrieval, nevertheless, comes from two aspects. First, the current technology in food recognition can only scale up to few hundreds of categories, which are yet to be practical for recognizing tens of thousands of food categories. Second, even one food category can have variants of recipes that differ in ingredient composition. Finding the best-match recipe …


Cross-Modal Recipe Retrieval With Rich Food Attributes, Jingjing Chen, Chong-Wah Ngo, Tat-Seng Chua Oct 2017

Cross-Modal Recipe Retrieval With Rich Food Attributes, Jingjing Chen, Chong-Wah Ngo, Tat-Seng Chua

Research Collection School Of Computing and Information Systems

Food is rich of visible (e.g., colour, shape) and procedural (e.g., cutting, cooking) attributes. Proper leveraging of these attributes, particularly the interplay among ingredients, cutting and cooking methods, for health-related applications has not been previously explored. This paper investigates cross-modal retrieval of recipes, specifically to retrieve a text-based recipe given a food picture as query. As similar ingredient composition can end up with wildly different dishes depending on the cooking and cutting procedures, the difficulty of retrieval originates from fine-grained recognition of rich attributes from pictures. With a multi-task deep learning model, this paper provides insights on the feasibility of …


Cross-Modal Recipe Retrieval: How To Cook This Dish?, Jingjing Chen, Lei Pang, Chong-Wah Ngo Jan 2017

Cross-Modal Recipe Retrieval: How To Cook This Dish?, Jingjing Chen, Lei Pang, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

In social media users like to share food pictures. One intelligent feature, potentially attractive to amateur chefs, is the recommendation of recipe along with food. Having this feature, unfortunately, is still technically challenging. First, the current technology in food recognition can only scale up to few hundreds of categories, which are yet to be practical for recognizing ten of thousands of food categories. Second, even one food category can have variants of recipes that differ in ingredient composition. Finding the best-match recipe requires knowledge of ingredients, which is a fine-grained recognition problem. In this paper, we consider the problem from …


Deep Multimodal Learning For Affective Analysis And Retrieval, Lei Pang, Shiai Zhu, Chong-Wah Ngo Nov 2015

Deep Multimodal Learning For Affective Analysis And Retrieval, Lei Pang, Shiai Zhu, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

Social media has been a convenient platform for voicing opinions through posting messages, ranging from tweeting a short text to uploading a media file, or any combination of messages. Understanding the perceived emotions inherently underlying these user-generated contents (UGC) could bring light to emerging applications such as advertising and media analytics. Existing research efforts on affective computation are mostly dedicated to single media, either text captions or visual content. Few attempts for combined analysis of multiple media are made, despite that emotion can be viewed as an expression of multimodal experience. In this paper, we explore the learning of highly …