Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 5 of 5
Full-Text Articles in Physical Sciences and Mathematics
Conquer: Contextual Query-Aware Ranking For Video Corpus Moment Retrieval, Zhijian Hou, Chong-Wah Ngo, W. K. Chan
Conquer: Contextual Query-Aware Ranking For Video Corpus Moment Retrieval, Zhijian Hou, Chong-Wah Ngo, W. K. Chan
Research Collection School Of Computing and Information Systems
This paper tackles a recently proposed Video Corpus Moment Retrieval task. This task is essential because advanced video retrieval applications should enable users to retrieve a precise moment from a large video corpus. We propose a novel CONtextual QUery-awarE Ranking (CONQUER) model for effective moment localization and ranking. CONQUER explores query context for multi-modal fusion and representation learning in two different steps. The first step derives fusion weights for the adaptive combination of multi-modal video content. The second step performs bi-directional attention to tightly couple video and query as a single joint representation for moment localization. As query context is …
Cross-Modal Recipe Retrieval With Stacked Attention Model, Jing-Jing Chen, Lei Pang, Chong-Wah Ngo
Cross-Modal Recipe Retrieval With Stacked Attention Model, Jing-Jing Chen, Lei Pang, Chong-Wah Ngo
Research Collection School Of Computing and Information Systems
Taking a picture of delicious food and sharing it in social media has been a popular trend. The ability to recommend recipes along will benefit users who want to cook a particular dish, and the feature is yet to be available. The challenge of recipe retrieval, nevertheless, comes from two aspects. First, the current technology in food recognition can only scale up to few hundreds of categories, which are yet to be practical for recognizing tens of thousands of food categories. Second, even one food category can have variants of recipes that differ in ingredient composition. Finding the best-match recipe …
Cross-Modal Recipe Retrieval With Rich Food Attributes, Jingjing Chen, Chong-Wah Ngo, Tat-Seng Chua
Cross-Modal Recipe Retrieval With Rich Food Attributes, Jingjing Chen, Chong-Wah Ngo, Tat-Seng Chua
Research Collection School Of Computing and Information Systems
Food is rich of visible (e.g., colour, shape) and procedural (e.g., cutting, cooking) attributes. Proper leveraging of these attributes, particularly the interplay among ingredients, cutting and cooking methods, for health-related applications has not been previously explored. This paper investigates cross-modal retrieval of recipes, specifically to retrieve a text-based recipe given a food picture as query. As similar ingredient composition can end up with wildly different dishes depending on the cooking and cutting procedures, the difficulty of retrieval originates from fine-grained recognition of rich attributes from pictures. With a multi-task deep learning model, this paper provides insights on the feasibility of …
Cross-Modal Recipe Retrieval: How To Cook This Dish?, Jingjing Chen, Lei Pang, Chong-Wah Ngo
Cross-Modal Recipe Retrieval: How To Cook This Dish?, Jingjing Chen, Lei Pang, Chong-Wah Ngo
Research Collection School Of Computing and Information Systems
In social media users like to share food pictures. One intelligent feature, potentially attractive to amateur chefs, is the recommendation of recipe along with food. Having this feature, unfortunately, is still technically challenging. First, the current technology in food recognition can only scale up to few hundreds of categories, which are yet to be practical for recognizing ten of thousands of food categories. Second, even one food category can have variants of recipes that differ in ingredient composition. Finding the best-match recipe requires knowledge of ingredients, which is a fine-grained recognition problem. In this paper, we consider the problem from …
Deep Multimodal Learning For Affective Analysis And Retrieval, Lei Pang, Shiai Zhu, Chong-Wah Ngo
Deep Multimodal Learning For Affective Analysis And Retrieval, Lei Pang, Shiai Zhu, Chong-Wah Ngo
Research Collection School Of Computing and Information Systems
Social media has been a convenient platform for voicing opinions through posting messages, ranging from tweeting a short text to uploading a media file, or any combination of messages. Understanding the perceived emotions inherently underlying these user-generated contents (UGC) could bring light to emerging applications such as advertising and media analytics. Existing research efforts on affective computation are mostly dedicated to single media, either text captions or visual content. Few attempts for combined analysis of multiple media are made, despite that emotion can be viewed as an expression of multimodal experience. In this paper, we explore the learning of highly …