Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Graphics and Human Computer Interfaces

Series

2019

Institution
Keyword
Publication

Articles 1 - 30 of 58

Full-Text Articles in Physical Sciences and Mathematics

Image Classification Using Fuzzy Fca, Niruktha Roy Gotoor Dec 2019

Image Classification Using Fuzzy Fca, Niruktha Roy Gotoor

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Formal concept analysis (FCA) is a mathematical theory based on lattice and order theory used for data analysis and knowledge representation. It has been used in various domains such as data mining, machine learning, semantic web, Sciences, for the purpose of data analysis and Ontology over the last few decades. Various extensions of FCA are being researched to expand it's scope over more departments. In this thesis,we review the theory of Formal Concept Analysis (FCA) and its extension Fuzzy FCA. Many studies to use FCA in data mining and text learning have been pursued. We extend these studies to include …


Improved Generalisation Bounds For Deep Learning Through L∞ Covering Numbers, Antoine Ledent, Yunwen Lei, Marius Kloft Dec 2019

Improved Generalisation Bounds For Deep Learning Through L∞ Covering Numbers, Antoine Ledent, Yunwen Lei, Marius Kloft

Research Collection School Of Computing and Information Systems

Using proof techniques involving L∞ covering numbers, we show generalisation error bounds for deep learning with two main improvements over the state of the art. First, our bounds have no explicit dependence on the number of classes except for logarithmic factors. This holds even when formulating the bounds in terms of the L 2 norm of the weight matrices, while previous bounds exhibit at least a square-root dependence on the number of classes in this case. Second, we adapt the Rademacher analysis of DNNs to incorporate weight sharing—a task of fundamental theoretical importance which was previously attempted only under very …


Improving Medication Information Presentation Through Interactive Visualization In Mobile Apps: Human Factors Design, Don Roosan, Yan Li, Anandi Law, Huy Truong, Mazharul Karim, Jay Chok, Moom Roosan Nov 2019

Improving Medication Information Presentation Through Interactive Visualization In Mobile Apps: Human Factors Design, Don Roosan, Yan Li, Anandi Law, Huy Truong, Mazharul Karim, Jay Chok, Moom Roosan

Pharmacy Faculty Articles and Research

Background: Despite the detailed patient package inserts (PPIs) with prescription drugs that communicate crucial information about safety, there is a critical gap between patient understanding and the knowledge presented. As a result, patients may suffer from adverse events. We propose using human factors design methodologies such as hierarchical task analysis (HTA) and interactive visualization to bridge this gap. We hypothesize that an innovative mobile app employing human factors design with an interactive visualization can deliver PPI information aligned with patients’ information processing heuristics. Such an app may help patients gain an improved overall knowledge of medications.

Objective: The …


Semi-Supervised Entity Alignment Via Joint Knowledge Embedding Model And Cross-Graph Model, Chengjiang Li, Yixin Cao, Lei Hou, Jiaxin Shi, Juanzi Li, Tat-Seng Chua Nov 2019

Semi-Supervised Entity Alignment Via Joint Knowledge Embedding Model And Cross-Graph Model, Chengjiang Li, Yixin Cao, Lei Hou, Jiaxin Shi, Juanzi Li, Tat-Seng Chua

Research Collection School Of Computing and Information Systems

Entity alignment aims at integrating complementary knowledge graphs (KGs) from different sources or languages, which may benefit many knowledge-driven applications. It is challenging due to the heterogeneity of KGs and limited seed alignments. In this paper, we propose a semi-supervised entity alignment method by joint Knowledge Embedding model and Cross-Graph model (KECG). It can make better use of seed alignments to propagate over the entire graphs with KG-based constraints. Specifically, as for the knowledge embedding model, we utilize TransE to implicitly complete two KGs towards consistency and learn relational constraints between entities. As for the cross-graph model, we extend Graph …


Gender And Racial Diversity In Commercial Brands' Advertising Images On Social Media, Jisun An, Haewoon Kwak Nov 2019

Gender And Racial Diversity In Commercial Brands' Advertising Images On Social Media, Jisun An, Haewoon Kwak

Research Collection School Of Computing and Information Systems

Gender and racial diversity in the mediated images from the media shape our perception of different demographic groups. In this work, we investigate gender and racial diversity of 85,957 advertising images shared by the 73 top international brands on Instagram and Facebook. We hope that our analyses give guidelines on how to build a fully automated watchdog for gender and racial diversity in online advertisements.


Vireojd-Mm @ Trecvid 2019: Activities In Extended Video (Actev), Zhijian Hou, Ying-Wei Pan, Ting Yao, Chong-Wah Ngo Nov 2019

Vireojd-Mm @ Trecvid 2019: Activities In Extended Video (Actev), Zhijian Hou, Ying-Wei Pan, Ting Yao, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

In this paper, we describe the system developed for Activities in Extended Video(ActEV) task at TRECVid 2019 [1] and the achieved results. Activities in Extended Video(ActEV): The goal of Activities in Extended Video is to spatially and temporally localize the action instances in a surveillance setting. We have participated in previous ActEV prize challenge. Since the only difference between the two challenges is evaluation metric, we maintain previous pipeline [2] for this challenge. The pipeline has three stages: object detection, tubelet generation and temporal action localization. This time we extend the system for two aspects separately: better object detection and …


Shellnet: Efficient Point Cloud Convolutional Neural Networks Using Concentric Shells Statistics, Zhiyuan Zhang, Binh-Son Hua, Sai-Kit Yeung Nov 2019

Shellnet: Efficient Point Cloud Convolutional Neural Networks Using Concentric Shells Statistics, Zhiyuan Zhang, Binh-Son Hua, Sai-Kit Yeung

Research Collection School Of Computing and Information Systems

Deep learning with 3D data has progressed significantly since the introduction of convolutional neural networks that can handle point order ambiguity in point cloud data. While being able to achieve good accuracies in various scene understanding tasks, previous methods often have low training speed and complex network architecture. In this paper, we address these problems by proposing an efficient end-to-end permutation invariant convolution for point cloud deep learning. Our simple yet effective convolution operator named ShellConv uses statistics from concentric spherical shells to define representative features and resolve the point order ambiguity, allowing traditional convolution to perform on such features. …


Special Issue On Multimedia Recommendation And Multi-Modal Data Analysis, Xiangnan He, Zhenguang Liu, Hanwang Zhang, Chong-Wah Ngo, Svebor Karaman, Yongfeng Zhang Nov 2019

Special Issue On Multimedia Recommendation And Multi-Modal Data Analysis, Xiangnan He, Zhenguang Liu, Hanwang Zhang, Chong-Wah Ngo, Svebor Karaman, Yongfeng Zhang

Research Collection School Of Computing and Information Systems

Rich multimedia contents are dominating the Web. In popular social media platforms such as FaceBook, Twitter, and Instagram, there are over millions of multimedia contents being created by users on a daily basis. In the meantime, multimedia data consist of data in multiple modalities, such as text, images, audio, and so on. Users are heavily overloaded by the massive multi-modal data, and it becomes critical to explore advanced techniques for heterogeneous big data analytics and multimedia recommendation. Traditional multimedia recommendation and data analysis technologies cannot well address the problem of understanding users’ preference in the feature-rich multimedia contents, and have …


Revisiting Collaboration Through Mixed Reality: The Evolution Of Groupware, Barrett Ens, Joel Lanir, Anthony Tang, Scott Bateman, Gun Lee, Thammathip Piumsomboon, Mark Billinghurst Nov 2019

Revisiting Collaboration Through Mixed Reality: The Evolution Of Groupware, Barrett Ens, Joel Lanir, Anthony Tang, Scott Bateman, Gun Lee, Thammathip Piumsomboon, Mark Billinghurst

Research Collection School Of Computing and Information Systems

Collaborative Mixed Reality (MR) systems are at a critical point in time as they are soon to become more commonplace. However, MR technology has only recently matured to the point where researchers can focus deeply on the nuances of supporting collaboration, rather than needing to focus on creating the enabling technology. In parallel, but largely independently, the field of Computer Supported Cooperative Work (CSCW) has focused on the fundamental concerns that underlie human communication and collaboration over the past 30-plus years. Since MR research is now on the brink of moving into the real world, we reflect on three decades …


Low-Resource Name Tagging Learned With Weakly Labeled Data, Yixin Cao, Zikun Hu, Tat-Seng Chua, Zhiyuan Liu, Heng Ji Nov 2019

Low-Resource Name Tagging Learned With Weakly Labeled Data, Yixin Cao, Zikun Hu, Tat-Seng Chua, Zhiyuan Liu, Heng Ji

Research Collection School Of Computing and Information Systems

Name tagging in low-resource languages or domains suffers from inadequate training data. Existing work heavily relies on additional information, while leaving those noisy annotations unexplored that extensively exist on the web. In this paper, we propose a novel neural model for name tagging solely based on weakly labeled (WL) data, so that it can be applied in any low-resource settings. To take the best advantage of all WL sentences, we split them into high-quality and noisy portions for two modules, respectively: (1) a classification module focusing on the large portion of noisy data can efficiently and robustly pretrain the tag …


Eurecom At Trecvid Avs 2019, Danny Francis, Phuong Anh Nguyen, Benoit Huet, Chong-Wah Ngo Nov 2019

Eurecom At Trecvid Avs 2019, Danny Francis, Phuong Anh Nguyen, Benoit Huet, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

This notebook reports the model and results of the EURECOM runs at TRECVID AVS 2019.


Vireo-Eurecom @ Trecvid 2019: Ad-Hoc Video Search (Avs), Phuong Anh Nguyen, Jiaxin Wu, Chong-Wah Ngo, Francis Danny, Benoit Huet Nov 2019

Vireo-Eurecom @ Trecvid 2019: Ad-Hoc Video Search (Avs), Phuong Anh Nguyen, Jiaxin Wu, Chong-Wah Ngo, Francis Danny, Benoit Huet

Research Collection School Of Computing and Information Systems

In this paper, we describe the systems developed for Ad-hoc Video Search (AVS) task at TRECVID 2019[1] and the achieved results.


Digital Addiction: A Conceptual Overview, Amarjit Kumar Singh, Pawan Kumar Singh Oct 2019

Digital Addiction: A Conceptual Overview, Amarjit Kumar Singh, Pawan Kumar Singh

Library Philosophy and Practice (e-journal)

Abstract

Digital addiction referred to an impulse control disorder that involves the obsessive use of digital devices, digital technologies, and digital platforms, i.e. internet, video game, online platforms, mobile devices, digital gadgets, and social network platform. It is an emerging domain of Cyberpsychology (Singh, Amarjit Kumar and Pawan Kumar Singh; 2019), which explore a problematic usage of digital media, device, and platforms by being obsessive and excessive. This article analyses, reviewed the current research, and established a conceptual overview on the digital addiction. The research literature on digital addiction has proliferated. However, we tried to categories the digital addiction, according …


Vrsensory: Designing Inclusive Virtual Games With Neurodiverse Children, Ben Wasserman, Derek Prate, Bryce Purnell, Alex Muse, Kaitlyn Abdo, Kendra Day, Louanne Boyd Oct 2019

Vrsensory: Designing Inclusive Virtual Games With Neurodiverse Children, Ben Wasserman, Derek Prate, Bryce Purnell, Alex Muse, Kaitlyn Abdo, Kendra Day, Louanne Boyd

Engineering Faculty Articles and Research

We explore virtual environments and accompanying interaction styles to enable inclusive play. In designing games for three neurodiverse children, we explore how designing for sensory diversity can be understood through a formal game design framework. Our process reveals that by using sensory processing needs as requirements we can make sensory and social accessible play spaces. We contribute empirical findings for accommodating sensory differences for neurodiverse children in a way that supports inclusive play. Specifically, we detail the sensory driven design choices that not only support the enjoyability of the leisure activities, but that also support the social inclusion of sensory-diverse …


Fusion Of Multimodal Embeddings For Ad-Hoc Video Search, Danny Francis, Phuong Anh Nguyen, Benoit Huet, Chong-Wah Ngo Oct 2019

Fusion Of Multimodal Embeddings For Ad-Hoc Video Search, Danny Francis, Phuong Anh Nguyen, Benoit Huet, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

The challenge of Ad-Hoc Video Search (AVS) originates from free-form (i.e., no pre-defined vocabulary) and freestyle (i.e., natural language) query description. Bridging the semantic gap between AVS queries and videos becomes highly difficult as evidenced from the low retrieval accuracy of AVS benchmarking in TRECVID. In this paper, we study a new method to fuse multimodal embeddings which have been derived based on completely disjoint datasets. This method is tested on two datasets for two distinct tasks: on MSR-VTT for unique video retrieval and on V3C1 for multiple videos retrieval.


Nonuniform Timeslicing Of Dynamic Graphs Based On Visual Complexity, Yong Wang, Daniel Archambault, Hammad Haleem, Torsten Moeller, Yanhong Wu, Huamin Qu Oct 2019

Nonuniform Timeslicing Of Dynamic Graphs Based On Visual Complexity, Yong Wang, Daniel Archambault, Hammad Haleem, Torsten Moeller, Yanhong Wu, Huamin Qu

Research Collection School Of Computing and Information Systems

Uniform timeslicing of dynamic graphs has been used due to its convenience and uniformity across the time dimension. However, uniform timeslicing does not take the data set into account, which can generate cluttered timeslices with edge bursts and empty timeslices with few interactions. The graph mining filed has explored nonuniform timeslicing methods specifically designed to preserve graph features for mining tasks. In this paper, we propose a nonuni-form timeslicing approach for dynamic graph visualization. Our goal is to create timeslices of equal visual complexity. To this end, we adapt histogram equalization to create timeslices with a similar number of events, …


Reachnn: Reachability Analysis Of Neural-Network Controlled Systems, Chao Huang, Jiameng Fan, Wenchao Li, Xin Chen, Qi Zhu Oct 2019

Reachnn: Reachability Analysis Of Neural-Network Controlled Systems, Chao Huang, Jiameng Fan, Wenchao Li, Xin Chen, Qi Zhu

Computer Science Faculty Publications

Applying neural networks as controllers in dynamical systems has shown great promises. However, it is critical yet challenging to verify the safety of such control systems with neural-network controllers in the loop. Previous methods for verifying neural network controlled systems are limited to a few specific activation functions. In this work, we propose a new reachability analysis approach based on Bernstein polynomials that can verify neural-network controlled systems with a more general form of activation functions, i.e., as long as they ensure that the neural networks are Lipschitz continuous. Specifically, we consider abstracting feedforward neural networks with Bernstein polynomials for …


Mixed-Dish Recognition With Contextual Relation Networks, Lixi Deng, Jingjing Chen, Qianru Sun, Xiangnan He, Sheng Tang, Zhaoyan Ming, Yongdong Zhang, Tat-Seng Chua Oct 2019

Mixed-Dish Recognition With Contextual Relation Networks, Lixi Deng, Jingjing Chen, Qianru Sun, Xiangnan He, Sheng Tang, Zhaoyan Ming, Yongdong Zhang, Tat-Seng Chua

Research Collection School Of Computing and Information Systems

Mixed dish is a food category that contains different dishes mixed in one plate, and is popular in Eastern and Southeast Asia. Recognizing individual dishes in a mixed dish image is important for health related applications, e.g. calculating the nutrition values. However, most existing methods that focus on single dish classification are not applicable to mixed-dish recognition. The new challenge in recognizing mixed-dish images are the complex ingredient combination and severe overlap among different dishes. In order to tackle these problems, we propose a novel approach called contextual relation networks (CR-Nets) that encodes the implicit and explicit contextual relations among …


Rotation Invariant Convolutions For 3d Point Clouds Deep Learning, Zhiyuan Zhang, Binh-Son Hua, David W. Rosen, Sai-Kit Yeung Sep 2019

Rotation Invariant Convolutions For 3d Point Clouds Deep Learning, Zhiyuan Zhang, Binh-Son Hua, David W. Rosen, Sai-Kit Yeung

Research Collection School Of Computing and Information Systems

Recent progresses in 3D deep learning has shown that it is possible to design special convolution operators to consume point cloud data. However, a typical drawback is that rotation invariance is often not guaranteed, resulting in networks that generalizes poorly to arbitrary rotations. In this paper, we introduce a novel convolution operator for point clouds that achieves rotation invariance. Our core idea is to use low-level rotation invariant geometric features such as distances and angles to design a convolution operator for point cloud learning. The well-known point ordering problem is also addressed by a binning approach seamlessly built into the …


Anticipating Widespread Augmented Reality: Insights From The 2018 Ar Visioning Workshop, Gregory F. Welch, Gerd Bruder, Peter Squire, Ryan Schubert Aug 2019

Anticipating Widespread Augmented Reality: Insights From The 2018 Ar Visioning Workshop, Gregory F. Welch, Gerd Bruder, Peter Squire, Ryan Schubert

Faculty Scholarship and Creative Works

In August of 2018 a group of academic, government, and industry experts in the field of Augmented Reality gathered for four days to consider potential technological and societal issues and opportunities that could accompany a future where AR is pervasive in location and duration of use. This report is intended to summarize some of the most novel and potentially impactful insights and opportunities identified by the group.

Our target audience includes AR researchers, government leaders, and thought leaders in general. It is our intent to share some compelling technological and societal questions that we believe are unique to AR, and …


Kgat: Knowledge Graph Attention Network For Recommendation, Xiang Wang, Xiangnan He, Yixin Cao, Meng Liu, Tat-Seng Chua Aug 2019

Kgat: Knowledge Graph Attention Network For Recommendation, Xiang Wang, Xiangnan He, Yixin Cao, Meng Liu, Tat-Seng Chua

Research Collection School Of Computing and Information Systems

To provide more accurate, diverse, and explainable recommendation, it is compulsory to go beyond modeling user-item interactions and take side information into account. Traditional methods like factorization machine (FM) cast it as a supervised learning problem, which assumes each interaction as an independent instance with side information encoded. Due to the overlook of the relations among instances or items (e.g., the director of a movie is also an actor of another movie), these methods are insufficient to distill the collaborative signal from the collective behaviors of users. In this work, we investigate the utility of knowledge graph (KG), which breaks …


Multimodal Transformer Networks For End-To-End Video-Grounded Dialogue Systems, Hung Le, Doyen Sahoo, Nancy F. Chen, Steven C. H. Hoi Aug 2019

Multimodal Transformer Networks For End-To-End Video-Grounded Dialogue Systems, Hung Le, Doyen Sahoo, Nancy F. Chen, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

Developing Video-Grounded Dialogue Systems (VGDS), where a dialogue is conducted based on visual and audio aspects of a given video, is significantly more challenging than traditional image or text-grounded dialogue systems because (1) feature space of videos span across multiple picture frames, making it difficult to obtain semantic information; and (2) a dialogue agent must perceive and process information from different modalities (audio, video, caption, etc.) to obtain a comprehensive understanding. Most existing work is based on RNNs and sequence-to-sequence architectures, which are not very effective for capturing complex long-term dependencies (like in videos). To overcome this, we propose Multimodal …


Personalized Fashion Recommendation With Visual Explanations Based On Multimodal Attention Network: Towards Visually Explainable Recommendation, Xu Chen, Hanxiong Chen, Hongteng Xu, Yongfeng Zhang, Yixin Cao, Zheng Qin, Hongyuan Zha Jul 2019

Personalized Fashion Recommendation With Visual Explanations Based On Multimodal Attention Network: Towards Visually Explainable Recommendation, Xu Chen, Hanxiong Chen, Hongteng Xu, Yongfeng Zhang, Yixin Cao, Zheng Qin, Hongyuan Zha

Research Collection School Of Computing and Information Systems

Fashion recommendation has attracted increasing attention from both industry and academic communities. This paper proposes a novel neural architecture for fashion recommendation based on both image region-level features and user review information. Our basic intuition is that: for a fashion image, not all the regions are equally important for the users, i.e., people usually care about a few parts of the fashion image. To model such human sense, we learn an attention model over many pre-segmented image regions, based on which we can understand where a user is really interested in on the image, and correspondingly, represent the image in …


An Introduction To Declarative Programming In Clips And Prolog, Jack L. Watkin, Adam C. Volk, Saverio Perugini Jul 2019

An Introduction To Declarative Programming In Clips And Prolog, Jack L. Watkin, Adam C. Volk, Saverio Perugini

Computer Science Faculty Publications

We provide a brief introduction to CLIPS—a declarative/logic programming language for implementing expert systems—and PROLOG—a declarative/logic programming language based on first-order, predicate calculus. Unlike imperative languages in which the programmer specifies how to compute a solution to a problem, in a declarative language, the programmer specifies what they what to find, and the system uses a search strategy built into the language. We also briefly discuss applications of CLIPS and PROLOG.


Multi-Channel Graph Neural Network For Entity Alignment, Yixin Cao, Zhiyuan Liu, Chengjiang Li, Zhiyuan Liu, Juanzi Li, Tat-Seng Chua Jul 2019

Multi-Channel Graph Neural Network For Entity Alignment, Yixin Cao, Zhiyuan Liu, Chengjiang Li, Zhiyuan Liu, Juanzi Li, Tat-Seng Chua

Research Collection School Of Computing and Information Systems

Entity alignment typically suffers from the issues of structural heterogeneity and limited seed alignments. In this paper, we propose a novel Multi-channel Graph Neural Network model (MuGNN) to learn alignment-oriented knowledge graph (KG) embeddings by robustly encoding two KGs via multiple channels. Each channel encodes KGs via different relation weighting schemes with respect to self-attention towards KG completion and cross-KG attention for pruning exclusive entities respectively, which are further combined via pooling techniques. Moreover, we also infer and transfer rule knowledge for completing two KGs consistently. MuGNN is expected to reconcile the structural differences of two KGs, and thus make …


Evaluating The Readability Of Force Directed Graph Layouts: A Deep Learning Approach, Hammad Haleem, Yong Wang, Abishek Puri, Sahil Wadhwa, Huamin Qu Jul 2019

Evaluating The Readability Of Force Directed Graph Layouts: A Deep Learning Approach, Hammad Haleem, Yong Wang, Abishek Puri, Sahil Wadhwa, Huamin Qu

Research Collection School Of Computing and Information Systems

Existing graph layout algorithms are usually not able to optimize all the aesthetic properties desired in a graph layout. To evaluate how well the desired visual features are reflected in a graph layout, many readability metrics have been proposed in the past decades. However, the calculation of these readability metrics often requires access to the node and edge coordinates and is usually computationally inefficient, especially for dense graphs. Importantly, when the node and edge coordinates are not accessible, it becomes impossible to evaluate the graph layouts quantitatively. In this paper, we present a novel deep learning-based approach to evaluate the …


"Flagella Base Model" And "Flagellin Monomer", Brandon Lasalle, Rebecca Roston Jun 2019

"Flagella Base Model" And "Flagellin Monomer", Brandon Lasalle, Rebecca Roston

3-D Printed Model Structural Files

"Flagella Base Model" and "Flagellin monomer"

Description: This is a teaching model of the proteins that make a bacterial flagella. All models are depicted in space-fill. The Flagellin monomer and the Flagella base can slot together to show protein quaternary structure and filamentous protein assembly.

Printable models are already uploaded to Shapeways.com in the MacroMolecules shop under the names "Flagella Base Model" and "Flagellin monomer".

This model has been printed successfully using these parameters on Shapeways’ laser sintering printer in the following material: Processed Versatile Plastic (Strong & Flexible Plastic).

Model designer: Brandon Lasalle Authors: Brandon Lasalle and Rebecca Roston …


Exploring Object Relation In Mean Teacher For Cross-Domain Detection, Qi Cai, Yingwei Pan, Chong-Wah Ngo, Xinmei Tian, Lingyu Duan, Ting Yao Jun 2019

Exploring Object Relation In Mean Teacher For Cross-Domain Detection, Qi Cai, Yingwei Pan, Chong-Wah Ngo, Xinmei Tian, Lingyu Duan, Ting Yao

Research Collection School Of Computing and Information Systems

Rendering synthetic data (e.g., 3D CAD-rendered images) to generate annotations for learning deep models in vision tasks has attracted increasing attention in recent years. However, simply applying the models learnt on synthetic images may lead to high generalization error on real images due to domain shift. To address this issue, recent progress in cross-domain recognition has featured the Mean Teacher, which directly simulates unsupervised domain adaptation as semi-supervised learning. The domain gap is thus naturally bridged with consistency regularization in a teacher-student scheme. In this work, we advance this Mean Teacher paradigm to be applicable for crossdomain detection. Specifically, we …


Mixed Dish Recognition Through Multi-Label Learning, Yunan Wang, Jing-Jing Chen, Chong-Wah Ngo, Tat-Seng Chua, Wanli Zuo, Zhaoyan Ming Jun 2019

Mixed Dish Recognition Through Multi-Label Learning, Yunan Wang, Jing-Jing Chen, Chong-Wah Ngo, Tat-Seng Chua, Wanli Zuo, Zhaoyan Ming

Research Collection School Of Computing and Information Systems

Mix dish recognition, whose goal is to identify each of the dish type presented on one plate, is generally regarded as a difficult problem. The major challenge of this problem is that different dishes presented in one plate may overlap with each other and there may be no clear boundaries among them. Therefore, labeling the bounding box of each dish type is difficult and not necessarily leading to good results. This paper studies the problem from the perspective of multi-label learning. Specially, we propose to perform dish recognition on region level with multiple granularities. For experimental purpose, we collect two …


Dietlens-Eout: Large Scale Restaurant Food Photo Recognition, Zhipeng Wei, Jingjing Chen, Zhaoyan Ming, Chong-Wah Ngo, Tat-Seng Chua, Fengfeng Zhou Jun 2019

Dietlens-Eout: Large Scale Restaurant Food Photo Recognition, Zhipeng Wei, Jingjing Chen, Zhaoyan Ming, Chong-Wah Ngo, Tat-Seng Chua, Fengfeng Zhou

Research Collection School Of Computing and Information Systems

Restaurant dishes represent a significant portion of food that people consume in their daily life. While people are becoming healthconscious in their food intake, convenient restaurant food tracking becomes an essential task in wellness and fitness applications. Given the huge number of dishes (food categories) involved, it becomes extremely challenging for traditional food photo classification to be feasible in both algorithm design and training data availability. In this work, we present a demo that runs on restaurant dish images in a city of millions of residents and tens of thousand restaurants. We propose a rank-loss based convolutional neural network to …