Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 23 of 23

Full-Text Articles in Entire DC Network

Cast2face: Assigning Character Names Onto Faces In Movie With Actor-Character Correspondence, Guangyu Gao, Mengdi Xu, Jialie Shen, Huangdong Ma, Shuicheng Yan Dec 2016

Cast2face: Assigning Character Names Onto Faces In Movie With Actor-Character Correspondence, Guangyu Gao, Mengdi Xu, Jialie Shen, Huangdong Ma, Shuicheng Yan

Research Collection School Of Computing and Information Systems

Automatically identifying characters in movies has attracted researchers' interest and led to several significant and interesting applications. However, due to the vast variation in character appearance as well as the weakness and ambiguity of available annotation, it is still a challenging problem. In this paper, we investigate this problem with the supervision of actor-character name correspondence provided by the movie cast. Our proposed framework, namely, Cast2Face, is featured by: 1) we restrict the assigned names within the set of character names in the cast; 2) for each character, by using the corresponding actor and movie name as keywords, we retrieve …


Hashtag Recommendation With Topical Attention-Based Lstm, Yang Li, Ting Liu, Jing Jiang, Liang Zhang Dec 2016

Hashtag Recommendation With Topical Attention-Based Lstm, Yang Li, Ting Liu, Jing Jiang, Liang Zhang

Research Collection School Of Computing and Information Systems

Microblogging services allow users to create hashtags to categorize their posts. In recent years,the task of recommending hashtags for microblogs has been given increasing attention. However,most of existing methods depend on hand-crafted features. Motivated by the successful use oflong short-term memory (LSTM) for many natural language processing tasks, in this paper, weadopt LSTM to learn the representation of a microblog post. Observing that hashtags indicatethe primary topics of microblog posts, we propose a novel attention-based LSTM model whichincorporates topic modeling into the LSTM architecture through an attention mechanism. Weevaluate our model using a large real-world dataset. Experimental results show that …


Reducing Adaptation Latency For Multi-Concept Visual Perception In Outdoor Environments, Maggie Wigness, John G. Rogers, Luis Ernesto Navarro-Serment, Arne Suppe, Bruce A. Draper Nov 2016

Reducing Adaptation Latency For Multi-Concept Visual Perception In Outdoor Environments, Maggie Wigness, John G. Rogers, Luis Ernesto Navarro-Serment, Arne Suppe, Bruce A. Draper

Research Collection School Of Computing and Information Systems

Multi-concept visual classification is emerging as a common environment perception technique, with applications in autonomous mobile robot navigation. Supervised visual classifiers are typically trained with large sets of images, hand annotated by humans with region boundary outlines followed by label assignment. This annotation is time consuming, and unfortunately, a change in environment requires new or additional labeling to adapt visual perception. The time is takes for a human to label new data is what we call adaptation latency. High adaptation latency is not simply undesirable but may be infeasible for scenarios with limited labeling time and resources. In this paper, …


Hierarchical Visualization Of Video Search Results For Topic-Based Browsing, Yu-Gang Jiang, Jiajun Wang, Qiang Wang, Wei Liu, Chong-Wah Ngo Nov 2016

Hierarchical Visualization Of Video Search Results For Topic-Based Browsing, Yu-Gang Jiang, Jiajun Wang, Qiang Wang, Wei Liu, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

Existing video search engines return a ranked list of videos for each user query, which is not convenient for browsing the results of query topics that have multiple facets, such as the "early life," "personal life," and "presidency" of a query "Barack Obama." Organizing video search results into semantically structured hierarchies with nodes covering different topic facets can significantly improve the browsing efficiency for such queries. In this paper, we introduce a hierarchical visualization approach for video search result browsing, which can help users quickly understand the multiple facets of a query topic in a very well-organized manner. Given a …


Vireo@Trecvid 2016: Multimedia Event Detection, Ad-Hoc Video Search, Video To Text Description, Hao Zhang, Lei Pang, Yi-Jie Lu, Chong-Wah Ngo Nov 2016

Vireo@Trecvid 2016: Multimedia Event Detection, Ad-Hoc Video Search, Video To Text Description, Hao Zhang, Lei Pang, Yi-Jie Lu, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

The vireo group participates in 3 tasks: multimedia event detection, ad-hoc video search, video-to-text description. In this paper, we will separately present frameworks for these tasks and discuss experimental results.


Deep-Based Ingredient Recognition For Cooking Recipe Retrieval, Jingjing Chen, Chong-Wah Ngo Oct 2016

Deep-Based Ingredient Recognition For Cooking Recipe Retrieval, Jingjing Chen, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

Retrieving recipes corresponding to given dish pictures facilitates the estimation of nutrition facts, which is crucial to various health relevant applications. The current approaches mostly focus on recognition of food category based on global dish appearance without explicit analysis of ingredient composition. Such approaches are incapable for retrieval of recipes with unknown food categories, a problem referred to as zero-shot retrieval. On the other hand, content-based retrieval without knowledge of food categories is also difficult to attain satisfactory performance due to large visual variations in food appearance and ingredient composition. As the number of ingredients is far less than food …


Fast Covariant Vlad For Image Search, Wan-Lei Zhao, Chong-Wah Ngo, Hanzi Wang Sep 2016

Fast Covariant Vlad For Image Search, Wan-Lei Zhao, Chong-Wah Ngo, Hanzi Wang

Research Collection School Of Computing and Information Systems

Vector of locally aggregated descriptor (VLAD) is a popular image encoding approach for its simplicity and better scalability over conventional bag-of-visual-word approach. In order to enhance its distinctiveness and geometric invariance, covariant VLAD (CVLAD) is proposed to pool local features based on their dominant orientations/characteristic scales, which leads to a geometric-aware representation. This representation achieves rotation/scale invariance when being associated with circular matching. However, the circular matching induces several times of computation overhead, which makes CVLAD hardly suitable for large-scale retrieval tasks. In this paper, the issue of computation overhead is alleviated by performing the circular matching in CVLAD's frequency …


Cross-Modal Self-Taught Hashing For Large-Scale Image Retrieval, Liang Xie, Lei Zhu, Peng Pan, Yansheng Lu Jul 2016

Cross-Modal Self-Taught Hashing For Large-Scale Image Retrieval, Liang Xie, Lei Zhu, Peng Pan, Yansheng Lu

Research Collection School Of Computing and Information Systems

Cross-modal hashing integrates the advantages of traditional cross-modal retrieval and hashing, it can solve large-scale cross-modal retrieval effectively and efficiently. However, existing cross-modal hashing methods rely on either labeled training data, or lack semantic analysis. In this paper, we propose Cross-Modal Self-Taught Hashing (CMSTH) for large-scale cross-modal and unimodal image retrieval. CMSTH can effectively capture the semantic correlation from unlabeled training data. Its learning process contains three steps: first we propose Hierarchical Multi-Modal Topic Learning (HMMTL) to detect multi-modal topics with semantic information. Then we use Robust Matrix Factorization (RMF) to transfer the multi-modal topics to hash codes which are …


Automatic Hookworm Detection In Wireless Capsule Endoscopy Images, Xiao Wu, Honghan Chen, Tao Gan, Junzhou Chen, Chong-Wah Ngo, Qiang Peng Jul 2016

Automatic Hookworm Detection In Wireless Capsule Endoscopy Images, Xiao Wu, Honghan Chen, Tao Gan, Junzhou Chen, Chong-Wah Ngo, Qiang Peng

Research Collection School Of Computing and Information Systems

Wireless capsule endoscopy (WCE) has become a widely used diagnostic technique to examine inflammatory bowel diseases and disorders. As one of the most common human helminths, hookworm is a kind of small tubular structure with grayish white or pinkish semi-transparent body, which is with a number of 600 million people infection around the world. Automatic hookworm detection is a challenging task due to poor quality of images, presence of extraneous matters, complex structure of gastrointestinal, and diverse appearances in terms of color and texture. This is the first few works to comprehensively explore the automatic hookworm detection for WCE images. …


Video Modeling And Learning On Riemannian Manifold For Emotion Recognition In The Wild, Mengyi Liu, Ruiping Wang, Shaoxin Li, Zhiwu Huang, Shiguang Shan, Xilin Chen Jun 2016

Video Modeling And Learning On Riemannian Manifold For Emotion Recognition In The Wild, Mengyi Liu, Ruiping Wang, Shaoxin Li, Zhiwu Huang, Shiguang Shan, Xilin Chen

Research Collection School Of Computing and Information Systems

In this paper, we present the method for our submission to the emotion recognition in the wild challenge (EmotiW). The challenge is to automatically classify the emotions acted by human subjects in video clips under real-world environment. In our method, each video clip can be represented by three types of image set models (i.e. linear subspace, covariance matrix, and Gaussian distribution) respectively, which can all be viewed as points residing on some Riemannian manifolds. Then different Riemannian kernels are employed on these set models correspondingly for similarity/ distance measurement. For classification, three types of classifiers, i.e. kernel SVM, logistic regression, …


Serendipity-Driven Celebrity Video Hyperlinking, Shujun Yang, Lei Pang, Chong-Wah Ngo, Benoit Huet Jun 2016

Serendipity-Driven Celebrity Video Hyperlinking, Shujun Yang, Lei Pang, Chong-Wah Ngo, Benoit Huet

Research Collection School Of Computing and Information Systems

This demo showcases the utility of video hyperlinks with celebrities as the link anchors and their social circles as targets, aiming to help users quickly explore the aboutness of a celebrity by link traversal. Through content analysis, our system embeds hyperlinks into videos such that users can click-and-jump between celebrity faces in different videos to get-to-know their social circles. One peculiar feature is the ability of the system in providing links that maximize users' chance encounter, or serendipitous experience, beyond information need. Our system is enabled by two key components, name-face association and diversity-based ranking, for the aboutness and serendipity …


Event Detection With Zero Example: Select The Right And Suppress The Wrong Concepts, Yi-Jie Lu, Hao Zhang, Maaike De Boer, Chong-Wah Ngo Jun 2016

Event Detection With Zero Example: Select The Right And Suppress The Wrong Concepts, Yi-Jie Lu, Hao Zhang, Maaike De Boer, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

Complex video event detection without visual examples is a very challenging issue in multimedia retrieval. We present a state-of-the-art framework for event search without any need of exemplar videos and textual metadata in search corpus. To perform event search given only query words, the core of our framework is a large, pre-built bank of concept detectors which can understand the content of a video in the perspective of object, scene, action and activity concepts. Leveraging such knowledge can effectively narrow the semantic gap between textual query and the visual content of videos. Besides the large concept bank, this paper focuses …


Exemplar-Driven Top-Down Saliency Detection Via Deep Association, Shengfeng He, Rynson W. H. Lau, Qingxiong Yang Jun 2016

Exemplar-Driven Top-Down Saliency Detection Via Deep Association, Shengfeng He, Rynson W. H. Lau, Qingxiong Yang

Research Collection School Of Computing and Information Systems

Top-down saliency detection is a knowledge-driven search task. While some previous methods aim to learn this "knowledge" from category-specific data, others transfer existing annotations in a large dataset through appearance matching. In contrast, we propose in this paper a locateby-exemplar strategy. This approach is challenging, as we only use a few exemplars (up to 4) and the appearances among the query object and the exemplars can be very different. To address it, we design a two-stage deep model to learn the intra-class association between the exemplars and query objects. The first stage is for learning object-to-object association, and the second …


Efficient 3d Dental Identification Via Signed Feature Histogram And Learning Keypoint Detection, Zhiyuan Zhang, Sim Heng Ong, Xin Zhong, Kelvin W. C. Foong May 2016

Efficient 3d Dental Identification Via Signed Feature Histogram And Learning Keypoint Detection, Zhiyuan Zhang, Sim Heng Ong, Xin Zhong, Kelvin W. C. Foong

Research Collection School Of Computing and Information Systems

Current methods of dental identification are mainly based on 2D dental radiographs which suffer from speed and accuracy limitations. In this paper, we present an efficient dental identification approach based on 3D dental models. We propose a novel shape descriptor, the Signed Feature Histogram (SFH), which is highly discriminative and can be easily computed to describe the local surface. Based on the SFH, a learning keypoint detection method is adopted to accurately detect the desired keypoints on both antemortem (AM) and postmortem (PM) models. For a given PM model, the optimal initial alignment to the AM model to be matched …


An Autonomous Agent For Learning Spatiotemporal Models Of Human Daily Activities, Shan Gao, Ah-Hwee Tan May 2016

An Autonomous Agent For Learning Spatiotemporal Models Of Human Daily Activities, Shan Gao, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Activities of Daily Living (ADLs) refer to activities performed by individuals on a daily basis. As ADLs are indicatives of a person’s habits, lifestyle, and well being, learning the knowledge of people’s ADL routine has great values in the healthcare and consumer domains. In this paper, we propose an autonomous agent, named Agent for Spatia-Temporal Activity Pattern Modeling (ASTAPM), being able to learn spatial and temporal patterns of human ADLs. ASTAPM utilises a self-organizing neural network model named Spatiotemporal - Adaptive Resonance Theory (ST-ART). ST-ART is capable of integrating multimodal contextual information, involving the time and space, wherein the ADL …


Modeling Human-Like Non-Rationality For Social Agents, Jaroslaw Kochanowicz, Ah-Hwee Tan, Daniel Thalmann May 2016

Modeling Human-Like Non-Rationality For Social Agents, Jaroslaw Kochanowicz, Ah-Hwee Tan, Daniel Thalmann

Research Collection School Of Computing and Information Systems

Humans are not rational beings. Deviations from rationality in human thinking are currently well documented [25] as non-reducible to rational pursuit of egoistic benefit or its occasional distortion with temporary emotional excitation, as it is often assumed. This occurs not only outside conceptual reasoning or rational goal realization but also subconsciously and often in certainty that they did not and could not take place ‘in my case’. Non-rationality can no longer be perceived as a rare affective abnormality in otherwise rational thinking, but as a systemic, permanent quality, ’a design feature’ of human cognition. While social psychology has systematically addressed …


You Are Being Watched: Bystanders' Perspective On The Use Of Camera Devices In Public Spaces, Samarth Singhal, Carman Neustaedter, Thecla Schiphorst, Anthony Tang, Abhisekh Patra, Rui Pan May 2016

You Are Being Watched: Bystanders' Perspective On The Use Of Camera Devices In Public Spaces, Samarth Singhal, Carman Neustaedter, Thecla Schiphorst, Anthony Tang, Abhisekh Patra, Rui Pan

Research Collection School Of Computing and Information Systems

We are observing an increase in the use of smartphones and wearable devices in public places for streaming and recording video. Yet the use of cameras in these devices can infringe upon the privacy of the people in the surrounding environment by inadvertently capturing them. This paper presents findings from an in-situ exploratory study that investigates bystanders' reactions and feelings towards streaming and recording videos with smartphones and wearable glasses in public spaces. We use the interview results to guide an exploration of design directions for mobile video.


Foreword To Special Section On Graphics Interface 2015, Hao Zhang, Anthony Tang Apr 2016

Foreword To Special Section On Graphics Interface 2015, Hao Zhang, Anthony Tang

Research Collection School Of Computing and Information Systems

This special section of the Computers & Graphics (C&G) Journal features expanded versions of five of the top graphics and interactions papers [1–5] that were originally presented at Graphics Interface (GI) 2015, which took place in Halifax, Nova Scotia, Canada, between June 3rd and 5th. GI, sponsored by the Canadian Human–Computer Communications Society, is an annual international conference devoted to computer graphics and human– computer interaction (HCI). With a graphics track and an HCI track having equal weights in the conference, GI offers a unique venue for a meeting of minds working on computer graphics and interactive techniques. GI is …


Understanding The Determinants Of Human Computation Game Acceptance: The Effects Of Aesthetic Experience And Output Quality, Xiaohui Wang, Dion Hoe-Lian Goh, Ee-Peng Lim, Wei Liang Adrian Vu Apr 2016

Understanding The Determinants Of Human Computation Game Acceptance: The Effects Of Aesthetic Experience And Output Quality, Xiaohui Wang, Dion Hoe-Lian Goh, Ee-Peng Lim, Wei Liang Adrian Vu

Research Collection School Of Computing and Information Systems

Purpose: Human computation games (HCGs) that blend gaming with utilitarian purposes are a potentially effective channel for content creation. The purpose of this paper is to investigate the driving factors behind players’ adoption of HCGs through a music video tagging game. The effects of perceived aesthetic experience (PAE) and perceived output quality (POQ) on HCG acceptance are empirically examined. Design/methodology/approach: An integrative structural model is developed to explain how hedonic and utilitarian factors, including PAE and POQ, working with another salient factor – perceived usefulness (PU) – affect the acceptance of HCGs. The structural equation modeling method is used to …


Opinion Question Answering By Sentiment Clip Localization, Lei Pang, Chong-Wah Ngo Mar 2016

Opinion Question Answering By Sentiment Clip Localization, Lei Pang, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

This article considers multimedia question answering beyond factoid and how-to questions. We are interested in searching videos for answering opinion-oriented questions that are controversial and hotly debated. Examples of questions include "Should Edward Snowden be pardoned?" and "Obamacare-unconstitutional or not?". These questions often invoke emotional response, either positively or negatively, hence are likely to be better answered by videos than texts, due to the vivid display of emotional signals visible through facial expression and speaking tone. Nevertheless, a potential answer of duration 60s may be embedded in a video of 10min, resulting in degraded user experience compared to reading the …


A Tool-Free Calibration Method For Turntable-Based 3d Scanning Systems, Xufang Pang, Rynson W.H. Lau, Zhan Song, Shengfeng He, Shengfeng He Jan 2016

A Tool-Free Calibration Method For Turntable-Based 3d Scanning Systems, Xufang Pang, Rynson W.H. Lau, Zhan Song, Shengfeng He, Shengfeng He

Research Collection School Of Computing and Information Systems

Turntable-based 3D scanners are popular but require calibration of the turntable axis. Existing methods for turntable calibration typically make use of specially designed tools, such as a chessboard or criterion sphere, which users must manually install and dismount. In this article, the authors propose an automatic method to calibrate the turntable axis without any calibration tools. Given a scan sequence of the input object, they first recover the initial rotation axis from an automatic registration step. Then they apply an iterative procedure to obtain the optimized turntable axis. This iterative procedure alternates between two steps: refining the initial pose of …


Object Pooling For Multimedia Event Detection And Evidence Localization, Ho Zhang, Chong-Wah Ngo, Chong-Wah Ngo Jan 2016

Object Pooling For Multimedia Event Detection And Evidence Localization, Ho Zhang, Chong-Wah Ngo, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

Multimedia event detection (MED) and evidence hunting are two primary topics in the area of multimedia event search. The former serves to retrieve a list of relevant videos given an event query, whereas, the latter reasons why and how much the degree a retrieved video answers that query. Common practices deal with these two topics in separate methods, however, in this paper, we combine MED and evidence hunting into a joint framework. We propose a refined semantical representation named object pooling which can dynamically extract visual snippets corresponding to the location of when and where evidences might appear. The main …


Ambiguityvis: Visualization Of Ambiguity In Graph Layouts, Yong Wang, Qiaomu Shen, Zhiguang Zhou, Min Zhu, Sixiao Yang, Qu Huamin Jan 2016

Ambiguityvis: Visualization Of Ambiguity In Graph Layouts, Yong Wang, Qiaomu Shen, Zhiguang Zhou, Min Zhu, Sixiao Yang, Qu Huamin

Research Collection School Of Computing and Information Systems

Node-link diagrams provide an intuitive way to explore networks and have inspired a large number of automated graph layout strategies that optimize aesthetic criteria. However, any particular drawing approach cannot fully satisfy all these criteria simultaneously, producing drawings with visual ambiguities that can impede the understanding of network structure. To bring attention to these potentially problematic areas present in the drawing. this paper presents a technique that highlights common types of visual ambiguities: ambiguous spatial relationships between nodes and edges, visual overlap between community structures, and ambiguity in edge bundling and metanodes. Metrics, including newly proposed metrics for abnormal edge …