Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 9 of 9

Full-Text Articles in Databases and Information Systems

From Footprint To Evidence: An Exploratory Study Of Mining Social Data For Credit Scoring, Guangming Guo, Feida Zhu, Enhong Chen, Qi Liu, Le Wu, Chu Guan Dec 2016

From Footprint To Evidence: An Exploratory Study Of Mining Social Data For Credit Scoring, Guangming Guo, Feida Zhu, Enhong Chen, Qi Liu, Le Wu, Chu Guan

Research Collection School Of Computing and Information Systems

With the booming popularity of online social networks like Twitter and Weibo, online user footprints are accumulating rapidly on the social web. Simultaneously, the question of how to leverage the large-scale user-generated social media data for personal credit scoring comes into the sight of both researchers and practitioners. It has also become a topic of great importance and growing interest in the P2P lending industry. However, compared with traditional financial data, heterogeneous social data presents both opportunities and challenges for personal credit scoring. In this article, we seek a deep understanding of how to learn users’ credit labels from social …


Unsupervised Feature Selection For Outlier Detection By Modelling Hierarchical Value-Feature Couplings, Guansong Pang, Longbing Cao, Ling Chen, Huan Liu Dec 2016

Unsupervised Feature Selection For Outlier Detection By Modelling Hierarchical Value-Feature Couplings, Guansong Pang, Longbing Cao, Ling Chen, Huan Liu

Research Collection School Of Computing and Information Systems

Proper feature selection for unsupervised outlier detection can improve detection performance but is very challenging due to complex feature interactions, the mixture of relevant features with noisy/redundant features in imbalanced data, and the unavailability of class labels. Little work has been done on this challenge. This paper proposes a novel Coupled Unsupervised Feature Selection framework (CUFS for short) to filter out noisy or redundant features for subsequent outlier detection in categorical data. CUFS quantifies the outlierness (or relevance) of features by learning and integrating both the feature value couplings and feature couplings. Such value-to-feature couplings capture intrinsic data characteristics and …


Summarization Of Egocentric Videos: A Comprehensive Survey, Ana Garcia Del Molino, Cheston Tan, Joo-Hwee Lim, Ah-Hwee Tan Nov 2016

Summarization Of Egocentric Videos: A Comprehensive Survey, Ana Garcia Del Molino, Cheston Tan, Joo-Hwee Lim, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

The introduction of wearable video cameras (e.g., GoPro) in the consumer market has promoted video life-logging, motivating users to generate large amounts of video data. This increasing flow of first-person video has led to a growing need for automatic video summarization adapted to the characteristics and applications of egocentric video. With this paper, we provide the first comprehensive survey of the techniques used specifically to summarize egocentric videos. We present a framework for first-person view summarization and compare the segmentation methods and selection algorithms used by the related work in the literature. Next, we describe the existing egocentric video datasets …


Metaflow: A Scalable Metadata Lookup Service For Distributed File Systems In Data Centers, Peng Sun, Yonggang Wen, Nguyen Binh Duong Ta, Haiyong Xie Sep 2016

Metaflow: A Scalable Metadata Lookup Service For Distributed File Systems In Data Centers, Peng Sun, Yonggang Wen, Nguyen Binh Duong Ta, Haiyong Xie

Research Collection School Of Computing and Information Systems

In large-scale distributed file systems, efficient metadata operations are critical since most file operations have to interact with metadata servers first. In existing distributed hash table (DHT) based metadata management systems, the lookup service could be a performance bottleneck due to its significant CPU overhead. Our investigations showed that the lookup service could reduce system throughput by up to 70%, and increase system latency by a factor of up to 8 compared to ideal scenarios. In this paper, we present MetaFlow, a scalable metadata lookup service utilizing software-defined networking (SDN) techniques to distribute lookup workload over network components. MetaFlow tackles …


Outlier Detection In Complex Categorical Data By Modeling The Feature Value Couplings, Guansong Pang, Longbing Cao, Ling Chen Jul 2016

Outlier Detection In Complex Categorical Data By Modeling The Feature Value Couplings, Guansong Pang, Longbing Cao, Ling Chen

Research Collection School Of Computing and Information Systems

This paper introduces a novel unsupervised outlier detection method, namely Coupled Biased Random Walks (CBRW), for identifying outliers in categorical data with diversified frequency distributions and many noisy features. Existing pattern-based outlier detection methods are ineffective in handling such complex scenarios, as they misfit such data. CBRW estimates outlier scores of feature values by modelling feature value level couplings, which carry intrinsic data characteristics, via biased random walks to handle this complex data. The outlier scores of feature values can either measure the outlierness of an object or facilitate the existing methods as a feature weighting and selection indicator. Substantial …


Poster: Improving Communication And Communicability With Smarter Use Of Text-Based Messages On Mobile And Wearable Devices, Kenny T. W. Choo Jun 2016

Poster: Improving Communication And Communicability With Smarter Use Of Text-Based Messages On Mobile And Wearable Devices, Kenny T. W. Choo

Research Collection School Of Computing and Information Systems

While smartphones have undoubtedly afforded many modern conveniences such as emails, instant messaging or web search, the notifications from smartphones conversely impact our lives through a deluge of information, or stress arising from expectations that we should turn our immediate attention to them (e.g., work emails). In my latest research, we find that the glanceability of smartwatches may provide an opportunity to reduce the perceived disruption from mobile notifications. Text is a common medium for communication in smart devices, the application of natural language processing on text, together with the physical affordances of smartwatches, present exciting opportunities for research to …


Efspredictor: Predicting Configuration Bugs With Ensemble Feature Selection, Bowen Xu, David Lo, Xin Xia, Ashish Sureka, Shanping Li May 2016

Efspredictor: Predicting Configuration Bugs With Ensemble Feature Selection, Bowen Xu, David Lo, Xin Xia, Ashish Sureka, Shanping Li

Research Collection School Of Computing and Information Systems

The configuration of a system determines the system behavior and wrong configuration settings can adversely impact system's availability, performance, and correctness. We refer to these wrong configuration settings as configuration bugs. The importance of configuration bugs has prompted many researchers to study it, and past studies can be grouped into three categories: detection, localization, and fixing of configuration bugs. In the work, we focus on the detection of configuration bugs, in particular, we follow the line-of-work that tries to predict if a bug report is caused by a wrong configuration setting. Automatically prediction of whether a bug is a configuration …


Interactive Teachable Cognitive Agents: Smart Building Blocks For Multiagent Systems, Budhitama Subagdja, Ah-Hwee Tan Mar 2016

Interactive Teachable Cognitive Agents: Smart Building Blocks For Multiagent Systems, Budhitama Subagdja, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Developing a complex intelligent system by abstracting their behaviors, functionalities, and reasoning mechanisms can be tedious and time consuming. In this paper, we present a framework for developing an application or software system based on smart autonomous components that collaborate with the developer or user to realize the entire system. Inspired by teachable approaches and programming-by-demonstration methods in robotics and end-user development, we treat intelligent agents as teachable components that make up the system to be built. Each agent serves different functionalities and may have prebuilt operations to accomplish its own design objectives. However, each agent may also be equipped …


Iot+Small Data: Transforming In-Store Shopping Analytics And Services, Meera Radhakrishnan, Sougata Sen, Vigneshwaran Subbaraju, Archan Misra, Rajesh Balan Jan 2016

Iot+Small Data: Transforming In-Store Shopping Analytics And Services, Meera Radhakrishnan, Sougata Sen, Vigneshwaran Subbaraju, Archan Misra, Rajesh Balan

Research Collection School Of Computing and Information Systems

We espouse a vision of small data-based immersive retail analytics, where a combination of sensor data, from personal wearable-devices and store-deployed sensors & IoT devices, is used to create real-time, individualized services for in-store shoppers. Key challenges include (a) appropriate joint mining of sensor & wearable data to capture a shopper’s product level interactions, and (b) judicious triggering of power-hungry wearable sensors (e.g., camera) to capture only relevant portions of a shopper’s in-store activities. To explore the feasibility of our vision, we conducted experiments with 5 smartwatch-wearing users who interacted with objects placed on cupboard racks in our lab (to …