Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

Singapore Management University

2021

Discipline
Keyword
Publication
Publication Type

Articles 211 - 233 of 233

Full-Text Articles in Databases and Information Systems

Evidence Aware Neural Pornographic Text Identification For Child Protection, Kaisong Song, Yangyang Kang, Wei Gao, Zhe Gao, Changlong Sun, Xiaozhong Liu Feb 2021

Evidence Aware Neural Pornographic Text Identification For Child Protection, Kaisong Song, Yangyang Kang, Wei Gao, Zhe Gao, Changlong Sun, Xiaozhong Liu

Research Collection School Of Computing and Information Systems

Identifying pornographic text online is practically useful to protect children from access to such adult content. However, some authors may intentionally avoid using sensitive words in their pornographic texts to take advantage of the lack of human audits. Without prior knowledge guidance, real semantics of such pornographic text is difficult to understand by existing methods due to its high context-sensitivity and heavy usage of figurative language, which brings huge challenges to the porn detection systems used in social media platforms. In this paper, we approach to the problem as a document-level porn identification task by locating and integrating sentence-level evidence …


An Exploratory Study On The Introduction And Removal Of Different Types Of Technical Debt In Deep Learning Frameworks, Jiakun Liu, Qiao Huang, Xin Xia, Emad Shihab, David Lo, Shanping Li Feb 2021

An Exploratory Study On The Introduction And Removal Of Different Types Of Technical Debt In Deep Learning Frameworks, Jiakun Liu, Qiao Huang, Xin Xia, Emad Shihab, David Lo, Shanping Li

Research Collection School Of Computing and Information Systems

To complete tasks faster, developers often have to sacrifice the quality of the software. Such compromised practice results in the increasing burden to developers in future development. The metaphor, technical debt, describes such practice. Prior research has illustrated the negative impact of technical debt, and many researchers investigated how developers deal with a certain type of technical debt. However, few studies focused on the removal of different types of technical debt in practice. To fill this gap, we use the introduction and removal of different types of self-admitted technical debt (i.e., SATD) in 7 deep learning frameworks as an example. …


Evoking Empathy: A Framework For Describing Empathy Tools, Sydney Pratte, Anthony Tang, Lora Oehlberg Feb 2021

Evoking Empathy: A Framework For Describing Empathy Tools, Sydney Pratte, Anthony Tang, Lora Oehlberg

Research Collection School Of Computing and Information Systems

Empathy tools are experiences designed to evoke empathetic responses by placing the user in another’s lived and felt experience. The problem is that designers do not have a common vocabulary to describe empathy tool experiences; consequently, it is difficult to compare/contrast empathy tool designs or to think about their efficacy. To address this problem, we analyzed 26 publications on empathy tools to develop a descriptive framework for designers of empathy tools. Based on our analysis, we found that empathy tools can be described along three dimensions: (i) the amount of agency the tool allows, (ii) the user’s perspective while using …


Facial Emotion Recognition With Noisy Multi-Task Annotations, S. Zhang, Zhiwu Huang, D.P. Paudel, Gool L. Van Jan 2021

Facial Emotion Recognition With Noisy Multi-Task Annotations, S. Zhang, Zhiwu Huang, D.P. Paudel, Gool L. Van

Research Collection School Of Computing and Information Systems

Human emotions can be inferred from facial expressions. However, the annotations of facial expressions are often highly noisy in common emotion coding models, including categorical and dimensional ones. To reduce human labelling effort on multi-task labels, we introduce a new problem of facial emotion recognition with noisy multitask annotations. For this new problem, we suggest a formulation from the point of joint distribution match view, which aims at learning more reliable correlations among raw facial images and multi-task labels, resulting in the reduction of noise influence. In our formulation, we exploit a new method to enable the emotion prediction and …


Why My Code Summarization Model Does Not Work: Code Comment Improvement With Category Prediction, Qiuyuan Chen, Xin Xia, Han Hu, David Lo, Shanping Li Jan 2021

Why My Code Summarization Model Does Not Work: Code Comment Improvement With Category Prediction, Qiuyuan Chen, Xin Xia, Han Hu, David Lo, Shanping Li

Research Collection School Of Computing and Information Systems

Code summarization aims at generating a code comment given a block of source code and it is normally performed by training machine learning algorithms on existing code block-comment pairs. Code comments in practice have different intentions. For example, some code comments might explain how the methods work, while others explain why some methods are written. Previous works have shown that a relationship exists between a code block and the category of a comment associated with it. In this article, we aim to investigate to which extent we can exploit this relationship to improve code summarization performance. We first classify comments …


Adversarial Specification Mining, Hong Jin Kang, David Lo Jan 2021

Adversarial Specification Mining, Hong Jin Kang, David Lo

Research Collection School Of Computing and Information Systems

There have been numerous studies on mining temporal specifications from execution traces. These approaches learn finite-state automata (FSA) from execution traces when running tests. To learn accurate specifications of a software system, many tests are required. Existing approaches generalize from a limited number of traces or use simple test generation strategies. Unfortunately, these strategies may not exercise uncommon usage patterns of a software system. To address this problem, we propose a new approach, adversarial specification mining, and develop a prototype, DICE (Diversity through Counter-Examples). DICE has two components: DICE-Tester and DICE-Miner. After mining Linear Temporal Logic specifications from an input …


Sustainability Of Rewards-Based Crowdfunding: A Quasi-Experimental Analysis Of Funding Targets And Backer Satisfaction, Michael Wessel, Rob Gleasure, Robert John Kauffman Jan 2021

Sustainability Of Rewards-Based Crowdfunding: A Quasi-Experimental Analysis Of Funding Targets And Backer Satisfaction, Michael Wessel, Rob Gleasure, Robert John Kauffman

Research Collection School Of Computing and Information Systems

Rewards-based crowdfunding presents an information asymmetry for participants due to the funding mechanism used. Campaign-backers trust creators to complete projects and deliver rewards as outlined prior to the fundraising process, but creators may discover better opportunities as they progress with a project. Despite this, the all-or-nothing (AON) mechanism on crowdfunding platforms incentivizes creators to set meager funding-targets that are easier to achieve but may offer limited slack when creators wish to simultaneously pursue emerging opportunities later in the project. We explore the related issues of how funding targets seem to be selected by the creators, and how dissatisfaction with the …


Unsupervised Representation Learning By Predicting Random Distances, Hu Wang, Guansong Pang, Chunhua Shen, Congbo Ma Jan 2021

Unsupervised Representation Learning By Predicting Random Distances, Hu Wang, Guansong Pang, Chunhua Shen, Congbo Ma

Research Collection School Of Computing and Information Systems

Deep neural networks have gained great success in a broad range of tasks due to its remarkable capability to learn semantically rich features from high-dimensional data. However, they often require large-scale labelled data to successfully learn such features, which significantly hinders their adaption in unsupervised learning tasks, such as anomaly detection and clustering, and limits their applications to critical domains where obtaining massive labelled data is prohibitively expensive. To enable unsupervised learning on those domains, in this work we propose to learn features without using any labelled data by training neural networks to predict data distances in a randomly projected …


Smart Contracts: Will Fintech Be The Catalyst For The Next Global Financial Crisis?, Randall Duran, Paul Griffin Jan 2021

Smart Contracts: Will Fintech Be The Catalyst For The Next Global Financial Crisis?, Randall Duran, Paul Griffin

Research Collection School Of Computing and Information Systems

Purpose: This paper aims to examine the risks associated with smart contracts, a disruptive financial technology (FinTech) innovation, and assesses how in the future they could threaten the integrity of the global financial system. Design/methodology/approach: A qualitative approach is used to identify risk factors related to the use of new financial innovations, by examining how over-the-counter (OTC) derivatives contributed to the Global Financial Crisis (GFC) which occurred during 2007 and 2008. Based on this analysis, the potential for similar concerns with smart contracts are evaluated, drawing on the failure of The DAO on the Ethereum blockchain, which involved the loss …


Novel Techniques In Recovering, Embedding, And Enforcing Policies For Control-Flow Integrity, Yan Lin Jan 2021

Novel Techniques In Recovering, Embedding, And Enforcing Policies For Control-Flow Integrity, Yan Lin

Dissertations and Theses Collection (Open Access)

Control-Flow Integrity (CFI) is an attractive security property with which most injected and code-reuse attacks can be defeated, including advanced attacking techniques like Return-Oriented Programming (ROP). CFI extracts a control-flow graph (CFG) for a given program and instruments the program to respect the CFG. Specifically, checks are inserted before indirect branch instructions. Before these instructions are executed during runtime, the checks consult the CFG to ensure that the indirect branch is allowed to reach the intended target. Hence, any sort of controlflow hijacking would be prevented. There are three fundamental components in CFI enforcement. The first component is accurately recovering …


Attribute-Aware Pedestrian Detection In A Crowd, Jialiang Zhang, Lixiang Lin, Jianke Zhu, Yang Li, Yun-Chen Chen, Yao Hu, Steven C. H. Hoi Jan 2021

Attribute-Aware Pedestrian Detection In A Crowd, Jialiang Zhang, Lixiang Lin, Jianke Zhu, Yang Li, Yun-Chen Chen, Yao Hu, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

Pedestrian detection is an initial step to perform outdoor scene analysis, which plays an essential role in many real-world applications. Although having enjoyed the merits of deep learning frameworks from the generic object detectors, pedestrian detection is still a very challenging task due to heavy occlusions, and highly crowded group. Generally, the conventional detectors are unable to differentiate individuals from each other effectively under such a dense environment. To tackle this critical problem, we propose an attribute-aware pedestrian detector to explicitly model people's semantic attributes in a high-level feature detection fashion. Besides the typical semantic features, center position, target's scale, …


Learning Adl Daily Routines With Spatiotemporal Neural Networks, Shan Gao, Ah-Hwee Tan, Rossi Setchi Jan 2021

Learning Adl Daily Routines With Spatiotemporal Neural Networks, Shan Gao, Ah-Hwee Tan, Rossi Setchi

Research Collection School Of Computing and Information Systems

The activities of daily living (ADLs) refer to the activities performed by individuals on a daily basis and are the indicators of a person’s habits, lifestyle, and wellbeing. Learning an individual’s ADL daily routines has significant value in the healthcare domain. Specifically, ADL recognition and inter-ADL pattern learning problems have been studied extensively in the past couple of decades. However, discovering the patterns performed in a day and clustering them into ADL daily routines has been a relatively unexplored research area. In this paper, a self-organizing neural network model, called the Spatiotemporal ADL Adaptive Resonance Theory (STADLART), is proposed for …


Coherence And Identity Learning For Arbitrary-Length Face Video Generation, Shuquan Ye, Chu Han, Jiaying Lin, Guoqiang Han, Shengfeng He Jan 2021

Coherence And Identity Learning For Arbitrary-Length Face Video Generation, Shuquan Ye, Chu Han, Jiaying Lin, Guoqiang Han, Shengfeng He

Research Collection School Of Computing and Information Systems

Face synthesis is an interesting yet challenging task in computer vision. It is even much harder to generate a portrait video than a single image. In this paper, we propose a novel video generation framework for synthesizing arbitrary-length face videos without any face exemplar or landmark. To overcome the synthesis ambiguity of face video, we propose a divide-and-conquer strategy to separately address the video face synthesis problem from two aspects, face identity synthesis and rearrangement. To this end, we design a cascaded network which contains three components, Identity-aware GAN (IA-GAN), Face Coherence Network, and Interpolation Network. IA-GAN is proposed to …


Technical Q8a Site Answer Recommendation Via Question Boosting, Zhipeng Gao, Xin Xia, David Lo, John Grundy Jan 2021

Technical Q8a Site Answer Recommendation Via Question Boosting, Zhipeng Gao, Xin Xia, David Lo, John Grundy

Research Collection School Of Computing and Information Systems

Software developers have heavily used online question and answer platforms to seek help to solve their technical problems. However, a major problem with these technical Q&A sites is "answer hungriness" i.e., a large number of questions remain unanswered or unresolved, and users have to wait for a long time or painstakingly go through the provided answers with various levels of quality. To alleviate this time-consuming problem, we propose a novel DeepAns neural network-based approach to identify the most relevant answer among a set of answer candidates. Our approach follows a three-stage process: question boosting, label establishment, and answer recommendation. Given …


Creators And Backers In Rewards-Based Crowdfunding: Will Incentive Misalignment Affect Kickstarter's Sustainability?, Michael Wessel, Rob Gleasure, Robert John Kauffman Jan 2021

Creators And Backers In Rewards-Based Crowdfunding: Will Incentive Misalignment Affect Kickstarter's Sustainability?, Michael Wessel, Rob Gleasure, Robert John Kauffman

Research Collection School Of Computing and Information Systems

Incentive misalignment in rewards-based crowd-funding occurs because creators may benefit disproportionately from fundraising, while backers may benefit disproportionately from the quality of project deliverables. The resulting principal-agent relationship means backers rely on campaign information to identify signs of moral hazard, adverse selection, and risk attitude asymmetry. We analyze campaign information related to fundraising, and compare how different information affects eventual backer satisfaction, based on an extensive dataset from Kickstarter. The data analysis uses a multi-model comparison to reveal similarities and contrasts in the estimated drivers of dependent variables that capture different outcomes in Kickstarter’s funding campaigns, using a linear probability …


Analyzing Tweets On New Norm: Work From Home During Covid-19 Outbreak, Swapna Gottipati, Kyong Jin Shim, Hui Hian Teo, Karthik Nityanand, Shreyansh Shivam Jan 2021

Analyzing Tweets On New Norm: Work From Home During Covid-19 Outbreak, Swapna Gottipati, Kyong Jin Shim, Hui Hian Teo, Karthik Nityanand, Shreyansh Shivam

Research Collection School Of Computing and Information Systems

The COVID-19 pandemic triggered a large-scale work-from-home trend globally in recent months. In this paper, we study the phenomenon of “work-from-home” (WFH) by performing social listening. We propose an analytics pipeline designed to crawl social media data and perform text mining analyzes on textual data from tweets scrapped based on hashtags related to WFH in COVID-19 situation. We apply text mining and NLP techniques to analyze the tweets for extracting the WFH themes and sentiments (positive and negative). Our Twitter theme analysis adds further value by summarizing the common key topics, allowing employers to gain more insights on areas of …


Proxy-Free Privacy-Preserving Task Matching With Efficient Revocation In Crowdsourcing, Jiangang Shu, Kan Yang, Xiaohua Jia, Ximeng Liu, Cong Wang, Robert H. Deng Jan 2021

Proxy-Free Privacy-Preserving Task Matching With Efficient Revocation In Crowdsourcing, Jiangang Shu, Kan Yang, Xiaohua Jia, Ximeng Liu, Cong Wang, Robert H. Deng

Research Collection School Of Computing and Information Systems

Task matching in crowdsourcing has been extensively explored with the increasing popularity of crowdsourcing. However, privacy of tasks and workers is usually ignored in most of exiting solutions. In this paper, we study the problem of privacy-preserving task matching for crowdsourcing with multiple requesters and multiple workers. Instead of utilizing proxy re-encryption, we propose a proxy-free task matching scheme for multi-requester/multi-worker crowdsourcing, which achieves task-worker matching over encrypted data with scalability and non-interaction. We further design two different mechanisms for worker revocation including ServerLocal Revocation (SLR) and Global Revocation (GR), which realize efficient worker revocation with minimal overhead on the …


Deep Unsupervised Anomaly Detection, Tangqing Li, Zheng Wang, Siying Liu, Wen-Yan Lin Jan 2021

Deep Unsupervised Anomaly Detection, Tangqing Li, Zheng Wang, Siying Liu, Wen-Yan Lin

Research Collection School Of Computing and Information Systems

This paper proposes a novel method to detect anomalies in large datasets under a fully unsupervised setting. The key idea behind our algorithm is to learn the representation underlying normal data. To this end, we leverage the latest clustering technique suitable for handling high dimensional data. This hypothesis provides a reliable starting point for normal data selection. We train an autoencoder from the normal data subset, and iterate between hypothesizing normal candidate subset based on clustering and representation learning. The reconstruction error from the learned autoencoder serves as a scoring function to assess the normality of the data. Experimental results …


Context-Aware Retrieval-Based Deep Commit Message Generation, Haoye Wang, Xin Xia, David Lo, Qiang He, Xinyu Wang, John Grundy Jan 2021

Context-Aware Retrieval-Based Deep Commit Message Generation, Haoye Wang, Xin Xia, David Lo, Qiang He, Xinyu Wang, John Grundy

Research Collection School Of Computing and Information Systems

Commit messages recorded in version control systems contain valuable information for software development, maintenance, and comprehension. Unfortunately, developers often commit code with empty or poor quality commit messages. To address this issue, several studies have proposed approaches to generate commit messages from commit diffs. Recent studies make use of neural machine translation algorithms to try and translate git diffs into commit messages and have achieved some promising results. However, these learning-based methods tend to generate high-frequency words but ignore low-frequency ones. In addition, they suffer from exposure bias issues, which leads to a gap between training phase and testing phase. …


Do Blockchain And Iot Architecture Create Informedness To Support Provenance Tracking In The Product Lifecycle?, Somnath Mazumdar, Thomas Jensen, Raghava Rao Mukkamala, Robert John Kauffman, Jan Damsgaard Jan 2021

Do Blockchain And Iot Architecture Create Informedness To Support Provenance Tracking In The Product Lifecycle?, Somnath Mazumdar, Thomas Jensen, Raghava Rao Mukkamala, Robert John Kauffman, Jan Damsgaard

Research Collection School Of Computing and Information Systems

Consumers often lack information about the origin and provenance of the products they buy. They may ask: Is a food product truly organic? Or, what is the origin of the gemstone in the ring I purchased? They also may have sustainability concerns about the footprint of a product at the end of its life. Producers and sellers, meanwhile, wish to know how longitudinal tracking of the provenance of products and their components can boost their sales prices and after-market value, and re- veal new business opportunities. We focus on how the product lifecycle (PLC) can be leveraged to track information …


A Continual Deepfake Detection Benchmark: Dataset, Methods, And Essentials, Chuqiao Li, Zhiwu Huang, Danda Pani Paudel, Yabin Wang, Mohamad Shahbazi, Xiaopeng Hong, Van Gool Luc Jan 2021

A Continual Deepfake Detection Benchmark: Dataset, Methods, And Essentials, Chuqiao Li, Zhiwu Huang, Danda Pani Paudel, Yabin Wang, Mohamad Shahbazi, Xiaopeng Hong, Van Gool Luc

Research Collection School Of Computing and Information Systems

There have been emerging a number of benchmarks and techniques for the detection of deepfakes. However, very few works study the detection of incrementally appearing deepfakes in the real-world scenarios. To simulate the wild scenes, this paper suggests a continual deepfake detection benchmark (CDDB) over a new collection of deepfakes from both known and unknown generative models. The suggested CDDB designs multiple evaluations on the detection over easy, hard, and long sequence of deepfake tasks, with a set of appropriate measures. In addition, we exploit multiple approaches to adapt multiclass incremental learning methods, commonly used in the continual visual recognition, …


The Value Of Humanization In Customer Service, Yang Gao, Huaxia Rui, Shujing Sun Jan 2021

The Value Of Humanization In Customer Service, Yang Gao, Huaxia Rui, Shujing Sun

Research Collection School Of Computing and Information Systems

As algorithm-based agents become increasingly capable of handling customer service queries, customers are often uncertain whether they are served by humans or algorithms, and managers are left to question the value of human agents once the technology matures. The current paper studies this question by quantifying the impact of customers' enhanced perception of being served by human agents on customer service interactions. Our identification strategy hinges on the abrupt implementation by Southwest Airlines of a signature policy, which requires the inclusion of an agent's first name in responses on Twitter, thereby making the agent more humanized in the eyes of …


Chronic Customers Or Increased Awareness? The Dynamics Of Social Media Customer Service, Shujing Sun, Yang Gao, Huaxia Rui Jan 2021

Chronic Customers Or Increased Awareness? The Dynamics Of Social Media Customer Service, Shujing Sun, Yang Gao, Huaxia Rui

Research Collection School Of Computing and Information Systems

Despite that social media has become a promising alternative to traditional call centers, managers hesitate to fully harness its power because they worry that active service intervention may encourage excessive use of the channel by disgruntled customers. This paper sheds light on such a concern by examining the dynamics between brand-level customer complaints and service interventions on social media. Using details of customer-brand interactions of 40 airlines on Twitter, we find that more service interventions indeed cause more customer complaints, accounting for the online customer population and service quality. However, the increased complaints are primarily driven by the awareness enhancement …