Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 31

Full-Text Articles in Physical Sciences and Mathematics

A Layered Hidden Markov Model For Predicting Human Trajectories In A Multi-Floor Building, Qian Li, Hoong Chuin Lau Dec 2015

A Layered Hidden Markov Model For Predicting Human Trajectories In A Multi-Floor Building, Qian Li, Hoong Chuin Lau

Research Collection School Of Computing and Information Systems

Tracking and modeling huge amount of users’ movement in a multi-floor building by using wireless devices is a challenging task, due to crowd movement complexity and signal sensing accuracy. In this paper, we use Layered Hidden Markov Model (LHMM) to fit the spatial-temporal trajectories (with large number of missing values). We decompose the problem into distinct layers that Hidden Markov Models (HMMs) are operated at different spatial granularities separately. Baum-Welch algorithm and Viterbi algorithm are used for finding the probable location sequences at each layer. By measuring the predicted result of trajectories, we compared the predicted results of both single …


Active Crowdsourcing For Annotation, Shuji Hao, Chunyan Miao, Steven C. H. Hoi, Peilin Zhao Dec 2015

Active Crowdsourcing For Annotation, Shuji Hao, Chunyan Miao, Steven C. H. Hoi, Peilin Zhao

Research Collection School Of Computing and Information Systems

Crowdsourcing has shown great potential in obtaining large-scale and cheap labels for different tasks. However, obtaining reliable labels is challenging due to several reasons, such as noisy annotators, limited budget and so on. The state-of-the-art approaches, either suffer in some noisy scenarios, or rely on unlimited resources to acquire reliable labels. In this article, we adopt the learning with expert~(AKA worker in crowdsourcing) advice framework to robustly infer accurate labels by considering the reliability of each worker. However, in order to accurately predict the reliability of each worker, traditional learning with expert advice will consult with external oracles~(AKA domain experts) …


Incorporating Analytics Into A Business Process Modelling Course, Gottipati Swapna, Shankararaman, Venky Dec 2015

Incorporating Analytics Into A Business Process Modelling Course, Gottipati Swapna, Shankararaman, Venky

Research Collection School Of Computing and Information Systems

Embedding analytics is about integrating data analytics into operational systems that are part of an organization’s business processes. Currently, most organizations focus on automation business processes and enhancing productivity. However, going forward, in order to stay competitive, organizations have to go beyond automating their processes, by making them more intelligent, by embedding analytics into their processes and business applications. Therefore, there is need for enhancing the knowledge and skills of BPM professionals with know-how on improving a business process by embedding analytics into the workflow. In this paper contribution, the authors share their experience on how an existing process modelling, …


Building Crowd Movement Model Using Sample-Based Mobility Survey, Larry J. J. Lin, Shih-Fen Cheng, Hoong Chuin Lau Dec 2015

Building Crowd Movement Model Using Sample-Based Mobility Survey, Larry J. J. Lin, Shih-Fen Cheng, Hoong Chuin Lau

Research Collection School Of Computing and Information Systems

Crowd simulation is a well-studied topic, yet it usually focuses on visualization. In this paper, we study a special class of crowd simulation, where individual agents have diverse backgrounds, ad hoc objectives, and non-repeating visits. Such crowd simulation is particularly useful when modeling human agents movement in leisure settings such as visiting museums or theme parks. In these settings, we are interested in accurately estimating aggregate crowd-related movement statistics. As comprehensive monitoring is usually not feasible for a large crowd, we propose to conduct mobility surveys on only a small group of sampled individuals. We demonstrate via simulation that we …


Whom Should We Sense In 'Social Sensing' - Analyzing Which Users Work Best For Social Media Now-Casting, Jisun An, Ingmar Weber Nov 2015

Whom Should We Sense In 'Social Sensing' - Analyzing Which Users Work Best For Social Media Now-Casting, Jisun An, Ingmar Weber

Research Collection School Of Computing and Information Systems

Given the ever increasing amount of publicly available social media data, there is growing interest in using online data to study and quantify phenomena in the offline 'real' world. As social media data can be obtained in near real-time and at low cost, it is often used for 'now-casting' indices such as levels of flu activity or unemployment. The term 'social sensing' is often used in this context to describe the idea that users act as 'sensors', publicly reporting their health status or job losses. Sensor activity during a time period is then typically aggregated in a 'one tweet, one …


Using Digital Genomics To Create An Intelligent Enterprise, Mario Domingo Nov 2015

Using Digital Genomics To Create An Intelligent Enterprise, Mario Domingo

Asian Management Insights

Every business knows that it needs to leverage customer data, but few know the potential it has to transform business processes, decisions and performance.


Scheduled Approximation For Personalized Pagerank With Utility-Based Hub Selection, Fanwei Zhu, Yuan Fang, Kevin Chen-Chuan Chang, Jing Ying Oct 2015

Scheduled Approximation For Personalized Pagerank With Utility-Based Hub Selection, Fanwei Zhu, Yuan Fang, Kevin Chen-Chuan Chang, Jing Ying

Research Collection School Of Computing and Information Systems

As Personalized PageRank has been widely leveraged for ranking on a graph, the efficient computation of Personalized PageRank Vector (PPV) becomes a prominent issue. In this paper, we propose FastPPV, an approximate PPV computation algorithm that is incremental and accuracy-aware. Our approach hinges on a novel paradigm of scheduled approximation: the computation is partitioned and scheduled for processing in an “organized” way, such that we can gradually improve our PPV estimation in an incremental manner and quantify the accuracy of our approximation at query time. Guided by this principle, we develop an efficient hub-based realization, where we adopt the metric …


The Importance Of Being Isolated: An Empirical Study On Chromium Reviews, Subhajit Datta, Devarshi Bhatt, Manish Jain, Proshanta Sarkar, Santonu Sarkar Oct 2015

The Importance Of Being Isolated: An Empirical Study On Chromium Reviews, Subhajit Datta, Devarshi Bhatt, Manish Jain, Proshanta Sarkar, Santonu Sarkar

Research Collection School Of Computing and Information Systems

As large scale software development has become more collaborative, and software teams more globally distributed, several studies have explored how developer interaction influences software development outcomes. The emphasis so far has been largely on outcomes like defect count, the time to close modification requests etc. In the paper, we examine data from the Chromium project to understand how different aspects of developer discussion relate to the closure time of reviews. On the basis of analyzing reviews discussed by 2000+ developers, our results indicate that quicker closure of reviews owned by a developer relates to higher reception of information and insights …


Choosing Your Weapons: On Sentiment Analysis Tools For Software Engineering Research, Robbert Jongeling, Subhajit Datta, Alexander Serebrenik Oct 2015

Choosing Your Weapons: On Sentiment Analysis Tools For Software Engineering Research, Robbert Jongeling, Subhajit Datta, Alexander Serebrenik

Research Collection School Of Computing and Information Systems

Recent years have seen an increasing attention to social aspects of software engineering, including studies of emotions and sentiments experienced and expressed by the software developers. Most of these studies reuse existing sentiment analysis tools such as SentiStrength and NLTK. However, these tools have been trained on product reviews and movie reviews and, therefore, their results might not be applicable in the software engineering domain. In this paper we study whether the sentiment analysis tools agree with the sentiment recognized by human evaluators (as reported in an earlier study) as well as with each other. Furthermore, we evaluate the impact …


Need Accurate User Behaviour?: Pay Attention To Groups!, Kasthuri Jayarajah, Youngki Lee, Archan Misra, Rajesh Krishna Balan Sep 2015

Need Accurate User Behaviour?: Pay Attention To Groups!, Kasthuri Jayarajah, Youngki Lee, Archan Misra, Rajesh Krishna Balan

Research Collection School Of Computing and Information Systems

In this paper, we show that characterizing user behaviour from location or smartphone usage traces, without accounting for the interaction of individuals in physical-world groups, can lead to erroneous results. We conducted one of the largest studies in the UbiComp domain thus far, involving indoor location traces of more than 6,000 users, collected over a 4-month period at our university campus, and further studied fine-grained App usage of a subset of 156 Android users. We apply a state-of-the-art group detection algorithm to annotate such location traces with group vs. individual context, and then show that individuals vs. groups exhibit significant …


Trace Element Composition Of Pm2.5 And Pm10 From Kolkata - A Heavily Polluted Indian Metropolis, Reshmi Das, Bahareh Khezri, Bijayen Srivastava, Subhajit Datta, Pradip Kumar Sikdar, Richard D. Webster, Xianfeng Wang Sep 2015

Trace Element Composition Of Pm2.5 And Pm10 From Kolkata - A Heavily Polluted Indian Metropolis, Reshmi Das, Bahareh Khezri, Bijayen Srivastava, Subhajit Datta, Pradip Kumar Sikdar, Richard D. Webster, Xianfeng Wang

Research Collection School Of Computing and Information Systems

Elemental composition of PM2.5 and PM10 was measured from 16 locations in Greater Kolkata in Eastern India. Sampling was carried out in the winter months of 2013–2014. PM2.5 and PM10 mass concentrations ranged from 83–783 μg/m3 and 167–928 μg/m3 respectively. 20 elements were measured with an Agilent 7700 series ICP–MS equipped with a 3rd generation He reaction/collision cell following closed vessel microwave digestion. In both size fractions Fe, Na, Al, K, Ca were present in high concentrations (>1 000 ng/m3), Mn, Zn and Pb demonstrated medium concentrations (>100 ng/m …


Evaluation And Improvement Of Procurement Process With Data Analytics, Melvin H. C. Tan, Wee Leong Lee Sep 2015

Evaluation And Improvement Of Procurement Process With Data Analytics, Melvin H. C. Tan, Wee Leong Lee

Research Collection School Of Computing and Information Systems

Analytics can be applied in procurement to benefit organizations beyond just prevention and detection of fraud. This study aims to demonstrate how advanced data mining techniques such as text mining and cluster analysis can be used to improve visibility of procurement patterns and provide decision-makers with insight to develop more efficient sourcing strategies, in terms of cost and effort. A case study of an organization’s effort to improve its procurement process is presented in this paper. The findings from this study suggest that opportunities exist for organizations to aggregate common goods and services among the purchases made under and across …


Deep Learning For Just-In-Time Defect Prediction, Xinli Yang, David Lo, Xin Xia, Yun Zhang, Jianling Sun Aug 2015

Deep Learning For Just-In-Time Defect Prediction, Xinli Yang, David Lo, Xin Xia, Yun Zhang, Jianling Sun

Research Collection School Of Computing and Information Systems

Defect prediction is a very meaningful topic, particularly at change-level. Change-level defect prediction, which is also referred as just-in-time defect prediction, could not only ensure software quality in the development process, but also make the developers check and fix the defects in time. Nowadays, deep learning is a hot topic in the machine learning literature. Whether deep learning can be used to improve the performance of just-in-time defect prediction is still uninvestigated. In this paper, to bridge this research gap, we propose an approach Deeper which leverages deep learning techniques to predict defect-prone changes. We first build a set of …


Cooperation In Delay-Tolerant Networks With Wireless Energy Transfer: Performance Analysis And Optimization, Dusit Niyato, Ping Wang, Hwee-Pink Tan, Walid Saad, Dong In Kim Aug 2015

Cooperation In Delay-Tolerant Networks With Wireless Energy Transfer: Performance Analysis And Optimization, Dusit Niyato, Ping Wang, Hwee-Pink Tan, Walid Saad, Dong In Kim

Research Collection School Of Computing and Information Systems

We consider a delay-tolerant network (DTN) whose mobile nodes are assigned to collect packets from data sources and deliver them to a sink (i.e., a gateway). Each mobile node operates by using energy transferred wirelessly from the gateway. For such a network, two main issues are studied. First, when a mobile node is at the data source, this node must decide on whether to accept the packet received from the data source or not. In contrast, whenever a mobile node is at the gateway, it has to decide on whether to transmit the packets collected from the data sources or …


Enabling Real Time In-Situ Context Based Experimentation To Observe User Behaviour, Kartik Muralidaran Aug 2015

Enabling Real Time In-Situ Context Based Experimentation To Observe User Behaviour, Kartik Muralidaran

Dissertations and Theses Collection (Open Access)

Today’s mobile phones represent a rich and powerful computing platform, given their sensing, processing and communication capabilities. These devices are also part of the everyday life of millions of people, and coupled with the unprecedented access to personal context, make them the ideal tool for conducting behavioural experiments in an unobtrusive way. Transforming the mobile device from a mere observer of human context to an enabler of behavioural experiments however, requires not only providing experimenters access to the deep, near-real time human context (e.g., location, activity, group dynamics) but also exposing a disciplined scientific experimentation service that frees them from …


Structured Learning From Heterogeneous Behavior For Social Identity Linkage, Siyuan Liu, Shuhui Wang, Feida Zhu Jul 2015

Structured Learning From Heterogeneous Behavior For Social Identity Linkage, Siyuan Liu, Shuhui Wang, Feida Zhu

Research Collection School Of Computing and Information Systems

Social identity linkage across different social media platforms is of critical importance to business intelligence by gaining from social data a deeper understanding and more accurate profiling of users. In this paper, we propose a solution framework, HYDRA, which consists of three key steps: (I) we model heterogeneous behavior by long-term topical distribution analysis and multi-resolution temporal behavior matching against high noise and information missing, and the behavior similarity are described by multi-dimensional similarity vector for each user pair; (II) we build structure consistency models to maximize the structure and behavior consistency on users' core social structure across different platforms, …


Message Passing For Collective Graphical Models, Tao Sun, Daniel Sheldon, Akshat Kumar Jul 2015

Message Passing For Collective Graphical Models, Tao Sun, Daniel Sheldon, Akshat Kumar

Research Collection School Of Computing and Information Systems

Collective graphical models (CGMs) are a formalism for inference and learning about a population of independent and identically distributed individuals when only noisy aggregate data are available. We highlight a close connection between approximate MAP inference in CGMs and marginal inference in standard graphical models. The connection leads us to derive a novel Belief Propagation (BP) style algorithm for collective graphical models. Mathematically, the algorithm is a strict generalization of BP—it can be viewed as an extension to minimize the Bethe free energy plus additional energy terms that are non-linear functions of the marginals. For CGMs, the algorithm is much …


Improving Patient Flow With Data-Driven Patient Prioritization Method In The Emergency Department, Kar Way Tan, Sean Shao Wei Lam Jul 2015

Improving Patient Flow With Data-Driven Patient Prioritization Method In The Emergency Department, Kar Way Tan, Sean Shao Wei Lam

Research Collection School Of Computing and Information Systems

We aim to improve the length-of-stay (LOS) of patients in the Emergency Department (ED) ambulatory care area. We propose the use of real-time computerized physician order entry data and ED patient flow management system to estimate the consultation time of patients re-entering the queue to consult a doctor again after receiving treatment or results of tests. The estimation allows decision-makers to apply dynamic prioritization strategies that help the ED to identify patients who can complete their ED treatment process quickly, freeing up resources in the ED and lowering overall LOS.


Should We Use The Sample? Analyzing Datasets Sampled From Twitter's Stream Api, Yazhe Wang, Jamie Callan, Baihua Zheng Jun 2015

Should We Use The Sample? Analyzing Datasets Sampled From Twitter's Stream Api, Yazhe Wang, Jamie Callan, Baihua Zheng

Research Collection School Of Computing and Information Systems

Researchers have begun studying content obtained from microblogging services such as Twitter to address a variety of technological, social, and commercial research questions. The large number of Twitter users and even larger volume of tweets often make it impractical to collect and maintain a complete record of activity; therefore, most research and some commercial software applications rely on samples, often relatively small samples, of Twitter data. For the most part, sample sizes have been based on availability and practical considerations. Relatively little attention has been paid to how well these samples represent the underlying stream of Twitter data. To fill …


Efficient Reverse Top-K Boolean Spatial Keyword Queries On Road Networks, Yunjun Gao, Xu Qin, Baihua Zheng, Gang Chen May 2015

Efficient Reverse Top-K Boolean Spatial Keyword Queries On Road Networks, Yunjun Gao, Xu Qin, Baihua Zheng, Gang Chen

Research Collection School Of Computing and Information Systems

Reverse k nearest neighbor (RkNN) queries have a broad application base such as decision support, profile-based marketing, and resource allocation. Previous work on RkNN search does not take textual information into consideration or limits to the Euclidean space. In the real world, however, most spatial objects are associated with textual information and lie on road networks. In this paper, we introduce a new type of queries, namely, reverse top-k Boolean spatial keyword (RkBSK) retrieval, which assumes objects are on the road network and considers both spatial and textual information. Given a data set P on a road network and a …


Moving Average Reversion Strategy For On-Line Portfolio Selection, Bin Li, Steven C. H. Hoi, Doyen Sahoo, Zhi-Yong Liu May 2015

Moving Average Reversion Strategy For On-Line Portfolio Selection, Bin Li, Steven C. H. Hoi, Doyen Sahoo, Zhi-Yong Liu

Research Collection School Of Computing and Information Systems

On-line portfolio selection, a fundamental problem in computational finance, has attracted increasing interest from artificial intelligence and machine learning communities in recent years. Empirical evidence shows that stock's high and low prices are temporary and stock prices are likely to follow the mean reversion phenomenon. While existing mean reversion strategies are shown to achieve good empirical performance on many real datasets, they often make the single-period mean reversion assumption, which is not always satisfied, leading to poor performance in certain real datasets. To overcome this limitation, this article proposes a multiple-period mean reversion, or so-called "Moving Average Reversion" (MAR), and …


Cofaçade: A Customizable Assistive Approach For Elders And Their Helpers, Jason Chen Zhao, Richard Christopher Davis, Pin Sym Foong, Shengdong Zhao Apr 2015

Cofaçade: A Customizable Assistive Approach For Elders And Their Helpers, Jason Chen Zhao, Richard Christopher Davis, Pin Sym Foong, Shengdong Zhao

Research Collection School Of Computing and Information Systems

We present CoFaçade, a novel approach to helping elders reach their goals with IT products by working collaboratively with helpers. In this approach, the elder uses an interface with a small number of triggers, where each trigger is a single button (or card) that can execute a procedure. The helper uses a customization interface to link triggers to procedures that accomplish frequently-recurring high-level goals with IT products. Customization can be done either locally or remotely. We conducted an experiment to compare the CoFaçade approach with a baseline approach where helpers taught elders to perform IT tasks. Our results showed that …


Using Support Vector Machine Ensembles For Target Audience Classification On Twitter, Siaw Ling Lo, Raymond Chiong, David Cornforth Apr 2015

Using Support Vector Machine Ensembles For Target Audience Classification On Twitter, Siaw Ling Lo, Raymond Chiong, David Cornforth

Research Collection School Of Computing and Information Systems

The vast amount and diversity of the content shared on social media can pose a challenge for any business wanting to use it to identify potential customers. In this paper, our aim is to investigate the use of both unsupervised and supervised learning methods for target audience classification on Twitter with minimal annotation efforts. Topic domains were automatically discovered from contents shared by followers of an account owner using Twitter Latent Dirichlet Allocation (LDA). A Support Vector Machine (SVM) ensemble was then trained using contents from different account owners of the various topic domains identified by Twitter LDA. Experimental results …


Review Selection Using Micro-Reviews, Thanh-Son Nguyen, Hady W. Lauw, Panayiotis Tsaparas Apr 2015

Review Selection Using Micro-Reviews, Thanh-Son Nguyen, Hady W. Lauw, Panayiotis Tsaparas

Research Collection School Of Computing and Information Systems

Given the proliferation of review content, and the fact that reviews are highly diverse and often unnecessarily verbose, users frequently face the problem of selecting the appropriate reviews to consume. Micro-reviews are emerging as a new type of online review content in the social media. Micro-reviews are posted by users of check-in services such as Foursquare. They are concise (up to 200 characters long) and highly focused, in contrast to the comprehensive and verbose reviews. In this paper, we propose a novel mining problem, which brings together these two disparate sources of review content. Specifically, we use coverage of micro-reviews …


Best Upgrade Plans For Single And Multiple Source-Destination Pairs, Yimin Lin, Kyriakos Mouratidis Apr 2015

Best Upgrade Plans For Single And Multiple Source-Destination Pairs, Yimin Lin, Kyriakos Mouratidis

Research Collection School Of Computing and Information Systems

In this paper, we study Resource Constrained Best Upgrade Plan (BUP) computation in road network databases. Consider a transportation network (weighted graph) G where a subset of the edges are upgradable, i.e., for each such edge there is a cost, which if spent, the weight of the edge can be reduced to a specific new value. In the single-pair version of BUP, the input includes a source and a destination in G, and a budget B (resource constraint). The goal is to identify which upgradable edges should be upgraded so that the shortest path distance between source and …


Beyond Support And Confidence: Exploring Interestingness Measures For Rule-Based Specification Mining, Bui Tien Duy Le, David Lo Mar 2015

Beyond Support And Confidence: Exploring Interestingness Measures For Rule-Based Specification Mining, Bui Tien Duy Le, David Lo

Research Collection School Of Computing and Information Systems

Numerous rule-based specification mining approaches have been proposed in the literature. Many of these approaches analyze a set of execution traces to discover interesting usage rules, e.g., whenever lock() is invoked, eventually unlock() is invoked. These techniques often generate and enumerate a set of candidate rules and compute some interestingness scores. Rules whose interestingness scores are above a certain threshold would then be output. In past studies, two measures, namely support and confidence, which are well-known measures, are often used to compute these scores. However, aside from these two, many other interestingness measures have been proposed. It is thus unclear …


On Efficient K-Optimal-Location-Selection Query Processing In Metric Spaces, Yunjun Gao, Shuyao Qi, Lu Chen, Baihua Zheng, Xinhan Li Mar 2015

On Efficient K-Optimal-Location-Selection Query Processing In Metric Spaces, Yunjun Gao, Shuyao Qi, Lu Chen, Baihua Zheng, Xinhan Li

Research Collection School Of Computing and Information Systems

This paper studies the problem of k-optimal-location-selection (kOLS) retrieval in metric spaces. Given a set DA of customers, a set DB of locations, a constrained region R , and a critical distance dc, a metric kOLS (MkOLS) query retrieves k locations in DB that are outside R but have the maximal optimality scores. Here, the optimality score of a location l∈DB located outside R is defined as the number of the customers in DA that are inside R and meanwhile have their distances to l bounded by …


Joint Search By Social And Spatial Proximity, Kyriakos Mouratidis, Jing Li, Yu Tang, Nikos Mamoulis Mar 2015

Joint Search By Social And Spatial Proximity, Kyriakos Mouratidis, Jing Li, Yu Tang, Nikos Mamoulis

Research Collection School Of Computing and Information Systems

The diffusion of social networks introduces new challenges and opportunities for advanced services, especially so with their ongoing addition of location-based features. We show how applications like company and friend recommendation could significantly benefit from incorporating social and spatial proximity, and study a query type that captures these two-fold semantics. We develop highly scalable algorithms for its processing, and enhance them with elaborate optimizations. Finally, we use real social network data to empirically verify the efficiency and efficacy of our solutions.


Will This Be Quick? A Case Study Of Bug Resolution Times Across Industrial Projects, Subhajit Datta, Prasanth Lade Feb 2015

Will This Be Quick? A Case Study Of Bug Resolution Times Across Industrial Projects, Subhajit Datta, Prasanth Lade

Research Collection School Of Computing and Information Systems

Resolution of problem tickets is a source of significant revenue in the worldwide software services industry. Due to the high volume of problem tickets in any large scale customer engagement, automated techniques are necessary to segregate related incoming tickets into groups. Existing techniques focus on this classification problem. In this paper, we present a case study built around the position that predicting the category of resolution times within a class of tickets and also the actual resolution times, is strongly beneficial to ticket resolution. We present an approach based on topic analysis to predict the category of resolution times of …


Bridging The Vocabulary Gap Between Health Seekers And Healthcare Knowledge, Liqiang Nie, Yiliang Zhao, Akbari Mohammad, Jialie Shen, Tat-Seng Chua Feb 2015

Bridging The Vocabulary Gap Between Health Seekers And Healthcare Knowledge, Liqiang Nie, Yiliang Zhao, Akbari Mohammad, Jialie Shen, Tat-Seng Chua

Research Collection School Of Computing and Information Systems

The vocabulary gap between health seekers and providers has hindered the cross-system operability and the interuser reusability. To bridge this gap, this paper presents a novel scheme to code the medical records by jointly utilizing local mining and global learning approaches, which are tightly linked and mutually reinforced. Local mining attempts to code the individual medical record by independently extracting the medical concepts from the medical record itself and then mapping them to authenticated terminologies. A corpus-aware terminology vocabulary is naturally constructed as a byproduct, which is used as the terminology space for global learning. Local mining approach, however, may …