Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 63

Full-Text Articles in Engineering

Proactive Sequential Resource (Re)Distribution For Improving Efficiency In Urban Environments, Supriyo Ghosh Dec 2017

Proactive Sequential Resource (Re)Distribution For Improving Efficiency In Urban Environments, Supriyo Ghosh

Dissertations and Theses Collection (Open Access)

Due to the increasing population and lack of coordination, there is a mismatch in supply and demand of common resources (e.g., shared bikes, ambulances, taxis) in urban environments, which has deteriorated a wide variety of quality of life metrics such as success rate in issuing shared bikes, response times for emergency needs, waiting times in queues etc. Thus, in my thesis, I propose efficient algorithms that optimise the quality of life metrics by proactively redistributing the resources using intelligent operational (day-to-day) and strategic (long-term) decisions in the context of urban transportation and health & safety. For urban transportation, Bike Sharing …


Policy Gradient With Value Function Approximation For Collective Multiagent Planning, Duc Thien Nguyen, Akshat Kumar, Hoong Chuin Lau Dec 2017

Policy Gradient With Value Function Approximation For Collective Multiagent Planning, Duc Thien Nguyen, Akshat Kumar, Hoong Chuin Lau

Research Collection School Of Computing and Information Systems

Decentralized (PO)MDPs provide an expressive framework for sequential decision making in a multiagent system. Given their computational complexity, recent research has focused on tractable yet practical subclasses of Dec-POMDPs. We address such a subclass called CDec-POMDP where the collective behavior of a population of agents affects the joint-reward and environment dynamics. Our main contribution is an actor-critic (AC) reinforcement learning method for optimizing CDec-POMDP policies. Vanilla AC has slow convergence for larger problems. To address this, we show how a particular decomposition of the approximate action-value function over agents leads to effective updates, and also derive a new way to …


Efficient Gate System Operations For A Multipurpose Port Using Simulation Optimization, Ketki Kulkarni, Trong Khiem Tran, Hai Wang, Hoong Chuin Lau Dec 2017

Efficient Gate System Operations For A Multipurpose Port Using Simulation Optimization, Ketki Kulkarni, Trong Khiem Tran, Hai Wang, Hoong Chuin Lau

Research Collection School Of Computing and Information Systems

Port capacity is determined by three major infrastructural resources namely, berths, yards and gates. Theadvertised capacity is constrained by the least of the capacities of the three resources. While a lot ofattention has been paid to optimizing berth and yard capacities, not much attention has been given toanalyzing the gate capacity. The gates are a key node between the land-side and sea-side operations in anocean-to-cities value chain. The gate system under consideration, located at an important port in an Asiancity, is a multi-class parallel queuing system with non-homogeneous Poisson arrivals. It is hard to obtaina closed form analytic approach for …


Graphmp: An Efficient Semi-External-Memory Big Graph Processing System On A Single Machine, Peng Sun, Yonggang Wen, Nguyen Binh Duong Ta, Xiaokui Xiao Dec 2017

Graphmp: An Efficient Semi-External-Memory Big Graph Processing System On A Single Machine, Peng Sun, Yonggang Wen, Nguyen Binh Duong Ta, Xiaokui Xiao

Research Collection School Of Computing and Information Systems

Recent studies showed that single-machine graph processing systems can be as highly competitive as clusterbased approaches on large-scale problems. While several outof-core graph processing systems and computation models have been proposed, the high disk I/O overhead could significantly reduce performance in many practical cases. In this paper, we propose GraphMP to tackle big graph analytics on a single machine. GraphMP achieves low disk I/O overhead with three techniques. First, we design a vertex-centric sliding window (VSW) computation model to avoid reading and writing vertices on disk. Second, we propose a selective scheduling method to skip loading and processing unnecessary edge …


Law Enforcement Resource Optimization With Response Time Guarantees, Jonathan Chase, Jiali Du, Na Fu, Truc Viet Le, Hoong Chuin Lau Dec 2017

Law Enforcement Resource Optimization With Response Time Guarantees, Jonathan Chase, Jiali Du, Na Fu, Truc Viet Le, Hoong Chuin Lau

Research Collection School Of Computing and Information Systems

In a security-conscious world, and with the rapid increase in the global urbanized population, there is a growing challenge for law enforcement agencies to efficiently respond to emergency calls. We consider the problem of spatially and temporally optimizing the allocation of law enforcement resources such that the quality of service (QoS) in terms of emergency response time can be guaranteed. To solve this problem, we provide a spatio-temporal MILP optimization model, which we learn from a real-world dataset of incidents and dispatching records, and solve by existing solvers. One key feature of our proposed model is the introduction of risk …


Vkse-Mo: Verifiable Keyword Search Over Encrypted Data In Multi-Owner Settings, Yinbin Miao, Jianfeng Ma, Ximeng Liu, Junwei Zhang, Zhiquan Liu Dec 2017

Vkse-Mo: Verifiable Keyword Search Over Encrypted Data In Multi-Owner Settings, Yinbin Miao, Jianfeng Ma, Ximeng Liu, Junwei Zhang, Zhiquan Liu

Research Collection School Of Computing and Information Systems

Searchable encryption (SE) techniques allow cloud clients to easily store data and search encrypted data in a privacy-preserving manner, where most of SE schemes treat the cloud server as honest-but-curious. However, in practice, the cloud server is a semi-honest-but-curious third-party, which only executes a fraction of search operations and returns a fraction of false search results to save its computational and bandwidth resources. Thus, it is important to provide a results verification method to guarantee the correctness of the search results. Existing SE schemes allow multiple data owners to upload different records to the cloud server, but these schemes have …


A Selective-Discrete Particle Swarm Optimization Algorithm For Solving A Class Of Orienteering Problems, Aldy Gunawan, Vincent F. Yu, Perwira Redi, Parida Jewpanya, Hoong Chuin Lau Dec 2017

A Selective-Discrete Particle Swarm Optimization Algorithm For Solving A Class Of Orienteering Problems, Aldy Gunawan, Vincent F. Yu, Perwira Redi, Parida Jewpanya, Hoong Chuin Lau

Research Collection School Of Computing and Information Systems

This study addresses a class of NP-hard problem called the Orienteering Problem (OP), which belongs to a well-known class of vehicle routing problems. In the OP, a set of nodes that associated with a location and a score is given. The time required to travel between each pair of nodes is known in advance. The total travel time is limited by a predetermined time budget. The objective is to select a subset of nodes to be visited that maximizes the total collected score within a path. The Team OP (TOP) is an extension of OP that incorporates multiple paths. Another …


Who Are Your Users? Comparing Media Professionals' Preconception Of Users To Data-Driven Personas, Lene Nielsen, Soon-Gyu Jung, Jisun An, Joni Salminen, Haewoon Kwak, Bernard J. Jansen Dec 2017

Who Are Your Users? Comparing Media Professionals' Preconception Of Users To Data-Driven Personas, Lene Nielsen, Soon-Gyu Jung, Jisun An, Joni Salminen, Haewoon Kwak, Bernard J. Jansen

Research Collection School Of Computing and Information Systems

One of the reasons for using personas is to align user understandings across project teams and sites. As part of a larger persona study, at Al Jazeera English (AJE), we conducted 16 qualitative interviews with media producers, the end users of persona descriptions. We asked the participants about their understanding of a typical AJE media consumer, and the variety of answers shows that the understandings are not aligned and are built on a mix of own experiences, own self, assumptions, and data given by the company. The answers are sometimes aligned with the data-driven personas and sometimes not. The end …


Bikemate: Bike Riding Behavior Monitoring With Smartphones, Weixi Gu, Zimu Zhou, Yuxun Zhou, Han Zou, Yunxin Liu, Costas J. Spanos, Lin Zhang Nov 2017

Bikemate: Bike Riding Behavior Monitoring With Smartphones, Weixi Gu, Zimu Zhou, Yuxun Zhou, Han Zou, Yunxin Liu, Costas J. Spanos, Lin Zhang

Research Collection School Of Computing and Information Systems

Detecting dangerous riding behaviors is of great importance to improve bicycling safety. Existing bike safety precautionary measures rely on dedicated infrastructures that incur high installation costs. In this work, we propose BikeMate, a ubiquitous bicycling behavior monitoring system with smartphones. BikeMate invokes smartphone sensors to infer dangerous riding behaviors including lane weaving, standing pedalling and wrong-way riding. For easy adoption, BikeMate leverages transfer learning to reduce the overhead of training models for different users, and applies crowdsourcing to infer legal riding directions without prior knowledge. Experiments with 12 participants show that BikeMate achieves an overall accuracy of 86.8% for lane …


Intent Recognition In Smart Living Through Deep Recurrent Neural Networks, Xiang Zhang, Lina Yao, Chaoran Huang, Quan Z. Sheng, Xianzhi Wang Nov 2017

Intent Recognition In Smart Living Through Deep Recurrent Neural Networks, Xiang Zhang, Lina Yao, Chaoran Huang, Quan Z. Sheng, Xianzhi Wang

Research Collection School Of Computing and Information Systems

Electroencephalography (EEG) signal based intent recognition has recently attracted much attention in both academia and industries, due to helping the elderly or motor-disabled people controlling smart devices to communicate with outer world. However, the utilization of EEG signals is challenged by low accuracy, arduous and time-consuming feature extraction. This paper proposes a 7-layer deep learning model to classify raw EEG signals with the aim of recognizing subjects’ intents, to avoid the time consumed in pre-processing and feature extraction. The hyper-parameters are selected by an Orthogonal Array experiment method for efficiency. Our model is applied to an open EEG dataset provided …


Understanding Inactive Yet Available Assignees In Github, Jing Jiang, David Lo, Xinyu Ma, Fuli Feng, Li Zhang Nov 2017

Understanding Inactive Yet Available Assignees In Github, Jing Jiang, David Lo, Xinyu Ma, Fuli Feng, Li Zhang

Research Collection School Of Computing and Information Systems

Context In GitHub, an issue or a pull request can be assigned to a specific assignee who is responsible for working on this issue or pull request. Due to the principle of voluntary participation, available assignees may remain inactive in projects. If assignees ever participate in projects, they are active assignees; otherwise, they are inactive yet available assignees (inactive assignees for short). Objective Our objective in this paper is to provide a comprehensive analysis of inactive yet available assignees in GitHub. Method We collect 2,374,474 records of activities in 37 popular projects, and 797,756 records of activities in 687 projects …


Selective Value Coupling Learning For Detecting Outliers In High-Dimensional Categorical Data, Guansong Pang, Hongzuo Xu, Cao Longbing, Wentao Zhao Nov 2017

Selective Value Coupling Learning For Detecting Outliers In High-Dimensional Categorical Data, Guansong Pang, Hongzuo Xu, Cao Longbing, Wentao Zhao

Research Collection School Of Computing and Information Systems

This paper introduces a novel framework, namely SelectVC and its instance POP, for learning selective value couplings (i.e., interactions between the full value set and a set of outlying values) to identify outliers in high-dimensional categorical data. Existing outlier detection methods work on a full data space or feature subspaces that are identified independently from subsequent outlier scoring. As a result, they are significantly challenged by overwhelming irrelevant features in high-dimensional data due to the noise brought by the irrelevant features and its huge search space. In contrast, SelectVC works on a clean and condensed data space spanned by selective …


Enabling Phased Array Signal Processing For Mobile Wifi Devices, Kun Qian, Chenshu Wu, Zheng Yang, Zimu Zhou, Xu Wang, Yunhao Liu Nov 2017

Enabling Phased Array Signal Processing For Mobile Wifi Devices, Kun Qian, Chenshu Wu, Zheng Yang, Zimu Zhou, Xu Wang, Yunhao Liu

Research Collection School Of Computing and Information Systems

Modern mobile devices are equipped with multiple antennas, which brings various wireless sensing applications such as accurate localization, contactless human detection, and wireless human-device interaction. A key enabler for these applications is phased array signal processing, especially Angle of Arrival (AoA) estimation. However, accurate AoA estimation on commodity devices is non-trivial due to limited number of antennas and uncertain phase offsets. Previous works either rely on elaborate calibration or involve contrived human interactions. In this paper, we aim to enable practical AoA measurements on commodity off-the-shelf (COTS) mobile devices. The key insight is to involve users’ natural rotation to formulate …


Sourcevote: Fusing Multi-Valued Data Via Inter-Source Agreements, Xiu Susie Fang, Quan Z. Sheng, Xianzhi Wang, Mahmoud Barhamgi, Lina Yao, Anne H.H. Ngu Nov 2017

Sourcevote: Fusing Multi-Valued Data Via Inter-Source Agreements, Xiu Susie Fang, Quan Z. Sheng, Xianzhi Wang, Mahmoud Barhamgi, Lina Yao, Anne H.H. Ngu

Research Collection School Of Computing and Information Systems

Data fusion is a fundamental research problem of identifyingtrue values of data items of interest from conflicting multi-sourceddata. Although considerable research efforts have been conducted on thistopic, existing approaches generally assume every data item has exactlyone true value, which fails to reflect the real world where data items withmultiple true values widely exist. In this paper, we propose a novel approach,SourceVote, to estimate value veracity for multi-valued data items.SourceVote models the endorsement relations among sources by quantifyingtheir two-sided inter-source agreements. In particular, two graphs areconstructed to model inter-source relations. Then two aspects of sourcereliability are derived from these graphs and …


Sourcevote: Fusing Multi-Valued Data Via Inter-Source Agreements, Xiu Susie Fang, Quan Z. Sheng, Xianzhi Wang, Mahmoud Barhamgi, Lina Yao, Anne H.H. Ngu Nov 2017

Sourcevote: Fusing Multi-Valued Data Via Inter-Source Agreements, Xiu Susie Fang, Quan Z. Sheng, Xianzhi Wang, Mahmoud Barhamgi, Lina Yao, Anne H.H. Ngu

Research Collection School Of Computing and Information Systems

Data fusion is a fundamental research problem of identifying true values of data items of interest from conflicting multi-sourced data. Although considerable research efforts have been conducted on this topic, existing approaches generally assume every data item has exactly one true value, which fails to reflect the real world where data items with multiple true values widely exist. In this paper, we propose a novel approach,SourceVote, to estimate value veracity for multi-valued data items. SourceVote models the endorsement relations among sources by quantifying their two-sided inter-source agreements. In particular, two graphs are constructed to model inter-source relations. Then two aspects …


Spatio-Temporal Analysis And Prediction Of Cellular Traffic In Metropolis, Xu Wang, Zimu Zhou, Zheng Yang, Yunhao Liu, Chunyi Peng Oct 2017

Spatio-Temporal Analysis And Prediction Of Cellular Traffic In Metropolis, Xu Wang, Zimu Zhou, Zheng Yang, Yunhao Liu, Chunyi Peng

Research Collection School Of Computing and Information Systems

Understanding and predicting cellular traffic at large-scale and fine-granularity is beneficial and valuable to mobile users, wireless carriers and city authorities. Predicting cellular traffic in modern metropolis is particularly challenging because of the tremendous temporal and spatial dynamics introduced by diverse user Internet behaviours and frequent user mobility citywide. In this paper, we characterize and investigate the root causes of such dynamics in cellular traffic through a big cellular usage dataset covering 1.5 million users and 5,929 cell towers in a major city of China. We reveal intensive spatio-temporal dependency even among distant cell towers, which is largely overlooked in …


Cross-Modal Recipe Retrieval With Rich Food Attributes, Jingjing Chen, Chong-Wah Ngo, Tat-Seng Chua Oct 2017

Cross-Modal Recipe Retrieval With Rich Food Attributes, Jingjing Chen, Chong-Wah Ngo, Tat-Seng Chua

Research Collection School Of Computing and Information Systems

Food is rich of visible (e.g., colour, shape) and procedural (e.g., cutting, cooking) attributes. Proper leveraging of these attributes, particularly the interplay among ingredients, cutting and cooking methods, for health-related applications has not been previously explored. This paper investigates cross-modal retrieval of recipes, specifically to retrieve a text-based recipe given a food picture as query. As similar ingredient composition can end up with wildly different dishes depending on the cooking and cutting procedures, the difficulty of retrieval originates from fine-grained recognition of rich attributes from pictures. With a multi-task deep learning model, this paper provides insights on the feasibility of …


Combinatorial Auction For Transportation Matching Service: Formulation And Adaptive Large Neighborhood Search Heuristic, Baoxiang Li, Hoong Chuin Lau Oct 2017

Combinatorial Auction For Transportation Matching Service: Formulation And Adaptive Large Neighborhood Search Heuristic, Baoxiang Li, Hoong Chuin Lau

Research Collection School Of Computing and Information Systems

This paper considers the problem of matching multiple shippers and multi-transporters for pickups and drop-offs, where the goal is to select a subset of group jobs (shipper bids) that maximizes profit. This is the underlying winner determination problem in an online auction-based vehicle sharing platform that matches transportation demand and supply, particularly in a B2B last-mile setting. Each shipper bid contains multiple jobs, and each job has a weight, volume, pickup location, delivery location and time window. On the other hand, each transporter bid specifies the vehicle capacity, available time periods, and a cost structure. This double-sided auction will be …


Semantic Reasoning In Zero Example Video Event Retrieval, M. H. T. De Boer, Yi-Jie Lu, Hao Zhang, Klamer Schutte, Chong-Wah Ngo, Wessel Kraaij Oct 2017

Semantic Reasoning In Zero Example Video Event Retrieval, M. H. T. De Boer, Yi-Jie Lu, Hao Zhang, Klamer Schutte, Chong-Wah Ngo, Wessel Kraaij

Research Collection School Of Computing and Information Systems

Searching in digital video data for high-level events, such as a parade or a car accident, is challenging when the query is textual and lacks visual example images or videos. Current research in deep neural networks is highly beneficial for the retrieval of high-level events using visual examples, but without examples it is still hard to (1) determine which concepts are useful to pre-train (Vocabulary challenge) and (2) which pre-trained concept detectors are relevant for a certain unseen high-level event (Concept Selection challenge). In our article, we present our Semantic Event Retrieval Systemwhich (1) shows the importance of high-level concepts …


Sugarmate: Non-Intrusive Blood Glucose Monitoring With Smartphones, Weixi Gu, Yuxun Zhou, Zimu Zhou, Xi Liu, Han Zou, Pei Zhang, Costas J. Spanos, Lin Zhang Sep 2017

Sugarmate: Non-Intrusive Blood Glucose Monitoring With Smartphones, Weixi Gu, Yuxun Zhou, Zimu Zhou, Xi Liu, Han Zou, Pei Zhang, Costas J. Spanos, Lin Zhang

Research Collection School Of Computing and Information Systems

Inferring abnormal glucose events such as hyperglycemia and hypoglycemia is crucial for the health of both diabetic patients and non-diabetic people. However, regular blood glucose monitoring can be invasive and inconvenient in everyday life. We present SugarMate, a first smartphone-based blood glucose inference system as a temporary alternative to continuous blood glucose monitors (CGM) when they are uncomfortable or inconvenient to wear. In addition to the records of food, drug and insulin intake, it leverages smartphone sensors to measure physical activities and sleep quality automatically. Provided with the imbalanced and often limited measurements, a challenge of SugarMate is the inference …


Personalized Microtopic Recommendation On Microblogs, Yang Li, Jing Jiang, Ting Liu, Minghui Qiu, Xiaofei Sun Sep 2017

Personalized Microtopic Recommendation On Microblogs, Yang Li, Jing Jiang, Ting Liu, Minghui Qiu, Xiaofei Sun

Research Collection School Of Computing and Information Systems

Microblogging services such as Sina Weibo and Twitter allow users to create tags explicitly indicated by the # symbol. In Sina Weibo, these tags are called microtopics, and in Twitter, they are called hashtags. In Sina Weibo, each microtopic has a designate page and can be directly visited or commented on. Recommending these microtopics to users based on their interests can help users efficiently acquire information. However, it is non-trivial to recommend microtopics to users to satisfy their information needs. In this article, we investigate the task of personalized microtopic recommendation, which exhibits two challenges. First, users usually do not …


Audiosense: Sound-Based Shopper Behavior Analysis System, Amit Sharma, Youngki Lee Sep 2017

Audiosense: Sound-Based Shopper Behavior Analysis System, Amit Sharma, Youngki Lee

Research Collection School Of Computing and Information Systems

This paper presents AudioSense, the system to monitor user-item interactions inside a store hence enabling precisely customized promotions. A shopper's smartwatch emits sound every time the shopper picks up or touches an item inside a store. This sound is then localized, in 2D space, by calculating the angles of arrival captured by multiple microphones deployed on the racks. Lastly, the 2D location is mapped to specific items on the rack based on the rack layout information. In our initial experiments conducted with a single rack with 16 compartments, we could localize the shopper's smartwatch with a median estimation error of …


Well-Tuned Algorithms For The Team Orienteering Problem With Time Windows, Aldy Gunawan, Hoong Chuin Lau, Pieter Vansteenwegen, Kun Lu Aug 2017

Well-Tuned Algorithms For The Team Orienteering Problem With Time Windows, Aldy Gunawan, Hoong Chuin Lau, Pieter Vansteenwegen, Kun Lu

Research Collection School Of Computing and Information Systems

The Team Orienteering Problem with Time Windows (TOPTW) is the extension of the Orienteering Problem (OP) where each node is limited by a predefined time window during which the service has to start. The objective of the TOPTW is to maximize the total collected score by visiting a set of nodes with a limited number of paths. We propose two algorithms, Iterated Local Search and a hybridization of Simulated Annealing and Iterated Local Search (SAILS), to solve the TOPTW. As indicated in multiple research works on algorithms for the OP and its variants, determining appropriate parameter values in a statistical …


Time-Aware Conversion Prediction, Wendi Ji, Xiaoling Wang, Feida Zhu Aug 2017

Time-Aware Conversion Prediction, Wendi Ji, Xiaoling Wang, Feida Zhu

Research Collection School Of Computing and Information Systems

The importance of product recommendation has been well recognized as a central task in business intelligence for e-commerce websites. Interestingly, what has been less aware of is the fact that different products take different time periods for conversion. The “conversion” here refers to actually a more general set of pre-defined actions, including for example purchases or registrations in recommendation and advertising systems. The mismatch between the product’s actual conversion period and the application’s target conversion period has been the subtle culprit compromising many existing recommendation algorithms.The challenging question: what products should be recommended for a given time period to maximize …


Mechanism Design For Strategic Project Scheduling, Pradeep Varakantham, Na Fu Aug 2017

Mechanism Design For Strategic Project Scheduling, Pradeep Varakantham, Na Fu

Research Collection School Of Computing and Information Systems

Organizing large scale projects (e.g., Conferences, IT Shows, F1 race) requires precise scheduling of multiple dependent tasks on common resources where multiple selfish entities are competing to execute the individual tasks. In this paper, we consider a well studied and rich scheduling model referred to as RCPSP (Resource Constrained Project Scheduling Problem). The key change to this model that we consider in this paper is the presence of selfish entities competing to perform individual tasks with the aim of maximizing their own utility. Due to the selfish entities in play, the goal of the scheduling problem is no longer only …


Embedding-Based Representation Of Categorical Data By Hierarchical Value Coupling Learning, Songlei Jian, Longbing Cao, Guansong Pang, Kai Lu, Hang Gao Aug 2017

Embedding-Based Representation Of Categorical Data By Hierarchical Value Coupling Learning, Songlei Jian, Longbing Cao, Guansong Pang, Kai Lu, Hang Gao

Research Collection School Of Computing and Information Systems

Learning the representation of categorical data with hierarchical value coupling relationships is very challenging but critical for the effective analysis and learning of such data. This paper proposes a novel coupled unsupervised categorical data representation (CURE) framework and its instantiation, i.e., a coupled data embedding (CDE) method, for representing categorical data by hierarchical value-to-value cluster coupling learning. Unlike existing embedding- and similarity-based representation methods which can capture only a part or none of these complex couplings, CDE explicitly incorporates the hierarchical couplings into its embedding representation. CDE first learns two complementary feature value couplings which are then used to cluster …


Learning Homophily Couplings From Non-Iid Data For Joint Feature Selection And Noise-Resilient Outlier Detection, Guansong Pang, Longbing Cao, Ling Chen, Huan Liu Aug 2017

Learning Homophily Couplings From Non-Iid Data For Joint Feature Selection And Noise-Resilient Outlier Detection, Guansong Pang, Longbing Cao, Ling Chen, Huan Liu

Research Collection School Of Computing and Information Systems

This paper introduces a novel wrapper-based outlier detection framework (WrapperOD) and its instance (HOUR) for identifying outliers in noisy data (i.e., data with noisy features) with strong couplings between outlying behaviors. Existing subspace or feature selection-based methods are significantly challenged by such data, as their search of feature subset(s) is independent of outlier scoring and thus can be misled by noisy features. In contrast, HOUR takes a wrapper approach to iteratively optimize the feature subset selection and outlier scoring using a top-k outlier ranking evaluation measure as its objective function. HOUR learns homophily couplings between outlying behaviors (i.e., abnormal behaviors …


Geometric Approaches For Top-K Queries [Tutorial], Kyriakos Mouratidis Aug 2017

Geometric Approaches For Top-K Queries [Tutorial], Kyriakos Mouratidis

Research Collection School Of Computing and Information Systems

Top-k processing is a well-studied problem with numerous applications that is becoming increasingly relevant with the growing availability of recommendation systems and decision-making software. The objective of this tutorial is twofold. First, we will delve into the geometric aspects of top-k processing. Second, we will cover complementary features to top-k queries, with strong practical relevance and important applications, that have a computational geometric nature. The tutorial will close with insights in the effect of dimensionality on the meaningfulness of top-k queries, and interesting similarities to nearest neighbor search.


Semantic Visualization For Short Texts With Word Embeddings, Van Minh Tuan Le, Hady W. Lauw Aug 2017

Semantic Visualization For Short Texts With Word Embeddings, Van Minh Tuan Le, Hady W. Lauw

Research Collection School Of Computing and Information Systems

Semantic visualization integrates topic modeling and visualization, such that every document is associated with a topic distribution as well as visualization coordinates on a low-dimensional Euclidean space. We address the problem of semantic visualization for short texts. Such documents are increasingly common, including tweets, search snippets, news headlines, or status updates. Due to their short lengths, it is difficult to model semantics as the word co-occurrences in such a corpus are very sparse. Our approach is to incorporate auxiliary information, such as word embeddings from a larger corpus, to supplement the lack of co-occurrences. This requires the development of a …


Toward Accurate Network Delay Measurement On Android Phones, Weichao Li, Daoyuan Wu, Rocky K. C. Chang, Ricky K. P. Mok Aug 2017

Toward Accurate Network Delay Measurement On Android Phones, Weichao Li, Daoyuan Wu, Rocky K. C. Chang, Ricky K. P. Mok

Research Collection School Of Computing and Information Systems

Measuring and understanding the performance of mobile networks is becoming very important for end users and operators. Despite the availability of many measurement apps, their measurement accuracy has not received sufficient scrutiny. In this paper, we appraise the accuracy of smartphone-based network performance measurement using the Android platform and the network round-trip time (RTT) as the metric. We show that two of the most popular measurement apps-Ookla Speedtest and MobiPerf-have their RTT measurements inflated. We build three test apps that cover three common measurement methods and evaluate them in a testbed. We overcome the main challenge of obtaining a complete …