Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Publication Year
- Publication
- Publication Type
Articles 1 - 28 of 28
Full-Text Articles in Physical Sciences and Mathematics
Explorelah: Personalised And Smart Trip Planner For Mobile Tourism, Aldy Gunawan, Siu Loon Hoe, Xun Yi Lim, Linh Chi Tran, Dang Viet Anh Nguyen
Explorelah: Personalised And Smart Trip Planner For Mobile Tourism, Aldy Gunawan, Siu Loon Hoe, Xun Yi Lim, Linh Chi Tran, Dang Viet Anh Nguyen
Research Collection School Of Computing and Information Systems
Various recommender systems for mobile tourism have been developed over the years. However, most of these recommender systems tend to overwhelm users with too much information and may not be personalised to user preferences. In this paper, we introduce ExploreLah, a personalised and smart trip planner for exploring Point of Interests (POIs) in Singapore. The user preferences are categorised into five groups: shopping, art & culture, outdoor activity, adventure, and nightlife. The problem is considered as the Team Orienteering Problem with Time Windows. The algorithm is developed to generate itineraries. Simulated experiments using test cases were performed to evaluate and …
Towards Distributed Node Similarity Search On Graphs, Tianming Zhang, Yunjun Gao, Baihua Zheng, Lu Chen, Shiting Wen, Wei Guo
Towards Distributed Node Similarity Search On Graphs, Tianming Zhang, Yunjun Gao, Baihua Zheng, Lu Chen, Shiting Wen, Wei Guo
Research Collection School Of Computing and Information Systems
Node similarity search on graphs has wide applications in recommendation, link prediction, to name just a few. However, existing studies are insufficient due to two reasons: (i) the scale of the real-world graph is growing rapidly, and (ii) vertices are always associated with complex attributes. In this paper, we propose an efficiently distributed framework to support node similarity search on massive graphs, which considers both graph structure correlation and node attribute similarity in metric spaces. The framework consists of preprocessing stage and query stage. In the preprocessing stage, a parallel KD-tree construction (KDC) algorithm is developed to form a newly …
Efficient Distributed Reachability Querying Of Massive Temporal Graphs, Tianming Zhang, Yunjun Gao, Chen Lu, Wei Guo, Shiliang Pu, Baihua Zheng, Christian S. Jensen
Efficient Distributed Reachability Querying Of Massive Temporal Graphs, Tianming Zhang, Yunjun Gao, Chen Lu, Wei Guo, Shiliang Pu, Baihua Zheng, Christian S. Jensen
Research Collection School Of Computing and Information Systems
Reachability computation is a fundamental graph functionality with a wide range of applications. In spite of this, little work has as yet been done on efficient reachability queries over temporal graphs, which are used extensively to model time-varying networks, such as communication networks, social networks, and transportation schedule networks. Moreover, we are faced with increasingly large real-world temporal networks that may be distributed across multiple data centers. This state of affairs motivates the paper's study of efficient reachability queries on distributed temporal graphs. We propose an efficient index, called Temporal Vertex Labeling (TVL), which is a labeling scheme for distributed …
Distributed Similarity Queries In Metric Spaces, Keyu Yang, Xin Ding, Yuanliang Zhang, Lu Chen, Baihua Zheng, Yunjun Gao
Distributed Similarity Queries In Metric Spaces, Keyu Yang, Xin Ding, Yuanliang Zhang, Lu Chen, Baihua Zheng, Yunjun Gao
Research Collection School Of Computing and Information Systems
Similarity queries, including range queries and k nearest neighbor (kNN) queries, in metric spaces have applications in many areas such as multimedia retrieval, computational biology and location-based services. With the growing volumes of data, a distributed method is required. In this paper, we propose an Asynchronous Metric Distributed System (AMDS), to support efficient metric similarity queries in the distributed environment. AMDS uniformly partitions the data with the pivot-mapping technique to ensure the load balancing, and employs publish/subscribe communication model to asynchronous process large scale of queries. The employment of asynchronous processing model also improves robustness and efficiency of AMDS. In …
Maximizing Multifaceted Network Influence, Yuchen Li, Ju Fan, George V. Ovchinnikov, Panagiotis Karras
Maximizing Multifaceted Network Influence, Yuchen Li, Ju Fan, George V. Ovchinnikov, Panagiotis Karras
Research Collection School Of Computing and Information Systems
An information dissemination campaign is often multifaceted, involving several facets or pieces of information disseminating from different sources. The question then arises, how should we assign such pieces to eligible sources so as to achieve the best viral dissemination results? Past research has studied the problem of Influence Maximization (IM), which is to select a set of k promoters that maximizes the expected reach of a message over a network. However, in this classical IM problem, each promoter spreads out the same unitary piece of information. In this paper, we propose the Optimal Influential Pieces Assignment (OIPA) problem, which is …
Distributed K-Nearest Neighbor Queries In Metric Spaces, Xin Ding, Yuanliang Zhang, Lu Chen, Yunjun Gao, Baihua Zheng
Distributed K-Nearest Neighbor Queries In Metric Spaces, Xin Ding, Yuanliang Zhang, Lu Chen, Yunjun Gao, Baihua Zheng
Research Collection School Of Computing and Information Systems
Metric k nearest neighbor (MkNN) queries have applications in many areas such as multimedia retrieval, computational biology, and location-based services. With the growing volumes of data, a distributed method is required. In this paper, we propose an Asynchronous Metric Distributed System (AMDS), which uniformly partitions the data with the pivot-mapping technique to ensure the load balancing, and employs publish/subscribe communication model to asynchronously process large scale of queries. The employment of asynchronous processing model also improves robustness and efficiency of AMDS. In addition, we develop an efficient estimation based MkNN method using AMDS to improve the query efficiency. Extensive experiments …
Understanding The Effects Of Taxi Ride-Sharing: A Case Study Of Singapore, Yazhe Wang, Baihua Zheng, Ee Peng Lim
Understanding The Effects Of Taxi Ride-Sharing: A Case Study Of Singapore, Yazhe Wang, Baihua Zheng, Ee Peng Lim
Research Collection School Of Computing and Information Systems
This paper studies the effects of ride-sharing among those calling on taxis in Singapore for similar origin and destination pairs at nearly the same time of day. It proposes a simple yet practical framework for taxi ride-sharing and scheduling, to reduce waiting times and travel times during peak demand periods. The solution method helps taxi users save money while helping taxi drivers serve multiple requests per day, thus increasing their earnings. A comprehensive simulation study is conducted, based on real taxi booking data for the city of Singapore, to evaluate the effect of various factors of the ride-sharing practice, e.g., …
Metric Similarity Joins Using Mapreduce, Yunjun Gao, Keyu Yang, Lu Chen, Baihua Zheng, Gang Chen, Chun Chen
Metric Similarity Joins Using Mapreduce, Yunjun Gao, Keyu Yang, Lu Chen, Baihua Zheng, Gang Chen, Chun Chen
Research Collection School Of Computing and Information Systems
Given two object sets Q and O , a metric similarity join finds similar object pairs according to a certain criterion. This operation has a wide variety of applications in data cleaning, data mining, to name but a few. However, the rapidly growing volume of data nowadays challenges traditional metric similarity join methods, and thus, a distributed method is required. In this paper, we adopt a popular distributed framework, namely, MapReduce, to support scalable metric similarity joins. To ensure the load balancing, we present two sampling based partition methods. One utilizes the pivot and the space-filling curve mappings to cluster …
Answering Why-Not And Why Questions On Reverse Top-K Queries, Qing Liu, Yunjun Gao, Gang Chen, Baihua Zheng, Linlin Zhou
Answering Why-Not And Why Questions On Reverse Top-K Queries, Qing Liu, Yunjun Gao, Gang Chen, Baihua Zheng, Linlin Zhou
Research Collection School Of Computing and Information Systems
Why-not and why questions can be posed by database users to seek clarifications on unexpected query results. Specifically, why-not questions aim to explain why certain expected tuples are absent from the query results, while why questions try to clarify why certain unexpected tuples are present in the query results. This paper systematically explores the why-not and why questions on reverse top-k queries, owing to its importance in multi-criteria decision making. We first formalize why-not questions on reverse top-k queries, which try to include the missing objects in the reverse top-k query results, and then, we propose a unified framework called …
Efficient Collective Spatial Keyword Query Processing On Road Networks, Yunjun Gao, Jingwen Zhao, Baihua Zheng, Gang Chen
Efficient Collective Spatial Keyword Query Processing On Road Networks, Yunjun Gao, Jingwen Zhao, Baihua Zheng, Gang Chen
Research Collection School Of Computing and Information Systems
The collective spatial keyword query (CSKQ), an important variant of spatial keyword queries, aims to find a set of the objects that collectively cover users' queried keywords, and those objects are close to the query location and have small inter-object distances. Existing works only focus on the CSKQ problem in the Euclidean space, although we observe that, in many real-life applications, the closeness of two spatial objects is measured by their road network distance. Thus, existing methods cannot solve the problem of network-based CSKQ efficiently. In this paper, we study the problem of collective spatial keyword query processing on road …
Top-K Dominating Queries On Incomplete Data, Xiaoye Miao, Yunjun Gao, Baihua Zheng, Gang Chen, Huiyong Cui
Top-K Dominating Queries On Incomplete Data, Xiaoye Miao, Yunjun Gao, Baihua Zheng, Gang Chen, Huiyong Cui
Research Collection School Of Computing and Information Systems
The top-k dominating (TKD) query returns the k objects that dominate the maximum number of objects in a given dataset. It combines the advantages of skyline and top-k queries, and plays an important role in many decision support applications. Incomplete data exists in a wide spectrum of real datasets, due to device failure, privacy preservation, data loss, and so on. In this paper, for the first time, we carry out a systematic study of TKD queries on incomplete data, which involves the data having some missing dimensional value(s). We formalize this problem, and propose a suite of efficient algorithms for …
On Robust Image Spam Filtering Via Comprehensive Visual Modeling, Jialie Shen, Deng, Robert H., Zhiyong Cheng, Liqiang Nie, Shuicheng Yan
On Robust Image Spam Filtering Via Comprehensive Visual Modeling, Jialie Shen, Deng, Robert H., Zhiyong Cheng, Liqiang Nie, Shuicheng Yan
Research Collection School Of Computing and Information Systems
The Internet has brought about fundamental changes in the way peoples generate and exchange media information. Over the last decade, unsolicited message images (image spams) have become one of the most serious problems for Internet service providers (ISPs), business firms and general end users. In this paper, we report a novel system called RoBoTs (Robust BoosTrap based spam detector) to support accurate and robust image spam filtering. The system is developed based on multiple visual properties extracted from different levels of granularity, aiming to capture more discriminative contents for effective spam image identification. In addition, a resampling based learning framework …
On Processing Reverse K-Skyband And Ranked Reverse Skyline Queries, Yunjun Gao, Qing Liu, Baihua Zheng, Mou Li, Gang Chen, Qing Li
On Processing Reverse K-Skyband And Ranked Reverse Skyline Queries, Yunjun Gao, Qing Liu, Baihua Zheng, Mou Li, Gang Chen, Qing Li
Research Collection School Of Computing and Information Systems
In this paper, for the first time, we identify and solve the problem of efficient reverse k-skyband (RkSB) query processing. Given a set P of multi-dimensional points and a query point q, an RkSB query returns all the points in P whose dynamic k-skyband contains q. We formalize RkSB retrieval, and then propose five algorithms for computing the RkSB of an arbitrary query point efficiently. Our methods utilize a conventional data-partitioning index (e.g., R-tree) on the dataset, and employ pre-computation, reuse and pruning techniques to boost the query efficiency. In addition, we extend our solutions to tackle an interesting variant …
On Efficient Reverse Skyline Query Processing, Yunjun Gao, Qing Liu, Baihua Zheng, Gang Chen
On Efficient Reverse Skyline Query Processing, Yunjun Gao, Qing Liu, Baihua Zheng, Gang Chen
Research Collection School Of Computing and Information Systems
Given a D-dimensional data set P and a query point q, a reverse skyline query (RSQ) returns all the data objects in P whose dynamic skyline contains q. It is important for many real life applications such as business planning and environmental monitoring. Currently, the state-of-the-art algorithm for answering the RSQ is the reverse skyline using skyline approximations (RSSA) algorithm, which is based on the precomputed approximations of the skylines. Although RSSA has some desirable features, e.g., applicability to arbitrary data distributions and dimensions, it needs for multiple accesses of the same nodes, incurring redundant I/O and CPU costs. In …
Anomaly Detection On Social Data, Hanbo Dai
Anomaly Detection On Social Data, Hanbo Dai
Dissertations and Theses Collection (Open Access)
The advent of online social media including Facebook, Twitter, Flickr and Youtube has drawn massive attention in recent years. These online platforms generate massive data capturing the behavior of multiple types of human actors as they interact with one another and with resources such as pictures, books and videos. Unfortunately, the openness of these platforms often leaves them highly susceptible to abuse by suspicious entities such as spammers. It therefore becomes increasingly important to automatically identify these suspicious entities and eliminate their threats. We call these suspicious entities anomalies in social data, as they often hold different agenda comparing to …
Business Intelligence And Analytics: Research Directions, Ee Peng Lim, Hsinchun Chen, Guoqing Chen
Business Intelligence And Analytics: Research Directions, Ee Peng Lim, Hsinchun Chen, Guoqing Chen
Research Collection School Of Computing and Information Systems
Business intelligence and analytics (BIA) is about the development of technologies, systems, practices, and applications to analyze critical business data so as to gain new insights about business and markets. The new insights can be used for improving products and services, achieving better operational efficiency, and fostering customer relationships. In this article, we will categorize BIA research activities into three broad research directions: (a) big data analytics, (b) text analytics, and (c) network analytics. The article aims to review the state-of-the-art techniques and models and to summarize their use in BIA applications. For each research direction, we will also determine …
Structural And Functional Analysis Of Multi-Interface Domains, Liang Zhao, Steven C. H. Hoi, Limsoon Wong, Tobias Hamp, Jinyan Li
Structural And Functional Analysis Of Multi-Interface Domains, Liang Zhao, Steven C. H. Hoi, Limsoon Wong, Tobias Hamp, Jinyan Li
Research Collection School Of Computing and Information Systems
A multi-interface domain is a domain that can shape multiple and distinctive binding sites to contact with many other domains, forming a hub in domain-domain interaction networks. The functions played by the multiple interfaces are usually different, but there is no strict bijection between the functions and interfaces as some subsets of the interfaces play the same function. This work applies graph theory and algorithms to discover fingerprints for the multiple interfaces of a domain and to establish associations between the interfaces and functions, based on a huge set of multi-interface proteins from PDB. We found that about 40% of …
Beyond Search: Event-Driven Summarization For Web Videos, Richard Hong, Jinhui Tang, Hung-Khoon Tan, Chong-Wah Ngo, Shuicheng Yan, Tat-Seng Chua
Beyond Search: Event-Driven Summarization For Web Videos, Richard Hong, Jinhui Tang, Hung-Khoon Tan, Chong-Wah Ngo, Shuicheng Yan, Tat-Seng Chua
Research Collection School Of Computing and Information Systems
The explosive growth of Web videos brings out the challenge of how to efficiently browse hundreds or even thousands of videos at a glance. Given an event-driven query, social media Web sites usually return a large number of videos that are diverse and noisy in a ranking list. Exploring such results will be time-consuming and thus degrades user experience. This article presents a novel scheme that is able to summarize the content of video search results by mining and threading "key" shots, such that users can get an overview of main content of these videos at a glance. The proposed …
Continuous Visible Nearest Neighbor Query Processing In Spatial Databases, Yunjun Gao, Baihua Zheng, Gencai Chen, Qing Li, Xiaofa Guo
Continuous Visible Nearest Neighbor Query Processing In Spatial Databases, Yunjun Gao, Baihua Zheng, Gencai Chen, Qing Li, Xiaofa Guo
Research Collection School Of Computing and Information Systems
In this paper, we identify and solve a new type of spatial queries, called continuous visible nearest neighbor (CVNN) search. Given a data set P, an obstacle set O, and a query line segment q in a two-dimensional space, a CVNN query returns a set of $${\langle p, R\rangle}$$ tuples such that $${p \in P}$$ is the nearest neighbor to every point r along the interval $${R \subseteq q}$$ as well as pis visible to r. Note that p may be NULL, meaning that all points in P are invisible to all points in R due to the obstruction of …
Efficient Mutual Nearest Neighbor Query Processing For Moving Object Trajectories, Yunjun Gao, Baihua Zheng, Gencai Chen, Qing Li, Chun Chen, Gang Chen
Efficient Mutual Nearest Neighbor Query Processing For Moving Object Trajectories, Yunjun Gao, Baihua Zheng, Gencai Chen, Qing Li, Chun Chen, Gang Chen
Research Collection School Of Computing and Information Systems
Given a set D of trajectories, a query object q, and a query time extent Γ, a mutual (i.e., symmetric) nearest neighbor (MNN) query over trajectories finds from D, the set of trajectories that are among the k1 nearest neighbors (NNs) of q within Γ, and meanwhile, have q as one of their k2 NNs. This type of queries is useful in many applications such as decision making, data mining, and pattern recognition, as it considers both the proximity of the trajectories to q and the proximity of q to the trajectories. In this paper, we first formalize MNN search …
Algorithms For Constrained K-Nearest Neighbor Queries Over Moving Object Trajectories, Yunjun Gao, Baihua Zheng, Gencai Chen, Qing Li, Chun Chen
Algorithms For Constrained K-Nearest Neighbor Queries Over Moving Object Trajectories, Yunjun Gao, Baihua Zheng, Gencai Chen, Qing Li, Chun Chen
Research Collection School Of Computing and Information Systems
An important query for spatio-temporal databases is to find nearest trajectories of moving objects. Existing work on this topic focuses on the closest trajectories in the whole data space. In this paper, we introduce and solve constrained k-nearest neighbor (CkNN) queries and historical continuous CkNN (HCCkNN) queries on R-tree-like structures storing historical information about moving object trajectories. Given a trajectory set D, a query object (point or trajectory) q, a temporal extent T, and a constrained region CR, (i) a CkNN query over trajectories retrieves from D within T, the k (≥ 1) trajectories that lie closest to q and …
On Efficient Mutual Nearest Neighbor Query Processing In Spatial Databases, Yunjun Gao, Baihua Zheng, Gencai Chen, Qing Li
On Efficient Mutual Nearest Neighbor Query Processing In Spatial Databases, Yunjun Gao, Baihua Zheng, Gencai Chen, Qing Li
Research Collection School Of Computing and Information Systems
This paper studies a new form of nearest neighbor queries in spatial databases, namely, mutual nearest neighbour (MNN) search. Given a set D of objects and a query object q, an MNN query returns from D, the set of objects that are among the k1 (≥ 1) nearest neighbors (NNs) of q; meanwhile, have q as one of their k2(≥ 1) NNs. Although MNN queries are useful in many applications involving decision making, data mining, and pattern recognition, it cannot be efficiently handled by existing spatial query processing approaches. In this paper, we present …
Optimal-Location-Selection Query Processing In Spatial Databases, Yunjun Gao, Baihua Zheng, Gencai Chen, Qing Li
Optimal-Location-Selection Query Processing In Spatial Databases, Yunjun Gao, Baihua Zheng, Gencai Chen, Qing Li
Research Collection School Of Computing and Information Systems
This paper introduces and solves a novel type of spatial queries, namely, Optimal-Location-Selection (OLS) search, which has many applications in real life. Given a data object set D_A, a target object set D_B, a spatial region R, and a critical distance d_c in a multidimensional space, an OLS query retrieves those target objects in D_B that are outside R but have maximal optimality. Here, the optimality of a target object b \in D_B located outside R is defined as the number of the data objects from D_A that are inside R and meanwhile have their distances to b not exceeding …
Finding A Length-Constrained Maximum-Sum Or Maximum-Density Subtree And Its Application To Logistics, Hoong Chuin Lau, Trung Hieu Ngo, Bao Nguyen Nguyen
Finding A Length-Constrained Maximum-Sum Or Maximum-Density Subtree And Its Application To Logistics, Hoong Chuin Lau, Trung Hieu Ngo, Bao Nguyen Nguyen
Research Collection School Of Computing and Information Systems
We study the problem of finding a length-constrained maximum-density path in a tree with weight and length on each edge. This problem was proposed in [R.R. Lin, W.H. Kuo, K.M. Chao, Finding a length-constrained maximum-density path in a tree, Journal of Combinatorial Optimization 9 (2005) 147–156] and solved in O(nU) time when the edge lengths are positive integers, where n is the number of nodes in the tree and U is the length upper bound of the path. We present an algorithm that runs in O(nlog2n) time for the generalized case when the edge lengths are positive real numbers, which …
Tcp Hack: Tcp Header Checksum Option To Improve Performance Over Lossy Links, Rajesh Krishna Balan, Boon Peng Lee, Renjish Kumar, Jacob Lillykutty, Winston Seah, A. L. Ananda
Tcp Hack: Tcp Header Checksum Option To Improve Performance Over Lossy Links, Rajesh Krishna Balan, Boon Peng Lee, Renjish Kumar, Jacob Lillykutty, Winston Seah, A. L. Ananda
Research Collection School Of Computing and Information Systems
Wireless networks have become increasingly common and an increasing number of devices are communicating with each other over lossy links. Unfortunately, TCP performs poorly over lossy links as it is unable to differentiate the loss due to packet corruption from that due to congestion. We present an extension to TCP which enables TCP to distinguish packet corruption from congestion in lossy environments resulting in improved performance. We refer to this extension as the HeAder ChecKsum option (HACK). We implemented our algorithm in the Linux kernel and performed various tests to determine its effectiveness. Our results have shown that HACK performs …
Predictive Adaptive Resonance Theory And Knowledge Discovery In Databases, Ah-Hwee Tan, Hui-Shin Vivien Soon
Predictive Adaptive Resonance Theory And Knowledge Discovery In Databases, Ah-Hwee Tan, Hui-Shin Vivien Soon
Research Collection School Of Computing and Information Systems
This paper investigates the scalability of predictive Adaptive Resonance Theory (ART) networks for knowledge discovery in very large databases. Although predictive ART performs fast and incremental learning, the number of recognition categories or rules that it creates during learning may become substantially large and cause the learning speed to slow down. To tackle this problem, we introduce an on-line algorithm for evaluating and pruning categories during learning. Benchmark experiments on a large scale data set show that on-line pruning has been effective in reducing the number of the recognition categories and the time for convergence. Interestingly, the pruned networks also …
Correction To "Redundancy Optimization Of General Systems", H. Sivaramakrishnan, Arcot Desai Narasimhalu
Correction To "Redundancy Optimization Of General Systems", H. Sivaramakrishnan, Arcot Desai Narasimhalu
Research Collection School Of Computing and Information Systems
Reader Aids-
Purpose: Report a correction
Special math needed: Probability
Results useful to: Reliability Theoreticians
A Rapid Algorithm For Reliability Optimization Of Parallel Redundant Systems, Arcot Desai Narasimhalu, H. Sivaramakrishnan
A Rapid Algorithm For Reliability Optimization Of Parallel Redundant Systems, Arcot Desai Narasimhalu, H. Sivaramakrishnan
Research Collection School Of Computing and Information Systems
A rapid method is proposed for optimization of reliability of multiconstraint parallel redundant systems. The constraints need not be linear. This method provides good starting values, which are close to the boundary of the feasible region, for the number of redundant units in each subsystem. No proof has been presented to establish the optimality obtained by this method. Yet for examples tried out this method provides optimal or near optimal solutions.