Efficient Skyline Maintenance For Streaming Data With Partially-Ordered Domains,
2010
Singapore Management University
Efficient Skyline Maintenance For Streaming Data With Partially-Ordered Domains, Yuan Fang, Chee-Yong Chan
Research Collection School Of Computing and Information Systems
We address the problem of skyline query processing for a count-based window of continuous streaming data that involves both totally- and partially-ordered attribute domains. In this problem, a fixed-size buffer of the N most recent tuples is dynamically maintained and the key challenge is how to efficiently maintain the skyline of the sliding window of N tuples as new tuples arrive and old tuples expire. We identify the limitations of the state-of-the-art approach STARS, and propose two new approaches, STARS+ and SkyGrid, to address its drawbacks. STARS+ is an enhancement of STARS with three new optimization techniques, while SkyGrid is …
Optimal Matching Between Spatial Datasets Under Capacity Constraints,
2010
University of Hong Kong
Optimal Matching Between Spatial Datasets Under Capacity Constraints, Hou U Leong, Kyriakos Mouratidis, Man Lung Yiu, Nikos Mamoulis
Research Collection School Of Computing and Information Systems
Consider a set of customers (e.g., WiFi receivers) and a set of service providers (e.g., wireless access points), where each provider has a capacity and the quality of service offered to its customers is anti-proportional to their distance. The capacity constrained assignment (CCA) is a matching between the two sets such that (i) each customer is assigned to at most one provider, (ii) every provider serves no more customers than its capacity, (iii) the maximum possible number of customers are served, and (iv) the sum of Euclidean distances within the assigned provider-customer pairs is minimized. Although max-flow algorithms are applicable …
Mining Diversity On Networks,
2010
Tsinghua University
Mining Diversity On Networks, Lu Liu, Feida Zhu, Chen Chen, Xifeng Yan, Jiawei Han, Philip Yu, Shiqiang Yang
Research Collection School Of Computing and Information Systems
Despite the recent emergence of many large-scale networks in different application domains, an important measure that captures a participant’s diversity in the network has been largely neglected in previous studies. Namely, diversity characterizes how diverse a given node connects with its peers. In this paper, we give a comprehensive study of this concept. We first lay out two criteria that capture the semantic meaning of diversity, and then propose a compliant definition which is simple enough to embed the idea. An efficient top-k diversity ranking algorithm is developed for computation on dynamic networks. Experiments on both synthetic and real datasets …
Managing Media Rich Geo-Spatial Annotations For A Map-Based Mobile Application Using Clustering,
2010
Nanyang Technological University
Managing Media Rich Geo-Spatial Annotations For A Map-Based Mobile Application Using Clustering, Khasfariyati Razikin, Dion Hoe-Lian Goh, Ee Peng Lim, Aixin Sun, Yin-Leng Theng, Thi Nhu Quynh Kim, Kalyani Chatterjea, Chew-Hung Chang
Research Collection School Of Computing and Information Systems
With the prevalence of mobile devices that are equipped with wireless Internet capabilities and Global Positioning System (GPS) functionality, the creation and access of user-generated content are extended to users on the go. Such content are tied to real world objects, in the form of geospatial annotations, and it is only natural that these annotations are visualized using a map-based approach. However, viewing maps that are filled with annotations could hinder the serendipitous discovery of data, especially on the small screens of mobile devices. This calls for a need to manage the annotations. In this paper, we introduce a mobile …
Algorithms For Constrained K-Nearest Neighbor Queries Over Moving Object Trajectories,
2010
Singapore Management University
Algorithms For Constrained K-Nearest Neighbor Queries Over Moving Object Trajectories, Yunjun Gao, Baihua Zheng, Gencai Chen, Qing Li, Chun Chen
Research Collection School Of Computing and Information Systems
An important query for spatio-temporal databases is to find nearest trajectories of moving objects. Existing work on this topic focuses on the closest trajectories in the whole data space. In this paper, we introduce and solve constrained k-nearest neighbor (CkNN) queries and historical continuous CkNN (HCCkNN) queries on R-tree-like structures storing historical information about moving object trajectories. Given a trajectory set D, a query object (point or trajectory) q, a temporal extent T, and a constrained region CR, (i) a CkNN query over trajectories retrieves from D within T, the k (≥ 1) trajectories that lie closest to q and …
Generating Synonyms Based On Query Log Data,
2010
Singapore Management University
Generating Synonyms Based On Query Log Data, Stelios Paparizos, Tao Cheng, Hady W. Lauw
Research Collection School Of Computing and Information Systems
An approach is described for generating synonyms to supplement at least one information item, such as, in one case, a set of related items. The approach can involve an expansion phase, a clean-up phase, and a reduction phase. In the expansion phase, the approach identifies, for each related item, a set of initial synonym candidates. In the clean-up phase, the approach removes noise from the set of initial synonym candidates (if such noise exists), to provide a set of filtered synonym candidate items. In the reduction phase, the approach ranks and applies a threshold (or thresholds) to the set of …
A Social Network Based Study Of Software Team Dynamics,
2010
Singapore Management University
A Social Network Based Study Of Software Team Dynamics, Subhajit Datta, Vikrant S. Kaulgoud, Vibhu Saujanya Sharma, Nishant Kumar
Research Collection School Of Computing and Information Systems
Members of software project teams have specific roles and responsibilities which are formally defined during project inception or at the start of a life cycle activity. Often, the team structure undergoes spontaneous changes as delivery deadlines draw near and critical tasks have to be completed. Some members -- depending on their skill or seniority -- need to take on more responsibilities, while others end up being peripheral to the project's execution. We posit that this kind of ad hoc reorganization of a team's structure can be discerned from the project's bug tracker. In this paper, we extract a social network …
Adaptive Ensemble Classification In P2p Networks,
2010
Nanyang Technological University
Adaptive Ensemble Classification In P2p Networks, Hock Hee Ang, Vivekanand Gopalkrishnan, Steven C. H. Hoi, Wee Keong Ng
Research Collection School Of Computing and Information Systems
Classification in P2P networks has become an important research problem in data mining due to the popularity of P2P computing environments. This is still an open difficult research problem due to a variety of challenges, such as non-i.i.d. data distribution, skewed or disjoint class distribution, scalability, peer dynamism and asynchronism. In this paper, we present a novel P2P Adaptive Classification Ensemble (PACE) framework to perform classification in P2P networks. Unlike regular ensemble classification approaches, our new framework adapts to the test data distribution and dynamically adjusts the voting scheme by combining a subset of classifiers/peers according to the test data …
Continuous Spatial Assignment Of Moving Users,
2010
University of Hong Kong
Continuous Spatial Assignment Of Moving Users, Hou U Leong, Kyriakos Mouratidis, Nikos Mamoulis
Research Collection School Of Computing and Information Systems
Consider a set of servers and a set of users, where each server has a coverage region (i.e., an area of service) and a capacity (i.e., a maximum number of users it can serve). Our task is to assign every user to one server subject to the coverage and capacity constraints. To offer the highest quality of service, we wish to minimize the average distance between users and their assigned server. This is an instance of a well-studied problem in operations research, termed optimal assignment. Even though there exist several solutions for the static case (where user locations are fixed), …
Pagesense: Style-Wise Web Page Advertising,
2010
Singapore Management University
Pagesense: Style-Wise Web Page Advertising, Lusong Li, Tao Mei, Xiang Niu, Chong-Wah Ngo
Research Collection School Of Computing and Information Systems
This paper presents an innovative style-wise advertising platform for web page. Web page “style” mainly refers to visual effects, such as color and layout. Unlike the most popular ad-network such as Google AdSense which needs publishers to change the original structure of their pages and define the position and style of the embedded ads manually, stylewise page advertising aims to automatically deliver styleconsistent ads at proper positions within the web page, without breaking the layout of the original page. Our system is motivated from the fact that almost 90% web pages contain blank regions without any content. Given a web …
Data Mining Based Predictive Models For Overall Health Indices,
2010
University of Minnesota - Twin Cities
Data Mining Based Predictive Models For Overall Health Indices, Ridhima Rajkumar, Kyong Jin Shim, Jaideep Srivastava
Research Collection School Of Computing and Information Systems
In this study, we infer health care indices of individuals using their pharmacy medical and prescription claims. Specifically, we focus on the widely used Charlson Index. We use data mining techniques to formulate the problem of classifying Charlson Index (CI) and build predictive models to predict individual health index score. First, we present comparative analyses of several classification algorithms. Second, our study shows that certain ensemble algorithms lead to higher prediction accuracy in comparison to base algorithms. Third, we introduce cost-sensitive learning to the classification algorithms and show that the inclusion of cost-sensitive learning leads to improved prediction accuracy. The …
Do You Trust To Get Trust? A Study Of Trust Reciprocity Behaviors And Reciprocal Trust Prediction,
2010
Singapore Management University
Do You Trust To Get Trust? A Study Of Trust Reciprocity Behaviors And Reciprocal Trust Prediction, Viet-An Nguyen, Ee Peng Lim, Hwee Hoon Tan, Jing Jiang, Aixin Sun
Research Collection School Of Computing and Information Systems
Trust reciprocity, a special form of link reciprocity, exists in many networks of trust among users. In this paper, we seek to determine the extent to which reciprocity exists in a trust network and develop quantitative models for measuring reciprocity and reciprocity related behaviors. We identify several reciprocity behaviors and their respective measures. These behavior measures can be employed for predicting if a trustee will return trust to her trustor given that the latter initiates a trust link earlier. We develop for this reciprocal trust prediction task a number of ranking method and classification methods, and evaluated them on an …
Incommensurability And Multi-Paradigm Grounding In Design Science Research: Implications For Creating Knowledge,
2010
Bond University
Incommensurability And Multi-Paradigm Grounding In Design Science Research: Implications For Creating Knowledge, Dirk S. Hovorka
Dirk Hovorka
The ‘problem identification-design-build-evaluate-theorize’ structure of Design Science Research has been proposed as an approach to creating knowledge in information systems and in broader organizational and social domains. Although the approach has merit, the philosophical foundations of two specific components warrant attention. First, the grounding of design theory on potentially incommensurate kernel theories may produce incoherent design theory. In addition, the newly design theory has no strong logical connection to the kernel theories, and so cannot be used to test or validate the contributing kernel theories. Second, the philosophical grounding of evaluation may inadvertently shift from functionally-based measures of utility and …
Tredd—A Database For Tandem Repeats Over The Edit Distance,
2010
CUNY Brooklyn College
Tredd—A Database For Tandem Repeats Over The Edit Distance, Dina Sokol, Firat Atagun
Publications and Research
A ‘tandem repeat’ in DNA is a sequence of two or more contiguous, approximate copies of a pattern of nucleotides. Tandem repeats are common in the genomes of both eukaryotic and prokaryotic organisms. They are significant markers for human identity testing, disease diagnosis, sequence homology and population studies. In this article, we describe a new database, TRedD, which contains the tandem repeats found in the human genome. The database is publicly available online, and the software for locating the repeats is also freely available. The definition of tandem repeats used by TRedD is a new and innovative definition based upon …
Multi-Objective Constraint Satisfaction For Mobile Robot Area Defense,
2010
Air Force Institute of Technology
Multi-Objective Constraint Satisfaction For Mobile Robot Area Defense, Kenneth W. Mayo
Theses and Dissertations
In developing multi-robot cooperative systems, there are often competing objectives that need to be met. For example in automating area defense systems, multiple robots must work together to explore the entire area, and maintain consistent communications to alert the other agents and ensure trust in the system. This research presents an algorithm that tasks robots to meet the two specific goals of exploration and communication maintenance in an uncoordinated environment reducing the need for a user to pre-balance the objectives. This multi-objective problem is defined as a constraint satisfaction problem solved using the Non-dominated Sorting Genetic Algorithm II (NSGA-II). Both …
Bay Audio Repair Website & Data Management Application,
2010
California Polytechnic State University - San Luis Obispo
Bay Audio Repair Website & Data Management Application, Michael Shelley
Computer Science and Software Engineering
The goal of this senior project was to build a website and software application to receive and manage audio equipment repair requests for a small startup company called Bay Audio Repair (BAR). Furthermore, it allowed me to gain experience in web development and software engineering practices, specifically requirements gathering, design and implementation. The website provides an online interface for BAR’s customers to request repairs and the application allows BAR employees to update the progress of a repair. Several technologies were used in the system’s construction: HTML, XML, PHP, and C#.
Top-K Aggregation Queries Over Large Networks,
2010
University of California at Santa Barbara, USA
Top-K Aggregation Queries Over Large Networks, Xifeng Yan, Bin He, Feida Zhu, Jiawei Han
Research Collection School Of Computing and Information Systems
Searching and mining large graphs today is critical to a variety of application domains, ranging from personalized recommendation in social networks, to searches for functional associations in biological pathways. In these domains, there is a need to perform aggregation operations on large-scale networks. Unfortunately the existing implementation of aggregation operations on relational databases does not guarantee superior performance in network space, especially when it involves edge traversals and joins of gigantic tables. In this paper, we investigate the neighborhood aggregation queries: Find nodes that have top-k highest aggregate values over their h-hop neighbors. While these basic queries are common in …
K-Anonymity In The Presence Of External Databases,
2010
Singapore Management University
K-Anonymity In The Presence Of External Databases, Dimitris Sacharidis, Kyriakos Mouratidis, Dimitris Papadias
Research Collection School Of Computing and Information Systems
The concept of k-anonymity has received considerable attention due to the need of several organizations to release microdata without revealing the identity of individuals. Although all previous k-anonymity techniques assume the existence of a public database (PD) that can be used to breach privacy, none utilizes PD during the anonymization process. Specifically, existing generalization algorithms create anonymous tables using only the microdata table (MT) to be published, independently of the external knowledge available. This omission leads to high information loss. Motivated by this observation we first introduce the concept of k-join-anonymity (KJA), which permits more effective generalization to reduce the …
Symphony: A Platform For Search-Driven Applications,
2010
Microsoft Research
Symphony: A Platform For Search-Driven Applications, John C. Shafer, Rakesh Agrawal, Hady W. Lauw
Research Collection School Of Computing and Information Systems
We present the design of Symphony, a platform that enables non-developers to build and deploy a new class of search-driven applications that combine their data and domain expertise with content from search engines and other web services. The Symphony prototype has been built on top of Microsoft's Bing infrastructure. While Symphony naturally makes use of the customization capabilities exposed by Bing, its distinguishing feature is the capability it provides to the application creator to combine their proprietary data and domain expertise with content obtained from Bing. They can also integrate specialized data obtained from web services to enhance the richness …
Differentiating Knowledge Processes In Organisational Learning: A Case Of “Two Solitudes”,
2010
Singapore Management University
Differentiating Knowledge Processes In Organisational Learning: A Case Of “Two Solitudes”, Siu Loon Hoe, Steven Mcshane
Research Collection School Of Computing and Information Systems
The fields of organizational behavior (OB)/strategy and marketing have taken different paths over the past two decades to understanding organizational learning. OB/strategy has been pre-occupied with theory development and case study illustrations, whereas marketing has taken a highly quantitative path. Although relying on essentially the same foundation theory, the two disciplines have had minimal crossfertilization. Furthermore, both fields tend to blur or usually ignore the distinction between structural and informal knowledge processes. The purpose of the paper is to highlight the distinction between informal and structural knowledge acquisition and dissemination processes and propose new definitions to differentiate them. Future research …