Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 35

Full-Text Articles in Databases and Information Systems

Direct Neighbor Search, Jilian Zhang, Kyriakos Mouratidis, Hwee Hwa Pang Jun 2014

Direct Neighbor Search, Jilian Zhang, Kyriakos Mouratidis, Hwee Hwa Pang

Kyriakos MOURATIDIS

In this paper we study a novel query type, called direct neighbor query. Two objects in a dataset are direct neighbors (DNs) if a window selection may exclusively retrieve these two objects. Given a source object, a DN search computes all of its direct neighbors in the dataset. The DNs define a new type of affinity that differs from existing formulations (e.g., nearest neighbors, nearest surrounders, reverse nearest neighbors, etc.) and finds application in domains where user interests are expressed in the form of windows, i.e., multi-attribute range selections. Drawing on key properties of the DN relationship, we develop an …


Shortlisting Top-K Assignments, Yimin Lin, Kyriakos Mouratidis Aug 2013

Shortlisting Top-K Assignments, Yimin Lin, Kyriakos Mouratidis

Kyriakos MOURATIDIS

In this paper we identify a novel query type, the top-K assignment query (αTop-K). Consider a set of objects and a set of suppliers, where each object must be assigned to one supplier. Assume that there is a cost associated with every object-supplier pair. If we allocate each object to the server with the smallest cost (for the specific object), the derived overall assignment will have the minimum total cost. In many scenarios, however, runner-up assignments may be required too, like for example when a decision maker needs to make additional considerations, not captured by individual object-supplier costs. In this …


Best Upgrade Plans For Large Road Networks, Yimin Lin, Kyriakos Mouratidis Aug 2013

Best Upgrade Plans For Large Road Networks, Yimin Lin, Kyriakos Mouratidis

Kyriakos MOURATIDIS

In this paper, we consider a new problem in the context of road network databases, named Resource Constrained Best Upgrade Plan computation (BUP, for short). Consider a transportation network (weighted graph) G where a subset of the edges are upgradable, i.e., for each such edge there is a cost, which if spent, the weight of the edge can be reduced to a specific new value. Given a source and a destination in G, and a budget (resource constraint) B, the BUP problem is to identify which upgradable edges should be upgraded so that the shortest path distance between source and …


Computing Immutable Regions For Subspace Top-K Queries, Kyriakos Mouratidis, Hwee Hwa Pang Dec 2012

Computing Immutable Regions For Subspace Top-K Queries, Kyriakos Mouratidis, Hwee Hwa Pang

Kyriakos MOURATIDIS

Given a high-dimensional dataset, a top-k query can be used to shortlist the k tuples that best match the user’s preferences. Typically, these preferences regard a subset of the available dimensions (i.e., attributes) whose relative significance is expressed by user-specified weights. Along with the query result, we propose to compute for each involved dimension the maximal deviation to the corresponding weight for which the query result remains valid. The derived weight ranges, called immutable regions, are useful for performing sensitivity analysis, for finetuning the query weights, etc. In this paper, we focus on top-k queries with linear preference functions over …


Shortest Path Computation With No Information Leakage, Kyriakos Mouratidis, Man Lung Yiu Jul 2012

Shortest Path Computation With No Information Leakage, Kyriakos Mouratidis, Man Lung Yiu

Kyriakos MOURATIDIS

Shortest path computation is one of the most common queries in location-based services (LBSs). Although particularly useful, such queries raise serious privacy concerns. Exposing to a (potentially untrusted) LBS the client’s position and her destination may reveal personal information, such as social habits, health condition, shopping preferences, lifestyle choices, etc. The only existing method for privacy-preserving shortest path computation follows the obfuscation paradigm; it prevents the LBS from inferring the source and destination of the query with a probability higher than a threshold. This implies, however, that the LBS still deduces some information (albeit not exact) about the client’s location …


Enhancing Access Privacy Of Range Retrievals Over B+Trees, Hwee Hwa Pang, Jilian Zhang, Kyriakos Mouratidis Jul 2012

Enhancing Access Privacy Of Range Retrievals Over B+Trees, Hwee Hwa Pang, Jilian Zhang, Kyriakos Mouratidis

Kyriakos MOURATIDIS

Users of databases that are hosted on shared servers cannot take for granted that their queries will not be disclosed to unauthorized parties. Even if the database is encrypted, an adversary who is monitoring the I/O activity on the server may still be able to infer some information about a user query. For the particular case of a B+-tree that has its nodes encrypted, we identify properties that enable the ordering among the leaf nodes to be deduced. These properties allow us to construct adversarial algorithms to recover the B+-tree structure from the I/O traces generated by range queries. Combining …


Heuristic Algorithms For Balanced Multi-Way Number Partitioning, Jilian Zhang, Kyriakos Mouratidis, Hwee Hwa Pang Jul 2012

Heuristic Algorithms For Balanced Multi-Way Number Partitioning, Jilian Zhang, Kyriakos Mouratidis, Hwee Hwa Pang

Kyriakos MOURATIDIS

Balanced multi-way number partitioning (BMNP) seeks to split a collection of numbers into subsets with (roughly) the same cardinality and subset sum. The problem is NP-hard, and there are several exact and approximate algorithms for it. However, existing exact algorithms solve only the simpler, balanced two-way number partitioning variant, whereas the most effective approximate algorithm, BLDM, may produce widely varying subset sums. In this paper, we introduce the LRM algorithm that lowers the expected spread in subset sums to one third that of BLDM for uniformly distributed numbers and odd subset cardinalities. We also propose Meld, a novel strategy for …


Spatial Queries In The Presence Of Obstacles, Jun Zhang, Dimitris Papadias, Kyriakos Mouratidis, Manli Zhu Dec 2010

Spatial Queries In The Presence Of Obstacles, Jun Zhang, Dimitris Papadias, Kyriakos Mouratidis, Manli Zhu

Kyriakos MOURATIDIS

Despite the existence of obstacles in many database applications, traditional spatial query processing utilizes the Euclidean distance metric assuming that points in space are directly reachable. In this paper, we study spatial queries in the presence of obstacles, where the obstructed distance between two points is defined as the length of the shortest path that connects them without crossing any obstacles. We propose efficient algorithms for the most important query types, namely, range search, nearest neighbors, e-distance joins and closest pairs, considering that both data objects and obstacles are indexed by R-trees. The effectiveness of the proposed solutions is verified …


Tree-Based Partition Querying: A Methodology For Computing Medoids In Large Spatial Datasets, Kyriakos Mouratidis, Dimitris Papadias, Spiros Papadimitriou Dec 2010

Tree-Based Partition Querying: A Methodology For Computing Medoids In Large Spatial Datasets, Kyriakos Mouratidis, Dimitris Papadias, Spiros Papadimitriou

Kyriakos MOURATIDIS

Besides traditional domains (e.g., resource allocation, data mining applications), algorithms for medoid computation and related problems will play an important role in numerous emerging fields, such as location based services and sensor networks. Since the k-medoid problem is NP hard, all existing work deals with approximate solutions on relatively small datasets. This paper aims at efficient methods for very large spatial databases, motivated by: (i) the high and ever increasing availability of spatial data, and (ii) the need for novel query types and improved services. The proposed solutions exploit the intrinsic grouping properties of a data partition index in order …


Partially Materialized Digest Scheme: An Efficient Verification Method For Outsourced Databases, Kyriakos Mouratidis, Dimitris Sacharidis, Hwee Hwa Pang Dec 2010

Partially Materialized Digest Scheme: An Efficient Verification Method For Outsourced Databases, Kyriakos Mouratidis, Dimitris Sacharidis, Hwee Hwa Pang

Kyriakos MOURATIDIS

In the outsourced database model, a data owner publishes her database through a third-party server; i.e., the server hosts the data and answers user queries on behalf of the owner. Since the server may not be trusted, or may be compromised, users need a means to verify that answers received are both authentic and complete, i.e., that the returned data have not been tampered with, and that no qualifying results have been omitted. We propose a result verification approach for one-dimensional queries, called Partially Materialized Digest scheme (PMD), that applies to both static and dynamic databases. PMD uses separate indexes …


Preventing Location-Based Identity Inference In Anonymous Spatial Queries, Panos Kalnis, Gabriel Ghinita, Kyriakos Mouratidis, Dimitris Papadias Dec 2010

Preventing Location-Based Identity Inference In Anonymous Spatial Queries, Panos Kalnis, Gabriel Ghinita, Kyriakos Mouratidis, Dimitris Papadias

Kyriakos MOURATIDIS

The increasing trend of embedding positioning capabilities (for example, GPS) in mobile devices facilitates the widespread use of location-based services. For such applications to succeed, privacy and confidentiality are essential. Existing privacy-enhancing techniques rely on encryption to safeguard communication channels, and on pseudonyms to protect user identities. Nevertheless, the query contents may disclose the physical location of the user. In this paper, we present a framework for preventing location-based identity inference of users who issue spatial queries to location-based services. We propose transformations based on the well-established K-anonymity concept to compute exact answers for range and nearest neighbor search, without …


Anonymous Query Processing In Road Networks, Kyriakos Mouratidis, Man Lung Yiu Dec 2010

Anonymous Query Processing In Road Networks, Kyriakos Mouratidis, Man Lung Yiu

Kyriakos MOURATIDIS

The increasing availability of location-aware mobile devices has given rise to a flurry of location-based services (LBSs). Due to the nature of spatial queries, an LBS needs the user position in order to process her requests. On the other hand, revealing exact user locations to a (potentially untrusted) LBS may pinpoint their identities and breach their privacy. To address this issue, spatial anonymity techniques obfuscate user locations, forwarding to the LBS a sufficiently large region instead. Existing methods explicitly target processing in the euclidean space and do not apply when proximity to the users is defined according to network distance …


Scalable Verification For Outsourced Dynamic Databases, Hwee Hwa Pang, Jilian Zhang, Kyriakos Mouratidis Dec 2010

Scalable Verification For Outsourced Dynamic Databases, Hwee Hwa Pang, Jilian Zhang, Kyriakos Mouratidis

Kyriakos MOURATIDIS

Query answers from servers operated by third parties need to be verified, as the third parties may not be trusted or their servers may be compromised. Most of the existing authentication methods construct validity proofs based on the Merkle hash tree (MHT). The MHT, however, imposes severe concurrency constraints that slow down data updates. We introduce a protocol, built upon signature aggregation, for checking the authenticity, completeness and freshness of query answers. The protocol offers the important property of allowing new data to be disseminated immediately, while ensuring that outdated values beyond a pre-set age can be detected. We also …


Continuous Nearest Neighbor Queries Over Sliding Windows, Kyriakos Mouratidis, Dimitris Papadias Dec 2010

Continuous Nearest Neighbor Queries Over Sliding Windows, Kyriakos Mouratidis, Dimitris Papadias

Kyriakos MOURATIDIS

Recent research has focused on continuous monitoring of nearest neighbors (NN) in highly dynamic scenarios, where the queries and the data objects move frequently and arbitrarily. All existing methods, however, assume the Euclidean distance metric. In this paper we study k-NN monitoring in road networks, where the distance between a query and a data object is determined by the length of the shortest path connecting them. We propose two methods that can handle arbitrary object and query moving patterns, as well as fluctuations of edge weights. The first one maintains the query results by processing only updates that may invalidate …


K-Anonymity In The Presence Of External Databases, Dimitris Sacharidis, Kyriakos Mouratidis, Dimitris Papadias Dec 2010

K-Anonymity In The Presence Of External Databases, Dimitris Sacharidis, Kyriakos Mouratidis, Dimitris Papadias

Kyriakos MOURATIDIS

The concept of k-anonymity has received considerable attention due to the need of several organizations to release microdata without revealing the identity of individuals. Although all previous k-anonymity techniques assume the existence of a public database (PD) that can be used to breach privacy, none utilizes PD during the anonymization process. Specifically, existing generalization algorithms create anonymous tables using only the microdata table (MT) to be published, independently of the external knowledge available. This omission leads to high information loss. Motivated by this observation we first introduce the concept of k-join-anonymity (KJA), which permits more effective generalization to reduce the …


Aggregate Nearest Neighbor Queries In Spatial Databases, Dimitris Papadias, Yufei Tao, Kyriakos Mouratidis, Chun Kit Hui Dec 2010

Aggregate Nearest Neighbor Queries In Spatial Databases, Dimitris Papadias, Yufei Tao, Kyriakos Mouratidis, Chun Kit Hui

Kyriakos MOURATIDIS

Given two spatial datasets P (e.g., facilities) and Q (queries), an aggregate nearest neighbor (ANN) query retrieves the point(s) of P with the smallest aggregate distance(s) to points in Q. Assuming, for example, n users at locations q1,...qn, an ANN query outputs the facility p belongs to P that minimizes the sum of distances |pqi| for 1 is less than or equal to i is less than or equal to n that the users have to travel in order to meet there. Similarly, another ANN query may report the point p belongs to P that minimizes the maximum distance that …


Continuous Medoid Queries Over Moving Objects, Stavros Papadopoulos, Dimitris Sacharidis, Kyriakos Mouratidis Dec 2010

Continuous Medoid Queries Over Moving Objects, Stavros Papadopoulos, Dimitris Sacharidis, Kyriakos Mouratidis

Kyriakos MOURATIDIS

In the k-medoid problem, given a dataset P, we are asked to choose kpoints in P as the medoids. The optimal medoid set minimizes the average Euclidean distance between the points in P and their closest medoid. Finding the optimal k medoids is NP hard, and existing algorithms aim at approximate answers, i.e., they compute medoids that achieve a small, yet not minimal, average distance. Similarly in this paper, we also aim at approximate solutions. We consider, however, the continuous version of the problem, where the points in P move and our task is to maintain the medoid set on-the-fly …


Query Processing In Spatial Databases Containing Obstacles, Jun Zhang, Dimitris Papadias, Kyriakos Mouratidis, Manli Zhu Dec 2010

Query Processing In Spatial Databases Containing Obstacles, Jun Zhang, Dimitris Papadias, Kyriakos Mouratidis, Manli Zhu

Kyriakos MOURATIDIS

Despite the existence of obstacles in many database applications, traditional spatial query processing assumes that points in space are directly reachable and utilizes the Euclidean distance metric. In this paper, we study spatial queries in the presence of obstacles, where the obstructed distance between two points is defined as the length of the shortest path that connects them without crossing any obstacles. We propose efficient algorithms for the most important query types, namely, range search, nearest neighbours, e-distance joins, closest pairs and distance semi-joins, assuming that both data objects and obstacles are indexed by R-trees. The effectiveness of the proposed …


Continuous Monitoring Of Spatial Queries In Wireless Broadcast Environments, Kyriakos Mouratidis, Spiridon Bakiras, Dimitris Papadias Dec 2010

Continuous Monitoring Of Spatial Queries In Wireless Broadcast Environments, Kyriakos Mouratidis, Spiridon Bakiras, Dimitris Papadias

Kyriakos MOURATIDIS

Wireless data broadcast is a promising technique for information dissemination that leverages the computational capabilities of the mobile devices in order to enhance the scalability of the system. Under this environment, the data are continuously broadcast by the server, interleaved with some indexing information for query processing. Clients may then tune in the broadcast channel and process their queries locally without contacting the server. Previous work on spatial query processing for wireless broadcast systems has only considered snapshot queries over static data. In this paper, we propose an air indexing framework that 1) outperforms the existing (i.e., snapshot) techniques in …


Authenticating The Query Results Of Text Search Engines, Hwee Hwa Pang, Kyriakos Mouratidis Dec 2010

Authenticating The Query Results Of Text Search Engines, Hwee Hwa Pang, Kyriakos Mouratidis

Kyriakos MOURATIDIS

The number of successful attacks on the Internet shows that it is very difficult to guarantee the security of online search engines. A breached server that is not detected in time may return incorrect results to the users. To prevent that, we introduce a methodology for generating an integrity proof for each search result. Our solution is targeted at search engines that perform similarity-based document retrieval, and utilize an inverted list implementation (as most search engines do). We formulate the properties that define a correct result, map the task of processing a text search query to adaptations of existing threshold-based …


Efficient Evaluation Of Continuous Text Seach Queries, Kyriakos Mouratidis, Hwee Hwa Pang Dec 2010

Efficient Evaluation Of Continuous Text Seach Queries, Kyriakos Mouratidis, Hwee Hwa Pang

Kyriakos MOURATIDIS

Consider a text filtering server that monitors a stream of incoming documents for a set of users, who register their interests in the form of continuous text search queries. The task of the server is to constantly maintain for each query a ranked result list, comprising the recent documents (drawn from a sliding window) with the highest similarity to the query. Such a system underlies many text monitoring applications that need to cope with heavy document traffic, such as news and email monitoring.In this paper, we propose the first solution for processing continuous text queries efficiently. Our objective is to …


Conceptual Partitioning: An Efficient Method For Continuous Nearest Neighbor Monitoring, Kyriakos Mouratidis, Marios Hadjieleftheriou, Dimitris Papadias Dec 2010

Conceptual Partitioning: An Efficient Method For Continuous Nearest Neighbor Monitoring, Kyriakos Mouratidis, Marios Hadjieleftheriou, Dimitris Papadias

Kyriakos MOURATIDIS

Given a set of objects P and a query point q, a k nearest neighbor (k-NN) query retrieves the k objects in P that lie closest to q. Even though the problem is well-studied for static datasets, the traditional methods do not extend to highly dynamic environments where multiple continuous queries require real-time results, and both objects and queries receive frequent location updates. In this paper we propose conceptual partitioning (CPM), a comprehensive technique for the efficient monitoring of continuous NN queries. CPM achieves low running time by handling location updates only from objects that fall in the vicinity of …


An Incremental Threshold Method For Continuous Text Search Queries, Kyriakos Mouratidis, Hwee Hwa Pang Dec 2010

An Incremental Threshold Method For Continuous Text Search Queries, Kyriakos Mouratidis, Hwee Hwa Pang

Kyriakos MOURATIDIS

A text filtering system monitors a stream of incoming documents, to identify those that match the interest profiles of its users. The user interests are registered at a server as continuous text search queries. The server constantly maintains for each query a ranked result list, comprising the recent documents (drawn from a sliding window) with the highest similarity to the query. Such a system underlies many text monitoring applications that need to cope with heavy document traffic, such as news and email monitoring. In this paper, we propose the first solution for processing continuous text queries efficiently. Our objective is …


Spatial Cloaking Revisited: Distinguishing Information Leakage From Anonymity, Kar Way Tan, Yimin Lin, Kyriakos Mouratidis Dec 2010

Spatial Cloaking Revisited: Distinguishing Information Leakage From Anonymity, Kar Way Tan, Yimin Lin, Kyriakos Mouratidis

Kyriakos MOURATIDIS

Location-based services (LBS) are receiving increasing popularity as they provide convenience to mobile users with on-demand information. The use of these services, however, poses privacy issues as the user locations and queries are exposed to untrusted LBSs. Spatial cloaking techniques provide privacy in the form of k-anonymity; i.e., they guarantee that the (location of the) querying user u is indistinguishable from at least k-1 others, where k is a parameter specified by u at query time. To achieve this, they form a group of k users, including u, and forward their minimum bounding rectangle (termed anonymzing spatial region, ASR) to …


Group Nearest Neighbor Queries, Dimitris Papadias, Qiongmao Shen, Yufei Tao, Kyriakos Mouratidis Dec 2010

Group Nearest Neighbor Queries, Dimitris Papadias, Qiongmao Shen, Yufei Tao, Kyriakos Mouratidis

Kyriakos MOURATIDIS

Given two sets of points P and Q, a group nearest neighbor (GNN) query retrieves the point(s) of P with the smallest sum of distances to all points in Q. Consider, for instance, three users at locations q1 , q2 and q3 that want to find a meeting point (e.g., a restaurant); the corresponding query returns the data point p that minimizes the sum of Euclidean distances |pqi| for 1 ≤i ≤3. Assuming that Q fits in memory and P is indexed by an R-tree, we propose several algorithms for finding the group nearest neighbors efficiently. As a second step, …


Medoid Queries In Large Spatial Databases, Kyriakos Mouratidis, Dimitris Papadias, Spiros Papadimitriou Dec 2010

Medoid Queries In Large Spatial Databases, Kyriakos Mouratidis, Dimitris Papadias, Spiros Papadimitriou

Kyriakos MOURATIDIS

Assume that a franchise plans to open k branches in a city, so that the average distance from each residential block to the closest branch is minimized. This is an instance of the k-medoids problem, where residential blocks constitute the input dataset and the k branch locations correspond to the medoids. Since the problem is NP-hard, research has focused on approximate solutions. Despite an avalanche of methods for small and moderate size datasets, currently there exists no technique applicable to very large databases. In this paper, we provide efficient algorithms that utilize an existing data-partition index to achieve low CPU …


Efficient Verification Of Shortest Path Search Via Authenticated Hints, Man Lung Yiu, Yimin Lin, Kyriakos Mouratidis Dec 2010

Efficient Verification Of Shortest Path Search Via Authenticated Hints, Man Lung Yiu, Yimin Lin, Kyriakos Mouratidis

Kyriakos MOURATIDIS

Shortest path search in transportation networks is unarguably one of the most important online search services nowadays (e.g., Google Maps, MapQuest, etc), with applications spanning logistics, spatial optimization, or everyday driving decisions. Often times, the owner of the road network data (e.g., a transport authority) provides its database to third-party query services, which are responsible for answering shortest path queries posed by their clients. The issue arising here is that a query service might be returning sub-optimal paths either purposely (in order to serve its own purposes like computational savings or commercial reasons) or because it has been compromised by …


Constrained Shortest Path Computation, Manolis Terrovitis, Spiridon Bakiras, Dimitris Papadias, Kyriakos Mouratidis Dec 2010

Constrained Shortest Path Computation, Manolis Terrovitis, Spiridon Bakiras, Dimitris Papadias, Kyriakos Mouratidis

Kyriakos MOURATIDIS

This paper proposes and solves a-autonomy and k-stops shortest path problems in large spatial databases. Given a source s and a destination d, an aautonomy query retrieves a sequence of data points connecting s and d, such that the distance between any two consecutive points in the path is not greater than a. A k-stops query retrieves a sequence that contains exactly k intermediate data points. In both cases our aim is to compute the shortest path subject to these constraints. Assuming that the dataset is indexed by a data-partitioning method, the proposed techniques initially compute a sub-optimal path by …


Optimal Matching Between Spatial Datasets Under Capacity Constraints, Hou U Leong, Kyriakos Mouratidis, Man Lung Yiu, Nikos Mamoulis Dec 2010

Optimal Matching Between Spatial Datasets Under Capacity Constraints, Hou U Leong, Kyriakos Mouratidis, Man Lung Yiu, Nikos Mamoulis

Kyriakos MOURATIDIS

Consider a set of customers (e.g., WiFi receivers) and a set of service providers (e.g., wireless access points), where each provider has a capacity and the quality of service offered to its customers is anti-proportional to their distance. The capacity constrained assignment (CCA) is a matching between the two sets such that (i) each customer is assigned to at most one provider, (ii) every provider serves no more customers than its capacity, (iii) the maximum possible number of customers are served, and (iv) the sum of Euclidean distances within the assigned provider-customer pairs is minimized. Although max-flow algorithms are applicable …


Capacity Constrained Assignment In Spatial Databases, Hou U Leong, Man Lung Yiu, Kyriakos Mouratidis, Nikos Mamoulis Dec 2010

Capacity Constrained Assignment In Spatial Databases, Hou U Leong, Man Lung Yiu, Kyriakos Mouratidis, Nikos Mamoulis

Kyriakos MOURATIDIS

Given a point set P of customers (e.g., WiFi receivers) and a point set Q of service providers (e.g., wireless access points), where each q 2 Q has a capacity q.k, the capacity constrained assignment (CCA) is a matching M Q × P such that (i) each point q 2 Q (p 2 P) appears at most k times (at most nce) in M, (ii) the size of M is maximized (i.e., it comprises min{|P|,P q2Q q.k} pairs), and (iii) the total assignment cost (i.e., the sum of Euclidean distances within all pairs) is minimized. Thus, the CCA problem is …