Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 15 of 15

Full-Text Articles in Databases and Information Systems

K-Anonymity In The Presence Of External Databases, Dimitris Sacharidis, Kyriakos Mouratidis, Dimitris Papadias Dec 2010

K-Anonymity In The Presence Of External Databases, Dimitris Sacharidis, Kyriakos Mouratidis, Dimitris Papadias

Kyriakos MOURATIDIS

The concept of k-anonymity has received considerable attention due to the need of several organizations to release microdata without revealing the identity of individuals. Although all previous k-anonymity techniques assume the existence of a public database (PD) that can be used to breach privacy, none utilizes PD during the anonymization process. Specifically, existing generalization algorithms create anonymous tables using only the microdata table (MT) to be published, independently of the external knowledge available. This omission leads to high information loss. Motivated by this observation we first introduce the concept of k-join-anonymity (KJA), which permits more effective generalization to reduce the …


Aggregate Nearest Neighbor Queries In Spatial Databases, Dimitris Papadias, Yufei Tao, Kyriakos Mouratidis, Chun Kit Hui Dec 2010

Aggregate Nearest Neighbor Queries In Spatial Databases, Dimitris Papadias, Yufei Tao, Kyriakos Mouratidis, Chun Kit Hui

Kyriakos MOURATIDIS

Given two spatial datasets P (e.g., facilities) and Q (queries), an aggregate nearest neighbor (ANN) query retrieves the point(s) of P with the smallest aggregate distance(s) to points in Q. Assuming, for example, n users at locations q1,...qn, an ANN query outputs the facility p belongs to P that minimizes the sum of distances |pqi| for 1 is less than or equal to i is less than or equal to n that the users have to travel in order to meet there. Similarly, another ANN query may report the point p belongs to P that minimizes the maximum distance that …


Group Nearest Neighbor Queries, Dimitris Papadias, Qiongmao Shen, Yufei Tao, Kyriakos Mouratidis Dec 2010

Group Nearest Neighbor Queries, Dimitris Papadias, Qiongmao Shen, Yufei Tao, Kyriakos Mouratidis

Kyriakos MOURATIDIS

Given two sets of points P and Q, a group nearest neighbor (GNN) query retrieves the point(s) of P with the smallest sum of distances to all points in Q. Consider, for instance, three users at locations q1 , q2 and q3 that want to find a meeting point (e.g., a restaurant); the corresponding query returns the data point p that minimizes the sum of Euclidean distances |pqi| for 1 ≤i ≤3. Assuming that Q fits in memory and P is indexed by an R-tree, we propose several algorithms for finding the group nearest neighbors efficiently. As a second step, …


Constrained Shortest Path Computation, Manolis Terrovitis, Spiridon Bakiras, Dimitris Papadias, Kyriakos Mouratidis Dec 2010

Constrained Shortest Path Computation, Manolis Terrovitis, Spiridon Bakiras, Dimitris Papadias, Kyriakos Mouratidis

Kyriakos MOURATIDIS

This paper proposes and solves a-autonomy and k-stops shortest path problems in large spatial databases. Given a source s and a destination d, an aautonomy query retrieves a sequence of data points connecting s and d, such that the distance between any two consecutive points in the path is not greater than a. A k-stops query retrieves a sequence that contains exactly k intermediate data points. In both cases our aim is to compute the shortest path subject to these constraints. Assuming that the dataset is indexed by a data-partitioning method, the proposed techniques initially compute a sub-optimal path by …


Program Transformations For Information Personalization, Saverio Perugini, Naren Ramakrishnan Oct 2010

Program Transformations For Information Personalization, Saverio Perugini, Naren Ramakrishnan

Computer Science Faculty Publications

Personalization constitutes the mechanisms necessary to automatically customize information content, structure, and presentation to the end user to reduce information overload. Unlike traditional approaches to personalization, the central theme of our approach is to model a website as a program and conduct website transformation for personalization by program transformation (e.g., partial evaluation, program slicing). The goal of this paper is study personalization through a program transformation lens and develop a formal model, based on program transformations, for personalized interaction with hierarchical hypermedia. The specific research issues addressed involve identifying and developing program representations and transformations suitable for classes of hierarchical …


Merging Schemas In A Collaborative Faceted Classification System, Jianxiang Li Aug 2010

Merging Schemas In A Collaborative Faceted Classification System, Jianxiang Li

Computer Science Theses & Dissertations

We have developed a system that improves access to a large, growing image collection by allowing users to collaboratively build a global faceted (multi-perspective) classification schema. We are extending our system to support both global and local schemas, where global schema provides a complete and uniform view of the collection whereas local schema provides a personal, possibly incomplete and idiosyncratic view of the collection. We argue that although users usually focus on their personal schemas, it is still desirable to have a global schema for the entire collection even if such local schemas are available. In order to keep the …


Cloud Storage And Online Bin Packing, Swathi Venigella Aug 2010

Cloud Storage And Online Bin Packing, Swathi Venigella

UNLV Theses, Dissertations, Professional Papers, and Capstones

Cloud storage is the service provided by some corporations (such as Mozy and Carbonite) to store and backup computer files. We study the problem of allocating memory of servers in a data center based on online requests for storage. Over-the-net data backup has become increasingly easy and cheap due to cloud storage. Given an online sequence of storage requests and a cost associated with serving the request by allocating space on a certain server one seeks to select the minimum number of servers as to minimize total cost. We use two different algorithms and propose a third algorithm; we show …


Semi-Supervised Distance Metric Learning For Collaborative Image Retrieval And Clustering, Steven C. H. Hoi, Wei Liu, Shih-Fu Chang Aug 2010

Semi-Supervised Distance Metric Learning For Collaborative Image Retrieval And Clustering, Steven C. H. Hoi, Wei Liu, Shih-Fu Chang

Research Collection School Of Computing and Information Systems

Learning a good distance metric plays a vital role in many multimedia retrieval and data mining tasks. For example, a typical content-based image retrieval (CBIR) system often relies on an effective distance metric to measure similarity between any two images. Conventional CBIR systems simply adopting Euclidean distance metric often fail to return satisfactory results mainly due to the well-known semantic gap challenge. In this article, we present a novel framework of Semi-Supervised Distance Metric Learning for learning effective distance metrics by exploring the historical relevance feedback log data of a CBIR system and utilizing unlabeled data when log data are …


Information Hiding Using Stochastic Diffusion For The Covert Transmission Of Encrypted Images, Jonathan Blackledge Jun 2010

Information Hiding Using Stochastic Diffusion For The Covert Transmission Of Encrypted Images, Jonathan Blackledge

Conference papers

A principal weakness of all encryption systems is that the output data can be `seen' to be encrypted. In other words, encrypted data provides a 'flag' on the potential value of the information that has been encrypted. In this paper, we provide a novel approach to `hiding' encrypted data in a digital image. We consider an approach in which a plaintext image is encrypted with a cipher using the processes of `stochastic diffusion' and the output quantized into a 1-bit array generating a binary image cipher-text. This output is then `embedded' in a host image which is undertaken either in …


Personalization By Website Transformation: Theory And Practice, Saverio Perugini May 2010

Personalization By Website Transformation: Theory And Practice, Saverio Perugini

Computer Science Faculty Publications

We present an analysis of a progressive series of out-of-turn transformations on a hierarchical website to personalize a user’s interaction with the site. We formalize the transformation in graph-theoretic terms and describe a toolkit we built that enumerates all of the traversals enabled by every possible complete series of these transformations in any site and computes a variety of metrics while simulating each traversal therein to qualify the relationship between a site’s structure and the cumulative effect of support for the transformation in a site. We employed this toolkit in two websites. The results indicate that the transformation enables users …


K-Anonymity In The Presence Of External Databases, Dimitris Sacharidis, Kyriakos Mouratidis, Dimitris Papadias Mar 2010

K-Anonymity In The Presence Of External Databases, Dimitris Sacharidis, Kyriakos Mouratidis, Dimitris Papadias

Research Collection School Of Computing and Information Systems

The concept of k-anonymity has received considerable attention due to the need of several organizations to release microdata without revealing the identity of individuals. Although all previous k-anonymity techniques assume the existence of a public database (PD) that can be used to breach privacy, none utilizes PD during the anonymization process. Specifically, existing generalization algorithms create anonymous tables using only the microdata table (MT) to be published, independently of the external knowledge available. This omission leads to high information loss. Motivated by this observation we first introduce the concept of k-join-anonymity (KJA), which permits more effective generalization to reduce the …


Information-Quality Aware Routing In Event-Driven Sensor Networks, Hwee Xian Tan, Mun-Choon Chan, Wendong Xiao, Peng-Yong Kong, Chen-Khong Tham Mar 2010

Information-Quality Aware Routing In Event-Driven Sensor Networks, Hwee Xian Tan, Mun-Choon Chan, Wendong Xiao, Peng-Yong Kong, Chen-Khong Tham

Research Collection School Of Computing and Information Systems

Upon the occurrence of a phenomenon of interest in a wireless sensor network, multiple sensors may be activated, leading to data implosion and redundancy. Data aggregation and/or fusion techniques exploit spatio-temporal correlation among sensory data to reduce traffic load and mitigate congestion. However, this is often at the expense of loss in Information Quality (IQ) of data that is collected at the fusion center. In this work, we address the problem of finding the least-cost routing tree that satisfies a given IQ constraint. We note that the optimal least-cost routing solution is a variation of the classical NP-hard Steiner tree …


On The Applications Of Deterministic Chaos For Encrypting Data On The Cloud, Jonathan Blackledge, Nikolai Ptitsyn Jan 2010

On The Applications Of Deterministic Chaos For Encrypting Data On The Cloud, Jonathan Blackledge, Nikolai Ptitsyn

Conference papers

Cloud computing is expected to grow considerably in the future because it has so many advantages with regard to sale and cost, change management, next generation architectures, choice and agility. However, one of the principal concerns for users of the Cloud is lack of control and above all, data security. This paper considers an approach to encrypting information before it is ‘place’ on the Cloud where each user has access to their own encryption algorithm, an algorithm that is based on a set of Iterative Function Systems that outputs a chaotic number stream, designed to produce a cryptographically secure cipher. …


Supporting Multiple Paths To Objects In Information Hierarchies: Faceted Classification, Faceted Search, And Symbolic Links, Saverio Perugini Jan 2010

Supporting Multiple Paths To Objects In Information Hierarchies: Faceted Classification, Faceted Search, And Symbolic Links, Saverio Perugini

Computer Science Faculty Publications

We present three fundamental, interrelated approaches to support multiple access paths to each terminal object in information hierarchies: faceted classification, faceted search, and web directories with embedded symbolic links. This survey aims to demonstrate how each approach supports users who seek information from multiple perspectives. We achieve this by exploring each approach, the relationships between these approaches, including tradeoffs, and how they can be used in concert, while focusing on a core set of hypermedia elements common to all. This approach provides a foundation from which to study, understand, and synthesize applications which employ these techniques. This survey does not …


Cbtv: Visualising Case Bases For Similarity Measure Design And Selection, Brian Mac Namee, Sarah Jane Delany Jan 2010

Cbtv: Visualising Case Bases For Similarity Measure Design And Selection, Brian Mac Namee, Sarah Jane Delany

Conference papers

In CBR the design and selection of similarity measures is paramount. Selection can benefit from the use of exploratory visualisation- based techniques in parallel with techniques such as cross-validation ac- curacy comparison. In this paper we present the Case Base Topology Viewer (CBTV) which allows the application of different similarity mea- sures to a case base to be visualised so that system designers can explore the case base and the associated decision boundary space. We show, using a range of datasets and similarity measure types, how the idiosyncrasies of particular similarity measures can be illustrated and compared in CBTV allowing …