Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 5 of 5

Full-Text Articles in Physical Sciences and Mathematics

The Impact Of Accessible Data On Cyberstalking, Elise Kwan Jan 2024

The Impact Of Accessible Data On Cyberstalking, Elise Kwan

The Journal of Purdue Undergraduate Research

No abstract provided.


Sort Vs. Hash Join On Knights Landing Architecture, Victor L. Pan, Felix Lin Aug 2018

Sort Vs. Hash Join On Knights Landing Architecture, Victor L. Pan, Felix Lin

The Summer Undergraduate Research Fellowship (SURF) Symposium

With the increasing amount of information stored, there is a need for efficient database algorithms. One of the most important database operations is “join”. This involves combining columns from two tables and grouping common values in the same row in order to minimize redundant data. The two main algorithms used are hash join and sort merge join. Hash join builds a hash table to allow for faster searching. Sort merge join first sorts the two tables to make it more efficient when comparing values. There has been a lot of debate over which approach is superior. At first, hash join …


Efficient Processing Of Similarity Queries With Applications, Mingjie Tang Dec 2016

Efficient Processing Of Similarity Queries With Applications, Mingjie Tang

Open Access Dissertations

Today, a myriad of data sources, from the Internet to business operations to scientific instruments, produce large and different types of data. Many application scenarios, e.g., marketing analysis, sensor networks, and medical and biological applications, call for identifying and processing similarities in "big" data. As a result, it is imperative to develop new similarity query processing approaches and systems that scale from low dimensional data to high dimensional data, from single machine to clusters of hundreds of machines, and from disk-based to memory-based processing. This dissertation introduces and studies several similarity-aware query operators, analyzes and optimizes their performance.

The first …


Comparison Of Clustered Rdf Data Stores, Venkata Patchigolla Jul 2011

Comparison Of Clustered Rdf Data Stores, Venkata Patchigolla

Purdue Polytechnic Masters Theses

Storing data in RDF format helps in simpler data interchange among different researchers compared to present approaches. There has been tremendous increase in the applications that use RDF data. The nature of RDF data is such that it tends to increase explosively. This makes it necessary to consider the time for retrieval and scalability of data while selecting a suitable RDF data store for developing applications. The research concentrates on comparing BigOWLIM. Bigdata, 4store and Virtuoso RDF stores on basis of their scalability and performance of storing and retrieving cancer proteomics and mass spectrometry data using SPARQL queries. In this …


Beyond K-Anonymity: A Decision Theoretic Framework For Assessing Privacy Risk, Guy Lebanon, Monica Scannapieco, Mohamed Fouad, Elisa Bertino Jan 2009

Beyond K-Anonymity: A Decision Theoretic Framework For Assessing Privacy Risk, Guy Lebanon, Monica Scannapieco, Mohamed Fouad, Elisa Bertino

Cyber Center Publications

An important issue any organization or individual has to face when managing data containing sensitive information, is the risk that can be incurred when releasing such data. Even though data may be sanitized before being released, it is still possible for an adversary to reconstruct the original data using additional information thus resulting in privacy violations. To date, however, a systematic approach to quantify such risks is not available. In this paper we develop a framework, based on statistical decision theory, that assesses the relationship between the disclosed data and the resulting privacy risk. We model the problem of deciding …