Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Discipline
- Publication
- Publication Type
Articles 1 - 5 of 5
Full-Text Articles in Physical Sciences and Mathematics
The Impact Of Accessible Data On Cyberstalking, Elise Kwan
The Impact Of Accessible Data On Cyberstalking, Elise Kwan
The Journal of Purdue Undergraduate Research
No abstract provided.
Sort Vs. Hash Join On Knights Landing Architecture, Victor L. Pan, Felix Lin
Sort Vs. Hash Join On Knights Landing Architecture, Victor L. Pan, Felix Lin
The Summer Undergraduate Research Fellowship (SURF) Symposium
With the increasing amount of information stored, there is a need for efficient database algorithms. One of the most important database operations is “join”. This involves combining columns from two tables and grouping common values in the same row in order to minimize redundant data. The two main algorithms used are hash join and sort merge join. Hash join builds a hash table to allow for faster searching. Sort merge join first sorts the two tables to make it more efficient when comparing values. There has been a lot of debate over which approach is superior. At first, hash join …
Efficient Processing Of Similarity Queries With Applications, Mingjie Tang
Efficient Processing Of Similarity Queries With Applications, Mingjie Tang
Open Access Dissertations
Today, a myriad of data sources, from the Internet to business operations to scientific instruments, produce large and different types of data. Many application scenarios, e.g., marketing analysis, sensor networks, and medical and biological applications, call for identifying and processing similarities in "big" data. As a result, it is imperative to develop new similarity query processing approaches and systems that scale from low dimensional data to high dimensional data, from single machine to clusters of hundreds of machines, and from disk-based to memory-based processing. This dissertation introduces and studies several similarity-aware query operators, analyzes and optimizes their performance.
The first …
Comparison Of Clustered Rdf Data Stores, Venkata Patchigolla
Comparison Of Clustered Rdf Data Stores, Venkata Patchigolla
Purdue Polytechnic Masters Theses
Storing data in RDF format helps in simpler data interchange among different researchers compared to present approaches. There has been tremendous increase in the applications that use RDF data. The nature of RDF data is such that it tends to increase explosively. This makes it necessary to consider the time for retrieval and scalability of data while selecting a suitable RDF data store for developing applications. The research concentrates on comparing BigOWLIM. Bigdata, 4store and Virtuoso RDF stores on basis of their scalability and performance of storing and retrieving cancer proteomics and mass spectrometry data using SPARQL queries. In this …
Beyond K-Anonymity: A Decision Theoretic Framework For Assessing Privacy Risk, Guy Lebanon, Monica Scannapieco, Mohamed Fouad, Elisa Bertino
Beyond K-Anonymity: A Decision Theoretic Framework For Assessing Privacy Risk, Guy Lebanon, Monica Scannapieco, Mohamed Fouad, Elisa Bertino
Cyber Center Publications
An important issue any organization or individual has to face when managing data containing sensitive information, is the risk that can be incurred when releasing such data. Even though data may be sanitized before being released, it is still possible for an adversary to reconstruct the original data using additional information thus resulting in privacy violations. To date, however, a systematic approach to quantify such risks is not available. In this paper we develop a framework, based on statistical decision theory, that assesses the relationship between the disclosed data and the resulting privacy risk. We model the problem of deciding …