Open Access. Powered by Scholars. Published by Universities.®

Computer Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

2015

Theses/Dissertations

Data Storage Systems

Articles 1 - 13 of 13

Full-Text Articles in Computer Engineering

Estimation On Gibbs Entropy For An Ensemble, Lekhya Sai Sake Dec 2015

Estimation On Gibbs Entropy For An Ensemble, Lekhya Sai Sake

Electronic Theses, Projects, and Dissertations

In this world of growing technology, any small improvement in the present scenario would create a revolution. One of the popular revolutions in the computer science field is parallel computing. A single parallel execution is not sufficient to see its non-deterministic features, as same execution with the same data at different time would end up with a different path. In order to see how non deterministic a parallel execution can extend up to, creates the need of the ensemble of executions. This project implements a program to estimate the Gibbs Entropy for an ensemble of parallel executions. The goal is …


Modeling Information Reliability And Maintenance: A Systematic Literature Review, Daysi A. Guerra Garcia Dec 2015

Modeling Information Reliability And Maintenance: A Systematic Literature Review, Daysi A. Guerra Garcia

Industrial Engineering Undergraduate Honors Theses

Operating a business efficiently depends on effective everyday decision-making. In turn, those decisions are influenced by the quality of data used in the decision-making process, and maintaining good data quality becomes more challenging as a business expands. Protecting the quality of the data and the information it generates is a challenge faced by many companies across all industrial sectors. As companies begin to use data from these large data bases they will need to begin to develop strategies for maintaining and assessing the reliability of the information they generate using this data. A considerable amount of literature exists on data …


Api-Based Acquisition Of Evidence From Cloud Storage Providers, Andres E. Barreto Aug 2015

Api-Based Acquisition Of Evidence From Cloud Storage Providers, Andres E. Barreto

University of New Orleans Theses and Dissertations

Cloud computing and cloud storage services, in particular, pose a new challenge to digital forensic investigations. Currently, evidence acquisition for such services still follows the traditional approach of collecting artifacts on a client device. In this work, we show that such an approach not only requires upfront substantial investment in reverse engineering each service, but is also inherently incomplete as it misses prior versions of the artifacts, as well as cloud-only artifacts that do not have standard serialized representations on the client.

In this work, we introduce the concept of API-based evidence acquisition for cloud services, which addresses these concerns …


Converting Medical Service Provider Data Into A Unified Format For Processing, Brandon Krugman Jul 2015

Converting Medical Service Provider Data Into A Unified Format For Processing, Brandon Krugman

Master's Theses (2009 -)

Most organizations process flat files regularly. There are different options for processing files, including SQL Server Integration Services (SSIS), BizTalk, SQL import job, and other Extract, Transform, and Load (ETL) processes. All of these options have very strict requirements for file formats. If the format of the file changes, all of these options throw a catastrophic error, and implementing a fix to handle the new format is difficult. With each of the methods, the new format needs to be configured in the development environment, and the data flow must be modified to process all of the changes. Due to the …


Oversubscribing Inotify On Embedded Platforms, Donald Percivalle, Scott Vanderlind Jun 2015

Oversubscribing Inotify On Embedded Platforms, Donald Percivalle, Scott Vanderlind

Computer Engineering

For most computers running the popular Linux operating system, the inte- grated kernel component inotify provides adequate functionality for monitor- ing changes to files present on the filesystem. However, for certain embedded platforms where resources are very limited and filesystems are very populated (like network attached storage (NAS) devices), inotify may not have enough resources to provide watchers for every file. This results in applications missing change notifications for files they have watched. This paper explores methods for using inotify most effectively on embedded systems by leveraging more la- tent storage. Benefits of this include a reduction in dropped notifications …


Rest Api To Access And Manage Geospatial Pipeline Integrity Data, Alexandra Michelle Francis Jun 2015

Rest Api To Access And Manage Geospatial Pipeline Integrity Data, Alexandra Michelle Francis

Master's Theses

Today’s economy and infrastructure is dependent on raw natural resources, like crude oil and natural gases, that are optimally transported through a net- work of hundreds of thousands of miles of pipelines throughout America[28]. A damaged pipe can negatively a↵ect thousands of homes and businesses so it is vital that they are monitored and quickly repaired[1]. Ideally, pipeline operators are able to detect damages before they occur, but ensuring the in- tegrity of the vast amount of pipes is unrealistic and would take an impractical amount of time and manpower[1].

Natural disasters, like earthquakes, as well as construction are just …


Data Integrity Verification In Cloud Computing, Katanosh Morovat May 2015

Data Integrity Verification In Cloud Computing, Katanosh Morovat

Graduate Theses and Dissertations

Cloud computing is an architecture model which provides computing and storage capacity as a service over the internet. Cloud computing should provide secure services for users and owners of data as well. Cloud computing services are a completely internet-based technology where data are stored and maintained in the data center of a cloud provider. Lack of appropriate control over the data might incur several security issues. As a result, some data stored in the cloud must be protected at all times. These types of data are called sensitive data. Sensitive data is defined as data that must be protected against …


Energy Agile Cluster Communication, Muhammad Zain Mustafa Mar 2015

Energy Agile Cluster Communication, Muhammad Zain Mustafa

Masters Theses

Computing researchers have long focused on improving energy-efficiency?the amount of computation per joule? under the implicit assumption that all energy is created equal. Energy however is not created equal: its cost and carbon footprint fluctuates over time due to a variety of factors. These fluctuations are expected to in- tensify as renewable penetration increases. Thus in my work I introduce energy-agility a design concept for a platform?s ability to rapidly and efficiently adapt to such power fluctuations. I then introduce a representative application to assess energy-agility for the type of long-running, parallel, data-intensive tasks that are both common in data …


Universal Schema For Knowledge Representation From Text And Structured Data, Limin Yao Mar 2015

Universal Schema For Knowledge Representation From Text And Structured Data, Limin Yao

Doctoral Dissertations

In data integration we transform information from a source into a target schema. A general problem in this task is loss of fidelity and coverage: the source expresses more knowledge than that can be fit into the target schema, or knowledge that is hard to fit into any schema at all. This problem is taken to an extreme in information extraction (IE) where the source is natural language---one of the most expressive forms of knowledge representation. To address this issue, one can either automatically learn a latent schema emergent in text (a brittle and ill-defined task), or manually define schemas. …


Hadoop Based Data Intensive Computation On Iaas Cloud Platforms, Sruthi Vijayakumar Jan 2015

Hadoop Based Data Intensive Computation On Iaas Cloud Platforms, Sruthi Vijayakumar

UNF Graduate Theses and Dissertations

Cloud computing is a relatively new form of computing which uses virtualized resources. It is dynamically scalable and is often provided as pay for use service over the Internet or Intranet or both. With increasing demand for data storage in the cloud, the study of data-intensive applications is becoming a primary focus. Data intensive applications are those which involve high CPU usage, processing large volumes of data typically in size of hundreds of gigabytes, terabytes or petabytes. The research in this thesis is focused on the Amazon’s Elastic Cloud Compute (EC2) and Amazon Elastic Map Reduce (EMR) using HiBench Hadoop …


Automated Beverage Dispenser, Sonya Istocka Jan 2015

Automated Beverage Dispenser, Sonya Istocka

Williams Honors College, Honors Research Projects

The intention of this project is to define a new way of distributing liquor. The project will consist of a device which will measure and track liquor being poured and associate it with a person, either a bartender or a bar patron. The challenges will be controlling the flow of liquor and recording it in an extremely accurate manner as well as processing data quickly so that a pour can be initiated very soon after a person is identified. The liquor dispenser will open up the possibility of a person being able to dispense their own liquor in a controlled …


Comparing The Efficiency Of Heterogeneous And Homogeneous Data Center Workloads, Brandon Kimmons Jan 2015

Comparing The Efficiency Of Heterogeneous And Homogeneous Data Center Workloads, Brandon Kimmons

Electronic Theses and Dissertations

Abstract

Information Technology, as an industry, is growing very quickly to keep pace with increased data storage and computing needs. Data growth, if not planned or managed correctly, can have larger efficiency implications on your data center as a whole. The long term reduction in efficiency will increase costs over time and increase operational overhead. Similarly, increases in processor efficiency have led to increased system density in data centers. This can increase cost and operational overhead in your data center infrastructure.

This paper proposes the idea that balanced data center workloads are more efficient in comparison to similar levels of …


Testing Data Vault-Based Data Warehouse, Connard N. Williams Jan 2015

Testing Data Vault-Based Data Warehouse, Connard N. Williams

Electronic Theses and Dissertations

Data warehouse (DW) projects are undertakings that require integration of disparate sources of data, a well-defined mapping of the source data to the reconciled data, and effective Extract, Transform, and Load (ETL) processes. Owing to the complexity of data warehouse projects, great emphasis must be placed on an agile-based approach with properly developed and executed test plans throughout the various stages of designing, developing, and implementing the data warehouse to mitigate against budget overruns, missed deadlines, low customer satisfaction, and outright project failures. Yet, there are often attempts to test the data warehouse exactly like traditional back-end databases and legacy …