Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 17 of 17

Full-Text Articles in Engineering

Parallel Algorithms For Scalable Graph Mining: Applications On Big Data And Machine Learning, Naw Safrin Sattar Aug 2022

Parallel Algorithms For Scalable Graph Mining: Applications On Big Data And Machine Learning, Naw Safrin Sattar

University of New Orleans Theses and Dissertations

Parallel computing plays a crucial role in processing large-scale graph data. Complex network analysis is an exciting area of research for many applications in different scientific domains e.g., sociology, biology, online media, recommendation systems and many more. Graph mining is an area of interest with diverse problems from different domains of our daily life. Due to the advancement of data and computing technologies, graph data is growing at an enormous rate, for example, the number of links in social networks is growing every millisecond. Machine/Deep learning plays a significant role for technological accomplishments to work with big data in modern …


Learning Analytics For The Formative Assessment Of New Media Skills, Negar Shabihi Mar 2022

Learning Analytics For The Formative Assessment Of New Media Skills, Negar Shabihi

Electronic Thesis and Dissertation Repository

Recent theories of education have shifted learning environments towards student-centred education. Also, the advancement of technology and the need for skilled individuals in different areas have led to the introduction of new media skills. Along with new pedagogies and content, these changes require new forms of assessment. However, assessment as the core of learning has not been modified as much as other educational aspects. Hence, much attention is required to develop assessment methods based on current educational requirements. To address this gap, we have implemented two data-driven systematic literature reviews to recognize the existing state of the field in the …


A Method For Monitoring Operating Equipment Effectiveness With The Internet Of Things And Big Data, Carl D. Hays Iii Jun 2021

A Method For Monitoring Operating Equipment Effectiveness With The Internet Of Things And Big Data, Carl D. Hays Iii

Master's Theses

The purpose of this paper was to use the Overall Equipment Effectiveness productivity formula in plant manufacturing and convert it to measuring productivity for forklifts. Productivity for a forklift was defined as being available and picking up and moving containers at port locations in Seattle and Alaska. This research uses performance measures in plant manufacturing and applies them to mobile equipment in order to establish the most effective means of analyzing reliability and productivity. Using the Internet of Things to collect data on fifteen forklift trucks in three different locations, this data was then analyzed over a six-month period to …


A Study On The Improvement Of Data Collection In Data Centers And Its Analysis On Deep Learning-Based Applications, Dipak Kumar Singh Jun 2020

A Study On The Improvement Of Data Collection In Data Centers And Its Analysis On Deep Learning-Based Applications, Dipak Kumar Singh

LSU Doctoral Dissertations

Big data are usually stored in data center networks for processing and analysis through various cloud applications. Such applications are a collection of data-intensive jobs which often involve many parallel flows and are network bound in the distributed environment. The recent networking abstraction, coflow, for data parallel programming paradigm to express the communication requirements has opened new opportunities to network scheduling for such applications. Therefore, I propose coflow based network scheduling algorithm, Coflourish, to enhance the job completion time for such data-parallel applications, in the presence of the increased background traffic to mimic the cloud environment infrastructure. It outperforms …


Design And Implementation Of Anomaly Detections For User Authentication Framework, Iman Abu Sulayman Dec 2019

Design And Implementation Of Anomaly Detections For User Authentication Framework, Iman Abu Sulayman

Electronic Thesis and Dissertation Repository

Anomaly detection is quickly becoming a very significant tool for a variety of applications such as intrusion detection, fraud detection, fault detection, system health monitoring, and event detection in IoT devices. An application that lacks a strong implementation for anomaly detection is user trait modeling for user authentication purposes. User trait models expose up-to-date representation of the user so that changes in their interests, their learning progress or interactions with the system are noticed and interpreted. The reason behind the lack of adoption in user trait modeling arises from the need of a continuous flow of high-volume data, that is …


Automatic Identification Of Animals In The Wild: A Comparative Study Between C-Capsule Networks And Deep Convolutional Neural Networks., Joel Kamdem Teto, Ying Xie Nov 2018

Automatic Identification Of Animals In The Wild: A Comparative Study Between C-Capsule Networks And Deep Convolutional Neural Networks., Joel Kamdem Teto, Ying Xie

Master of Science in Computer Science Theses

The evolution of machine learning and computer vision in technology has driven a lot of

improvements and innovation into several domains. We see it being applied for credit decisions, insurance quotes, malware detection, fraud detection, email composition, and any other area having enough information to allow the machine to learn patterns. Over the years the number of sensors, cameras, and cognitive pieces of equipment placed in the wilderness has been growing exponentially. However, the resources (human) to leverage these data into something meaningful are not improving at the same rate. For instance, a team of scientist volunteers took 8.4 years, …


Resampling Methods And Visualization Tools For Computer Performance Comparisons In The Presence Of Performance Variation, Samuel Oridge Irving Apr 2018

Resampling Methods And Visualization Tools For Computer Performance Comparisons In The Presence Of Performance Variation, Samuel Oridge Irving

LSU Master's Theses

Performance variability, stemming from non-deterministic hardware and software behaviors or deterministic behaviors such as measurement bias, is a well-known phenomenon of computer systems which increases the difficulty of comparing computer performance metrics and is slated to become even more of a concern as interest in Big Data Analytics increases. Conventional methods use various measures (such as geometric mean) to quantify the performance of different benchmarks to compare computers without considering this variability which may lead to wrong conclusions. In this paper, we propose three resampling methods for performance evaluation and comparison: a randomization test for a general performance comparison between …


A Study Of Application-Awareness In Software-Defined Data Center Networks, Chui-Hui Chiu Nov 2017

A Study Of Application-Awareness In Software-Defined Data Center Networks, Chui-Hui Chiu

LSU Doctoral Dissertations

A data center (DC) has been a fundamental infrastructure for academia and industry for many years. Applications in DC have diverse requirements on communication. There are huge demands on data center network (DCN) control frameworks (CFs) for coordinating communication traffic. Simultaneously satisfying all demands is difficult and inefficient using existing traditional network devices and protocols. Recently, the agile software-defined Networking (SDN) is introduced to DCN for speeding up the development of the DCNCF. Application-awareness preserves the application semantics including the collective goals of communications. Previous works have illustrated that application-aware DCNCFs can much more efficiently allocate network resources by explicitly …


Wide-Area Measurement-Driven Approaches For Power System Modeling And Analytics, Hesen Liu Aug 2017

Wide-Area Measurement-Driven Approaches For Power System Modeling And Analytics, Hesen Liu

Doctoral Dissertations

This dissertation presents wide-area measurement-driven approaches for power system modeling and analytics. Accurate power system dynamic models are the very basis of power system analysis, control, and operation. Meanwhile, phasor measurement data provide first-hand knowledge of power system dynamic behaviors. The idea of building out innovative applications with synchrophasor data is promising.

Taking advantage of the real-time wide-area measurements, one of phasor measurements’ novel applications is to develop a synchrophasor-based auto-regressive with exogenous inputs (ARX) model that can be updated online to estimate or predict system dynamic responses.

Furthermore, since auto-regressive models are in a big family, the ARX model …


Data Masking, Encryption, And Their Effect On Classification Performance: Trade-Offs Between Data Security And Utility, Juan C. Asenjo Jan 2017

Data Masking, Encryption, And Their Effect On Classification Performance: Trade-Offs Between Data Security And Utility, Juan C. Asenjo

CCE Theses and Dissertations

As data mining increasingly shapes organizational decision-making, the quality of its results must be questioned to ensure trust in the technology. Inaccuracies can mislead decision-makers and cause costly mistakes. With more data collected for analytical purposes, privacy is also a major concern. Data security policies and regulations are increasingly put in place to manage risks, but these policies and regulations often employ technologies that substitute and/or suppress sensitive details contained in the data sets being mined. Data masking and substitution and/or data encryption and suppression of sensitive attributes from data sets can limit access to important details. It is believed …


Conditional Correlation Analysis, Sanjeev Bhatta Jan 2017

Conditional Correlation Analysis, Sanjeev Bhatta

Browse all Theses and Dissertations

Correlation analysis is a frequently used statistical measure to examine the relationship among variables in different practical applications. However, the traditional correlation analysis uses an overly simplistic method to do so. It measures how two variables are related in an application by examining only their relationship in the entire underlying data space. As a result, traditional correlation analysis may miss a strong correlation between those variables especially when that relationship exists in the small subpopulation of the larger data space. This is no longer acceptable and may lose a fair share of information in this era of Big Data which …


Dcms: A Data Analytics And Management System For Molecular Simulation, Meryem Berrada Mar 2015

Dcms: A Data Analytics And Management System For Molecular Simulation, Meryem Berrada

USF Tampa Graduate Theses and Dissertations

Despite the fact that Molecular Simulation systems represent a major research tool in multiple scientific and engineering fields, there is still a lack of systems for effective data management and fast data retrieval and processing. This is mainly due to the nature of MS which generate a very large amount of data - a system usually encompass millions of data information, and one query usually runs for tens of thousands of time frames. For this purpose, we designed and developed a new application, DCMS (A data Analytics and Management System for molecular Simulation), that intends to speed up the process …


Browser Based Visualization For Parameter Spaces Of Big Data Using Client-Server Model, Kurtis M. Glendenning Jan 2015

Browser Based Visualization For Parameter Spaces Of Big Data Using Client-Server Model, Kurtis M. Glendenning

Browse all Theses and Dissertations

Visualization is an important task in data analytics, as it allows researchers to view abstract patterns within the data instead of reading through extensive raw data. Allowing the ability to interact with the visualizations is an essential aspect since it provides the ability to intuitively explore data to find meaning and patterns more efficiently. Interactivity, however, becomes progressively more difficult as the size of the dataset increases. This project begins by leveraging existing web-based data visualization technologies and extends their functionality through the use of parallel processing. This methodology utilizes state-of-the-art techniques, such as Node.js, to split the visualization rendering …


Improvements On Scientific System Analysis, Vladimir Grupchev Jan 2015

Improvements On Scientific System Analysis, Vladimir Grupchev

USF Tampa Graduate Theses and Dissertations

Thanks to the advancement of the modern computer simulation systems, many scientific applications generate, and require manipulation of large volumes of data. Scientific exploration substantially relies on effective and accurate data analysis. The shear size of the generated data, however, imposes big challenges in the process of analyzing the system. In this dissertation we propose novel techniques as well as using some known designs in a novel way in order to improve scientific data analysis.

We develop an efficient method to compute an analytical query called spatial distance histogram (SDH). Special heuristics are exploited to process SDH efficiently and accurately. …


Disaster Data Management In Cloud Environments, Katarina Grolinger Dec 2013

Disaster Data Management In Cloud Environments, Katarina Grolinger

Electronic Thesis and Dissertation Repository

Facilitating decision-making in a vital discipline such as disaster management requires information gathering, sharing, and integration on a global scale and across governments, industries, communities, and academia. A large quantity of immensely heterogeneous disaster-related data is available; however, current data management solutions offer few or no integration capabilities and limited potential for collaboration. Moreover, recent advances in cloud computing, Big Data, and NoSQL have opened the door for new solutions in disaster data management.

In this thesis, a Knowledge as a Service (KaaS) framework is proposed for disaster cloud data management (Disaster-CDM) with the objectives of 1) facilitating information gathering …


Performance Evaluation Of Data Intensive Computing In The Cloud, Bhagavathi Kaza Jan 2013

Performance Evaluation Of Data Intensive Computing In The Cloud, Bhagavathi Kaza

UNF Graduate Theses and Dissertations

Big data is a topic of active research in the cloud community. With increasing demand for data storage in the cloud, study of data-intensive applications is becoming a primary focus. Data-intensive applications involve high CPU usage for processing large volumes of data on the scale of terabytes or petabytes. While some research exists for the performance effect of data intensive applications in the cloud, none of the research compares the Amazon Elastic Compute Cloud (Amazon EC2) and Google Compute Engine (GCE) clouds using multiple benchmarks. This study performs extensive research on the Amazon EC2 and GCE clouds using the TeraSort, …


Efficient And Private Processing Of Analytical Queries In Scientific Datasets, Anand Kumar Jan 2013

Efficient And Private Processing Of Analytical Queries In Scientific Datasets, Anand Kumar

USF Tampa Graduate Theses and Dissertations

Large amount of data is generated by applications used in basic-science research and development applications. The size of data introduces great challenges in storage, analysis and preserving privacy. This dissertation proposes novel techniques to efficiently analyze the data and reduce storage space requirements through a data compression technique while preserving privacy and providing data security.

We present an efficient technique to compute an analytical query called spatial distance histogram (SDH) using spatiotemporal properties of the data. Special spatiotemporal properties present in the data are exploited to process SDH efficiently on the fly. General purpose graphics processing units (GPGPU or just …