Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 10 of 10

Full-Text Articles in Engineering

Parallel Algorithms For Scalable Graph Mining: Applications On Big Data And Machine Learning, Naw Safrin Sattar Aug 2022

Parallel Algorithms For Scalable Graph Mining: Applications On Big Data And Machine Learning, Naw Safrin Sattar

University of New Orleans Theses and Dissertations

Parallel computing plays a crucial role in processing large-scale graph data. Complex network analysis is an exciting area of research for many applications in different scientific domains e.g., sociology, biology, online media, recommendation systems and many more. Graph mining is an area of interest with diverse problems from different domains of our daily life. Due to the advancement of data and computing technologies, graph data is growing at an enormous rate, for example, the number of links in social networks is growing every millisecond. Machine/Deep learning plays a significant role for technological accomplishments to work with big data in modern …


Machine Learning With Big Data For Electrical Load Forecasting, Alexandra L'Heureux Jun 2022

Machine Learning With Big Data For Electrical Load Forecasting, Alexandra L'Heureux

Electronic Thesis and Dissertation Repository

Today, the amount of data collected is exploding at an unprecedented rate due to developments in Web technologies, social media, mobile and sensing devices and the internet of things (IoT). Data is gathered in every aspect of our lives: from financial information to smart home devices and everything in between. The driving force behind these extensive data collections is the promise of increased knowledge. Therefore, the potential of Big Data relies on our ability to extract value from these massive data sets. Machine learning is central to this quest because of its ability to learn from data and provide data-driven …


Deep Neural Network Learning-Based Classifier Design For Big-Data Analytics, Krishnan Raghavan Jan 2019

Deep Neural Network Learning-Based Classifier Design For Big-Data Analytics, Krishnan Raghavan

Doctoral Dissertations

"In this digital age, big-data sets are commonly found in the field of healthcare, manufacturing and others where sustainable analysis is necessary to create useful information. Big-data sets are often characterized by high-dimensionality and massive sample size. High dimensionality refers to the presence of unwanted dimensions in the data where challenges such as noise, spurious correlation and incidental endogeneity are observed. Massive sample size, on the other hand, introduces the problem of heterogeneity because complex and unstructured data types must analyzed. To mitigate the impact of these challenges while considering the application of classification, a two step analysis approach is …


Automatic Identification Of Animals In The Wild: A Comparative Study Between C-Capsule Networks And Deep Convolutional Neural Networks., Joel Kamdem Teto, Ying Xie Nov 2018

Automatic Identification Of Animals In The Wild: A Comparative Study Between C-Capsule Networks And Deep Convolutional Neural Networks., Joel Kamdem Teto, Ying Xie

Master of Science in Computer Science Theses

The evolution of machine learning and computer vision in technology has driven a lot of

improvements and innovation into several domains. We see it being applied for credit decisions, insurance quotes, malware detection, fraud detection, email composition, and any other area having enough information to allow the machine to learn patterns. Over the years the number of sensors, cameras, and cognitive pieces of equipment placed in the wilderness has been growing exponentially. However, the resources (human) to leverage these data into something meaningful are not improving at the same rate. For instance, a team of scientist volunteers took 8.4 years, …


Data Masking, Encryption, And Their Effect On Classification Performance: Trade-Offs Between Data Security And Utility, Juan C. Asenjo Jan 2017

Data Masking, Encryption, And Their Effect On Classification Performance: Trade-Offs Between Data Security And Utility, Juan C. Asenjo

CCE Theses and Dissertations

As data mining increasingly shapes organizational decision-making, the quality of its results must be questioned to ensure trust in the technology. Inaccuracies can mislead decision-makers and cause costly mistakes. With more data collected for analytical purposes, privacy is also a major concern. Data security policies and regulations are increasingly put in place to manage risks, but these policies and regulations often employ technologies that substitute and/or suppress sensitive details contained in the data sets being mined. Data masking and substitution and/or data encryption and suppression of sensitive attributes from data sets can limit access to important details. It is believed …


Conditional Correlation Analysis, Sanjeev Bhatta Jan 2017

Conditional Correlation Analysis, Sanjeev Bhatta

Browse all Theses and Dissertations

Correlation analysis is a frequently used statistical measure to examine the relationship among variables in different practical applications. However, the traditional correlation analysis uses an overly simplistic method to do so. It measures how two variables are related in an application by examining only their relationship in the entire underlying data space. As a result, traditional correlation analysis may miss a strong correlation between those variables especially when that relationship exists in the small subpopulation of the larger data space. This is no longer acceptable and may lose a fair share of information in this era of Big Data which …


Dcms: A Data Analytics And Management System For Molecular Simulation, Meryem Berrada Mar 2015

Dcms: A Data Analytics And Management System For Molecular Simulation, Meryem Berrada

USF Tampa Graduate Theses and Dissertations

Despite the fact that Molecular Simulation systems represent a major research tool in multiple scientific and engineering fields, there is still a lack of systems for effective data management and fast data retrieval and processing. This is mainly due to the nature of MS which generate a very large amount of data - a system usually encompass millions of data information, and one query usually runs for tens of thousands of time frames. For this purpose, we designed and developed a new application, DCMS (A data Analytics and Management System for molecular Simulation), that intends to speed up the process …


Browser Based Visualization For Parameter Spaces Of Big Data Using Client-Server Model, Kurtis M. Glendenning Jan 2015

Browser Based Visualization For Parameter Spaces Of Big Data Using Client-Server Model, Kurtis M. Glendenning

Browse all Theses and Dissertations

Visualization is an important task in data analytics, as it allows researchers to view abstract patterns within the data instead of reading through extensive raw data. Allowing the ability to interact with the visualizations is an essential aspect since it provides the ability to intuitively explore data to find meaning and patterns more efficiently. Interactivity, however, becomes progressively more difficult as the size of the dataset increases. This project begins by leveraging existing web-based data visualization technologies and extends their functionality through the use of parallel processing. This methodology utilizes state-of-the-art techniques, such as Node.js, to split the visualization rendering …


Disaster Data Management In Cloud Environments, Katarina Grolinger Dec 2013

Disaster Data Management In Cloud Environments, Katarina Grolinger

Electronic Thesis and Dissertation Repository

Facilitating decision-making in a vital discipline such as disaster management requires information gathering, sharing, and integration on a global scale and across governments, industries, communities, and academia. A large quantity of immensely heterogeneous disaster-related data is available; however, current data management solutions offer few or no integration capabilities and limited potential for collaboration. Moreover, recent advances in cloud computing, Big Data, and NoSQL have opened the door for new solutions in disaster data management.

In this thesis, a Knowledge as a Service (KaaS) framework is proposed for disaster cloud data management (Disaster-CDM) with the objectives of 1) facilitating information gathering …


Efficient And Private Processing Of Analytical Queries In Scientific Datasets, Anand Kumar Jan 2013

Efficient And Private Processing Of Analytical Queries In Scientific Datasets, Anand Kumar

USF Tampa Graduate Theses and Dissertations

Large amount of data is generated by applications used in basic-science research and development applications. The size of data introduces great challenges in storage, analysis and preserving privacy. This dissertation proposes novel techniques to efficiently analyze the data and reduce storage space requirements through a data compression technique while preserving privacy and providing data security.

We present an efficient technique to compute an analytical query called spatial distance histogram (SDH) using spatiotemporal properties of the data. Special spatiotemporal properties present in the data are exploited to process SDH efficiently on the fly. General purpose graphics processing units (GPGPU or just …