Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Theses/Dissertations

2022

Machine learning

Discipline
Institution
Publication
File Type

Articles 91 - 118 of 118

Full-Text Articles in Physical Sciences and Mathematics

Smoothing Of Convolutional Neural Network Classifications, Glen R. Drumm Mar 2022

Smoothing Of Convolutional Neural Network Classifications, Glen R. Drumm

Theses and Dissertations

Smoothing convolutional neural networks is investigated. When intermittent and random false predictions happen, a technique of average smoothing is applied to smooth out the incorrect predictions. While a simple problem environment shows proof of concept, obstacles remain for applying such a technique to a more operationally complex problem.


Automated Aircraft Visual Inspection With Artificial Data Generation Enabled Deep Learning, Nathan J. Gaul Mar 2022

Automated Aircraft Visual Inspection With Artificial Data Generation Enabled Deep Learning, Nathan J. Gaul

Theses and Dissertations

Aircraft visual inspection, which is essential to daily maintenance of an aircraft, is expensive and time-consuming to perform. Augmenting trained maintenance technicians with automated UAVs to collect and analyze images for aircraft inspection is an active research topic and a potential application of CNNs. Training datasets for niche research topics such as aircraft visual inspection are small and challenging to produce, and the manual process of labeling these datasets often produces subjective annotations. Recently, researchers have produced several successful applications of artificially generated datasets with domain randomization for training CNNs for real-world computer vision problems. The research outlined herein builds …


An Empirical Study On The Efficacy Of Evolutionary Algorithms For Automated Neural Architecture Search, Andrew D. Cuccinello Jan 2022

An Empirical Study On The Efficacy Of Evolutionary Algorithms For Automated Neural Architecture Search, Andrew D. Cuccinello

Theses and Dissertations

The configuration and architecture design of neural networks is a time consuming process that has been shown to provide significant training speed and prediction improvements. Traditionally, this process is done manually, but this requires a large amount of expert knowledge and significant investment of labor. As a result it is beneficial to have automated ways to optimize model architectures. In this thesis, we study the use of evolutionary algorithm for neural architecture search (NAS). Moreover, we investigate the effect of integrating evolutionary NAS into deep reinforcement learning to learn control policy for ATARI game playing. Empirical classification results on the …


A Predictive Model To Predict Cyberattack Using Self-Normalizing Neural Networks, Oluwapelumi Eniodunmo Jan 2022

A Predictive Model To Predict Cyberattack Using Self-Normalizing Neural Networks, Oluwapelumi Eniodunmo

Theses, Dissertations and Capstones

Cyberattack is a never-ending war that has greatly threatened secured information systems. The development of automated and intelligent systems provides more computing power to hackers to steal information, destroy data or system resources, and has raised global security issues. Statistical and Data mining tools have received continuous research and improvements. These tools have been adopted to create sophisticated intrusion detection systems that help information systems mitigate and defend against cyberattacks. However, the advancement in technology and accessibility of information makes more identifiable elements that can be used to gain unauthorized access to systems and resources. Data mining and classification tools …


Supervised Machine Learning Techniques Applied To Low-Cost Air Quality Sensor Suites, Peter Wahman Jan 2022

Supervised Machine Learning Techniques Applied To Low-Cost Air Quality Sensor Suites, Peter Wahman

All Undergraduate Theses and Capstone Projects

Low-cost PM sensors have garnered interest for their ability to reduce the cost of investigating PM concentrations in both indoor and outdoor spaces. They perform well in high concentration lab testing with correlation coefficients greater than 0.9. In real-world applications, the correlation coefficients drop significantly because of sensing floors and adverse ambient conditions. There are plenty of supervised machine learning techniques that aim to correct the measurements ranging from linear regression to more advanced neural networks and random forests. This work aims to use those more complicated techniques to adjust the measurements using other data sets gathered by a sensor …


Classification Of Electropherograms Using Machine Learning For Parkinson’S Disease, Soroush Dehghan Jan 2022

Classification Of Electropherograms Using Machine Learning For Parkinson’S Disease, Soroush Dehghan

Electronic Theses and Dissertations

Parkinson’s disease (PD) is a neurodegenerative movement disorder that progresses gradually over time. The onset of symptoms in people who are suffering from PD can vary from case to case, and it depends on the progression of the disease in each patient. The PD symptoms gradually develop and exacerbate the patient’s movements throughout time. An early diagnosis of PD could improve the outcomes of treatments and could potentially delay the progression of this disorder and that makes discovering a new diagnostic method valuable. In this study, I investigate the feasibility of using a machine learning (ML) approach to classify PD …


Data-Driven Methods For Low-Energy Nuclear Theory, Jordan M.R. Fox Jan 2022

Data-Driven Methods For Low-Energy Nuclear Theory, Jordan M.R. Fox

CGU Theses & Dissertations

The term data-driven describes computational methods for numerical problem solvingwhich have been developed by the field of data science; these are at the intersection of computer science,mathematics, and statistics. When applied to a domain science like nuclear physics, especially with the goalof deepening scientific insight, data-driven methods form a core pillar of the computational science endeavor.In this dissertation I explore two problems related to theoretical nuclear physics: one in the framework of numerical statistics, and the other in the framework of machine learning. I) Historically our understanding of the structure of the atomic nucleus, the quantum many-body problem, has been …


A Validity-Based Approach For Feature Selection In Intrusion Detection Systems, Eljilani Hmouda Jan 2022

A Validity-Based Approach For Feature Selection In Intrusion Detection Systems, Eljilani Hmouda

CCE Theses and Dissertations

Intrusion detection systems are tools that detect and remedy the presence of malicious activities. Intrusion detection systems face many challenges in terms of accurate analysis and evaluation. One such challenge is the involvement of many features during analysis, which leads to high data volume and ultimately excessive computational overhead. This research surrounds the development of a new intrusion detection system by employing an entropy-based measure called v-measure to select significant features and reduce dimensionality. After the development of the intrusion detection system, this feature reduction technique was tested on public datasets by applying machine learning classifiers such as Decision Tree, …


A Surrogate Model Of Molecular Dynamics Simulations For Polar Fluids: Supervised Learning Methods For Molecular Polarization And Unsupervised Methods For Phase Classification, Zackerie W. Hjorth Jan 2022

A Surrogate Model Of Molecular Dynamics Simulations For Polar Fluids: Supervised Learning Methods For Molecular Polarization And Unsupervised Methods For Phase Classification, Zackerie W. Hjorth

Dissertations, Master's Theses and Master's Reports

Molecular Dynamic (MD) simulation is a standard computational tool in soft matter physics. While very powerful, it is computationally expensive, leading to some simulations taking days or even weeks to complete depending on the size of your computer cluster. Finding computationally cheap surrogate models which can learn the output features of MD simulation is therefore highly motivated. In this report I explore the use of deep neural network ensembles as well as support vector machine regressors as surrogate models for MD simulation. From the output of the surrogate models, we can then employ unsupervised learning methods to get insight into …


Finding The Best Predictors For Foot Traffic In Us Seafood Restaurants, Isabel Paige Beaulieu Jan 2022

Finding The Best Predictors For Foot Traffic In Us Seafood Restaurants, Isabel Paige Beaulieu

Honors Theses and Capstones

COVID-19 caused state and nation-wide lockdowns, which altered human foot traffic, especially in restaurants. The seafood sector in particular suffered greatly as there was an increase in illegal fishing, it is made up of perishable goods, it is seasonal in some places, and imports and exports were slowed. Foot traffic data is useful for business owners to have to know how much to order, how many employees to schedule, etc. One issue is that the data is very expensive, hard to get, and not available until months after it is recorded. Our goal is to not only find covariates that …


Using Landsat-Based Phenology Metrics, Terrain Variables, And Machine Learning For Mapping And Probabilistic Prediction Of Forest Community Types In West Virginia, Faith M. Hartley Jan 2022

Using Landsat-Based Phenology Metrics, Terrain Variables, And Machine Learning For Mapping And Probabilistic Prediction Of Forest Community Types In West Virginia, Faith M. Hartley

Graduate Theses, Dissertations, and Problem Reports

This study investigates the mapping of forest community types for the entire state of West Virginia, USA using Global Land Analysis and Discovery (GLAD) Phenology Metrics analysis ready data (ARD) derived from the Landsat time series and digital terrain variables derived from a digital terrain model (DTM). Both classifications and probabilistic predictions were made using random forest (RF) machine learning (ML) and training data derived from ground plots provided by the West Virginia Natural Heritage Program (WVNHP). The primary goal of this study is to explore the use of globally consistent ARD data for operational forest type mapping over a …


Predicting League Of Legends Ranked Games Outcome, Ngoc Linh Chi Nguyen Jan 2022

Predicting League Of Legends Ranked Games Outcome, Ngoc Linh Chi Nguyen

Senior Projects Spring 2022

League of Legends (LoL) is the one of most popular multiplayer online battle arena (MOBA) games in the world. For LoL, the most competitive way to evaluate a player’s skill level, below the professional Esports level, is competitive ranked games. These ranked games utilize a matchmaking system based on the player’s ranks to form a fair team for each game. However, a rank game's outcome cannot necessarily be predicted using just players’ ranks, there are a significant number of different variables impacting a rank game depending on how well each team plays. In this paper, I propose a method to …


Fine Scale Mapping Of Laurentian Mixed Forest Natural Habitat Communities Using Multispectral Naip And Uav Datasets Combined With Machine Learning Methods, Parth P. Bhatt Jan 2022

Fine Scale Mapping Of Laurentian Mixed Forest Natural Habitat Communities Using Multispectral Naip And Uav Datasets Combined With Machine Learning Methods, Parth P. Bhatt

Dissertations, Master's Theses and Master's Reports

Natural habitat communities are an important element of any forest ecosystem. Mapping and monitoring Laurentian Mixed Forest natural communities using high spatial resolution imagery is vital for management and conservation purposes. This study developed integrated spatial, spectral and Machine Learning (ML) approaches for mapping complex vegetation communities. The study utilized ultra-high and high spatial resolution National Agriculture Imagery Program (NAIP) and Unmanned Aerial Vehicle (UAV) datasets, and Digital Elevation Model (DEM). Complex natural vegetation community habitats in the Laurentian Mixed Forest of the Upper Midwest. A detailed workflow is presented to effectively process UAV imageries in a dense forest environment …


Multi-Modality Automatic Lung Tumor Segmentation Method Using Deep Learning And Radiomics, Siqiu Wang Jan 2022

Multi-Modality Automatic Lung Tumor Segmentation Method Using Deep Learning And Radiomics, Siqiu Wang

Theses and Dissertations

Delineation of the tumor volume is the initial and fundamental step in the radiotherapy planning process. The current clinical practice of manual delineation is time-consuming and suffers from observer variability. This work seeks to develop an effective automatic framework to produce clinically usable lung tumor segmentations. First, to facilitate the development and validation of our methodology, an expansive database of planning CTs, diagnostic PETs, and manual tumor segmentations was curated, and an image registration and preprocessing pipeline was established. Then a deep learning neural network was constructed and optimized to utilize dual-modality PET and CT images for lung tumor segmentation. …


Framework For The Evaluation Of Perturbations In The Systems Biology Landscape And Inter-Sample Similarity From Transcriptomic Datasets — A Digital Twin Perspective, Mariah Marie Hoffman Jan 2022

Framework For The Evaluation Of Perturbations In The Systems Biology Landscape And Inter-Sample Similarity From Transcriptomic Datasets — A Digital Twin Perspective, Mariah Marie Hoffman

Dissertations and Theses

One approach to interrogating the complexities of human systems in their well-regulated and dysregulated states is through the use of digital twins. Digital twins are virtual representations of physical systems that are descriptive of an individual's state of health, an object fundamentally related to precision medicine. A key element for building a functional digital twin type for a disease or predicting the therapeutic efficacy of a potential treatment is harmonized, machine-parsable domain knowledge. Hypothesis-driven investigations are the gold standard for representing subsystems, but their results encompass a limited knowledge of the full biosystem. Multi-omics data is one rich source of …


Developing And Validating A Machine Learning-Based Student Attentiveness Tracking System, Andrew L. Sanders Jan 2022

Developing And Validating A Machine Learning-Based Student Attentiveness Tracking System, Andrew L. Sanders

Electronic Theses and Dissertations

Academic instructors and institutions desire the ability to accurately and autonomously measure the attentiveness of students in the classroom. Generally, college departments use unreliable direct communication from students (i.e. emails, phone calls), distracting and Hawthorne effect-inducing observational sit-ins, and end-of-semester surveys to collect feedback regarding their courses. Each of these methods of collecting feedback is useful but does not provide automatic feedback regarding the pace and direction of lectures. Young et al. discuss that attention levels during passive classroom lectures generally drop after about ten to thirty minutes and can be restored to normal levels with regular breaks, novel activities, …


A Citizen-Science Approach For Urban Flood Risk Analysis Using Data Science And Machine Learning, Candace Agonafir Jan 2022

A Citizen-Science Approach For Urban Flood Risk Analysis Using Data Science And Machine Learning, Candace Agonafir

Dissertations and Theses

Street flooding is problematic in urban areas, where impervious surfaces, such as concrete, brick, and asphalt prevail, impeding the infiltration of water into the ground. During rain events, water ponds and rise to levels that cause considerable economic damage and physical harm. The main goal of this dissertation is to develop novel approaches toward the comprehension of urban flood risk using data science techniques on crowd-sourced data. This is accomplished by developing a series of data-driven models to identify flood factors of significance and localized areas of flood vulnerability in New York City (NYC). First, the infrastructural (catch basin clogs, …


Faking Sensor Noise Information, Justin Chang Jan 2022

Faking Sensor Noise Information, Justin Chang

Master's Projects

Noise residue detection in digital images has recently been used as a method to classify images based on source camera model type. The meteoric rise in the popularity of using Neural Network models has also been used in conjunction with the concept of noise residuals to classify source camera models. However, many papers gloss over the details on the methods of obtaining noise residuals and instead rely on the self- learning aspect of deep neural networks to implicitly discover this themselves. For this project I propose a method of obtaining noise residuals (“noiseprints”) and denoising an image, as well as …


Assessing Machine Learning Utility In Predicting Hydrologic And Nitrate Dynamics In Karst Agroecosystems, Timothy Mcgill Jan 2022

Assessing Machine Learning Utility In Predicting Hydrologic And Nitrate Dynamics In Karst Agroecosystems, Timothy Mcgill

Theses and Dissertations--Biosystems and Agricultural Engineering

Seasonal hypoxia in the Gulf of Mexico and harmful algal blooms experienced in many inland freshwater bodies is partially driven due to excessive nitrogen loading seen from agricultural watersheds. Within the Mississippi/Atchafalaya River Basin, many areas are underlain with karst features, and efforts to reduce nitrogen contributions from these areas have had varying success, due to lacking a complete understanding of nutrient dynamics in karst agricultural systems. To improve the understanding of nitrogen cycling in these systems, 35 months of high resolution in situ water quality and atmospheric data were collected and fed into a two-hidden layer extreme learning machine …


Applications Of A Combined Approach Of Kinetic Monte Carlo Simulations And Machine Learning To Model Atomic Layer Deposition (Ald) Of Metal Oxides, Emily Justus Jan 2022

Applications Of A Combined Approach Of Kinetic Monte Carlo Simulations And Machine Learning To Model Atomic Layer Deposition (Ald) Of Metal Oxides, Emily Justus

MSU Graduate Theses

Metal-oxides such as ZnO or Al2O3 synthesized through Atomic Layer Deposition (ALD) have been of great research interest as the candidate materials for ultra-thin tunnel barriers. In this study, I have applied a 3D on-lattice Kinetic Monte Carlo (kMC) code developed by Timo Weckman’s group to simulate the growth mechanisms of the tunnel barrier layer and to evaluate the role of various experimentally relevant factors in the ALD processes. I have systematically studied the effect of parameters such as the chamber pressure temperature, pulse, and purge times. The database generated from the kMC simulations was subsequently used …


Classifying Blood Glucose Levels Through Noninvasive Features, Rishi Reddy Jan 2022

Classifying Blood Glucose Levels Through Noninvasive Features, Rishi Reddy

Graduate Theses, Dissertations, and Problem Reports

Blood glucose monitoring is a key process in the prevention and management of certain chronic diseases, such as diabetes. Currently, glucose monitoring for those interested in their blood glucose levels are confronted with options that are primarily invasive and relatively costly. A growing topic of note is the development of non-invasive monitoring methods for blood glucose. This development holds a significant promise for improvement to the quality of life of a significant portion of the population and is overall met with great enthusiasm from the scientific community as well as commercial interest. This work aims to develop a potential pipeline …


Exploring Cyberterrorism, Topic Models And Social Networks Of Jihadists Dark Web Forums: A Computational Social Science Approach, Vivian Fiona Guetler Jan 2022

Exploring Cyberterrorism, Topic Models And Social Networks Of Jihadists Dark Web Forums: A Computational Social Science Approach, Vivian Fiona Guetler

Graduate Theses, Dissertations, and Problem Reports

This three-article dissertation focuses on cyber-related topics on terrorist groups, specifically Jihadists’ use of technology, the application of natural language processing, and social networks in analyzing text data derived from terrorists' Dark Web forums. The first article explores cybercrime and cyberterrorism. As technology progresses, it facilitates new forms of behavior, including tech-related crimes known as cybercrime and cyberterrorism. In this article, I provide an analysis of the problems of cybercrime and cyberterrorism within the field of criminology by reviewing existing literature focusing on (a) the issues in defining terrorism, cybercrime, and cyberterrorism, (b) ways that cybercriminals commit a crime in …


Establishing A Machine Learning Framework For Discovering Novel Phononic Crystal Designs, Drew Feltner Jan 2022

Establishing A Machine Learning Framework For Discovering Novel Phononic Crystal Designs, Drew Feltner

Browse all Theses and Dissertations

A phonon is a discrete unit of vibrational motion that occurs in a crystal lattice. Phonons and the frequency at which they propagate play a significant role in the thermal, optical, and electronic properties of a material. A phononic material/device is similar to a photonic material/device, except that it is fabricated to manipulate certain bands of acoustic waves instead of electromagnetic waves. Phononic materials and devices have been studied much less than their photonic analogues and as such current materials exhibit control over a smaller range of frequencies. This study aims to test the viability of machine learning, specifically neural …


Identifying Network Biomarkers For Each Breast Cancer Subtypes Along With Their Effective Single And Paired Repurposed Drugs Using Network-Based Machine Learning Techniques, Forough Firoozbakht Jan 2022

Identifying Network Biomarkers For Each Breast Cancer Subtypes Along With Their Effective Single And Paired Repurposed Drugs Using Network-Based Machine Learning Techniques, Forough Firoozbakht

Electronic Theses and Dissertations

Breast cancer is a complex disease that can be classified into at least 10 different molecular subtypes. Appropriate diagnosis of specific subtypes is critical for ensuring the best possible patient treatment and response to therapy. Current computational methods for determining the subtypes are based on identifying differentially expressed genes (i.e., biomarkers) that can best discriminate the subtypes. Such approaches, however, are known to be unreliable since they yield different biomarker sets when applied to data sets from different studies. Gathering knowledge about the functional relationship among genes will identify “network biomarkers” that will enrich the criteria for biomarker selection. Cancer …


Few-Shot Malware Detection Using A Novel Adversarial Reprogramming Model, Ekula Praveen Kumar Jan 2022

Few-Shot Malware Detection Using A Novel Adversarial Reprogramming Model, Ekula Praveen Kumar

Browse all Theses and Dissertations

The increasing sophistication of malware has made detecting and defending against new strains a major challenge for cybersecurity. One promising approach to this problem is using machine learning techniques that extract representative features and train classification models to detect malware in an early stage. However, training such machine learning-based malware detection models represents a significant challenge that requires a large number of high-quality labeled data samples while it is very costly to obtain them in real-world scenarios. In other words, training machine learning models for malware detection requires the capability to learn from only a few labeled examples. To address …


Machine Learning-Driven Surrogate Models For Electrolytes, Tong Gao Jan 2022

Machine Learning-Driven Surrogate Models For Electrolytes, Tong Gao

Dissertations, Master's Theses and Master's Reports

We have developed a lattice Monte Carlo (MC) simulation based on the diffusion-limited aggregation model that accounts for the effect of the physical properties of ionic liquids (ILs) on lithium dendrite growth. Our simulations show that the size asymmetry between the cation and anion, the dielectric constant, and the volume fraction of ILs are critical factors to significantly suppress the dendrite growth, primarily due to substantial changes in electric-field screening. Specifically, the volume fraction of ILs has the optimal value for dendrite suppression. The present simulation method indicates potential challenges for the model extension to macroscopic systems. Therefore, we also …


Incorporating Ontological Information In Biomedical Entity Linking Of Phrases In Clinical Text, Evan French Jan 2022

Incorporating Ontological Information In Biomedical Entity Linking Of Phrases In Clinical Text, Evan French

Theses and Dissertations

Biomedical Entity Linking (BEL) is the task of mapping spans of text within biomedical documents to normalized, unique identifiers within an ontology. Translational application of BEL on clinical notes has enormous potential for augmenting discretely captured data in electronic health records, but the existing paradigm for evaluating BEL systems developed in academia is not well aligned with real-world use cases. In this work, we demonstrate a proof of concept for incorporating ontological similarity into the training and evaluation of BEL systems to begin to rectify this misalignment. This thesis has two primary components: 1) a comprehensive literature review and 2) …


A Non-Deterministic Deep Learning Based Surrogate For Ice Sheet Modeling, Hannah Jordan Jan 2022

A Non-Deterministic Deep Learning Based Surrogate For Ice Sheet Modeling, Hannah Jordan

Graduate Student Theses, Dissertations, & Professional Papers

Surrogate modeling is a new and expanding field in the world of deep learning, providing a computationally inexpensive way to approximate results from computationally demanding high-fidelity simulations. Ice sheet modeling is one of these computationally expensive models, the model used in this study currently requires between 10 and 20 minutes to complete one simulation. While this process is adequate for certain applications, the ability to use sampling approaches to perform statistical inference becomes infeasible. This issue can be overcome by using a surrogate model to approximate the ice sheet model, bringing the time to produce output down to a tenth …