Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Theses/Dissertations

2021

Machine learning

Discipline
Institution
Publication

Articles 91 - 109 of 109

Full-Text Articles in Physical Sciences and Mathematics

Texture-Driven Image Clustering In Laser Powder Bed Fusion, Alexander H. Groeger Jan 2021

Texture-Driven Image Clustering In Laser Powder Bed Fusion, Alexander H. Groeger

Browse all Theses and Dissertations

The additive manufacturing (AM) field is striving to identify anomalies in laser powder bed fusion (LPBF) using multi-sensor in-process monitoring paired with machine learning (ML). In-process monitoring can reveal the presence of anomalies but creating a ML classifier requires labeled data. The present work approaches this problem by printing hundreds of Inconel-718 coupons with different processing parameters to capture a wide range of process monitoring imagery with multiple sensor types. Afterwards, the process monitoring images are encoded into feature vectors and clustered to isolate groups in each sensor modality. Four texture representations were learned by training two convolutional neural network …


Wind Turbine Parameter Calibration Using Deep Learning Approaches, Rebecca Mccubbin Jan 2021

Wind Turbine Parameter Calibration Using Deep Learning Approaches, Rebecca Mccubbin

Electronic Theses and Dissertations

The inertia and damping coefficients are critical to understanding the workings of a wind turbine, especially when it is in a transient state. However, many manufacturers do not provide this information about their turbines, requiring people to estimate these values themselves. This research seeks to design a multilayer perceptron (MLP) that can accurately predict the inertia and damping coefficients using the power data from a turbine during a transient state. To do this, a model of a wind turbine was built in Matlab, and a simulation of a three-phase fault was used to collect realistic fault data to input into …


Increasing Software Reliability Using Mutation Testing And Machine Learning, Michael Allen Stewart Jan 2021

Increasing Software Reliability Using Mutation Testing And Machine Learning, Michael Allen Stewart

CCE Theses and Dissertations

Mutation testing is a type of software testing proposed in the 1970s where program statements are deliberately changed to introduce simple errors so that test cases can be validated to determine if they can detect the errors. The goal of mutation testing was to reduce complex program errors by preventing the related simple errors. Test cases are executed against the mutant code to determine if one fails, detects the error and ensures the program is correct. One major issue with this type of testing was it became intensive computationally to generate and test all possible mutations for complex programs.

This …


Analysis Of Classifier Weaknesses Based On Patterns And Corrective Methods, Nicholas Skapura Jan 2021

Analysis Of Classifier Weaknesses Based On Patterns And Corrective Methods, Nicholas Skapura

Browse all Theses and Dissertations

Classification is an important branch of machine learning that impacts many areas of modern life. Many classification algorithms (classifiers for short) have been developed. They have highly different levels of sophistication and classification accuracy. Classification problems often have highly different levels of hardness and complexity. Practitioners of classification modeling need better understanding of those algorithms in order to select the optimal algorithm for given classification problems. Researchers of classification need new insight on how given classifiers are weak and how they can be improved by correcting their classification errors. This dissertation introduces new tools and concepts to analyze classifier weakness …


Texture-Driven Image Clustering In Laser Powder Bed Fusion, Alexander H. Groeger Jan 2021

Texture-Driven Image Clustering In Laser Powder Bed Fusion, Alexander H. Groeger

Browse all Theses and Dissertations

The additive manufacturing (AM) field is striving to identify anomalies in laser powder bed fusion (LPBF) using multi-sensor in-process monitoring paired with machine learning (ML). In-process monitoring can reveal the presence of anomalies but creating a ML classifier requires labeled data. The present work approaches this problem by printing hundreds of Inconel-718 coupons with different processing parameters to capture a wide range of process monitoring imagery with multiple sensor types. Afterwards, the process monitoring images are encoded into feature vectors and clustered to isolate groups in each sensor modality. Four texture representations were learned by training two convolutional neural network …


Using Text Mining And Machine Learning Classifiers To Analyze Stack Overflow, Taylor Morris Jan 2021

Using Text Mining And Machine Learning Classifiers To Analyze Stack Overflow, Taylor Morris

Dissertations, Master's Theses and Master's Reports

StackOverflow is an extensively used platform for programming questions. In this report, text mining and machine learning classifiers such as decision trees and Naive Bayes are used to evaluate whether a given question posted on StackOverflow will be closed or answered. While multiple models were used in the analysis, the performance for the models was no better than the majority classifier. Future work to develop better performing classifiers to understand why a question is closed or answered will require additional natural language processing or methods to address the imbalanced data.


Reliable And Interpretable Machine Learning For Modeling Physical And Cyber Systems, Daniel L. Marino Lizarazo Jan 2021

Reliable And Interpretable Machine Learning For Modeling Physical And Cyber Systems, Daniel L. Marino Lizarazo

Theses and Dissertations

Over the past decade, Machine Learning (ML) research has predominantly focused on building extremely complex models in order to improve predictive performance. The idea was that performance can be improved by adding complexity to the models. This approach proved to be successful in creating models that can approximate highly complex relationships while taking advantage of large datasets. However, this approach led to extremely complex black-box models that lack reliability and are difficult to interpret. By lack of reliability, we specifically refer to the lack of consistent (unpredictable) behavior in situations outside the training data. Lack of interpretability refers to the …


The Application Of Machine Learning In Analyzing Organic Compounds From Nmr Spectral Data, Nicole Maia Powell Jan 2021

The Application Of Machine Learning In Analyzing Organic Compounds From Nmr Spectral Data, Nicole Maia Powell

Senior Independent Study Theses

Nuclear magnetic resonance (NMR) is used in organic chemistry to identify unknown organic compounds. The data obtained from an NMR spectrometer are typically shown in the form of a spectrum, which is then analyzed by an analytical chemist. The action of analyzing a spectrum, especially one of a large and complex molecule, is a long and tedious process. In this project, Python is used to implement hierarchical clustering on NMR data obtained from an NMR spectrometer at the College of Wooster to explore its application in NMR analysis. MATLAB is used to build a decision tree from the same data, …


Implementing A Neural Network For Supervised Learning With A Random Configuration Of Layers And Nodes, Kane A. Phillips Jan 2021

Implementing A Neural Network For Supervised Learning With A Random Configuration Of Layers And Nodes, Kane A. Phillips

Electronic Theses and Dissertations

Deep learning has a substantial amount of real-life applications, making it an increasingly popular subset of artificial intelligence over the last decade. These applications come to fruition due to the tireless research and implementation of neural networks. This paper goes into detail on the implementation of supervised learning neural networks utilizing MATLAB, with the purpose being to generate a neural network based on specifications given by a user. Such specifications involve how many layers are in the network, and how many nodes are in each layer. The neural network is then trained based on known sample values of a function …


Developing Natural Language Processing Instruments To Study Sociotechnical Systems, Thayer Alshaabi Jan 2021

Developing Natural Language Processing Instruments To Study Sociotechnical Systems, Thayer Alshaabi

Graduate College Dissertations and Theses

Identifying temporal linguistic patterns and tracing social amplification across communities has always been vital to understanding modern sociotechnical systems. Now, well into the age of information technology, the growing digitization of text archives powered by machine learning systems has enabled an enormous number of interdisciplinary studies to examine the coevolution of language and culture. However, most research in that domain investigates formal textual records, such as books and newspapers. In this work, I argue that the study of conversational text derived from social media is just as important. I present four case studies to identify and investigate societal developments in …


Feature Selection On Permissions, Intents And Apis For Android Malware Detection, Fred Guyton Jan 2021

Feature Selection On Permissions, Intents And Apis For Android Malware Detection, Fred Guyton

CCE Theses and Dissertations

Malicious applications pose an enormous security threat to mobile computing devices. Currently 85% of all smartphones run Android, Google’s open-source operating system, making that platform the primary threat vector for malware attacks. Android is a platform that hosts roughly 99% of known malware to date, and is the focus of most research efforts in mobile malware detection due to its open source nature. One of the main tools used in this effort is supervised machine learning. While a decade of work has made a lot of progress in detection accuracy, there is an obstacle that each stream of research is …


Volcan De Fuego: A Machine Learning Approach In Understanding The Eruptive Cycles Using Precursory Tilt Signals, Kay Sivaraj Jan 2021

Volcan De Fuego: A Machine Learning Approach In Understanding The Eruptive Cycles Using Precursory Tilt Signals, Kay Sivaraj

Dissertations, Master's Theses and Master's Reports

Volcan de Fuego is an active stratovolcano located in the Central Guatemalan segment of the 1100 m long Central America Volcanic Arc System (CAVAS). Fuego-Acatenango massif consists of at least four major vents of which the Fuego summit vent is the most active and the youngest member. The volcano exhibits primarily Strombolian and Vulcanian behavior along with occasional paroxysms and pyroclastic flows. Historically, Fuego has produced basaltic-andesitic rocks with more recent eruptions progressively trending towards maficity. Several studies have used short-term deployments of broadband seismometers, infrasound, and long-term remote sensing techniques to characterize the mechanism of Fuego. In our study, …


Machine Learning And Bioinformatic Insights Into Key Enzymes For A Bio-Based Circular Economy, Japheth E. Gado Jan 2021

Machine Learning And Bioinformatic Insights Into Key Enzymes For A Bio-Based Circular Economy, Japheth E. Gado

Theses and Dissertations--Chemical and Materials Engineering

The world is presently faced with a sustainability crisis; it is becoming increasingly difficult to meet the energy and material needs of a growing global population without depleting and polluting our planet. Greenhouse gases released from the continuous combustion of fossil fuels engender accelerated climate change, and plastic waste accumulates in the environment. There is need for a circular economy, where energy and materials are renewably derived from waste items, rather than by consuming limited resources. Deconstruction of the recalcitrant linkages in natural and synthetic polymers is crucial for a circular economy, as deconstructed monomers can be used to manufacture …


Identification And Classification Of Radio Pulsar Signals Using Machine Learning, Di Pang Jan 2021

Identification And Classification Of Radio Pulsar Signals Using Machine Learning, Di Pang

Graduate Theses, Dissertations, and Problem Reports

Automated single-pulse search approaches are necessary as ever-increasing amount of observed data makes the manual inspection impractical. Detecting radio pulsars using single-pulse searches, however, is a challenging problem for machine learning because pul- sar signals often vary significantly in brightness, width, and shape and are only detected in a small fraction of observed data.

The research work presented in this dissertation is focused on development of ma- chine learning algorithms and approaches for single-pulse searches in the time domain. Specifically, (1) We developed a two-stage single-pulse search approach, named Single- Pulse Event Group IDentification (SPEGID), which automatically identifies and clas- …


Classification Of Chess Games: An Exploration Of Classifiers For Anomaly Detection In Chess, Masudul Hoque Jan 2021

Classification Of Chess Games: An Exploration Of Classifiers For Anomaly Detection In Chess, Masudul Hoque

All Graduate Theses, Dissertations, and Other Capstone Projects

Chess is a strategy board game with its inception dating back to the 15th century. The Covid-19 pandemic has led to a chess boom online with 95,853,038 chess games being played during January, 2021 on lichess.com. Along with the chess boom, instances of cheating have also become more rampant. Classifications have been used for anomaly detection in different fields and thus it is a natural idea to develop classifiers to detect cheating in chess. However, there are no specific examples of this, and it is difficult to obtain data where cheating has occurred. So, in this paper, we develop 4 …


Searching Harder, Localizing Better, Classifying Faster: Optimizing Fast Radio Burst Detection And Analysis, Kshitij Aggarwal Jan 2021

Searching Harder, Localizing Better, Classifying Faster: Optimizing Fast Radio Burst Detection And Analysis, Kshitij Aggarwal

Graduate Theses, Dissertations, and Problem Reports

Fast Radio Bursts (or FRBs) are millisecond-duration transients of extragalactic origin. They exhibit dispersion caused by propagation through an ionized medium, and quantified by Dispersion Measure (DM). Around 800 FRBs (24 repeaters) have been discovered; so far, 24 FRBs have been confidently associated with a host galaxy. In this thesis, we discuss multiple new FRB search and analysis techniques and the corresponding tools that enable us to search for FRBs harder, localize them better, and classify candidates faster.

We discuss five open-source software suites that can be used in FRB analysis. These suites are used to distinguish between FRBs and …


Ensemble Protein Inference Evaluation, Kyle Lee Lucke Jan 2021

Ensemble Protein Inference Evaluation, Kyle Lee Lucke

Graduate Student Theses, Dissertations, & Professional Papers

The Protein inference problem is becoming an increasingly important tool that aids in the characterization of complex proteomes and analysis of complex protein samples. In bottom-up shotgun proteomics experiments the metrics for evaluation (like AUC and calibration error) are based on an often imperfect target-decoy database. These metrics make the inherent assumption that all of the proteins in the target set are present in the sample being analyzed. In general, this is not the case, they are typically a mix of present and absent proteins. To objectively evaluate inference methods, protein standard datasets are used. These datasets are special in …


Inference Of Surface Velocities From Oblique Time Lapse Photos And Terrestrial Based Lidar At The Helheim Glacier, Franklyn T. Dunbar Ii Jan 2021

Inference Of Surface Velocities From Oblique Time Lapse Photos And Terrestrial Based Lidar At The Helheim Glacier, Franklyn T. Dunbar Ii

Graduate Student Theses, Dissertations, & Professional Papers

Using time dependent observations derived from terrestrial LiDAR and oblique
time-lapse imagery, we demonstrate that a Bayesian approach to glacial motion es-
timation provides a concise way to incorporate multiple data products into a single
motion estimation procedure effectively producing surface velocity estimates with
an associated uncertainty. This approach brings both improved computational effi-
ciency, and greater scalability across observational time-frames when compared to
existing methods. To gauge efficacy, we apply these methods to a set of observa-
tions from the Helheim Glacier, a critical actor in contemporary mass loss trends
observed in the Greenland Ice Sheet. We find that …


Quality Of Sql Code Security On Stackoverflow And Methods Of Prevention, Robert Klock Jan 2021

Quality Of Sql Code Security On Stackoverflow And Methods Of Prevention, Robert Klock

Honors Papers

This paper explores the frequency at which SQL/PHP posts on the website Stackoverflow.com contain code susceptible to SQL Injection, a common database vulnerability. Specifically, we analyze whether other users give notice of the vulnerability or provide an answer that is secure. The majority of questions analyzed were vulnerable to SQL Injection and were not corrected in their answers or brought to the attention of the original poster. To mitigate this, we present a machine learning bot which analyzes the poster’s code and alerts them of potential injection vulnerabilities, if necessary.