Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Machine learning

Applied Mathematics

Institution
Publication Year
Publication
Publication Type

Articles 1 - 30 of 40

Full-Text Articles in Physical Sciences and Mathematics

Convolutional Neural Network-Based Gene Prediction Using Buffalograss As A Model System, Michael Morikone Nov 2023

Convolutional Neural Network-Based Gene Prediction Using Buffalograss As A Model System, Michael Morikone

Complex Biosystems PhD Program: Dissertations

The task of gene prediction has been largely stagnant in algorithmic improvements compared to when algorithms were first developed for predicting genes thirty years ago. Rather than iteratively improving the underlying algorithms in gene prediction tools by utilizing better performing models, most current approaches update existing tools through incorporating increasing amounts of extrinsic data to improve gene prediction performance. The traditional method of predicting genes is done using Hidden Markov Models (HMMs). These HMMs are constrained by having strict assumptions made about the independence of genes that do not always hold true. To address this, a Convolutional Neural Network (CNN) …


Compatibility Of Clique Clustering Algorithm With Dimensionality Reduction, Ug ̆Ur Madran, Duygu Soyog ̆Lu Sep 2023

Compatibility Of Clique Clustering Algorithm With Dimensionality Reduction, Ug ̆Ur Madran, Duygu Soyog ̆Lu

Applied Mathematics & Information Sciences

In our previous work, we introduced a clustering algorithm based on clique formation. Cliques, the obtained clusters, are constructed by choosing the most dense complete subgraphs by using similarity values between instances. The clique algorithm successfully reduces the number of instances in a data set without substantially changing the accuracy rate. In this current work, we focused on reducing the number of features. For this purpose, the effect of the clique clustering algorithm on dimensionality reduction has been analyzed. We propose a novel algorithm for support vector machine classification by combining these two techniques and applying different strategies by differentiating …


Data-Driven Exploration Of Coarse-Grained Equations: Harnessing Machine Learning, Elham Kianiharchegani Aug 2023

Data-Driven Exploration Of Coarse-Grained Equations: Harnessing Machine Learning, Elham Kianiharchegani

Electronic Thesis and Dissertation Repository

In scientific research, understanding and modeling physical systems often involves working with complex equations called Partial Differential Equations (PDEs). These equations are essential for describing the relationships between variables and their derivatives, allowing us to analyze a wide range of phenomena, from fluid dynamics to quantum mechanics. Traditionally, the discovery of PDEs relied on mathematical derivations and expert knowledge. However, the advent of data-driven approaches and machine learning (ML) techniques has transformed this process. By harnessing ML techniques and data analysis methods, data-driven approaches have revolutionized the task of uncovering complex equations that describe physical systems. The primary goal in …


Mathematics Behind Machine Learning, Rim Hammoud Aug 2023

Mathematics Behind Machine Learning, Rim Hammoud

Electronic Theses, Projects, and Dissertations

Artificial intelligence (AI) is a broad field of study that involves developing intelligent
machines that can perform tasks that typically require human intelligence. Machine
learning (ML) is often used as a tool to help create AI systems. The goal of ML is
to create models that can learn and improve to make predictions or decisions based on given data. The goal of this thesis is to build a clear and rigorous exposition of the mathematical underpinnings of support vector machines (SVM), a popular platform used in ML. As we will explore later on in the thesis, SVM can be implemented …


Learning The Game: Implementations Of Convolutional Networks In Automated Strategy Identification, Cameron Klig Jun 2023

Learning The Game: Implementations Of Convolutional Networks In Automated Strategy Identification, Cameron Klig

Master's Theses

Games can be used to represent a wide variety of real world problems, giving rise to many applications of game theory. Various computational methods have been proposed for identifying game strategies, including optimized tree search algorithms, game-specific heuristics, and artificial intelligence. In the last decade, systems like AlphaGo and AlphaZero have significantly exceeded the performance of the best human players in Chess, Go, and other games. The most effective game engines to date employ convolutional neural networks (CNNs) to evaluate game boards, extract features, and predict the optimal next move. These engines are trained on billions of simulated games, wherein …


Continuum Modeling Of Active Nematics Via Data-Driven Equation Discovery, Connor Robertson May 2023

Continuum Modeling Of Active Nematics Via Data-Driven Equation Discovery, Connor Robertson

Dissertations

Data-driven modeling seeks to extract a parsimonious model for a physical system directly from measurement data. One of the most interpretable of these methods is Sparse Identification of Nonlinear Dynamics (SINDy), which selects a relatively sparse linear combination of model terms from a large set of (possibly nonlinear) candidates via optimization. This technique has shown promise for synthetic data generated by numerical simulations but the application of the techniques to real data is less developed. This dissertation applies SINDy to video data from a bio-inspired system of mictrotubule-motor protein assemblies, an example of nonequilibrium dynamics that has posed a significant …


Multilevel Optimization With Dropout For Neural Networks, Gary Joseph Saavedra Apr 2023

Multilevel Optimization With Dropout For Neural Networks, Gary Joseph Saavedra

Mathematics & Statistics ETDs

Large neural networks have become ubiquitous in machine learning. Despite their widespread use, the optimization process for training a neural network remains com-putationally expensive and does not necessarily create networks that generalize well to unseen data. In addition, the difficulty of training increases as the size of the neural network grows. In this thesis, we introduce the novel MGDrop and SMGDrop algorithms which use a multigrid optimization scheme with a dropout coarsening operator to train neural networks. In contrast to other standard neural network training schemes, MGDrop explicitly utilizes information from smaller sub-networks which act as approximations of the full …


Quantum Computing And Its Applications In Healthcare, Vu Giang Jan 2023

Quantum Computing And Its Applications In Healthcare, Vu Giang

OUR Journal: ODU Undergraduate Research Journal

This paper serves as a review of the state of quantum computing and its application in healthcare. The various avenues for how quantum computing can be applied to healthcare is discussed here along with the conversation about the limitations of the technology. With more and more efforts put into the development of these computers, its future is promising with the endeavors of furthering healthcare and various other industries.


Graph-Based Acoustic Clustering And Classification, Justin Youngho Sunu Jan 2023

Graph-Based Acoustic Clustering And Classification, Justin Youngho Sunu

CGU Theses & Dissertations

The rapid growth of audio data collection in various domains necessitates advanced techniquesfor efficient analysis and classification. This dissertation proposes new approaches for categorizing acoustic data, using both unsupervised and semi-supervised learning methods. Starting with raw audio, we preprocess the signal to segment it into time windows, each of which we consider as an independent data point. We use the short-time Fourier transform to describe the signal in a given time window as a set of Fourier coefficients. We interpret the resulting frequency signature as a high-dimensional feature description of each data point. We then develop a graph-based approach for …


Leveraging Subject Matter Expertise To Optimize Machine Learning Techniques For Air And Space Applications, Philip Y. Cho Sep 2022

Leveraging Subject Matter Expertise To Optimize Machine Learning Techniques For Air And Space Applications, Philip Y. Cho

Theses and Dissertations

We develop new machine learning and statistical methods that are tailored for Air and Space applications through the incorporation of subject matter expertise. In particular, we focus on three separate research thrusts that each represents a different type of subject matter knowledge, modeling approach, and application. In our first thrust, we incorporate knowledge of natural phenomena to design a neural network algorithm for localizing point defects in transmission electron microscopy (TEM) images of crystalline materials. In our second research thrust, we use Bayesian feature selection and regression to analyze the relationship between fighter pilot attributes and flight mishap rates. We …


A Kuramoto Model Approach To Predicting Chaotic Systems With Echo State Networks, Sophie Wu, Jackson Howe Aug 2022

A Kuramoto Model Approach To Predicting Chaotic Systems With Echo State Networks, Sophie Wu, Jackson Howe

Undergraduate Student Research Internships Conference

An Echo State Network (ESN) with an activation function based on the Kuramoto model (Kuramoto ESN) is implemented, which can successfully predict the logistic map for a non-trivial number of time steps. The reservoir in the prediction stage exhibits binary dynamics when a good prediction is made, but the oscillators in the reservoir display a larger variability in states as the ESN’s prediction becomes worse. Analytical approaches to quantify how the Kuramoto ESN’s dynamics relate to its prediction are explored, as well as how the dynamics of the Kuramoto ESN relate to another widely studied physical model, the Ising model.


Machine Learning Model Comparison And Arma Simulation Of Exhaled Breath Signals Classifying Covid-19 Patients, Aaron Christopher Segura Aug 2022

Machine Learning Model Comparison And Arma Simulation Of Exhaled Breath Signals Classifying Covid-19 Patients, Aaron Christopher Segura

Mathematics & Statistics ETDs

This study compared the performance of machine learning models in classifying COVID-19 patients using exhaled breath signals and simulated datasets. Ground truth classification was determined by the gold standard Polymerase Chain Reaction (PCR) test results. A residual bootstrapped method generated the simulated datasets by fitting signal data to Autoregressive Moving Average (ARMA) models. Classification models included neural networks, k-nearest neighbors, naïve Bayes, random forest, and support vector machines. A Recursive Feature Elimination (RFE) study was performed to determine if reducing signal features would improve the classification models performance using Gini Importance scoring for the two classes. The top 25% of …


Stability And Differential Privacy Of Stochastic Gradient Methods, Zhenhuan Yang Aug 2022

Stability And Differential Privacy Of Stochastic Gradient Methods, Zhenhuan Yang

Legacy Theses & Dissertations (2009 - 2024)

Recently there are a considerable amount of work devoted to the study of the algorithmic stability as well as differential privacy (DP) for stochastic gradient methods (SGM). However, most of the existing work focus on the empirical risk minimization (ERM) and the population risk minimization problems. In this paper, we study two types of optimization problems that enjoy wide applications in modern machine learning, namely the minimax problem and the pairwise learning problem.


Applications Of Machine Learning Algorithms In Materials Science And Bioinformatics, Mohammed Quazi Jun 2022

Applications Of Machine Learning Algorithms In Materials Science And Bioinformatics, Mohammed Quazi

Mathematics & Statistics ETDs

The piezoelectric response has been a measure of interest in density functional theory (DFT) for micro-electromechanical systems (MEMS) since the inception of MEMS technology. Piezoelectric-based MEMS devices find wide applications in automobiles, mobile phones, healthcare devices, and silicon chips for computers, to name a few. Piezoelectric properties of doped aluminum nitride (AlN) have been under investigation in materials science for piezoelectric thin films because of its wide range of device applicability. In this research using rigorous DFT calculations, high throughput ab-initio simulations for 23 AlN alloys are generated.

This research is the first to report strong enhancements of piezoelectric properties …


Fine-Tuning A 𝑘-Nearest Neighbors Machine Learning Model For The Detection Of Insurance Fraud, Alliyah Stout Jun 2022

Fine-Tuning A 𝑘-Nearest Neighbors Machine Learning Model For The Detection Of Insurance Fraud, Alliyah Stout

Honors Theses

Billions of dollars are lost within insurance companies due to fraud. Large money losses force insurance companies to increase premium costs and/or restrict policies. This negatively affects a company’s loyal customers. Although this is a prevalent problem, companies are not urgently working toward bettering their machine learning algorithms. Underskilled workers paired with inefficient computer algorithms make it difficult to accurately and reliably detect fraud.

The goal of this study is to understand the idea of -Nearest Neighbors ( -NN) and to use this classification technique to accurately detect fraudulent auto insurance claims. Using -NN requires choosing a value and a …


Generating A Dataset For Comparing Linear Vs. Non-Linear Prediction Methods In Education Research, Jack Mauro, Elena Martinez, Anna Bargagliotti May 2022

Generating A Dataset For Comparing Linear Vs. Non-Linear Prediction Methods In Education Research, Jack Mauro, Elena Martinez, Anna Bargagliotti

Honors Thesis

Machine learning is often used to build predictive models by extracting patterns from large data sets. Such techniques are increasingly being utilized to predict outcomes in the social sciences. One such application is predicting student success. Machine learning can be applied to predicting student acceptance and success in academia. Using these tools for education-related data analysis, may enable the evaluation of programs, resources and curriculum. Currently, research is needed to examine application, admissions, and retention data in order to address equity in college computer science programs. However, most student-level data sets contain sensitive data that cannot be made public. To …


Evaluating The Behaviour Of Centrally Perforated Unreinforced Masonry Walls: Applications Of Numerical Analysis, Machine Learning, And Stochastic Methods, Mohsen Khaleghi, Javid Salimi, Visar Farhangi, Mohammad Javad Moradi, Moses Karakouzian May 2022

Evaluating The Behaviour Of Centrally Perforated Unreinforced Masonry Walls: Applications Of Numerical Analysis, Machine Learning, And Stochastic Methods, Mohsen Khaleghi, Javid Salimi, Visar Farhangi, Mohammad Javad Moradi, Moses Karakouzian

Civil and Environmental Engineering and Construction Faculty Research

The presence of openings greatly affects the response of unreinforced masonry (URM) walls. This topic greatly attracts the attention of many researchers. Perforated unreinforced masonry (PURM) walls under in-plane loads through the truss discretization method (TDM) along with several machine learning approaches such as Multilayer perceptron (MLP), Group of Method Data Handling (GMDH), and Radial basis function (RBF) are described in this paper. A new method named Multi-pier (MP) that is fast and accurate, is used to determine the behavior of PURM walls. The results of the MP method are expressed as a ratio of lateral load-bearing capacity and initial …


Intra-Hour Solar Forecasting Using Cloud Dynamics Features Extracted From Ground-Based Infrared Sky Images, Guillermo Terrén-Serrano Apr 2022

Intra-Hour Solar Forecasting Using Cloud Dynamics Features Extracted From Ground-Based Infrared Sky Images, Guillermo Terrén-Serrano

Electrical and Computer Engineering ETDs

Due to the increasing use of photovoltaic systems, power grids are vulnerable to the projection of shadows from moving clouds. An intra-hour solar forecast provides power grids with the capability of automatically controlling the dispatch of energy, reducing the additional cost for a guaranteed, reliable supply of energy (i.e., energy storage). This dissertation introduces a novel sky imager consisting of a long-wave radiometric infrared camera and a visible light camera with a fisheye lens. The imager is mounted on a solar tracker to maintain the Sun in the center of the images throughout the day, reducing the scattering effect produced …


Toward Suicidal Ideation Detection With Lexical Network Features And Machine Learning, Ulya Bayram, William Lee, Daniel Santel, Ali Minai, Peggy Clark, Tracy Glauser, John Pestian Apr 2022

Toward Suicidal Ideation Detection With Lexical Network Features And Machine Learning, Ulya Bayram, William Lee, Daniel Santel, Ali Minai, Peggy Clark, Tracy Glauser, John Pestian

Northeast Journal of Complex Systems (NEJCS)

In this study, we introduce a new network feature for detecting suicidal ideation from clinical texts and conduct various additional experiments to enrich the state of knowledge. We evaluate statistical features with and without stopwords, use lexical networks for feature extraction and classification, and compare the results with standard machine learning methods using a logistic classifier, a neural network, and a deep learning method. We utilize three text collections. The first two contain transcriptions of interviews conducted by experts with suicidal (n=161 patients that experienced severe ideation) and control subjects (n=153). The third collection consists of interviews conducted by experts …


Camouflaged Poisoning Attack On Graph Neural Networks, Chao Jiang, Yi He, Richard Chapman, Hongyi Wu Jan 2022

Camouflaged Poisoning Attack On Graph Neural Networks, Chao Jiang, Yi He, Richard Chapman, Hongyi Wu

Computer Science Faculty Publications

Graph neural networks (GNNs) have enabled the automation of many web applications that entail node classification on graphs, such as scam detection in social media and event prediction in service networks. Nevertheless, recent studies revealed that the GNNs are vulnerable to adversarial attacks, where feeding GNNs with poisoned data at training time can lead them to yield catastrophically devastative test accuracy. This finding heats up the frontier of attacks and defenses against GNNs. However, the prior studies mainly posit that the adversaries can enjoy free access to manipulate the original graph, while obtaining such access could be too costly in …


From Mdp To Alphazero, David Robert Sewell Nov 2021

From Mdp To Alphazero, David Robert Sewell

Dissertations and Theses

In this paper I will explain the AlphaGo family of algorithms starting from first principles and requiring little previous knowledge from the reader. The focus will be upon one of the more recent versions AlphaZero but I hope to explain the core principles that allowed these algorithms to be so successful. I will generally refer to AlphaZero as theses [sic] core set of principles and will make it clear when I am referring to a specific algorithm of the AlphaGo family. AlphaZero in short combines Monte Carlo Tree Search (MCTS) with Deep learning and self-play. We will see how these …


Applying Deep Learning To The Ice Cream Vendor Problem: An Extension Of The Newsvendor Problem, Gaffar Solihu Aug 2021

Applying Deep Learning To The Ice Cream Vendor Problem: An Extension Of The Newsvendor Problem, Gaffar Solihu

Electronic Theses and Dissertations

The Newsvendor problem is a classical supply chain problem used to develop strategies for inventory optimization. The goal of the newsvendor problem is to predict the optimal order quantity of a product to meet an uncertain demand in the future, given that the demand distribution itself is known. The Ice Cream Vendor Problem extends the classical newsvendor problem to an uncertain demand with unknown distribution, albeit a distribution that is known to depend on exogenous features. The goal is thus to estimate the order quantity that minimizes the total cost when demand does not follow any known statistical distribution. The …


Gene Selection And Classification In High-Throughput Biological Data With Integrated Machine Learning Algorithms And Bioinformatics Approaches, Abhijeet R Patil May 2021

Gene Selection And Classification In High-Throughput Biological Data With Integrated Machine Learning Algorithms And Bioinformatics Approaches, Abhijeet R Patil

Open Access Theses & Dissertations

With the rise of high throughput technologies in biomedical research, large volumes of expression profiling, methylation profiling, and RNA-sequencing data are being generated. These high-dimensional data have large number of features with small number of samples, a characteristic called the "curse of dimensionality." The selection of optimal features, which largely affects the performance of classification algorithms in machine learning models, has led to challenging problems in bioinformatics analyses of such high-dimensional datasets. In this work, I focus on the design of two-stage frameworks of feature selection and classification and their applications in multiple sets of colorectal cancer data. The first …


Inference Of Surface Velocities From Oblique Time Lapse Photos And Terrestrial Based Lidar At The Helheim Glacier, Franklyn T. Dunbar Ii Jan 2021

Inference Of Surface Velocities From Oblique Time Lapse Photos And Terrestrial Based Lidar At The Helheim Glacier, Franklyn T. Dunbar Ii

Graduate Student Theses, Dissertations, & Professional Papers

Using time dependent observations derived from terrestrial LiDAR and oblique
time-lapse imagery, we demonstrate that a Bayesian approach to glacial motion es-
timation provides a concise way to incorporate multiple data products into a single
motion estimation procedure effectively producing surface velocity estimates with
an associated uncertainty. This approach brings both improved computational effi-
ciency, and greater scalability across observational time-frames when compared to
existing methods. To gauge efficacy, we apply these methods to a set of observa-
tions from the Helheim Glacier, a critical actor in contemporary mass loss trends
observed in the Greenland Ice Sheet. We find that …


Developing Natural Language Processing Instruments To Study Sociotechnical Systems, Thayer Alshaabi Jan 2021

Developing Natural Language Processing Instruments To Study Sociotechnical Systems, Thayer Alshaabi

Graduate College Dissertations and Theses

Identifying temporal linguistic patterns and tracing social amplification across communities has always been vital to understanding modern sociotechnical systems. Now, well into the age of information technology, the growing digitization of text archives powered by machine learning systems has enabled an enormous number of interdisciplinary studies to examine the coevolution of language and culture. However, most research in that domain investigates formal textual records, such as books and newspapers. In this work, I argue that the study of conversational text derived from social media is just as important. I present four case studies to identify and investigate societal developments in …


Implementing A Neural Network For Supervised Learning With A Random Configuration Of Layers And Nodes, Kane A. Phillips Jan 2021

Implementing A Neural Network For Supervised Learning With A Random Configuration Of Layers And Nodes, Kane A. Phillips

Electronic Theses and Dissertations

Deep learning has a substantial amount of real-life applications, making it an increasingly popular subset of artificial intelligence over the last decade. These applications come to fruition due to the tireless research and implementation of neural networks. This paper goes into detail on the implementation of supervised learning neural networks utilizing MATLAB, with the purpose being to generate a neural network based on specifications given by a user. Such specifications involve how many layers are in the network, and how many nodes are in each layer. The neural network is then trained based on known sample values of a function …


Exploring The Potential Of Sparse Coding For Machine Learning, Sheng Yang Lundquist Oct 2020

Exploring The Potential Of Sparse Coding For Machine Learning, Sheng Yang Lundquist

Dissertations and Theses

While deep learning has proven to be successful for various tasks in the field of computer vision, there are several limitations of deep-learning models when compared to human performance. Specifically, human vision is largely robust to noise and distortions, whereas deep learning performance tends to be brittle to modifications of test images, including being susceptible to adversarial examples. Additionally, deep-learning methods typically require very large collections of training examples for good performance on a task, whereas humans can learn to perform the same task with a much smaller number of training examples.

In this dissertation, I investigate whether the use …


Evaluating An Ordinal Output Using Data Modeling, Algorithmic Modeling, And Numerical Analysis, Martin Keagan Wynne Brown Jan 2020

Evaluating An Ordinal Output Using Data Modeling, Algorithmic Modeling, And Numerical Analysis, Martin Keagan Wynne Brown

Murray State Theses and Dissertations

Data and algorithmic modeling are two different approaches used in predictive analytics. The models discussed from these two approaches include the proportional odds logit model (POLR), the vector generalized linear model (VGLM), the classification and regression tree model (CART), and the random forests model (RF). Patterns in the data were analyzed using trigonometric polynomial approximations and Fast Fourier Transforms. Predictive modeling is used frequently in statistics and data science to find the relationship between the explanatory (input) variables and a response (output) variable. Both approaches prove advantageous in different cases depending on the data set. In our case, the data …


Comparing Predictive Performance Of Statistical Learning Models On Medical Data, Francis Biney Jan 2020

Comparing Predictive Performance Of Statistical Learning Models On Medical Data, Francis Biney

Open Access Theses & Dissertations

This work investigates the predictive performance of 10 Machine learning models on three medical data including Breast cancer, Heart disease and Prostate cancer. Furthermore, we use the models to identify risk factors that contribute significantly to these diseases.

The models considered include; Logistic regression with L1 and L_2 penalties, Principal component logistic regression(PCR-LR), Partial least squares logistic regression(PLS-LR), Multivariate adaptive regression splines(MARS), Support vector machine with Radial Basis Kernel (SVM-RBK), Random Forest(RF), Gradient Boosting Machines(GBM), Elastic Net (Enet) and Feedforward Neural Network(FFNN). The models were grouped according to their similarities and learning style; i) Linear regularized models: LR-Lasso, LR-Ridge and …


Adaptive Feature Engineering Modeling For Ultrasound Image Classification For Decision Support, Hatwib Mugasa Oct 2019

Adaptive Feature Engineering Modeling For Ultrasound Image Classification For Decision Support, Hatwib Mugasa

Doctoral Dissertations

Ultrasonography is considered a relatively safe option for the diagnosis of benign and malignant cancer lesions due to the low-energy sound waves used. However, the visual interpretation of the ultrasound images is time-consuming and usually has high false alerts due to speckle noise. Improved methods of collection image-based data have been proposed to reduce noise in the images; however, this has proved not to solve the problem due to the complex nature of images and the exponential growth of biomedical datasets. Secondly, the target class in real-world biomedical datasets, that is the focus of interest of a biopsy, is usually …