Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Applied Mathematics

PDF

Machine learning

Institution
Publication Year
Publication
Publication Type

Articles 1 - 30 of 46

Full-Text Articles in Physical Sciences and Mathematics

A Meta-Ensemble Predictive Model For The Risk Of Lung Cancer, Sideeqoh Oluwaseun Olawale-Shosanya, Olayinka Olufunmilayo Olusanya, Adeyemi Omotayo Joseph, Kabir Oluwatobi Idowu, Oyelade Babatunde Eriwa, Adedeji Oladimeji Adebare, Morufat Adebola Usman Jun 2024

A Meta-Ensemble Predictive Model For The Risk Of Lung Cancer, Sideeqoh Oluwaseun Olawale-Shosanya, Olayinka Olufunmilayo Olusanya, Adeyemi Omotayo Joseph, Kabir Oluwatobi Idowu, Oyelade Babatunde Eriwa, Adedeji Oladimeji Adebare, Morufat Adebola Usman

Al-Bahir Journal for Engineering and Pure Sciences

The lungs play a vital role in supplying oxygen to every cell, filtering air to prevent harmful substances, and supporting defense mechanisms. However, they remain susceptible to the risk of diseases such as infections, inflammation, and cancer that affect the lungs. Meta-ensemble techniques are prominent methods used in machine learning to enhance the accuracy of classifier learning systems in making predictions. This work proposes a robust predictive model using a meta-ensemble method to identify high-risk individuals with lung cancer, thereby taking early action to prevent long-term problems benchmarked upon the Kaggle Machine Learning practitioners' Lung Cancer Dataset. Three machine learning …


Tools For Biomolecular Modeling And Simulation, Xin Yang Apr 2024

Tools For Biomolecular Modeling And Simulation, Xin Yang

Mathematics Theses and Dissertations

Electrostatic interactions play a pivotal role in understanding biomolecular systems, influencing their structural stability and functional dynamics. The Poisson-Boltzmann (PB) equation, a prevalent implicit solvent model that treats the solvent as a continuum while describes the mobile ions using the Boltzmann distribution, has become a standard tool for detailed investigations into biomolecular electrostatics. There are two primary methodologies: grid-based finite difference or finite element methods and body-fitted boundary element methods. This dissertation focuses on developing fast and accurate PB solvers, leveraging both methodologies, to meet diverse scientific needs and overcome various obstacles in the field.


Predicting Biomolecular Properties And Interactions Using Numerical, Statistical And Machine Learning Methods, Elyssa Sliheet Apr 2024

Predicting Biomolecular Properties And Interactions Using Numerical, Statistical And Machine Learning Methods, Elyssa Sliheet

Mathematics Theses and Dissertations

We investigate machine learning and electrostatic methods to predict biophysical properties of proteins, such as solvation energy and protein ligand binding affinity, for the purpose of drug discovery/development. We focus on the Poisson-Boltzmann model and various high performance computing considerations such as parallelization schemes.


Automatic Hemorrhage Segmentation In Brain Ct Scans Using Curriculum-Based Semi-Supervised Learning, Solayman H. Emon, Tzu-Liang (Bill) Tseng, Michael Pokojovy, Peter Mccaffrey, Scott Moen, Md Fashiar Rahman Jan 2024

Automatic Hemorrhage Segmentation In Brain Ct Scans Using Curriculum-Based Semi-Supervised Learning, Solayman H. Emon, Tzu-Liang (Bill) Tseng, Michael Pokojovy, Peter Mccaffrey, Scott Moen, Md Fashiar Rahman

Mathematics & Statistics Faculty Publications

One of the major neuropathological consequences of traumatic brain injury (TBI) is intracranial hemorrhage (ICH), which requires swift diagnosis to avert perilous outcomes. We present a new automatic hemorrhage segmentation technique via curriculum-based semi-supervised learning. It employs a pre-trained lightweight encoder-decoder framework (MobileNetV2) on labeled and unlabeled data. The model integrates consistency regularization for improved generalization, offering steady predictions from original and augmented versions of unlabeled data. The training procedure employs curriculum learning to progressively train the model at diverse complexity levels. We utilize the PhysioNet dataset to train and evaluate the proposed approach. The performance results surpass those of …


Bringing Gans To Medieval Times: Manuscript Translation Models, Tonilynn M. Holtz Jan 2024

Bringing Gans To Medieval Times: Manuscript Translation Models, Tonilynn M. Holtz

Electronic Theses and Dissertations

The Generative Adversarial Networks (GAN) recently emerged as a powerful framework for producing new knowledge from existing knowledge. These models aim to learn patterns from input data then use that knowledge to generate output data samples that plausibly appear to belong to the same set as the input data. Medieval manuscripts study has been an important research area in the humanities field for many decades. These rare manuscripts are often times inaccessible to the general public, including students in scholars, and it is of a great interest to provide digital support (including, but not limited to translation and search) for …


Convolutional Neural Network-Based Gene Prediction Using Buffalograss As A Model System, Michael Morikone Nov 2023

Convolutional Neural Network-Based Gene Prediction Using Buffalograss As A Model System, Michael Morikone

Complex Biosystems PhD Program: Dissertations

The task of gene prediction has been largely stagnant in algorithmic improvements compared to when algorithms were first developed for predicting genes thirty years ago. Rather than iteratively improving the underlying algorithms in gene prediction tools by utilizing better performing models, most current approaches update existing tools through incorporating increasing amounts of extrinsic data to improve gene prediction performance. The traditional method of predicting genes is done using Hidden Markov Models (HMMs). These HMMs are constrained by having strict assumptions made about the independence of genes that do not always hold true. To address this, a Convolutional Neural Network (CNN) …


Compatibility Of Clique Clustering Algorithm With Dimensionality Reduction, Ug ̆Ur Madran, Duygu Soyog ̆Lu Sep 2023

Compatibility Of Clique Clustering Algorithm With Dimensionality Reduction, Ug ̆Ur Madran, Duygu Soyog ̆Lu

Applied Mathematics & Information Sciences

In our previous work, we introduced a clustering algorithm based on clique formation. Cliques, the obtained clusters, are constructed by choosing the most dense complete subgraphs by using similarity values between instances. The clique algorithm successfully reduces the number of instances in a data set without substantially changing the accuracy rate. In this current work, we focused on reducing the number of features. For this purpose, the effect of the clique clustering algorithm on dimensionality reduction has been analyzed. We propose a novel algorithm for support vector machine classification by combining these two techniques and applying different strategies by differentiating …


Data-Driven Exploration Of Coarse-Grained Equations: Harnessing Machine Learning, Elham Kianiharchegani Aug 2023

Data-Driven Exploration Of Coarse-Grained Equations: Harnessing Machine Learning, Elham Kianiharchegani

Electronic Thesis and Dissertation Repository

In scientific research, understanding and modeling physical systems often involves working with complex equations called Partial Differential Equations (PDEs). These equations are essential for describing the relationships between variables and their derivatives, allowing us to analyze a wide range of phenomena, from fluid dynamics to quantum mechanics. Traditionally, the discovery of PDEs relied on mathematical derivations and expert knowledge. However, the advent of data-driven approaches and machine learning (ML) techniques has transformed this process. By harnessing ML techniques and data analysis methods, data-driven approaches have revolutionized the task of uncovering complex equations that describe physical systems. The primary goal in …


Mathematics Behind Machine Learning, Rim Hammoud Aug 2023

Mathematics Behind Machine Learning, Rim Hammoud

Electronic Theses, Projects, and Dissertations

Artificial intelligence (AI) is a broad field of study that involves developing intelligent
machines that can perform tasks that typically require human intelligence. Machine
learning (ML) is often used as a tool to help create AI systems. The goal of ML is
to create models that can learn and improve to make predictions or decisions based on given data. The goal of this thesis is to build a clear and rigorous exposition of the mathematical underpinnings of support vector machines (SVM), a popular platform used in ML. As we will explore later on in the thesis, SVM can be implemented …


Learning The Game: Implementations Of Convolutional Networks In Automated Strategy Identification, Cameron Klig Jun 2023

Learning The Game: Implementations Of Convolutional Networks In Automated Strategy Identification, Cameron Klig

Master's Theses

Games can be used to represent a wide variety of real world problems, giving rise to many applications of game theory. Various computational methods have been proposed for identifying game strategies, including optimized tree search algorithms, game-specific heuristics, and artificial intelligence. In the last decade, systems like AlphaGo and AlphaZero have significantly exceeded the performance of the best human players in Chess, Go, and other games. The most effective game engines to date employ convolutional neural networks (CNNs) to evaluate game boards, extract features, and predict the optimal next move. These engines are trained on billions of simulated games, wherein …


Continuum Modeling Of Active Nematics Via Data-Driven Equation Discovery, Connor Robertson May 2023

Continuum Modeling Of Active Nematics Via Data-Driven Equation Discovery, Connor Robertson

Dissertations

Data-driven modeling seeks to extract a parsimonious model for a physical system directly from measurement data. One of the most interpretable of these methods is Sparse Identification of Nonlinear Dynamics (SINDy), which selects a relatively sparse linear combination of model terms from a large set of (possibly nonlinear) candidates via optimization. This technique has shown promise for synthetic data generated by numerical simulations but the application of the techniques to real data is less developed. This dissertation applies SINDy to video data from a bio-inspired system of mictrotubule-motor protein assemblies, an example of nonequilibrium dynamics that has posed a significant …


U-No: U-Shaped Neural Operators, Md Ashiqur Rahman, Zachary E Ross, Kamyar Azizzadenesheli May 2023

U-No: U-Shaped Neural Operators, Md Ashiqur Rahman, Zachary E Ross, Kamyar Azizzadenesheli

Department of Computer Science Faculty Publications

Neural operators generalize classical neural networks to maps between infinite-dimensional spaces, e.g., function spaces. Prior works on neural operators proposed a series of novel methods to learn such maps and demonstrated unprecedented success in learning solution operators of partial differential equations. Due to their close proximity to fully connected architectures, these models mainly suffer from high memory usage and are generally limited to shallow deep learning models. In this paper, we propose U-shaped Neural Operator (U-NO), a U-shaped memory enhanced architecture that allows for deeper neural operators. U-NOs exploit the problem structures in function predictions and demonstrate fast training, data …


Multilevel Optimization With Dropout For Neural Networks, Gary Joseph Saavedra Apr 2023

Multilevel Optimization With Dropout For Neural Networks, Gary Joseph Saavedra

Mathematics & Statistics ETDs

Large neural networks have become ubiquitous in machine learning. Despite their widespread use, the optimization process for training a neural network remains com-putationally expensive and does not necessarily create networks that generalize well to unseen data. In addition, the difficulty of training increases as the size of the neural network grows. In this thesis, we introduce the novel MGDrop and SMGDrop algorithms which use a multigrid optimization scheme with a dropout coarsening operator to train neural networks. In contrast to other standard neural network training schemes, MGDrop explicitly utilizes information from smaller sub-networks which act as approximations of the full …


Quantum Computing And Its Applications In Healthcare, Vu Giang Jan 2023

Quantum Computing And Its Applications In Healthcare, Vu Giang

OUR Journal: ODU Undergraduate Research Journal

This paper serves as a review of the state of quantum computing and its application in healthcare. The various avenues for how quantum computing can be applied to healthcare is discussed here along with the conversation about the limitations of the technology. With more and more efforts put into the development of these computers, its future is promising with the endeavors of furthering healthcare and various other industries.


Graph-Based Acoustic Clustering And Classification, Justin Youngho Sunu Jan 2023

Graph-Based Acoustic Clustering And Classification, Justin Youngho Sunu

CGU Theses & Dissertations

The rapid growth of audio data collection in various domains necessitates advanced techniquesfor efficient analysis and classification. This dissertation proposes new approaches for categorizing acoustic data, using both unsupervised and semi-supervised learning methods. Starting with raw audio, we preprocess the signal to segment it into time windows, each of which we consider as an independent data point. We use the short-time Fourier transform to describe the signal in a given time window as a set of Fourier coefficients. We interpret the resulting frequency signature as a high-dimensional feature description of each data point. We then develop a graph-based approach for …


Leveraging Subject Matter Expertise To Optimize Machine Learning Techniques For Air And Space Applications, Philip Y. Cho Sep 2022

Leveraging Subject Matter Expertise To Optimize Machine Learning Techniques For Air And Space Applications, Philip Y. Cho

Theses and Dissertations

We develop new machine learning and statistical methods that are tailored for Air and Space applications through the incorporation of subject matter expertise. In particular, we focus on three separate research thrusts that each represents a different type of subject matter knowledge, modeling approach, and application. In our first thrust, we incorporate knowledge of natural phenomena to design a neural network algorithm for localizing point defects in transmission electron microscopy (TEM) images of crystalline materials. In our second research thrust, we use Bayesian feature selection and regression to analyze the relationship between fighter pilot attributes and flight mishap rates. We …


A Kuramoto Model Approach To Predicting Chaotic Systems With Echo State Networks, Sophie Wu, Jackson Howe Aug 2022

A Kuramoto Model Approach To Predicting Chaotic Systems With Echo State Networks, Sophie Wu, Jackson Howe

Undergraduate Student Research Internships Conference

An Echo State Network (ESN) with an activation function based on the Kuramoto model (Kuramoto ESN) is implemented, which can successfully predict the logistic map for a non-trivial number of time steps. The reservoir in the prediction stage exhibits binary dynamics when a good prediction is made, but the oscillators in the reservoir display a larger variability in states as the ESN’s prediction becomes worse. Analytical approaches to quantify how the Kuramoto ESN’s dynamics relate to its prediction are explored, as well as how the dynamics of the Kuramoto ESN relate to another widely studied physical model, the Ising model.


Machine Learning Model Comparison And Arma Simulation Of Exhaled Breath Signals Classifying Covid-19 Patients, Aaron Christopher Segura Aug 2022

Machine Learning Model Comparison And Arma Simulation Of Exhaled Breath Signals Classifying Covid-19 Patients, Aaron Christopher Segura

Mathematics & Statistics ETDs

This study compared the performance of machine learning models in classifying COVID-19 patients using exhaled breath signals and simulated datasets. Ground truth classification was determined by the gold standard Polymerase Chain Reaction (PCR) test results. A residual bootstrapped method generated the simulated datasets by fitting signal data to Autoregressive Moving Average (ARMA) models. Classification models included neural networks, k-nearest neighbors, naïve Bayes, random forest, and support vector machines. A Recursive Feature Elimination (RFE) study was performed to determine if reducing signal features would improve the classification models performance using Gini Importance scoring for the two classes. The top 25% of …


Stability And Differential Privacy Of Stochastic Gradient Methods, Zhenhuan Yang Aug 2022

Stability And Differential Privacy Of Stochastic Gradient Methods, Zhenhuan Yang

Legacy Theses & Dissertations (2009 - 2024)

Recently there are a considerable amount of work devoted to the study of the algorithmic stability as well as differential privacy (DP) for stochastic gradient methods (SGM). However, most of the existing work focus on the empirical risk minimization (ERM) and the population risk minimization problems. In this paper, we study two types of optimization problems that enjoy wide applications in modern machine learning, namely the minimax problem and the pairwise learning problem.


Applications Of Machine Learning Algorithms In Materials Science And Bioinformatics, Mohammed Quazi Jun 2022

Applications Of Machine Learning Algorithms In Materials Science And Bioinformatics, Mohammed Quazi

Mathematics & Statistics ETDs

The piezoelectric response has been a measure of interest in density functional theory (DFT) for micro-electromechanical systems (MEMS) since the inception of MEMS technology. Piezoelectric-based MEMS devices find wide applications in automobiles, mobile phones, healthcare devices, and silicon chips for computers, to name a few. Piezoelectric properties of doped aluminum nitride (AlN) have been under investigation in materials science for piezoelectric thin films because of its wide range of device applicability. In this research using rigorous DFT calculations, high throughput ab-initio simulations for 23 AlN alloys are generated.

This research is the first to report strong enhancements of piezoelectric properties …


Fine-Tuning A 𝑘-Nearest Neighbors Machine Learning Model For The Detection Of Insurance Fraud, Alliyah Stout Jun 2022

Fine-Tuning A 𝑘-Nearest Neighbors Machine Learning Model For The Detection Of Insurance Fraud, Alliyah Stout

Honors Theses

Billions of dollars are lost within insurance companies due to fraud. Large money losses force insurance companies to increase premium costs and/or restrict policies. This negatively affects a company’s loyal customers. Although this is a prevalent problem, companies are not urgently working toward bettering their machine learning algorithms. Underskilled workers paired with inefficient computer algorithms make it difficult to accurately and reliably detect fraud.

The goal of this study is to understand the idea of -Nearest Neighbors ( -NN) and to use this classification technique to accurately detect fraudulent auto insurance claims. Using -NN requires choosing a value and a …


Generating A Dataset For Comparing Linear Vs. Non-Linear Prediction Methods In Education Research, Jack Mauro, Elena Martinez, Anna Bargagliotti May 2022

Generating A Dataset For Comparing Linear Vs. Non-Linear Prediction Methods In Education Research, Jack Mauro, Elena Martinez, Anna Bargagliotti

Honors Thesis

Machine learning is often used to build predictive models by extracting patterns from large data sets. Such techniques are increasingly being utilized to predict outcomes in the social sciences. One such application is predicting student success. Machine learning can be applied to predicting student acceptance and success in academia. Using these tools for education-related data analysis, may enable the evaluation of programs, resources and curriculum. Currently, research is needed to examine application, admissions, and retention data in order to address equity in college computer science programs. However, most student-level data sets contain sensitive data that cannot be made public. To …


Evaluating The Behaviour Of Centrally Perforated Unreinforced Masonry Walls: Applications Of Numerical Analysis, Machine Learning, And Stochastic Methods, Mohsen Khaleghi, Javid Salimi, Visar Farhangi, Mohammad Javad Moradi, Moses Karakouzian May 2022

Evaluating The Behaviour Of Centrally Perforated Unreinforced Masonry Walls: Applications Of Numerical Analysis, Machine Learning, And Stochastic Methods, Mohsen Khaleghi, Javid Salimi, Visar Farhangi, Mohammad Javad Moradi, Moses Karakouzian

Civil and Environmental Engineering and Construction Faculty Research

The presence of openings greatly affects the response of unreinforced masonry (URM) walls. This topic greatly attracts the attention of many researchers. Perforated unreinforced masonry (PURM) walls under in-plane loads through the truss discretization method (TDM) along with several machine learning approaches such as Multilayer perceptron (MLP), Group of Method Data Handling (GMDH), and Radial basis function (RBF) are described in this paper. A new method named Multi-pier (MP) that is fast and accurate, is used to determine the behavior of PURM walls. The results of the MP method are expressed as a ratio of lateral load-bearing capacity and initial …


Intra-Hour Solar Forecasting Using Cloud Dynamics Features Extracted From Ground-Based Infrared Sky Images, Guillermo Terrén-Serrano Apr 2022

Intra-Hour Solar Forecasting Using Cloud Dynamics Features Extracted From Ground-Based Infrared Sky Images, Guillermo Terrén-Serrano

Electrical and Computer Engineering ETDs

Due to the increasing use of photovoltaic systems, power grids are vulnerable to the projection of shadows from moving clouds. An intra-hour solar forecast provides power grids with the capability of automatically controlling the dispatch of energy, reducing the additional cost for a guaranteed, reliable supply of energy (i.e., energy storage). This dissertation introduces a novel sky imager consisting of a long-wave radiometric infrared camera and a visible light camera with a fisheye lens. The imager is mounted on a solar tracker to maintain the Sun in the center of the images throughout the day, reducing the scattering effect produced …


Toward Suicidal Ideation Detection With Lexical Network Features And Machine Learning, Ulya Bayram, William Lee, Daniel Santel, Ali Minai, Peggy Clark, Tracy Glauser, John Pestian Apr 2022

Toward Suicidal Ideation Detection With Lexical Network Features And Machine Learning, Ulya Bayram, William Lee, Daniel Santel, Ali Minai, Peggy Clark, Tracy Glauser, John Pestian

Northeast Journal of Complex Systems (NEJCS)

In this study, we introduce a new network feature for detecting suicidal ideation from clinical texts and conduct various additional experiments to enrich the state of knowledge. We evaluate statistical features with and without stopwords, use lexical networks for feature extraction and classification, and compare the results with standard machine learning methods using a logistic classifier, a neural network, and a deep learning method. We utilize three text collections. The first two contain transcriptions of interviews conducted by experts with suicidal (n=161 patients that experienced severe ideation) and control subjects (n=153). The third collection consists of interviews conducted by experts …


Camouflaged Poisoning Attack On Graph Neural Networks, Chao Jiang, Yi He, Richard Chapman, Hongyi Wu Jan 2022

Camouflaged Poisoning Attack On Graph Neural Networks, Chao Jiang, Yi He, Richard Chapman, Hongyi Wu

Computer Science Faculty Publications

Graph neural networks (GNNs) have enabled the automation of many web applications that entail node classification on graphs, such as scam detection in social media and event prediction in service networks. Nevertheless, recent studies revealed that the GNNs are vulnerable to adversarial attacks, where feeding GNNs with poisoned data at training time can lead them to yield catastrophically devastative test accuracy. This finding heats up the frontier of attacks and defenses against GNNs. However, the prior studies mainly posit that the adversaries can enjoy free access to manipulate the original graph, while obtaining such access could be too costly in …


From Mdp To Alphazero, David Robert Sewell Nov 2021

From Mdp To Alphazero, David Robert Sewell

Dissertations and Theses

In this paper I will explain the AlphaGo family of algorithms starting from first principles and requiring little previous knowledge from the reader. The focus will be upon one of the more recent versions AlphaZero but I hope to explain the core principles that allowed these algorithms to be so successful. I will generally refer to AlphaZero as theses [sic] core set of principles and will make it clear when I am referring to a specific algorithm of the AlphaGo family. AlphaZero in short combines Monte Carlo Tree Search (MCTS) with Deep learning and self-play. We will see how these …


Applying Deep Learning To The Ice Cream Vendor Problem: An Extension Of The Newsvendor Problem, Gaffar Solihu Aug 2021

Applying Deep Learning To The Ice Cream Vendor Problem: An Extension Of The Newsvendor Problem, Gaffar Solihu

Electronic Theses and Dissertations

The Newsvendor problem is a classical supply chain problem used to develop strategies for inventory optimization. The goal of the newsvendor problem is to predict the optimal order quantity of a product to meet an uncertain demand in the future, given that the demand distribution itself is known. The Ice Cream Vendor Problem extends the classical newsvendor problem to an uncertain demand with unknown distribution, albeit a distribution that is known to depend on exogenous features. The goal is thus to estimate the order quantity that minimizes the total cost when demand does not follow any known statistical distribution. The …


Gene Selection And Classification In High-Throughput Biological Data With Integrated Machine Learning Algorithms And Bioinformatics Approaches, Abhijeet R Patil May 2021

Gene Selection And Classification In High-Throughput Biological Data With Integrated Machine Learning Algorithms And Bioinformatics Approaches, Abhijeet R Patil

Open Access Theses & Dissertations

With the rise of high throughput technologies in biomedical research, large volumes of expression profiling, methylation profiling, and RNA-sequencing data are being generated. These high-dimensional data have large number of features with small number of samples, a characteristic called the "curse of dimensionality." The selection of optimal features, which largely affects the performance of classification algorithms in machine learning models, has led to challenging problems in bioinformatics analyses of such high-dimensional datasets. In this work, I focus on the design of two-stage frameworks of feature selection and classification and their applications in multiple sets of colorectal cancer data. The first …


Inference Of Surface Velocities From Oblique Time Lapse Photos And Terrestrial Based Lidar At The Helheim Glacier, Franklyn T. Dunbar Ii Jan 2021

Inference Of Surface Velocities From Oblique Time Lapse Photos And Terrestrial Based Lidar At The Helheim Glacier, Franklyn T. Dunbar Ii

Graduate Student Theses, Dissertations, & Professional Papers

Using time dependent observations derived from terrestrial LiDAR and oblique
time-lapse imagery, we demonstrate that a Bayesian approach to glacial motion es-
timation provides a concise way to incorporate multiple data products into a single
motion estimation procedure effectively producing surface velocity estimates with
an associated uncertainty. This approach brings both improved computational effi-
ciency, and greater scalability across observational time-frames when compared to
existing methods. To gauge efficacy, we apply these methods to a set of observa-
tions from the Helheim Glacier, a critical actor in contemporary mass loss trends
observed in the Greenland Ice Sheet. We find that …