Open Access. Powered by Scholars. Published by Universities.®

Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Machine learning

Discipline
Institution
Publication Year
Publication
Publication Type
File Type

Articles 1 - 30 of 64

Full-Text Articles in Mathematics

Infusing Machine Learning And Computational Linguistics Into Clinical Notes, Funke V. Alabi, Onyeka Omose, Omotomilola Jegede Jan 2024

Infusing Machine Learning And Computational Linguistics Into Clinical Notes, Funke V. Alabi, Onyeka Omose, Omotomilola Jegede

Mathematics & Statistics Faculty Publications

Entering free-form text notes into Electronic Health Records (EHR) systems takes a lot of time from clinicians. A large portion of this paper work is viewed as a burden, which cuts into the amount of time doctors spend with patients and increases the risk of burnout. We will see how machine learning and computational linguistics can be infused in the processing of taking clinical notes. We are presenting a new language modeling task that predicts the content of notes conditioned on historical data from a patient's medical record, such as patient demographics, lab results, medications, and previous notes, with the …


Explainable Machine Learning Reveals The Relationship Between Hearing Thresholds And Speech-In-Noise Recognition In Listeners With Normal Audiograms, Jithin Raj Balan, Hansapani Rodrigo, Udit Saxena, Srikanta K. Mishra Oct 2023

Explainable Machine Learning Reveals The Relationship Between Hearing Thresholds And Speech-In-Noise Recognition In Listeners With Normal Audiograms, Jithin Raj Balan, Hansapani Rodrigo, Udit Saxena, Srikanta K. Mishra

School of Mathematical and Statistical Sciences Faculty Publications and Presentations

Some individuals complain of listening-in-noise difficulty despite having a normal audiogram. In this study, machine learning is applied to examine the extent to which hearing thresholds can predict speech-in-noise recognition among normal-hearing individuals. The specific goals were to (1) compare the performance of one standard (GAM, generalized additive model) and four machine learning models (ANN, artificial neural network; DNN, deep neural network; RF, random forest; XGBoost; eXtreme gradient boosting), and (2) examine the relative contribution of individual audiometric frequencies and demographic variables in predicting speech-in-noise recognition. Archival data included thresholds (0.25–16 kHz) and speech recognition thresholds (SRTs) from listeners with …


Longboard Classification Using Machine Learning, Tuan (Kevin) Le, Evans Sajtar, Mckenzie Lamb Oct 2023

Longboard Classification Using Machine Learning, Tuan (Kevin) Le, Evans Sajtar, Mckenzie Lamb

Annual Student Research Poster Session

There are several techniques a rider can choose from that they can perform being distributed along the long-board ride. This research aims to create a machine-learning model that can efficiently classify these techniques at different periods of time using raw acceleration data. This paper presents the complete workflow of the application. This application involves analytical geometry, multidimensional calculus, and linear algebra and can be used to visualize and normalize time-invariant object paths. This model focuses on displacement data calculated from raw acceleration data and gyro sensor data from a smartphone application called "Physics Toolbox Sensor Suite". We extracted features from …


Compatibility Of Clique Clustering Algorithm With Dimensionality Reduction, Ug ̆Ur Madran, Duygu Soyog ̆Lu Sep 2023

Compatibility Of Clique Clustering Algorithm With Dimensionality Reduction, Ug ̆Ur Madran, Duygu Soyog ̆Lu

Applied Mathematics & Information Sciences

In our previous work, we introduced a clustering algorithm based on clique formation. Cliques, the obtained clusters, are constructed by choosing the most dense complete subgraphs by using similarity values between instances. The clique algorithm successfully reduces the number of instances in a data set without substantially changing the accuracy rate. In this current work, we focused on reducing the number of features. For this purpose, the effect of the clique clustering algorithm on dimensionality reduction has been analyzed. We propose a novel algorithm for support vector machine classification by combining these two techniques and applying different strategies by differentiating …


Numerical Simulation Of The Korteweg–De Vries Equation With Machine Learning, Kristina O. F. Williams *, Benjamin F. Akers Jun 2023

Numerical Simulation Of The Korteweg–De Vries Equation With Machine Learning, Kristina O. F. Williams *, Benjamin F. Akers

Faculty Publications

A machine learning procedure is proposed to create numerical schemes for solutions of nonlinear wave equations on coarse grids. This method trains stencil weights of a discretization of the equation, with the truncation error of the scheme as the objective function for training. The method uses centered finite differences to initialize the optimization routine and a second-order implicit-explicit time solver as a framework. Symmetry conditions are enforced on the learned operator to ensure a stable method. The procedure is applied to the Korteweg–de Vries equation. It is observed to be more accurate than finite difference or spectral methods on coarse …


Why Softmax? Because It Is The Only Consistent Approach To Probability-Based Classification, Anatole Lokshin, Vladik Kreinovich Jun 2023

Why Softmax? Because It Is The Only Consistent Approach To Probability-Based Classification, Anatole Lokshin, Vladik Kreinovich

Departmental Technical Reports (CS)

In many practical problems, the most effective classification techniques are based on deep learning. In this approach, once the neural network generates values corresponding to different classes, these values are transformed into probabilities by using the softmax formula. Researchers tried other transformation, but they did not work as well as softmax. A natural question is: why is softmax so effective? In this paper, we provide a possible explanation for this effectiveness: namely, we prove that softmax is the only consistent approach to probability-based classification. In precise terms, it is the only approach for which two reasonable probability-based ideas -- Least …


Using Deep Neural Networks To Classify Astronomical Images, Andrew D. Macpherson May 2023

Using Deep Neural Networks To Classify Astronomical Images, Andrew D. Macpherson

Honors Projects

As the quantity of astronomical data available continues to exceed the resources available for analysis, recent advances in artificial intelligence encourage the development of automated classification tools. This paper lays out a framework for constructing a deep neural network capable of classifying individual astronomical images by describing techniques to extract and label these objects from large images.


Fast -- Asymptotically Optimal -- Methods For Determining The Optimal Number Of Features, Saied Tizpaz-Niari, Luc Longpré, Olga Kosheleva, Vladik Kreinovich May 2023

Fast -- Asymptotically Optimal -- Methods For Determining The Optimal Number Of Features, Saied Tizpaz-Niari, Luc Longpré, Olga Kosheleva, Vladik Kreinovich

Departmental Technical Reports (CS)

In machine learning -- and in data processing in general -- it is very important to select the proper number of features. If we select too few, we miss important information and do not get good results, but if we select too many, this will include many irrelevant ones that only bring noise and thus again worsen the results. The usual method of selecting the proper number of features is to add features one by one until the quality stops improving and starts deteriorating again. This method works, but it often takes too much time. In this paper, we propose …


Nviz: Unraveling Neural Networks Through Visualization, Kevin Hoffman Apr 2023

Nviz: Unraveling Neural Networks Through Visualization, Kevin Hoffman

Mathematics and Computer Science Presentations

The growing utility of artificial intelligence (AI) is attributed to the development of neural networks. These networks are a class of models that make predictions based on previously observed data. While the inferential power of neural networks is great, the ability to explain their results is difficult because the underlying model is automatically generated. The AI community commonly refers to neural networks as black boxes because the patterns they learn from the data are not easily understood. This project aims to improve the visibility of patterns that neural networks identify in data. Through an interactive web application, NVIZ affords the …


Multilevel Optimization With Dropout For Neural Networks, Gary Joseph Saavedra Apr 2023

Multilevel Optimization With Dropout For Neural Networks, Gary Joseph Saavedra

Mathematics & Statistics ETDs

Large neural networks have become ubiquitous in machine learning. Despite their widespread use, the optimization process for training a neural network remains com-putationally expensive and does not necessarily create networks that generalize well to unseen data. In addition, the difficulty of training increases as the size of the neural network grows. In this thesis, we introduce the novel MGDrop and SMGDrop algorithms which use a multigrid optimization scheme with a dropout coarsening operator to train neural networks. In contrast to other standard neural network training schemes, MGDrop explicitly utilizes information from smaller sub-networks which act as approximations of the full …


Predicting The Outcomes Of Internet-Based Cognitive Behavioral Therapy For Tinnitus: Applications Of Artificial Neural Network And Support Vector Machine, Hansapani Rodrigo, Eldré W. Beukes, Gerhard Andersson, Vinaya Manchaiah Dec 2022

Predicting The Outcomes Of Internet-Based Cognitive Behavioral Therapy For Tinnitus: Applications Of Artificial Neural Network And Support Vector Machine, Hansapani Rodrigo, Eldré W. Beukes, Gerhard Andersson, Vinaya Manchaiah

School of Mathematical and Statistical Sciences Faculty Publications and Presentations

Purpose:

Internet-based cognitive behavioral therapy (ICBT) has been found to be effective for tinnitus management, although there is limited understanding about who will benefit the most from ICBT. Traditional statistical models have largely failed to identify the nonlinear associations and hence find strong predictors of success with ICBT. This study aimed at examining the use of an artificial neural network (ANN) and support vector machine (SVM) to identify variables associated with treatment success in ICBT for tinnitus.

Method:

The study involved a secondary analysis of data from 228 individuals who had completed ICBT in previous intervention studies. A 13-point reduction …


Machine Learning Model Comparison And Arma Simulation Of Exhaled Breath Signals Classifying Covid-19 Patients, Aaron Christopher Segura Aug 2022

Machine Learning Model Comparison And Arma Simulation Of Exhaled Breath Signals Classifying Covid-19 Patients, Aaron Christopher Segura

Mathematics & Statistics ETDs

This study compared the performance of machine learning models in classifying COVID-19 patients using exhaled breath signals and simulated datasets. Ground truth classification was determined by the gold standard Polymerase Chain Reaction (PCR) test results. A residual bootstrapped method generated the simulated datasets by fitting signal data to Autoregressive Moving Average (ARMA) models. Classification models included neural networks, k-nearest neighbors, naïve Bayes, random forest, and support vector machines. A Recursive Feature Elimination (RFE) study was performed to determine if reducing signal features would improve the classification models performance using Gini Importance scoring for the two classes. The top 25% of …


Contributions To Random Forest Variable Importance With Applications In R, Kelvyn K. Bladen Aug 2022

Contributions To Random Forest Variable Importance With Applications In R, Kelvyn K. Bladen

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

A major focus in statistics is building and improving computational algorithms that can use data to predict a response. Two fundamental camps of research arise from such a goal. The first camp is researching ways to get more accurate predictions. Many sophisticated methods, collectively known as machine learning methods, have been developed for this very purpose. One such method that is widely used across industry and many other areas of investigation is called Random Forests.

The second camp of research is that of improving the interpretability of machine learning methods. This is worthy of attention when analysts desire to optimize …


Applications Of Machine Learning Algorithms In Materials Science And Bioinformatics, Mohammed Quazi Jun 2022

Applications Of Machine Learning Algorithms In Materials Science And Bioinformatics, Mohammed Quazi

Mathematics & Statistics ETDs

The piezoelectric response has been a measure of interest in density functional theory (DFT) for micro-electromechanical systems (MEMS) since the inception of MEMS technology. Piezoelectric-based MEMS devices find wide applications in automobiles, mobile phones, healthcare devices, and silicon chips for computers, to name a few. Piezoelectric properties of doped aluminum nitride (AlN) have been under investigation in materials science for piezoelectric thin films because of its wide range of device applicability. In this research using rigorous DFT calculations, high throughput ab-initio simulations for 23 AlN alloys are generated.

This research is the first to report strong enhancements of piezoelectric properties …


Machine Learning Classification Of Digitally Modulated Signals, James A. Latshaw May 2022

Machine Learning Classification Of Digitally Modulated Signals, James A. Latshaw

Electrical & Computer Engineering Theses & Dissertations

Automatic classification of digitally modulated signals is a challenging problem that has traditionally been approached using signal processing tools such as log-likelihood algorithms for signal classification or cyclostationary signal analysis. These approaches are computationally intensive and cumbersome in general, and in recent years alternative approaches that use machine learning have been presented in the literature for automatic classification of digitally modulated signals. This thesis studies deep learning approaches for classifying digitally modulated signals that use deep artificial neural networks in conjunction with the canonical representation of digitally modulated signals in terms of in-phase and quadrature components. Specifically, capsule networks are …


Intra-Hour Solar Forecasting Using Cloud Dynamics Features Extracted From Ground-Based Infrared Sky Images, Guillermo Terrén-Serrano Apr 2022

Intra-Hour Solar Forecasting Using Cloud Dynamics Features Extracted From Ground-Based Infrared Sky Images, Guillermo Terrén-Serrano

Electrical and Computer Engineering ETDs

Due to the increasing use of photovoltaic systems, power grids are vulnerable to the projection of shadows from moving clouds. An intra-hour solar forecast provides power grids with the capability of automatically controlling the dispatch of energy, reducing the additional cost for a guaranteed, reliable supply of energy (i.e., energy storage). This dissertation introduces a novel sky imager consisting of a long-wave radiometric infrared camera and a visible light camera with a fisheye lens. The imager is mounted on a solar tracker to maintain the Sun in the center of the images throughout the day, reducing the scattering effect produced …


In Search Of Star Clusters: An Introduction To The K-Means Algorithm, Marcio Nascimento Jan 2022

In Search Of Star Clusters: An Introduction To The K-Means Algorithm, Marcio Nascimento

Journal of Humanistic Mathematics

This article is a gentle introduction to K-means, a mathematical technique of processing data for further classification. We begin with a brief historical introduction, where we find connections with Plato’s Timæus, von Linné’s binomial classification, and the star clustering concept of Mary Sommerville and collaborators. Artificial intelligence algorithms use K-means as a classification methodology to learn about data in a very accurate way, because it is a quantitative procedure based on similarities.


A Predictive Model To Predict Cyberattack Using Self-Normalizing Neural Networks, Oluwapelumi Eniodunmo Jan 2022

A Predictive Model To Predict Cyberattack Using Self-Normalizing Neural Networks, Oluwapelumi Eniodunmo

Theses, Dissertations and Capstones

Cyberattack is a never-ending war that has greatly threatened secured information systems. The development of automated and intelligent systems provides more computing power to hackers to steal information, destroy data or system resources, and has raised global security issues. Statistical and Data mining tools have received continuous research and improvements. These tools have been adopted to create sophisticated intrusion detection systems that help information systems mitigate and defend against cyberattacks. However, the advancement in technology and accessibility of information makes more identifiable elements that can be used to gain unauthorized access to systems and resources. Data mining and classification tools …


Finding The Best Predictors For Foot Traffic In Us Seafood Restaurants, Isabel Paige Beaulieu Jan 2022

Finding The Best Predictors For Foot Traffic In Us Seafood Restaurants, Isabel Paige Beaulieu

Honors Theses and Capstones

COVID-19 caused state and nation-wide lockdowns, which altered human foot traffic, especially in restaurants. The seafood sector in particular suffered greatly as there was an increase in illegal fishing, it is made up of perishable goods, it is seasonal in some places, and imports and exports were slowed. Foot traffic data is useful for business owners to have to know how much to order, how many employees to schedule, etc. One issue is that the data is very expensive, hard to get, and not available until months after it is recorded. Our goal is to not only find covariates that …


Exploratory Data Mining Techniques (Decision Tree Models) For Examining The Impact Of Internet-Based Cognitive Behavioral Therapy For Tinnitus: Machine Learning Approach, Hansapani Rodrigo, Eldré W. Beukes, Gerhard Andersson, Vinaya Manchaiah Nov 2021

Exploratory Data Mining Techniques (Decision Tree Models) For Examining The Impact Of Internet-Based Cognitive Behavioral Therapy For Tinnitus: Machine Learning Approach, Hansapani Rodrigo, Eldré W. Beukes, Gerhard Andersson, Vinaya Manchaiah

School of Mathematical and Statistical Sciences Faculty Publications and Presentations

Background: There is huge variability in the way that individuals with tinnitus respond to interventions. These experiential variations, together with a range of associated etiologies, contribute to tinnitus being a highly heterogeneous condition. Despite this heterogeneity, a “one size fits all” approach is taken when making management recommendations. Although there are various management approaches, not all are equally effective. Psychological approaches such as cognitive behavioral therapy have the most evidence base. Managing tinnitus is challenging due to the significant variations in tinnitus experiences and treatment successes. Tailored interventions based on individual tinnitus profiles may improve outcomes. Predictive models of treatment …


System Identification Through Lipschitz Regularized Deep Neural Networks, Elisa Negrini, Giovanna Citti, Luca Capogna Nov 2021

System Identification Through Lipschitz Regularized Deep Neural Networks, Elisa Negrini, Giovanna Citti, Luca Capogna

Mathematics and Statistics: Faculty Publications

In this paper we use neural networks to learn governing equations from data. Specifically we reconstruct the right-hand side of a system of ODEs x˙(t)=f(t,x(t)) directly from observed uniformly time-sampled data using a neural network. In contrast with other neural network-based approaches to this problem, we add a Lipschitz regularization term to our loss function. In the synthetic examples we observed empirically that this regularization results in a smoother approximating function and better generalization properties when compared with non-regularized models, both on trajectory and non-trajectory data, especially in presence of noise. In contrast with sparse regression approaches, since neural networks …


Applying Deep Learning To The Ice Cream Vendor Problem: An Extension Of The Newsvendor Problem, Gaffar Solihu Aug 2021

Applying Deep Learning To The Ice Cream Vendor Problem: An Extension Of The Newsvendor Problem, Gaffar Solihu

Electronic Theses and Dissertations

The Newsvendor problem is a classical supply chain problem used to develop strategies for inventory optimization. The goal of the newsvendor problem is to predict the optimal order quantity of a product to meet an uncertain demand in the future, given that the demand distribution itself is known. The Ice Cream Vendor Problem extends the classical newsvendor problem to an uncertain demand with unknown distribution, albeit a distribution that is known to depend on exogenous features. The goal is thus to estimate the order quantity that minimizes the total cost when demand does not follow any known statistical distribution. The …


Algebraic Graph-Assisted Bidirectional Transformers For Molecular Property Prediction, Dong Chen, Kaifu Gao, Duc Duy Nguyen, Xin Chen, Yi Jiang, Guo-Wei Wei, Feng Pan Jun 2021

Algebraic Graph-Assisted Bidirectional Transformers For Molecular Property Prediction, Dong Chen, Kaifu Gao, Duc Duy Nguyen, Xin Chen, Yi Jiang, Guo-Wei Wei, Feng Pan

Mathematics Faculty Publications

The ability of molecular property prediction is of great significance to drug discovery, human health, and environmental protection. Despite considerable efforts, quantitative prediction of various molecular properties remains a challenge. Although some machine learning models, such as bidirectional encoder from transformer, can incorporate massive unlabeled molecular data into molecular representations via a self-supervised learning strategy, it neglects three-dimensional (3D) stereochemical information. Algebraic graph, specifically, element-specific multiscale weighted colored algebraic graph, embeds complementary 3D molecular information into graph invariants. We propose an algebraic graph-assisted bidirectional transformer (AGBT) framework by fusing representations generated by algebraic graph and bidirectional transformer, as well as …


Stationary Probability Distributions Of Stochastic Gradient Descent And The Success And Failure Of The Diffusion Approximation, William Joseph Mccann May 2021

Stationary Probability Distributions Of Stochastic Gradient Descent And The Success And Failure Of The Diffusion Approximation, William Joseph Mccann

Theses

In this thesis, Stochastic Gradient Descent (SGD), an optimization method originally popular due to its computational efficiency, is analyzed using Markov chain methods. We compute both numerically, and in some cases analytically, the stationary probability distributions (invariant measures) for the SGD Markov operator over all step sizes or learning rates. The stationary probability distributions provide insight into how the long-time behavior of SGD samples the objective function minimum.

A key focus of this thesis is to provide a systematic study in one dimension comparing the exact SGD stationary distributions to the Fokker-Planck diffusion approximation equations —which are commonly used in …


Defect Detection In Atomic Resolution Transmission Electron Microscopy Images Using Machine Learning, Philip Cho, Aihua W. Wood, Krishnamurthy Mahalingam, Kurt Eyink May 2021

Defect Detection In Atomic Resolution Transmission Electron Microscopy Images Using Machine Learning, Philip Cho, Aihua W. Wood, Krishnamurthy Mahalingam, Kurt Eyink

Faculty Publications

Point defects play a fundamental role in the discovery of new materials due to their strong influence on material properties and behavior. At present, imaging techniques based on transmission electron microscopy (TEM) are widely employed for characterizing point defects in materials. However, current methods for defect detection predominantly involve visual inspection of TEM images, which is laborious and poses difficulties in materials where defect related contrast is weak or ambiguous. Recent efforts to develop machine learning methods for the detection of point defects in TEM images have focused on supervised methods that require labeled training data that is generated via …


Acceleration Of Boltzmann Collision Integral Calculation Using Machine Learning, Ian Holloway, Aihua W. Wood, Alexander Alekseenko Jan 2021

Acceleration Of Boltzmann Collision Integral Calculation Using Machine Learning, Ian Holloway, Aihua W. Wood, Alexander Alekseenko

Faculty Publications

The Boltzmann equation is essential to the accurate modeling of rarefied gases. Unfortunately, traditional numerical solvers for this equation are too computationally expensive for many practical applications. With modern interest in hypersonic flight and plasma flows, to which the Boltzmann equation is relevant, there would be immediate value in an efficient simulation method. The collision integral component of the equation is the main contributor of the large complexity. A plethora of new mathematical and numerical approaches have been proposed in an effort to reduce the computational cost of solving the Boltzmann collision integral, yet it still remains prohibitively expensive for …


A Review Of The Fractal Market Hypothesis For Trading And Market Price Prediction, Jonathan Blackledge, Marc Lamphiere Jan 2021

A Review Of The Fractal Market Hypothesis For Trading And Market Price Prediction, Jonathan Blackledge, Marc Lamphiere

Articles

This paper provides a review of the Fractal Market Hypothesis (FMH) focusing on financial times series analysis. In order to put the FMH into a broader perspective, the Random Walk and Efficient Market Hypotheses are considered together with the basic principles of fractal geometry. After exploring the historical developments associated with different financial hypotheses, an overview of the basic mathematical modelling is provided. The principal goal of this paper is to consider the intrinsic scaling properties that are characteristic for each hypothesis. In regard to the FMH, it is explained why a financial time series can be taken to be …


Hybrid Deep Neural Networks For Mining Heterogeneous Data, Xiurui Hou Aug 2020

Hybrid Deep Neural Networks For Mining Heterogeneous Data, Xiurui Hou

Dissertations

In the era of big data, the rapidly growing flood of data represents an immense opportunity. New computational methods are desired to fully leverage the potential that exists within massive structured and unstructured data. However, decision-makers are often confronted with multiple diverse heterogeneous data sources. The heterogeneity includes different data types, different granularities, and different dimensions, posing a fundamental challenge in many applications. This dissertation focuses on designing hybrid deep neural networks for modeling various kinds of data heterogeneity.

The first part of this dissertation concerns modeling diverse data types, the first kind of data heterogeneity. Specifically, image data and …


Analyzing The Fractal Dimension Of Various Musical Pieces, Nathan Clark Aug 2020

Analyzing The Fractal Dimension Of Various Musical Pieces, Nathan Clark

Industrial Engineering Undergraduate Honors Theses

One of the most common tools for evaluating data is regression. This technique, widely used by industrial engineers, explores linear relationships between predictors and the response. Each observation of the response is a fixed linear combination of the predictors with an added error element. The method is built on the assumption that this error is normally distributed across all observations and has a mean of zero. In some cases, it has been found that the inherent variation is not the result of a random variable, but is instead the result of self-symmetric properties of the observations. For data with these …


Bayesian Topological Machine Learning, Christopher A. Oballe Aug 2020

Bayesian Topological Machine Learning, Christopher A. Oballe

Doctoral Dissertations

Topological data analysis encompasses a broad set of ideas and techniques that address 1) how to rigorously define and summarize the shape of data, and 2) use these constructs for inference. This dissertation addresses the second problem by developing new inferential tools for topological data analysis and applying them to solve real-world data problems. First, a Bayesian framework to approximate probability distributions of persistence diagrams is established. The key insight underpinning this framework is that persistence diagrams may be viewed as Poisson point processes with prior intensities. With this assumption in hand, one may compute posterior intensities by adopting techniques …