Open Access. Powered by Scholars. Published by Universities.®

Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability

Theses/Dissertations

2023

Institution
Keyword
Publication

Articles 1 - 30 of 32

Full-Text Articles in Mathematics

Integrating Machine Learning Methods For Medical Diagnosis, Jazmin Quezada Dec 2023

Integrating Machine Learning Methods For Medical Diagnosis, Jazmin Quezada

Open Access Theses & Dissertations

Abstract:The rapid advancement of machine learning techniques has revolutionized the field of medical diagnosis by offering powerful tools to analyze complex data sets and make accurate predictions. In this proposed method, we present a novel approach that integrates machine learning and optimization models to enhance the accuracy of medical diagnoses. Our method focuses on fine-tuning and optimizing the parameters of machine learning algorithms commonly used in medical diagnosis, such as logistic regression, support vector machines, and neural networks. By employing optimization techniques, we systematically explore the parameter space of these algorithms to discover the most optimal configurations. Moreover, by representing …


Wavelet Compression As An Observational Operator In Data Assimilation Systems For Sea Surface Temperature, Bradley J. Sciacca Dec 2023

Wavelet Compression As An Observational Operator In Data Assimilation Systems For Sea Surface Temperature, Bradley J. Sciacca

University of New Orleans Theses and Dissertations

The ocean remains severely under-observed, in part due to its sheer size. Containing nearly billion of water with most of the subsurface being invisible because water is extremely difficult to penetrate using electromagnetic radiation, as is typically used by satellite measuring instruments. For this reason, most observations of the ocean have very low spatial-temporal coverage to get a broad capture of the ocean’s features. However, recent “dense but patchy” data have increased the availability of high-resolution – low spatial coverage observations. These novel data sets have motivated research into multi-scale data assimilation methods. Here, we demonstrate a new assimilation approach …


Stochastic Optimal Control Of Conditional Mckean-Vlasov Equations With Jump And Markovian Switching, Charles Samuel Conly Sharp Dec 2023

Stochastic Optimal Control Of Conditional Mckean-Vlasov Equations With Jump And Markovian Switching, Charles Samuel Conly Sharp

Theses and Dissertations

This thesis obtains a number of results in stochastic optimal control for conditional McKean-Vlasov equations with jump and Markovian switching. First, we prove the uniqueness of the solutions and derive a relevant version of Itô's formula. We provide the dynamic programming principle and prove the associated verification theorem. A stochastic maximum principle is established. Further, we derive the relationship between dynamic programming and the stochastic maximum principle. Additionally, we utilize our stochastic maximum principle result for a mean-variance portfolio selection problem.


Aspects Of Stochastic Geometric Mechanics In Molecular Biophysics, David Frost Dec 2023

Aspects Of Stochastic Geometric Mechanics In Molecular Biophysics, David Frost

All Dissertations

In confocal single-molecule FRET experiments, the joint distribution of FRET efficiency and donor lifetime distribution can reveal underlying molecular conformational dynamics via deviation from their theoretical Forster relationship. This shift is referred to as a dynamic shift. In this study, we investigate the influence of the free energy landscape in protein conformational dynamics on the dynamic shift by simulation of the associated continuum reaction coordinate Langevin dynamics, yielding a deeper understanding of the dynamic and structural information in the joint FRET efficiency and donor lifetime distribution. We develop novel Langevin models for the dye linker dynamics, including rotational dynamics, based …


Using Gamification To Foster Student Resilience And Motivation To Learn, And Using Games To Teach Significance Testing Concepts In The Statistics Classroom, Todd Partridge Dec 2023

Using Gamification To Foster Student Resilience And Motivation To Learn, And Using Games To Teach Significance Testing Concepts In The Statistics Classroom, Todd Partridge

All Graduate Theses and Dissertations, Fall 2023 to Present

Two studies are outlined in this dissertation.

In the first study, elements of Super Mario Bros. videos games were used to change the way college students in a beginners’ statistics course were graded on their work. This was part of an effort to help students remain optimistic in the face of challenging coursework and even failure on assignments and tests. The study shows that the changes made to the grading structure did help students to keep trying and to use the materials given to them by their professor until they achieved their desired grade in the course, and suggests ways …


Foundations Of Memory Capacity In Models Of Neural Cognition, Chandradeep Chowdhury Dec 2023

Foundations Of Memory Capacity In Models Of Neural Cognition, Chandradeep Chowdhury

Master's Theses

A central problem in neuroscience is to understand how memories are formed as a result of the activities of neurons. Valiant’s neuroidal model attempted to address this question by modeling the brain as a random graph and memories as subgraphs within that graph. However the question of memory capacity within that model has not been explored: how many memories can the brain hold? Valiant introduced the concept of interference between memories as the defining factor for capacity; excessive interference signals the model has reached capacity. Since then, exploration of capacity has been limited, but recent investigations have delved into the …


Probabilistic Modeling Of Social Media Networks, Distinguishing Phylogenetic Networks From Trees, And Fairness In Service Queues, Md Rashidul Hasan Aug 2023

Probabilistic Modeling Of Social Media Networks, Distinguishing Phylogenetic Networks From Trees, And Fairness In Service Queues, Md Rashidul Hasan

Mathematics & Statistics ETDs

In this dissertation, three primary issues are explored. The first subject exposes who-saw-from-whom pathways in post-specific dissemination networks in social media platforms. We describe a network-based approach for temporal, textual, and post-diffusion network inference. The conditional point process method discovers the most probable diffusion network. The tool is capable of meaningful analysis of hundreds of post shares. Inferred diffusion networks demonstrate disparities in information distribution between user groups (confirmed versus unverified, conservative versus liberal) and local communities (political, entrepreneurial, etc.). A promising approach for quantifying post-impact, we observe discrepancies in inferred networks that indicate the disproportionate amount of automated bots. …


Stressor: An R Package For Benchmarking Machine Learning Models, Samuel A. Haycock Aug 2023

Stressor: An R Package For Benchmarking Machine Learning Models, Samuel A. Haycock

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Many discipline specific researchers need a way to quickly compare the accuracy of their predictive models to other alternatives. However, many of these researchers are not experienced with multiple programming languages. Python has recently been the leader in machine learning functionality, which includes the PyCaret library that allows users to develop high-performing machine learning models with only a few lines of code. The goal of the stressor package is to help users of the R programming language access the advantages of PyCaret without having to learn Python. This allows the user to leverage R’s powerful data analysis workflows, while simultaneously …


An Interval-Valued Random Forests, Paul Gaona Partida Aug 2023

An Interval-Valued Random Forests, Paul Gaona Partida

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

There is a growing demand for the development of new statistical models and the refinement of established methods to accommodate different data structures. This need arises from the recognition that traditional statistics often assume the value of each observation to be precise, which may not hold true in many real-world scenarios. Factors such as the collection process and technological advancements can introduce imprecision and uncertainty into the data.

For example, consider data collected over a long period of time, where newer measurement tools may offer greater accuracy and provide more information than previous methods. In such cases, it becomes crucial …


Modified Geometries, Clifford Algebras And Graphs: Their Impact On Discreteness, Locality And Symmetr, Roma Sverdlov Jul 2023

Modified Geometries, Clifford Algebras And Graphs: Their Impact On Discreteness, Locality And Symmetr, Roma Sverdlov

Mathematics & Statistics ETDs

In this dissertation I will explore the question whether various entities commonly used in quantum field theory can be “constructed". In particular, can spacetime be “constructed" out of building blocks, and can Berezin integral be “constructed" in terms of Riemann integrals.

As far as “constructing" spacetime out of building blocks, it has been attempted by multiple scientific communities and various models were proposed. But the common downfall is they break the principles of relativity. I will explore the ways of doing so in such a way that principles of relativity are respected. One of my approaches is to replace points …


On Maximum Likelihood Estimators For A Jump-Type Affine Diffusion Two-Factor Model, Jiaming Yin Mr. Jun 2023

On Maximum Likelihood Estimators For A Jump-Type Affine Diffusion Two-Factor Model, Jiaming Yin Mr.

Major Papers

We consider a jump-type two-factor affine diffusion model driven by a subordinator in the context of continuous time observations. We study the asymptotic properties of the maximum likelihood estimator (MLE) for the drift parameters. In particular, we prove the strong consistency and the asymptotic normality of MLE in the subcritical case. We also present some numerical illustrations to confirm the theoretical results. The main difficulty of this major paper consists in proving the ergodicity of the model in the subcritical case and deriving the limiting behavior of the process.


Survival Times And Investment Analysis With Dynamic Learning, Zhenzhen Li Jun 2023

Survival Times And Investment Analysis With Dynamic Learning, Zhenzhen Li

Dissertations and Theses

The central statistical problem of survival analysis is to determine and characterize the conditional distribution of a survival time given a history of some observed health markers.

This dissertation contributes to the modeling of such conditional distributions in a setup where the health markers evolve randomly over time in a manner that can be represented by an Ito stochastic process, that is, a stochastic process that can be written as a sum of a time integral of some stochastic process and an Ito integral of some stochastic process, with both integrands subject to certain restrictions.

The random survival time is …


A Survey On Online Matching And Ad Allocation, Ryan Lee May 2023

A Survey On Online Matching And Ad Allocation, Ryan Lee

Theses

One of the classical problems in graph theory is matching. Given an undirected graph, find a matching which is a set of edges without common vertices. In 1990s, Richard Karp, Umesh Vazirani, and Vijay Vazirani would be the first computer scientists to use matchings for online algorithms [8]. In our domain, an online algorithm operates in the online setting where a bipartite graph is given. On one side of the graph there is a set of advertisers and on the other side we have a set of impressions. During the online phase, multiple impressions will arrive and the objective of …


An Application Of The Pagerank Algorithm To Ncaa Football Team Rankings, Morgan Majors May 2023

An Application Of The Pagerank Algorithm To Ncaa Football Team Rankings, Morgan Majors

Honors Theses

We investigate the use of Google’s PageRank algorithm to rank sports teams. The PageRank algorithm is used in web searches to return a list of the websites that are of most interest to the user. The structure of the NCAA FBS football schedule is used to construct a network with a similar structure to the world wide web. Parallels are drawn between pages that are linked in the world wide web with the results of a contest between two sports teams. The teams under consideration here are the members of the 2021 Football Bowl Subdivision. We achieve a total ordering …


Explorations In Baseball Analytics: Simulations, Predictions, And Evaluations For Games And Players, Katelyn Mongerson May 2023

Explorations In Baseball Analytics: Simulations, Predictions, And Evaluations For Games And Players, Katelyn Mongerson

Theses and Dissertations

From statistics being reported in newspapers in the 1840s, to present day, baseballhas always been one of the most data-driven sports. We make use of the endless publicly available baseball data to build models in R and Python that answer various baseball- related questions regarding predicting and optimizing run production, evaluating player effectiveness, and forecasting the postseason. To predict and optimize run production, we present three models. The first builds a common tool in baseball analysis called a Run Expectancy Matrix which is used to give a value (in terms of runs) to various in-game decisions. The second uses the …


Investigating The Effect Of Greediness On The Coordinate Exchange Algorithm For Generating Optimal Experimental Designs, William Thomas Gullion May 2023

Investigating The Effect Of Greediness On The Coordinate Exchange Algorithm For Generating Optimal Experimental Designs, William Thomas Gullion

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Design of Experiments (DoE) is the field of statistics concerned with helping researchers maximize the amount of information they gain from their experiments. Recently, researchers have been turning to optimal experimental designs instead of classical/catalog experimental designs. One of the most popular algorithms used today to generate optimal designs is the Coordinate Exchange (CEXCH) Algorithm. CEXCH is known to be a greedy algorithm, which means it tends to favor immediate, locally best designs instead of globally optimal designs. Previous research demonstrated that this tradeoff was efficacious in that it reduced the cost of a single run of CEXCH and allowed …


Flexible Models For The Estimation Of Treatment Effect, Habeeb Abolaji Bashir May 2023

Flexible Models For The Estimation Of Treatment Effect, Habeeb Abolaji Bashir

Open Access Theses & Dissertations

Estimation of treatment effect is an important problem which is well studied in the literature. While the regression models are one of the most commonly used techniques for the estimation of treatment effect, they are prone to model misspecification. To minimize the model misspecification bias, flexible nonparametric models are introduced for the estimation. Continuing this line of research, we propose two flexible nonparametric models that allow the treatment effect to vary across different levels of covariates. We provide estimation algorithms for both these models. Using simulations and data analysis, we illustrate the usefulness of the proposed methods.


Machine Learning-Based Data And Model Driven Bayesian Uncertanity Quantification Of Inverse Problems For Suspended Non-Structural System, Zhiyuan Qin May 2023

Machine Learning-Based Data And Model Driven Bayesian Uncertanity Quantification Of Inverse Problems For Suspended Non-Structural System, Zhiyuan Qin

All Dissertations

Inverse problems involve extracting the internal structure of a physical system from noisy measurement data. In many fields, the Bayesian inference is used to address the ill-conditioned nature of the inverse problem by incorporating prior information through an initial distribution. In the nonparametric Bayesian framework, surrogate models such as Gaussian Processes or Deep Neural Networks are used as flexible and effective probabilistic modeling tools to overcome the high-dimensional curse and reduce computational costs. In practical systems and computer models, uncertainties can be addressed through parameter calibration, sensitivity analysis, and uncertainty quantification, leading to improved reliability and robustness of decision and …


Large Deviations For Self Intersection Local Times Of Ornstein-Uhlenbeck Processes, Apostolos Gournaris May 2023

Large Deviations For Self Intersection Local Times Of Ornstein-Uhlenbeck Processes, Apostolos Gournaris

Doctoral Dissertations

In the area of large deviations, people concern about the asymptotic computation of small probabilities on an exponential scale. The general form of large deviations can be roughly described as: P{Yn ∈ A} ≈ exp{−bnI(A)} (n → ∞), for a random sequence {Yn}, a positive sequence bn with bn → ∞, and a coefficient I(A) ≥ 0. In applications, we often concern about the probability that the random variables take large values, that is we concern about the P{Yn ≥ λ}, where λ > 0. Here, we consider the Ornstein-Uhlenbeck process, study the properties of the local times and self intersection …


Examining Model Complexity's Effects When Predicting Continuous Measures From Ordinal Labels, Mckade S. Thomas May 2023

Examining Model Complexity's Effects When Predicting Continuous Measures From Ordinal Labels, Mckade S. Thomas

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Many real world problems require the prediction of ordinal variables where the values are a set of categories with an ordering to them. However, in many of these cases the categorical nature of the ordinal data is not a desirable outcome. As such, regression models treat ordinal variables as continuous and do not bind their predictions to discrete categories. Prior research has found that these models are capable of learning useful information between the discrete levels of the ordinal labels they are trained on, but complex models may learn ordinal labels too closely, missing the information between levels. In this …


The 2015 Ncaa Cost-Of-Attendance Stipend And Its Effects On Institutional Financial Aid Packages, Sara Greene Apr 2023

The 2015 Ncaa Cost-Of-Attendance Stipend And Its Effects On Institutional Financial Aid Packages, Sara Greene

Honors Theses

In 2015, the National Collegiate Athletic Association (NCAA) allowed “Cost of Attendance” (COA) stipends to be offered to athletic recruits for Division I schools. These stipends are intended to allow schools to grant aid to student-athletes beyond a full-ride scholarship to cover additional costs imposed on student-athletes. These stipends created an opportunity for the “Autonomy” Power 5 programs to utilize a competitive tactic to try to win over the top recruits. There is evidence that these COA stipends have caused an increase in the estimated cost of attendance reported by the university. This paper examines if the COA stipends have …


Using A Distributive Approach To Model Insurance Loss, Kayla Kippes Apr 2023

Using A Distributive Approach To Model Insurance Loss, Kayla Kippes

Student Research Submissions

Insurance loss is an unpredicted event that stands at the forefront of the insurance industry. Loss in insurance represents the costs or expenses incurred due to a claim. An insurance claim is a request for the insurance company to pay for damage caused to an individual’s property. Loss can be measured by how much money (the dollar amount) has been paid out by the insurance company to repair the damage or it can be measured by the number of claims (claim count) made to the insurance company. Insured events include property damage due to fire, theft, flood, a car accident, …


From Big Farm To Big Pharma: A Differential Equations Model Of Antibiotic-Resistant Salmonella In Industrial Poultry Populations, Rilyn Mckallip Apr 2023

From Big Farm To Big Pharma: A Differential Equations Model Of Antibiotic-Resistant Salmonella In Industrial Poultry Populations, Rilyn Mckallip

Honors Theses

Antibiotics are used in poultry production as prophylaxis, curative treatment, and growth promotion. The first use is as prophylaxis, or prevention of common bacterial diseases. The crowded conditions in concentrated animal feeding operations necessitate management of infectious disease to ensure overall animal health and the profitability of such operations. In these farms, between 20,000 and 125,000 birds are raised in shed-like enclosures [3], with an average of less than one square foot of space per chicken [34]. Antibiotics are currently used in chicken farms to manage and prevent common bacterial diseases such as respiratory and digestive tract infections, as well …


Length Bias Estimation Of Small Businesses Lifetime, Simeng Li Apr 2023

Length Bias Estimation Of Small Businesses Lifetime, Simeng Li

Honors Theses

Small businesses, particularly restaurants, play a crucial role in the economy by generating employment opportunities, boosting tourism, and contributing to the local economy. However, accurately estimating their lifetimes can be challenging due to the presence of length bias, which occurs when the likelihood of sampling any particular restaurant's closure is influenced by its duration in operation. To address the issue, this study conducts goodness-of-fit tests on exponential/gamma family distributions and employs the Kaplan-Meier method to more accurately estimate the average lifetime of restaurants in Carytown. By providing insights into the challenges of estimating the lifetimes of small businesses, this study …


A Graphical User Interface Using Spatiotemporal Interpolation To Determine Fine Particulate Matter Values In The United States, Kelly M. Entrekin Apr 2023

A Graphical User Interface Using Spatiotemporal Interpolation To Determine Fine Particulate Matter Values In The United States, Kelly M. Entrekin

Honors College Theses

Fine particulate matter or PM2.5 can be described as a pollution particle that has a diameter of 2.5 micrometers or smaller. These pollution particle values are measured by monitoring sites installed across the United States throughout the year. While these values are helpful, a lot of areas are not accounted for as scientists are not able to measure all of the United States. Some of these unmeasured regions could be reaching high PM2.5 values over time without being aware of it. These high values can be dangerous by causing or worsening health conditions, such as cardiovascular and lung diseases. Within …


Multilevel Optimization With Dropout For Neural Networks, Gary Joseph Saavedra Apr 2023

Multilevel Optimization With Dropout For Neural Networks, Gary Joseph Saavedra

Mathematics & Statistics ETDs

Large neural networks have become ubiquitous in machine learning. Despite their widespread use, the optimization process for training a neural network remains com-putationally expensive and does not necessarily create networks that generalize well to unseen data. In addition, the difficulty of training increases as the size of the neural network grows. In this thesis, we introduce the novel MGDrop and SMGDrop algorithms which use a multigrid optimization scheme with a dropout coarsening operator to train neural networks. In contrast to other standard neural network training schemes, MGDrop explicitly utilizes information from smaller sub-networks which act as approximations of the full …


Using Physics-Informed Neural Networks For Multigrid In Time Coarse Grid Equations, Jonathan P. Gutierrez Mar 2023

Using Physics-Informed Neural Networks For Multigrid In Time Coarse Grid Equations, Jonathan P. Gutierrez

Mathematics & Statistics ETDs

For parallel-in-time integration methods, the multigrid-reduction-in-time (MGRIT) method has shown promising results in both improved convergence and increased computational speeds when solving evolution problems. However, one problem the MGRIT algorithm currently faces is it struggles solving hyperbolic problems efficiently. In particular, hyperbolic problems are generally solved using explicit methods and this causes issues on the coarser multigrid levels, where larger (coarser) time step sizes can violate the stability condition. In this thesis, physics-informed neural networks (PINNs) are used to evaluate the coarse grid equations in the MGRIT algorithm with the goal to improve convergence for problems with hyperbolic behavior, as …


Beginner's Analysis Of Financial Stochastic Process Models, David Garcia Jan 2023

Beginner's Analysis Of Financial Stochastic Process Models, David Garcia

HMC Senior Theses

This thesis explores the use of geometric Brownian motion (GBM) as a financial model for predicting stock prices. The model is first introduced and its assumptions and limitations are discussed. Then, it is shown how to simulate GBM in order to predict stock price values. The performance of the GBM model is then evaluated in two different periods of time to determine whether it's accuracy has changed before and after March 23, 2020.


Efficient High Order Ensemble For Fluid Flow, John Carter Jan 2023

Efficient High Order Ensemble For Fluid Flow, John Carter

Doctoral Dissertations

"This thesis proposes efficient ensemble-based algorithms for solving the full and reduced Magnetohydrodynamics (MHD) equations. The proposed ensemble methods require solving only one linear system with multiple right-hand sides for different realizations, reducing computational cost and simulation time. Four algorithms utilize a Generalized Positive Auxiliary Variable (GPAV) approach and are demonstrated to be second-order accurate and unconditionally stable with respect to the system energy through comprehensive stability analyses and error tests. Two algorithms make use of Artificial Compressibility (AC) to update pressure and a solenoidal constraint for the magnetic field. Numerical simulations are provided to illustrate theoretical results and demonstrate …


Advances In Differentially Methylated Region Detection And Cure Survival Models, Daniel Ahmed Alhassan Jan 2023

Advances In Differentially Methylated Region Detection And Cure Survival Models, Daniel Ahmed Alhassan

Doctoral Dissertations

"This dissertation focuses on two areas of statistics: DNA methylation and survival analysis. The first part of the dissertation pertains to the detection of differentially methylated regions in the human genome. The varying distribution of gaps between succeeding genomic locations, which are represented on the microarray used to quantify methylation, makes it challenging to identify regions that have differential methylation. This emphasizes the need to properly account for the correlation in methylation shared by nearby locations within a specific genomic distance. In this work, a normalized kernel-weighted statistic is proposed to obtain an optimal amount of "information" from neighboring locations …