Open Access. Powered by Scholars. Published by Universities.®

Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability

2023

Institution
Keyword
Publication
Publication Type
File Type

Articles 31 - 60 of 81

Full-Text Articles in Mathematics

Movie Recommender System Using Matrix Factorization, Roland Fiagbe May 2023

Movie Recommender System Using Matrix Factorization, Roland Fiagbe

Data Science and Data Mining

Recommendation systems are a popular and beneficial field that can help people make informed decisions automatically. This technique assists users in selecting relevant information from an overwhelming amount of available data. When it comes to movie recommendations, two common methods are collaborative filtering, which compares similarities between users, and content-based filtering, which takes a user’s specific preferences into account. However, our study focuses on the collaborative filtering approach, specifically matrix factorization. Various similarity metrics are used to identify user similarities for recommendation purposes. Our project aims to predict movie ratings for unwatched movies using the MovieLens rating dataset. We developed …


Formula 101 Using 2022 Formula One Season Data To Understand The Race Results, Christopher Garcia, Oliver Lopez May 2023

Formula 101 Using 2022 Formula One Season Data To Understand The Race Results, Christopher Garcia, Oliver Lopez

Student Scholar Symposium Abstracts and Posters

The reason why I am interested in Formula One is that my friend showed me what Formula One was all about. It became interesting to see the action of the sport, including the battles the drivers have during the race and how fast they go through a corner. Also, when qualifying comes around, they push their car to the absolute limit to gain a few seconds off their opponents. The drivers only in the top 10 receive points from the winner getting 25 points, the last driver in the top 10 getting 1 point, and those below the top ten …


Fractal Newton Methods, Ali Akgül, David E. Grow May 2023

Fractal Newton Methods, Ali Akgül, David E. Grow

Mathematics and Statistics Faculty Research & Creative Works

We introduce fractal Newton methods for solving (Formula presented.) that generalize and improve the classical Newton method. We compare the theoretical efficacy of the classical and fractal Newton methods and illustrate the theory with examples.


Explorations In Baseball Analytics: Simulations, Predictions, And Evaluations For Games And Players, Katelyn Mongerson May 2023

Explorations In Baseball Analytics: Simulations, Predictions, And Evaluations For Games And Players, Katelyn Mongerson

Theses and Dissertations

From statistics being reported in newspapers in the 1840s, to present day, baseballhas always been one of the most data-driven sports. We make use of the endless publicly available baseball data to build models in R and Python that answer various baseball- related questions regarding predicting and optimizing run production, evaluating player effectiveness, and forecasting the postseason. To predict and optimize run production, we present three models. The first builds a common tool in baseball analysis called a Run Expectancy Matrix which is used to give a value (in terms of runs) to various in-game decisions. The second uses the …


An Integer Garch Model For A Poisson Process With Time-Varying Zero-Inflation, Isuru Panduka Ratnayake, V. A. Samaranayake May 2023

An Integer Garch Model For A Poisson Process With Time-Varying Zero-Inflation, Isuru Panduka Ratnayake, V. A. Samaranayake

Mathematics and Statistics Faculty Research & Creative Works

A serially dependent Poisson process with time-varying zero-inflation is proposed. Such formulations have the potential to model count data time series arising from phenomena such as infectious diseases that ebb and flow over time. The model assumes that the intensity of the Poisson process evolves according to a generalized autoregressive conditional heteroscedastic (GARCH) formulation and allows the zero-inflation parameter to vary over time and be governed by a deterministic function or by an exogenous variable. Both the expectation maximization (EM) and the maximum likelihood estimation (MLE) approaches are presented as possible estimation methods. A simulation study shows that both parameter …


Fully Decoupled Energy-Stable Numerical Schemes For Two-Phase Coupled Porous Media And Free Flow With Different Densities And Viscosities, Yali Gao, Xiaoming He, Tao Lin, Yanping Lin May 2023

Fully Decoupled Energy-Stable Numerical Schemes For Two-Phase Coupled Porous Media And Free Flow With Different Densities And Viscosities, Yali Gao, Xiaoming He, Tao Lin, Yanping Lin

Mathematics and Statistics Faculty Research & Creative Works

In this article, we consider a phase field model with different densities and viscosities for the coupled two-phase porous media flow and two-phase free flow, as well as the corresponding numerical simulation. This model consists of three parts: a Cahn-Hilliard-Darcy system with different densities/viscosities describing the porous media flow in matrix, a Cahn-illiard-Navier-Stokes system with different densities/viscosities describing the free fluid in conduit, and seven interface conditions coupling the flows in the matrix and the conduit. Based on the separate Cahn-Hilliard equations in the porous media region and the free flow region, a weak formulation is proposed to incorporate the …


Flexible Models For The Estimation Of Treatment Effect, Habeeb Abolaji Bashir May 2023

Flexible Models For The Estimation Of Treatment Effect, Habeeb Abolaji Bashir

Open Access Theses & Dissertations

Estimation of treatment effect is an important problem which is well studied in the literature. While the regression models are one of the most commonly used techniques for the estimation of treatment effect, they are prone to model misspecification. To minimize the model misspecification bias, flexible nonparametric models are introduced for the estimation. Continuing this line of research, we propose two flexible nonparametric models that allow the treatment effect to vary across different levels of covariates. We provide estimation algorithms for both these models. Using simulations and data analysis, we illustrate the usefulness of the proposed methods.


Machine Learning-Based Data And Model Driven Bayesian Uncertanity Quantification Of Inverse Problems For Suspended Non-Structural System, Zhiyuan Qin May 2023

Machine Learning-Based Data And Model Driven Bayesian Uncertanity Quantification Of Inverse Problems For Suspended Non-Structural System, Zhiyuan Qin

All Dissertations

Inverse problems involve extracting the internal structure of a physical system from noisy measurement data. In many fields, the Bayesian inference is used to address the ill-conditioned nature of the inverse problem by incorporating prior information through an initial distribution. In the nonparametric Bayesian framework, surrogate models such as Gaussian Processes or Deep Neural Networks are used as flexible and effective probabilistic modeling tools to overcome the high-dimensional curse and reduce computational costs. In practical systems and computer models, uncertainties can be addressed through parameter calibration, sensitivity analysis, and uncertainty quantification, leading to improved reliability and robustness of decision and …


Investigating The Effect Of Greediness On The Coordinate Exchange Algorithm For Generating Optimal Experimental Designs, William Thomas Gullion May 2023

Investigating The Effect Of Greediness On The Coordinate Exchange Algorithm For Generating Optimal Experimental Designs, William Thomas Gullion

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Design of Experiments (DoE) is the field of statistics concerned with helping researchers maximize the amount of information they gain from their experiments. Recently, researchers have been turning to optimal experimental designs instead of classical/catalog experimental designs. One of the most popular algorithms used today to generate optimal designs is the Coordinate Exchange (CEXCH) Algorithm. CEXCH is known to be a greedy algorithm, which means it tends to favor immediate, locally best designs instead of globally optimal designs. Previous research demonstrated that this tradeoff was efficacious in that it reduced the cost of a single run of CEXCH and allowed …


Examining Model Complexity's Effects When Predicting Continuous Measures From Ordinal Labels, Mckade S. Thomas May 2023

Examining Model Complexity's Effects When Predicting Continuous Measures From Ordinal Labels, Mckade S. Thomas

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Many real world problems require the prediction of ordinal variables where the values are a set of categories with an ordering to them. However, in many of these cases the categorical nature of the ordinal data is not a desirable outcome. As such, regression models treat ordinal variables as continuous and do not bind their predictions to discrete categories. Prior research has found that these models are capable of learning useful information between the discrete levels of the ordinal labels they are trained on, but complex models may learn ordinal labels too closely, missing the information between levels. In this …


Uconn Baseball Batting Order Optimization, Gavin Rublewski, Gavin Rublewski May 2023

Uconn Baseball Batting Order Optimization, Gavin Rublewski, Gavin Rublewski

Honors Scholar Theses

Challenging conventional wisdom is at the very core of baseball analytics. Using data and statistical analysis, the sets of rules by which coaches make decisions can be justified, or possibly refuted. One of those sets of rules relates to the construction of a batting order. Through data collection, data adjustment, the construction of a baseball simulator, and the use of a Monte Carlo Simulation, I have assessed thousands of possible batting orders to determine the roster-specific strategies that lead to optimal run production for the 2023 UConn baseball team. This paper details a repeatable process in which basic player statistics …


A Brascamp-Lieb–Rary Of Examples, Anina Peersen May 2023

A Brascamp-Lieb–Rary Of Examples, Anina Peersen

Mathematics, Statistics, and Computer Science Honors Projects

This paper focuses on the Brascamp-Lieb inequality and its applications in analysis, fractal geometry, computer science, and more. It provides a beginner-level introduction to the Brascamp-Lieb inequality alongside re- lated inequalities in analysis and explores specific cases of extremizable, simple, and equivalent Brascamp-Lieb data. Connections to computer sci- ence and geometric measure theory are introduced and explained. Finally, the Brascamp-Lieb constant is calculated for a chosen family of linear maps.


Mixing Measures For Trees Of Fixed Diameter, Ari Holcombe Pomerance May 2023

Mixing Measures For Trees Of Fixed Diameter, Ari Holcombe Pomerance

Mathematics, Statistics, and Computer Science Honors Projects

A mixing measure is the expected length of a random walk in a graph given a set of starting and stopping conditions. We determine the tree structures of order n with diameter d that minimize and maximize for a few mixing measures. We show that the maximizing tree is usually a broom graph or a double broom graph and that the minimizing tree is usually a seesaw graph or a double seesaw graph.


Large Deviations For Self Intersection Local Times Of Ornstein-Uhlenbeck Processes, Apostolos Gournaris May 2023

Large Deviations For Self Intersection Local Times Of Ornstein-Uhlenbeck Processes, Apostolos Gournaris

Doctoral Dissertations

In the area of large deviations, people concern about the asymptotic computation of small probabilities on an exponential scale. The general form of large deviations can be roughly described as: P{Yn ∈ A} ≈ exp{−bnI(A)} (n → ∞), for a random sequence {Yn}, a positive sequence bn with bn → ∞, and a coefficient I(A) ≥ 0. In applications, we often concern about the probability that the random variables take large values, that is we concern about the P{Yn ≥ λ}, where λ > 0. Here, we consider the Ornstein-Uhlenbeck process, study the properties of the local times and self intersection …


The 2015 Ncaa Cost-Of-Attendance Stipend And Its Effects On Institutional Financial Aid Packages, Sara Greene Apr 2023

The 2015 Ncaa Cost-Of-Attendance Stipend And Its Effects On Institutional Financial Aid Packages, Sara Greene

Honors Theses

In 2015, the National Collegiate Athletic Association (NCAA) allowed “Cost of Attendance” (COA) stipends to be offered to athletic recruits for Division I schools. These stipends are intended to allow schools to grant aid to student-athletes beyond a full-ride scholarship to cover additional costs imposed on student-athletes. These stipends created an opportunity for the “Autonomy” Power 5 programs to utilize a competitive tactic to try to win over the top recruits. There is evidence that these COA stipends have caused an increase in the estimated cost of attendance reported by the university. This paper examines if the COA stipends have …


Employee Attrition: Analyzing Factors Influencing Job Satisfaction Of Ibm Data Scientists, Graham Nash Apr 2023

Employee Attrition: Analyzing Factors Influencing Job Satisfaction Of Ibm Data Scientists, Graham Nash

Symposium of Student Scholars

Employee attrition is a relevant issue that every business employer must consider when gauging the effectiveness of their employees. Whether or not an employee chooses to leave their job can come from a multitude of factors. As a result, employers need to develop methods in which they can measure attrition by calculating the several qualities of their employees. Factors like their age, years with the company, which department they work in, their level of education, their job role, and even their marital status are all considered by employers to assist in predicting employee attrition. This project will be analyzing a …


Defining Characteristics That Lead To Cost-Efficient Veteran Nba Free Agent Signings, David Mccain Apr 2023

Defining Characteristics That Lead To Cost-Efficient Veteran Nba Free Agent Signings, David Mccain

Honors Projects in Mathematics

Throughout the history of the NBA, decisions regarding the signing of free agents have been riddled with complexity. Franchises are tasked with finding out what players will serve as optimal free agent signings prior to seeing them perform within the framework of their team. This study hypothesizes that the adequacy of an NBA free agent signing can be modeled and predicted through the implementation of a machine learning model. The model will learn the necessary information using training and testing data sets that include various player biometrics, game statistics, and financial information. The application of this machine learning model will …


Using A Distributive Approach To Model Insurance Loss, Kayla Kippes Apr 2023

Using A Distributive Approach To Model Insurance Loss, Kayla Kippes

Student Research Submissions

Insurance loss is an unpredicted event that stands at the forefront of the insurance industry. Loss in insurance represents the costs or expenses incurred due to a claim. An insurance claim is a request for the insurance company to pay for damage caused to an individual’s property. Loss can be measured by how much money (the dollar amount) has been paid out by the insurance company to repair the damage or it can be measured by the number of claims (claim count) made to the insurance company. Insured events include property damage due to fire, theft, flood, a car accident, …


From Big Farm To Big Pharma: A Differential Equations Model Of Antibiotic-Resistant Salmonella In Industrial Poultry Populations, Rilyn Mckallip Apr 2023

From Big Farm To Big Pharma: A Differential Equations Model Of Antibiotic-Resistant Salmonella In Industrial Poultry Populations, Rilyn Mckallip

Honors Theses

Antibiotics are used in poultry production as prophylaxis, curative treatment, and growth promotion. The first use is as prophylaxis, or prevention of common bacterial diseases. The crowded conditions in concentrated animal feeding operations necessitate management of infectious disease to ensure overall animal health and the profitability of such operations. In these farms, between 20,000 and 125,000 birds are raised in shed-like enclosures [3], with an average of less than one square foot of space per chicken [34]. Antibiotics are currently used in chicken farms to manage and prevent common bacterial diseases such as respiratory and digestive tract infections, as well …


Length Bias Estimation Of Small Businesses Lifetime, Simeng Li Apr 2023

Length Bias Estimation Of Small Businesses Lifetime, Simeng Li

Honors Theses

Small businesses, particularly restaurants, play a crucial role in the economy by generating employment opportunities, boosting tourism, and contributing to the local economy. However, accurately estimating their lifetimes can be challenging due to the presence of length bias, which occurs when the likelihood of sampling any particular restaurant's closure is influenced by its duration in operation. To address the issue, this study conducts goodness-of-fit tests on exponential/gamma family distributions and employs the Kaplan-Meier method to more accurately estimate the average lifetime of restaurants in Carytown. By providing insights into the challenges of estimating the lifetimes of small businesses, this study …


Bridging The Chasm Between Fundamental, Momentum, And Quantitative Investing, Allen Hoskins, Jeff Reed, Robert Slater Apr 2023

Bridging The Chasm Between Fundamental, Momentum, And Quantitative Investing, Allen Hoskins, Jeff Reed, Robert Slater

SMU Data Science Review

A chasm exists between the active public equity investment management industry's fundamental, momentum, and quantitative styles. In this study, the researchers explore ways to bridge this gap by leveraging domain knowledge, fundamental analysis, momentum, crowdsourcing, and data science methods. This research also seeks to test the developed tools and strategies during the volatile time period of 2020 and 2021.


Mlb 2023 Season Attendance Predictions, Sophia Andersen, Anna Tollette, Hannah Clinton Apr 2023

Mlb 2023 Season Attendance Predictions, Sophia Andersen, Anna Tollette, Hannah Clinton

Research and Scholarship Symposium Posters

The goal of this project was to predict home game attendance for all 30 Major League Baseball (MLB) teams in their 2023 season. Researching and understanding that data as well as identifying influential factors of attendance were key factors before building a predictive model. Both the given material and data sets from MinneMUDAC, the competition organizer, was used as well as some outside sources. Finally, a predictive model was coded in Python which gave attendance predictions for every MLB game scheduled in 2023. From these results, insights could be offered to Major League Baseball or each team individually, to help …


El Final Report: Undergraduate Summer Research Internships, Sophie Wu Apr 2023

El Final Report: Undergraduate Summer Research Internships, Sophie Wu

SASAH 4th Year Capstone and Other Projects: Publications

In her final report, Sophie Wu discusses her two Undergraduate Summer Research Internships at Western University: the first in the Statistics and Actuarial Science department, concerning microinsurance, and the second, in the Mathematics department, concerning computational neuroscience.


Classification Of Land Cover On Sand Dunes, Heleyna Tucker, Micah Sterk Apr 2023

Classification Of Land Cover On Sand Dunes, Heleyna Tucker, Micah Sterk

22nd Annual Celebration of Undergraduate Research and Creative Activity (2023)

As members of the Hope College Coastal Research Group, we have studied the mechanisms for and effects of sand transport. In particular, we have worked to model vegetation coverage in West Michigan sand dune complexes in order to better understand how sand movement and resident vegetation affect one another. We use aerial drone imagery to develop machine learning algorithms for creating ground cover classification mappings in an automated way. Our team collected drone imagery ranging from high-resolution, low-altitude photographs to high-altitude stitched and rectified orthomosaics. We developed accurate ground cover classification methods for the low-altitude imagery and then explored ways …


Generating Optimal Space-Filling Designs With Particle Swarm Optimization, Rebekah Scott Apr 2023

Generating Optimal Space-Filling Designs With Particle Swarm Optimization, Rebekah Scott

Student Research Symposium

In 1935, Ronald Fisher published The Design of Experiments, establishing classical designs for various types of experiments. With the rise of computing power came optimal design, where statisticians can better customize designs according to the needs of the researchers running the experiment. This research focuses on generating optimal MaxMin space-filling designs with particle swarm optimization using various distance metrics (Manhattan, Euclidean, etc). Interestingly, changing the distance metric in the objective function had a minimal effect on the design, except for Aitchison geometry on the simplex. Space-filling designs are optimal for supporting high-order models with only a small sacrifice in prediction …


A Graphical User Interface Using Spatiotemporal Interpolation To Determine Fine Particulate Matter Values In The United States, Kelly M. Entrekin Apr 2023

A Graphical User Interface Using Spatiotemporal Interpolation To Determine Fine Particulate Matter Values In The United States, Kelly M. Entrekin

Honors College Theses

Fine particulate matter or PM2.5 can be described as a pollution particle that has a diameter of 2.5 micrometers or smaller. These pollution particle values are measured by monitoring sites installed across the United States throughout the year. While these values are helpful, a lot of areas are not accounted for as scientists are not able to measure all of the United States. Some of these unmeasured regions could be reaching high PM2.5 values over time without being aware of it. These high values can be dangerous by causing or worsening health conditions, such as cardiovascular and lung diseases. Within …


Multilevel Optimization With Dropout For Neural Networks, Gary Joseph Saavedra Apr 2023

Multilevel Optimization With Dropout For Neural Networks, Gary Joseph Saavedra

Mathematics & Statistics ETDs

Large neural networks have become ubiquitous in machine learning. Despite their widespread use, the optimization process for training a neural network remains com-putationally expensive and does not necessarily create networks that generalize well to unseen data. In addition, the difficulty of training increases as the size of the neural network grows. In this thesis, we introduce the novel MGDrop and SMGDrop algorithms which use a multigrid optimization scheme with a dropout coarsening operator to train neural networks. In contrast to other standard neural network training schemes, MGDrop explicitly utilizes information from smaller sub-networks which act as approximations of the full …


A New Approach To Proper Orthogonal Decomposition With Difference Quotients, Sarah Locke Eskew, John R. Singler Apr 2023

A New Approach To Proper Orthogonal Decomposition With Difference Quotients, Sarah Locke Eskew, John R. Singler

Mathematics and Statistics Faculty Research & Creative Works

In a Recent Work (Koc Et Al., SIAM J. Numer. Anal. 59(4), 2163–2196, 2021), the Authors Showed that Including Difference Quotients (DQs) is Necessary in Order to Prove Optimal Pointwise in Time Error Bounds for Proper Orthogonal Decomposition (POD) Reduced Order Models of the Heat Equation. in This Work, We Introduce a New Approach to Including DQs in the POD Procedure. Instead of Computing the POD Modes using All of the Snapshot Data and DQs, We Only Use the First Snapshot Along with All of the DQs and Special POD Weights. We Show that This Approach Retains All of the …


Rank-Based Inference For Survey Sampling Data, Akim Adekpedjou, Huybrechts F. Bindele Apr 2023

Rank-Based Inference For Survey Sampling Data, Akim Adekpedjou, Huybrechts F. Bindele

Mathematics and Statistics Faculty Research & Creative Works

For regression models where data are obtained from sampling surveies, the statistical analysis is often based on approaches that are either non-robust or inefficient. The handling of survey data requires more appropriate techniques, as the classical methods usually result in biased and inefficient estimates of the underlying model parameters. This article is concerned with the development of a new approach of obtaining robust and efficient estimates of regression model parameters when dealing with survey sampling data. Asymptotic properties of such estimators are established under mild regularity conditions. To demonstrate the performance of the proposed method, Monte Carlo simulation experiments are …


Changing Nfl Playoff Overtime Rules To Create Equal Opportunities To Win A Game, Matthew Silvia Apr 2023

Changing Nfl Playoff Overtime Rules To Create Equal Opportunities To Win A Game, Matthew Silvia

Honors Projects in Mathematics

The NFL has attempted to create fair overtime rules over the course of the past decade; however, this study is interested in determining what playoff overtime rule (or rules) could the NFL implement to result in outcomes where both teams have a relatively equal chance of winning a game. This study aims to find which overtime rules work best at minimizing the differences between teams who possess the ball first versus teams that kick the ball off to start an overtime period. By collecting various NFL statistics from ESPN.com and FantasyOutsiders.com, this study hopes to run multiple simulations of different …