Open Access. Powered by Scholars. Published by Universities.®

Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 25 of 25

Full-Text Articles in Mathematics

Stressor: An R Package For Benchmarking Machine Learning Models, Samuel A. Haycock Aug 2023

Stressor: An R Package For Benchmarking Machine Learning Models, Samuel A. Haycock

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Many discipline specific researchers need a way to quickly compare the accuracy of their predictive models to other alternatives. However, many of these researchers are not experienced with multiple programming languages. Python has recently been the leader in machine learning functionality, which includes the PyCaret library that allows users to develop high-performing machine learning models with only a few lines of code. The goal of the stressor package is to help users of the R programming language access the advantages of PyCaret without having to learn Python. This allows the user to leverage R’s powerful data analysis workflows, while simultaneously …


An Interval-Valued Random Forests, Paul Gaona Partida Aug 2023

An Interval-Valued Random Forests, Paul Gaona Partida

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

There is a growing demand for the development of new statistical models and the refinement of established methods to accommodate different data structures. This need arises from the recognition that traditional statistics often assume the value of each observation to be precise, which may not hold true in many real-world scenarios. Factors such as the collection process and technological advancements can introduce imprecision and uncertainty into the data.

For example, consider data collected over a long period of time, where newer measurement tools may offer greater accuracy and provide more information than previous methods. In such cases, it becomes crucial …


Investigating The Effect Of Greediness On The Coordinate Exchange Algorithm For Generating Optimal Experimental Designs, William Thomas Gullion May 2023

Investigating The Effect Of Greediness On The Coordinate Exchange Algorithm For Generating Optimal Experimental Designs, William Thomas Gullion

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Design of Experiments (DoE) is the field of statistics concerned with helping researchers maximize the amount of information they gain from their experiments. Recently, researchers have been turning to optimal experimental designs instead of classical/catalog experimental designs. One of the most popular algorithms used today to generate optimal designs is the Coordinate Exchange (CEXCH) Algorithm. CEXCH is known to be a greedy algorithm, which means it tends to favor immediate, locally best designs instead of globally optimal designs. Previous research demonstrated that this tradeoff was efficacious in that it reduced the cost of a single run of CEXCH and allowed …


Examining Model Complexity's Effects When Predicting Continuous Measures From Ordinal Labels, Mckade S. Thomas May 2023

Examining Model Complexity's Effects When Predicting Continuous Measures From Ordinal Labels, Mckade S. Thomas

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Many real world problems require the prediction of ordinal variables where the values are a set of categories with an ordering to them. However, in many of these cases the categorical nature of the ordinal data is not a desirable outcome. As such, regression models treat ordinal variables as continuous and do not bind their predictions to discrete categories. Prior research has found that these models are capable of learning useful information between the discrete levels of the ordinal labels they are trained on, but complex models may learn ordinal labels too closely, missing the information between levels. In this …


Dynamic System Discovery With Recursive Physics-Informed Neural Networks, Jarrod Mau Aug 2022

Dynamic System Discovery With Recursive Physics-Informed Neural Networks, Jarrod Mau

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

This thesis presents a novel method, recursive Physics informed neural network, to learn the right hand side of differential equations. The neural network takes in data, then trains, and then acts as a proxy for the differential equation which can be used for modeling. We show the theoretical superiority of the recursive approach. We also use computer simulations to demonstrate the proved properties.


Defining Areas Of Interest Using Voronoi And Modified Voronoi Tesselations To Analyze Eye-Tracking Data, Joanna D. Coltrin Aug 2022

Defining Areas Of Interest Using Voronoi And Modified Voronoi Tesselations To Analyze Eye-Tracking Data, Joanna D. Coltrin

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Eye tracking is a technology used to track where someone is looking. Eye-tracking technology is often used to study what people focus on when looking at a photo of another person. The eye-tracking technology records points on a photo that a person is looking at. When the photo being looked at shows a person, the points can be categorized by body part such as head, right hand, left hand, and torso. This thesis presents the use of partially circular areas to define the body parts of the person in the photo and therefore categorize the points collected by the eye-tracker. …


Contributions To Random Forest Variable Importance With Applications In R, Kelvyn K. Bladen Aug 2022

Contributions To Random Forest Variable Importance With Applications In R, Kelvyn K. Bladen

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

A major focus in statistics is building and improving computational algorithms that can use data to predict a response. Two fundamental camps of research arise from such a goal. The first camp is researching ways to get more accurate predictions. Many sophisticated methods, collectively known as machine learning methods, have been developed for this very purpose. One such method that is widely used across industry and many other areas of investigation is called Random Forests.

The second camp of research is that of improving the interpretability of machine learning methods. This is worthy of attention when analysts desire to optimize …


Delta Hedging Of Financial Options Using Reinforcement Learning And An Impossibility Hypothesis, Ronak Tali Dec 2020

Delta Hedging Of Financial Options Using Reinforcement Learning And An Impossibility Hypothesis, Ronak Tali

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

In this thesis we take a fresh perspective on delta hedging of financial options as undertaken by market makers. The current industry standard of delta hedging relies on the famous Black Scholes formulation that prescribes continuous time hedging in a way that allows the market maker to remain risk neutral at all times. But the Black Scholes formulation is a deterministic model that comes with several strict assumptions such as zero transaction costs, log normal distribution of the underlying stock prices, etc. In this paper we employ Reinforcement Learning to redesign the delta hedging problem in way that allows us …


Novel Statistical Models For Quantitative Shape-Gene Association Selection, Xiaotian Dai Dec 2017

Novel Statistical Models For Quantitative Shape-Gene Association Selection, Xiaotian Dai

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Other research reported that genetic mechanism plays a major role in the development process of biological shapes. The primary goal of this dissertation is to develop novel statistical models to investigate the quantitative relationships between biological shapes and genetic variants. However, these problems can be extremely challenging to traditional statistical models for a number of reasons: 1) the biological phenotypes cannot be effectively represented by single-valued traits, while traditional regression only handles one dependent variable; 2) in real-life genetic data, the number of candidate genes to be investigated is extremely large, and the signal-to-noise ratio of candidate genes is expected …


Exact Approaches For Bias Detection And Avoidance With Small, Sparse, Or Correlated Categorical Data, Sarah E. Schwartz Dec 2017

Exact Approaches For Bias Detection And Avoidance With Small, Sparse, Or Correlated Categorical Data, Sarah E. Schwartz

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Every day, traditional statistical methodology are used world wide to study a variety of topics and provides insight regarding countless subjects. Each technique is based on a distinct set of assumptions to ensure valid results. Additionally, many statistical approaches rely on large sample behavior and may collapse or degenerate in the presence of small, spare, or correlated data. This dissertation details several advancements to detect these conditions, avoid their consequences, and analyze data in a different way to yield trustworthy results.

One of the most commonly used modeling techniques for outcomes with only two possible categorical values (eg. live/die, pass/fail, …


Extracting And Visualizing Data From Mobile And Static Eye Trackers In R And Matlab, Chunyang Li Dec 2017

Extracting And Visualizing Data From Mobile And Static Eye Trackers In R And Matlab, Chunyang Li

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Eye tracking is the process of measuring where people are looking at with an eye tracker device. Eye tracking has been used in many scientific fields, such as education, usability research, sports, psychology, and marketing. Eye tracking data are often obtained from a static eye tracker or are manually extracted from a mobile eye tracker. Visualization usually plays an important role in the analysis of eye tracking data. So far, there existed no software package that contains a whole collection of eye tracking data processing and visualization tools. In this dissertation, we review the eye tracking technology, the eye tracking …


A Comparison Of Five Statistical Methods For Predicting Stream Temperature Across Stream Networks, Maike F. Holthuijzen Aug 2017

A Comparison Of Five Statistical Methods For Predicting Stream Temperature Across Stream Networks, Maike F. Holthuijzen

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

The health of freshwater aquatic systems, particularly stream networks, is mainly influenced by water temperature, which controls biological processes and influences species distributions and aquatic biodiversity. Thermal regimes of rivers are likely to change in the future, due to climate change and other anthropogenic impacts, and our ability to predict stream temperatures will be critical in understanding distribution shifts of aquatic biota. Spatial statistical network models take into account spatial relationships but have drawbacks, including high computation times and data pre-processing requirements. Machine learning techniques and generalized additive models (GAM) are promising alternatives to the SSN model. Two machine learning …


Combinatorial Games On Graphs, Trevor K. Williams May 2017

Combinatorial Games On Graphs, Trevor K. Williams

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Combinatorial Games are intriguing and have a tendency to engross students and lead them into a serious study of mathematics. The engaging nature of games is the basis for this thesis. Two combinatorial games and some educational tools are presented which were developed by the author in the pursuit of the solution of these games.


Bayesian Models For Repeated Measures Data Using Markov Chain Monte Carlo Methods, Yuanzhi Li May 2016

Bayesian Models For Repeated Measures Data Using Markov Chain Monte Carlo Methods, Yuanzhi Li

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Bayesian models for repeated measures data are fitted to three different data an analysis projects. Markov Chain Monte Carlo (MCMC) methodology is applied to each case with Gibbs sampling and/or an adaptive Metropolis-Hastings (MH) algorithm used to simulate the posterior distribution of parameters. We implement a Bayesian model with different variance-covariance structures to an audit fee data set. Block structures and linear models for variances are used to examine the linear trend and different behaviors before and after regulatory change during year 2004-2005. We proposed a Bayesian hierarchical model with latent teacher effects, to determine whether teacher professional development (PD) …


To Dot Product Graphs And Beyond, Sean Bailey May 2016

To Dot Product Graphs And Beyond, Sean Bailey

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

We will introduce three new vector representations of graphs. These representations are based on relationships between the vectors that are used. Specifically, we will examine scenarios where we ignore specific relationships, where we consider if information is missing, and where we look for when the information in common is not of a specified amount.


Modeling Seed Dispersal And Population Migration Given A Distribution Of Seed Handling Times And Variable Dispersal Motility: Case Study For Pinyon And Juniper In Utah, Ram C. Neupane May 2015

Modeling Seed Dispersal And Population Migration Given A Distribution Of Seed Handling Times And Variable Dispersal Motility: Case Study For Pinyon And Juniper In Utah, Ram C. Neupane

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

The spread of fruiting tree species is strongly determined by the behavior and range of fruit-eating animals, particularly birds. Birds either consume and digest seeds or carry and cache them at some distance from the source tree. These carried and settled seeds provide some form of distribution which generates tree spread to the new location. Firstly, we modal seed dispersal by birds and introduce it in a dispersal model to estimate seed distribution. Using this distribution, we create a population model to estimate the speed at which juniper and pinyon forest boundaries move.

Secondly, we introduce a fact that bird …


Statistical Modeling, Exploration, And Visualization Of Snow Water Equivalent Data, James Beguah Odei May 2014

Statistical Modeling, Exploration, And Visualization Of Snow Water Equivalent Data, James Beguah Odei

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Due to a continual increase in the demand for water as well as an ongoing regional drought, there is an imminent need to monitor and forecast water resources in the Western United States. In particular, water resources in the Intermountain West rely heavily on snow water storage. Thus, the need to improve seasonal forecasts of snowpack and considering new techniques would allow water resources to be more effectively managed throughout the entire water-year. Many available models used in forecasting snow water equivalent (SWE) measurements require delicate calibrations.

In contrast to the physical SWE models most commonly used for forecasting, we …


Family-Wise Error Rate Control In Quantitative Trait Loci (Qtl) Mapping And Gene Ontology Graphs With Remarks On Family Selection, Garrett Saunders May 2014

Family-Wise Error Rate Control In Quantitative Trait Loci (Qtl) Mapping And Gene Ontology Graphs With Remarks On Family Selection, Garrett Saunders

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

One of the great aims of statistics, the science of collecting, analyzing, and interpreting data, is to protect against the probability of falsely rejecting an accepted claim, or hypothesis, given observed data stemming from some experiment. This is generally known as protecting against a Type I Error, or controlling the Type I Error rate. The extension of this protection against Type I Errors to the situation where thousands upon thousands of hypotheses are examined simultaneously is known as multiple hypothesis testing. This dissertation presents an improvement to an existing multiple hypothesis testing approach, the Focus Level method, specific to gene …


Generalized Minimum Penalized Hellinger Distance Estimation And Generalized Penalized Hellinger Deviance Testing For Generalized Linear Models: The Discrete Case, Huey Yan May 2001

Generalized Minimum Penalized Hellinger Distance Estimation And Generalized Penalized Hellinger Deviance Testing For Generalized Linear Models: The Discrete Case, Huey Yan

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

In this dissertation, robust and efficient alternatives to quasi-likelihood estimation and likelihood ratio tests are developed for discrete generalized linear models. The estimation method considered is a penalized minimum Hellinger distance procedure that generalizes a procedure developed by Harris and Basu for estimating parameters of a single discrete probability distribution from a random sample. A bootstrap algorithm is proposed to select the weight of the penalty term. Simulations are carried out to compare the new estimators with quasi-likelihood estimation. The robustness of the estimation procedure is demonstrated by simulation work and by Hapel's α-influence curve. Penalized minimum Hellinger deviance tests …


Linear Operators Strongly Preserving Polynomial Equations Over Antinegative Semirings, Sang-Gu Lee May 1991

Linear Operators Strongly Preserving Polynomial Equations Over Antinegative Semirings, Sang-Gu Lee

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

We characterized the group of linear operators that strongly preserve r-potent matrices over the binary Boolean semiring, nonbinary Boolean semirings, and zero-divisor free antinegative semirings. We extended these results to show that linear operators that strongly preserve r-potent matrices are equivalent to those linear operators that strongly preserve the matrix polynomial equation p(X) = X. where p(X) = Xr1 + Xr2 + ... + Xrt and r1>r2>...>rt≥2.

In addition, we characterized the group of linear operators that strongly preserve r-cyclic matrices over the same semirings. We …


An Evaluation Of Truncated Sequential Test, Ryh-Thinn Chang May 1975

An Evaluation Of Truncated Sequential Test, Ryh-Thinn Chang

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

The development of sequential analysis has led to the proposal of tests that are more economical in that the Average Sample Number (A. S. N.) of the sequential test is smaller than the sample size of the fixed sample test. Although these tests usually have a smaller A. S. N. than the equivelent fixed sample procedure, there still remains the possibility that an extremely large sample size will be necessary to make a decision. To remedy this, truncated sequential tests have been developed.

A method of truncation for testing a composite hypotheses is studied. This method is formed by mixing …


A Fortran List Processor (Flip), Karl A. Fugal May 1970

A Fortran List Processor (Flip), Karl A. Fugal

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

A series of Basic Assembler Language subroutines were developed and made available to the FORTRAN IV language processor which makes list processing possible in a flexible and easily understood way.

The subroutine will create and maintain list structures in the computer's core storage. The subroutines are sufficiently general to permit FORTRAN programmers to tailor list processing routines to their own individual requirements. List structure sizes are limited only by the amount of core storage available.


Analysis Of Contingency Tables, James Joseph Biundo May 1969

Analysis Of Contingency Tables, James Joseph Biundo

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Two methods of analyzing multi-dimensional frequency data are detailed.

The Second Order Exponential (SOE) model is applicable for dichotomous classifications. The distribution has two sets of parameters, ϴi's and ϴj's. The ϴi's are interpreted as the log of the odds of the marginal probabilities if no two factor relationships exist. Or if all ϴij are not zero, then the ϴi's are analogous to a main effect in a 2m factorial analysis, (m = number of factors or classifications). The ϴif's may be interpreted as a measure and direction …


Generation Of Random Numbers, Keith H. Eberhard May 1969

Generation Of Random Numbers, Keith H. Eberhard

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Subroutines are written to generate random numbers on the computer. Depending on the subroutine used, the generated random numbers follow the uniform, binomial, normal, chi-square, t, F, or gamma distribution. Each subroutine is tested using the chi-square goodness of fit test to verify that the random numbers generated by each subroutine follow the statistical distribution for which it is written. The interpretation of the test results indicates that each subroutine generates random numbers which closely approximates the theoretical distribution for which it is designed.

The approach used in the subroutine which generates gamma distributed random numbers involves the use of …


Simulation Of Mathematical Models In Genetic Analysis, Dinesh Govindal Patel May 1964

Simulation Of Mathematical Models In Genetic Analysis, Dinesh Govindal Patel

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

In recent years a new field of statistics has become of importance in many branches of experimental science. This is the Monte Carlo Method, so called because it is based on simulation of stochastic processes. By stochastic process, it is meant some possible physical process in the real world that has some random or stochastic element in its structure. This is the subject which may appropriately be called the dynamic part of statistics or the statistics of "change," in contrast with the static statistical problems which have so far been the more systematically studied. Many obvious examples of such processes …