Open Access. Powered by Scholars. Published by Universities.®

Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 20 of 20

Full-Text Articles in Mathematics

Explorations In Baseball Analytics: Simulations, Predictions, And Evaluations For Games And Players, Katelyn Mongerson May 2023

Explorations In Baseball Analytics: Simulations, Predictions, And Evaluations For Games And Players, Katelyn Mongerson

Theses and Dissertations

From statistics being reported in newspapers in the 1840s, to present day, baseballhas always been one of the most data-driven sports. We make use of the endless publicly available baseball data to build models in R and Python that answer various baseball- related questions regarding predicting and optimizing run production, evaluating player effectiveness, and forecasting the postseason. To predict and optimize run production, we present three models. The first builds a common tool in baseball analysis called a Run Expectancy Matrix which is used to give a value (in terms of runs) to various in-game decisions. The second uses the …


Spline Modeling And Localized Mutual Information Monitoring Of Pairwise Associations In Animal Movement, Andrew Benjamin Whetten May 2022

Spline Modeling And Localized Mutual Information Monitoring Of Pairwise Associations In Animal Movement, Andrew Benjamin Whetten

Theses and Dissertations

to a new era of remote sensing and geospatial analysis. In environmental science and conservation ecology, biotelemetric data recorded is often high-dimensional, spatially and/or temporally, and functional in nature, meaning that there is an underlying continuity to the biological process of interest. GPS-tracking of animal movement is commonly characterized by irregular time-recording of animal position, and the movement relationships between animals are prone to sudden change. In this dissertation, I propose a spline modeling approach for exploring interactions and time-dependent correlation between the movement of apex predators exhibiting territorial and territory-sharing behavior. A measure of localized mutual information (LMI) is …


Smoothed Quantiles For Claim Frequency Models, With Applications To Risk Measurement, Ponmalar Suruliraj Ratnam May 2020

Smoothed Quantiles For Claim Frequency Models, With Applications To Risk Measurement, Ponmalar Suruliraj Ratnam

Theses and Dissertations

Statistical models for the claim severity and claim frequency variables are routinely constructed and utilized by actuaries. Typical applications of such models include identification of optimal deductibles for selected loss elimination ratios, pricing of contract layers, determining credibility factors, risk and economic capital measures, and evaluation of effects of inflation, market trends and other quantities arising in insurance. While the actuarial literature on the severity models is extensive and rapidly growing, that for the claim frequency models lags behind. One of the reasons for such a gap is that various actuarial metrics do not possess ``nice'' statistical properties for the …


Fitting Of Lotka-Volterra Model For Coupled Population Growth Data Through Least-Squares Estimation Of Parameters, Jessica Ann Harter May 2020

Fitting Of Lotka-Volterra Model For Coupled Population Growth Data Through Least-Squares Estimation Of Parameters, Jessica Ann Harter

Theses and Dissertations

The population of two types of bacteria found in the Gulf Coast of Florida, V.chagasii and V. harveyi, can be described by the Lotka-Voltera competition model. Using data gathered in experiments conducted by Bury and Pickett (2015), we take a different approach to find parameter estimates using numerical methods in R. In particular, we find a numerical solution to the coupled set of ODEs and minimize the sum of squared errors in order to obtain the optimal parameter estimates that will fit the data best. In order to get a sense of accuracy of these parameter estimates, we use bootstrap …


The Strong Law Of Large Numbers For U-Statistics Under Random Censorship, Jan Höft Dec 2018

The Strong Law Of Large Numbers For U-Statistics Under Random Censorship, Jan Höft

Theses and Dissertations

We introduce a semi-parametric U-statistics estimator for randomly right censored data. We will study the strong law of large numbers for this estimator under proper assumptions about the conditional expectation of the censoring indicator with re- spect to the observed life times. Moreover we will conduct simulation studies, where the semi-parametric estimator is compared to a U-statistic based on the Kaplan- Meier product limit estimator in terms of bias, variance and mean squared error, under different censoring models.


Robust Estimation Of Parametric Models For Insurance Loss Data, Chudamani Poudyal Jun 2018

Robust Estimation Of Parametric Models For Insurance Loss Data, Chudamani Poudyal

Theses and Dissertations

Parametric statistical models for insurance claims severity are continuous, right-skewed, and frequently heavy-tailed. The data sets that such models are usually fitted to contain outliers that

are difficult to identify and separate from genuine data. Moreover, due to commonly used actuarial “loss control strategies,” the random variables we observe and wish to model are affected by truncation (due to deductibles), censoring (due to policy limits), scaling

(due to coinsurance proportions) and other transformations. In the current practice, statistical inference for loss models is almost exclusively likelihood (MLE) based, which typically results in non-robust parameter estimators, pricing models, and risk measures. …


Fitting A Complex Markov Chain Model For Firm And Market Productivity, Julia Ruth Valder May 2018

Fitting A Complex Markov Chain Model For Firm And Market Productivity, Julia Ruth Valder

Theses and Dissertations

This thesis develops a methodology of estimating parameters for a complex Markov chain model for firm productivity. The model consists of two Markov chains, one describing firm-level productivity and the other modeling the productivity of the whole market. If applicable, the model can be used to help with optimal decision making problems for labor demand. The need for such a model is motivated and the economical background of this research is shown. A brief introduction to the concept of Markov chains and their application in this context is given. The simulated data that is being used for the estimation is …


Infinite-Dimensional Traits: Estimation Of Mean, Covariance, And Selection Gradient Of Tribolium Castaneum Growth Curves, Ly Viet Hoang May 2017

Infinite-Dimensional Traits: Estimation Of Mean, Covariance, And Selection Gradient Of Tribolium Castaneum Growth Curves, Ly Viet Hoang

Theses and Dissertations

In evolutionary biology, traits like growth curves, reaction norms or morphological shapes cannot be described by a finite vector of components alone. Instead, continuous functions represent a more useful structure. Such traits are called function-valued or infinite-dimensional traits. Kirkpatrick and Heckmann outlined the first quantitative genetic model for these traits. Beder and Gomulkiewicz extended the theory on the selection gradient and the evolutionary response from finite- to infinite-dimensional traits.

Rigorous methods for the estimation of these quantities were developed throughout the years. In his dissertation, Baur defines estimators for the mean and covariance function, as well as for the selection …


Robust And Computationally Efficient Methods For Fitting Loss Models And Pricing Insurance Risks, Qian Zhao May 2017

Robust And Computationally Efficient Methods For Fitting Loss Models And Pricing Insurance Risks, Qian Zhao

Theses and Dissertations

Continuous parametric distributions are useful tools for modeling and pricing insurance risks, measuring income inequality in economics, investigating reliability of engineering systems, and in many other areas of application. In this dissertation, we propose and develop a new method for estimation of their parameters—the method of Winsorized moments (MWM)—which is conceptually similar to the method of trimmed moments (MTM) and thus is robust and computationally efficient. Both approaches yield explicit formulas of parameter estimators for location-scale and log-location-scale families, which are commonly used to model claim severity. Large-sample properties of the new estimators are provided and corroborated through simulations. Their …


Associated Hypothesis In Linear Models With Unbalanced Data, Rica Katharina Wedowski May 2017

Associated Hypothesis In Linear Models With Unbalanced Data, Rica Katharina Wedowski

Theses and Dissertations

In a two-way linear model one can test six different hypotheses regarding the effects in this model. Those hypotheses can be ranked from less specific to more specific. Therefore the more specific hypotheses are nested in the less specific ones. To test those nested hypotheses sequential sums of squares are used. Searle sees a problem with these since they test an associated hypothesis that has the same sums of squares but involve the sample sizes. Hypotheses should be generic and not dependent on the data. The proof he uses in his book Linear Models for Unbalanced Data is not easy …


Black-Scholes Model: An Analysis Of The Influence Of Volatility, Cornelia Krome May 2017

Black-Scholes Model: An Analysis Of The Influence Of Volatility, Cornelia Krome

Theses and Dissertations

In this thesis the influence of volatility in the Black-Scholes model is analyzed. The deduced Black-Scholes formula estimates the price of European options. Contrary to the other parameters of the formula, the future volatility of the underlying asset cannot be observed in the market. The parameter needs to be assumed in order to calculate the option price. An inaccurate assumption may lead to an erroneous volatility. It is studied how a falsely assumed volatility impacts on the option price. Empirical simulations will be carried out to get an impression of possible errors in the computations. Afterwards, those results will be …


Estimating The Selection Gradient Of A Function-Valued Trait, Tyler John Baur Dec 2016

Estimating The Selection Gradient Of A Function-Valued Trait, Tyler John Baur

Theses and Dissertations

Kirkpatrick and Heckman initiated the study of function-valued traits in 1989. How to estimate the selection gradient of a function-valued trait is a major question asked by evolutionary biologists. In this dissertation, we give an explicit expansion of the selection gradient and construct estimators based on two different samples: one consisting of independent organisms (the independent case), and the other consisting of independent families of equally related organisms (the dependent case).

In the independent case we first construct and prove the joint consistency of sieve estimators of the mean and covariance functions of a Gaussian process, based on previous developments …


Density Estimation For Lifetime Distributions Under Semi-Parametric Random Censorship Models, Carsten Harlass Dec 2016

Density Estimation For Lifetime Distributions Under Semi-Parametric Random Censorship Models, Carsten Harlass

Theses and Dissertations

We derive product limit estimators of survival times and failure rates for randomly right censored data as the numerical solution of identifying Volterra integral equations by employing explicit and implicit Euler schemes. While the first approach results in some known estimators, the latter leads to a new general type of product limit estimator. Plugging in established methods to approximate the conditional probability of the censoring indicator given the observation, we introduce new semi-parametric and presmoothed Kaplan-Meier type estimators. In the case of the semi-parametric random censorship model, i.e. the latter probability belonging to some parametric family, we study the strong …


Statistical Contributions To Operational Risk Modeling, Daoping Yu May 2016

Statistical Contributions To Operational Risk Modeling, Daoping Yu

Theses and Dissertations

In this dissertation, we focus on statistical aspects of operational risk modeling. Specifically, we are interested in understanding the effects of model uncertainty on capital reserves due to data truncation and in developing better model selection tools for truncated and shifted parametric distributions. We first investigate the model uncertainty question which has been unanswered for many years because researchers, practitioners, and regulators could not agree on how to treat the data collection threshold in operational risk modeling. There are several approaches under consideration—the empirical approach, the “naive” approach, the shifted approach, and the truncated approach—for fitting the loss severity distribution. …


Parameter Estimation For The Spatial Ornstein-Uhlenbeck Process With Missing Observations, Sami Cheong May 2016

Parameter Estimation For The Spatial Ornstein-Uhlenbeck Process With Missing Observations, Sami Cheong

Theses and Dissertations

Suppose we are collecting a set of data on a rectangular sampling grid, it is reasonable to assume that observations (e.g. data that arise in weather forecasting, public health and agriculture) made on each sampling site are spatially correlated. Therefore, when building a model for this type of data, we often pair it with an underlying Gaussian process that contains different spatially dependent parameters. Here, we assume that the Gaussian process is characterized by the Ornstein-Uhlenbeck covariance function, which has the property of being both stationary and Markov under the assumption that no observations are missing. However, in reality, the …


Longitudinal Data Models With Nonparametric Random Effect Distributions, Hartmut Jakob Stenz May 2016

Longitudinal Data Models With Nonparametric Random Effect Distributions, Hartmut Jakob Stenz

Theses and Dissertations

There is the saying which says you cannot see the woods for the trees. This

thesis aims to circumvent this unfortunate situation: Longitudinal data on

tree growth, as an example of multiple observations of similar individuals

pooled together in one data set, are modeled simultaneously rather than

each individual separately. This is done under the assumption that one

model is suitable for all individuals but its parameters vary following un-

known nonparametric random effect distributions. The goal is a maximum

likelihood estimation of these distributions considering all provided data and

using basis-spline-approximations for the densities of each distribution func-

tion …


Associated Hypotheses In Linear Models For Unbalanced Data, Carlos J. Soto May 2015

Associated Hypotheses In Linear Models For Unbalanced Data, Carlos J. Soto

Theses and Dissertations

When looking at factorial experiments there are several natural hypotheses that can be tested. In a two-factor or a by b design, the three null hypotheses of greatest interest are the absence of each main effect and the absence of interaction. There are two ways to construct the numerator sum of squares for testing these, namely either adjusted or sequential sums of squares (also known as type I and type III in SAS). Searle has pointed out that, for unbalanced data, a sequential sum of squares for one of these hypotheses is equal (with probability 1) to an adjusted sum …


A Markov Model For Baseball With Applications, Daniel Joseph Ursin Dec 2014

A Markov Model For Baseball With Applications, Daniel Joseph Ursin

Theses and Dissertations

In this work we confirm a Markov chain model of baseball for 2013 Major League Baseball batting data. We describe the transition matrices for individual player data and their use in generating single and nine-inning run distributions for a given lineup. The run distribution is used to calculate the expected number of runs produced by a lineup over nine innings. We discuss batting order optimization heuristics to avoid computation of distributions for the 9! = 362, 880 distinct lineups for 9 players. Finally, we describe an implementation of the algorithms and review their performance against actual game data.


Optimal Reinsurance Strategy With Bivariate Pareto Risks, Evelyn Susanne Gaus May 2014

Optimal Reinsurance Strategy With Bivariate Pareto Risks, Evelyn Susanne Gaus

Theses and Dissertations

In an insurance, one is often concerned with risks and extreme events which can cause large losses. The Pareto distribution is often used in actuarial sciences for modeling large losses. This thesis extends the study of Cai and Wei (2011) by considering a two-line business model with positive dependence through stochastic ordering (PDS) risks, where the risks are bivariate Pareto distributed. Cai and Wei (2011) showed that in individual reinsurance treaties the excess-of-loss treaty is the optimal reinsurance form for an insurer with PDS risks. We derive explicit expressions for the optimal retention levels in the excess-of-loss treaty by considering …


An Efficient Methodology For Learning Bayesian Networks, Emmanuel Owusu Asante-Asamani Aug 2012

An Efficient Methodology For Learning Bayesian Networks, Emmanuel Owusu Asante-Asamani

Theses and Dissertations

Statistics from the National Cancer Institute indicate that 1 in 8 women will develop Breast cancer in their lifetime. Researchers have developed numerous statistical models to predict breast cancer risk however physicians are hesitant to use these models because of disparities in the predictions they produce. In an effort to reduce these disparities, we use Bayesian networks to capture the joint distribution of risk factors, and simulate artificial patient populations (clinical avatars) for interrogating the existing risk prediction models. The challenge in this effort has been to produce a Bayesian network whose dependencies agree with literature and are good estimates …