Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 46

Full-Text Articles in Physical Sciences and Mathematics

Explorations In Baseball Analytics: Simulations, Predictions, And Evaluations For Games And Players, Katelyn Mongerson May 2023

Explorations In Baseball Analytics: Simulations, Predictions, And Evaluations For Games And Players, Katelyn Mongerson

Theses and Dissertations

From statistics being reported in newspapers in the 1840s, to present day, baseballhas always been one of the most data-driven sports. We make use of the endless publicly available baseball data to build models in R and Python that answer various baseball- related questions regarding predicting and optimizing run production, evaluating player effectiveness, and forecasting the postseason. To predict and optimize run production, we present three models. The first builds a common tool in baseball analysis called a Run Expectancy Matrix which is used to give a value (in terms of runs) to various in-game decisions. The second uses the …


Change Point Detection For A Process Having Several Regimes, Oliver Gerd Meister May 2023

Change Point Detection For A Process Having Several Regimes, Oliver Gerd Meister

Theses and Dissertations

In this dissertation, possible methods for multiple change point detection on Markovchain processes are studied. Related works for oine and online change point detection are discussed and their applicability on sequential multiple change point detection for several regimes is evaluated. We develop a method for a multiple change point detection for a process having three regimes. Its eciency is then evaluated on simulated Markov chain data by looking into dierent scenarios such as processes that signicantly dier between each other or probability distributions that are slightly similar. This approach is then applied on Covid- 19 hospital data. Therefore, the data …


Spline Modeling And Localized Mutual Information Monitoring Of Pairwise Associations In Animal Movement, Andrew Benjamin Whetten May 2022

Spline Modeling And Localized Mutual Information Monitoring Of Pairwise Associations In Animal Movement, Andrew Benjamin Whetten

Theses and Dissertations

to a new era of remote sensing and geospatial analysis. In environmental science and conservation ecology, biotelemetric data recorded is often high-dimensional, spatially and/or temporally, and functional in nature, meaning that there is an underlying continuity to the biological process of interest. GPS-tracking of animal movement is commonly characterized by irregular time-recording of animal position, and the movement relationships between animals are prone to sudden change. In this dissertation, I propose a spline modeling approach for exploring interactions and time-dependent correlation between the movement of apex predators exhibiting territorial and territory-sharing behavior. A measure of localized mutual information (LMI) is …


Functional Multidimensional Scaling, Liting Li May 2022

Functional Multidimensional Scaling, Liting Li

Theses and Dissertations

Multidimensional scaling is an important component in analyzing proximity (similarity or dissimilarity) between objects and plays a key role in creating low-dimensional visualizations of objects. Regardless of the progress in this area, traditional solutions of multidimensional scaling problems are inapplicable to the proximity which change in time. In this dissertation, we focus on dissimilarity instead of similarity. Motivated by the studies of functional data analysis, we extend the current multidimensional scaling techniques and propose a functional method to obtain lower-dimensional smooth representations in terms of time-varying dissimilarities. This method incorporates the smoothness approach of functional data analysis by using cubic …


Machine-Learning-Based Prediction Of Sepsis Events From Vertical Clinical Trial Data: A Naïve Approach, Tyler Michael Gaddis Aug 2020

Machine-Learning-Based Prediction Of Sepsis Events From Vertical Clinical Trial Data: A Naïve Approach, Tyler Michael Gaddis

Theses and Dissertations

Sepsis is a potentially life-threatening condition characterized by a dysregulated, disproportionate immune response to infection by which the afflicted body attacks its own tissues, sometimes to the point of organ failure, and in the worst cases, death. According to the Centers for Disease Control and Prevention (CDC) Sepsis is reported to kill upwards of 270,000 Americans annually, though this figure may be greater given certain ambiguities in the current accepted diagnostic framework of the disease.

This study attempted to first establish an understanding of past definitions of sepsis, and to then recommend use of machine learning as integral in an …


Estimating Distortion Risk Measures Under Truncated And Censored Data Scenarios, Sahadeb Upretee Aug 2020

Estimating Distortion Risk Measures Under Truncated And Censored Data Scenarios, Sahadeb Upretee

Theses and Dissertations

\begin{center}

ABSTRACT\\

\vspace{0.4in}

ESTIMATING DISTORTION RISK MEASURES UNDER TRUNCATED AND CENSORED DATA SCENARIOS

\end{center}

\doublespacing

\noindent

~In insurance data analytics and actuarial practice, a broad class of

risk measures -- {\em distortion risk measures\/} -- are used to capture

the riskiness of the distribution tail. Point and interval estimates of

the risk measures are then employed to price extreme events, to develop

reserves, to design risk transfer strategies, and to allocate capital.

When solving such problems, the main statistical challenge is to choose

an appropriate estimate of a risk measure and to assess its variability.

In this context, the empirical …


Biomarker Development For Use In Regression Calibration, Yiwen Zhang May 2020

Biomarker Development For Use In Regression Calibration, Yiwen Zhang

Theses and Dissertations

It is challenging to alleviate systematic measurement error in self-reported data when studying the associations between dietary intakes and chronic disease risk. The regression calibration method has been used for this purpose when an objectively measured biomarker that satisfies a classical measurement error assumption is available. The requirement for the biomarkers needs to be quite strong and very few dietary intake biomarkers as such have been developed. Feeding studies provide opportunities to develop such potential biomarkers using regression methods with a much larger variety of dietary variables. However, the measurement error for the resulting biomarkers will be of Berkson type …


Infant Mortality In The United States: Socioeconomic Factors Predicting Infant Survival In Late Neo-Natal And Post Neo-Natal Infants From Birth Certificate Data, Mark Brunk-Grady May 2020

Infant Mortality In The United States: Socioeconomic Factors Predicting Infant Survival In Late Neo-Natal And Post Neo-Natal Infants From Birth Certificate Data, Mark Brunk-Grady

Theses and Dissertations

According to the Centers for Disease Control and Prevention, the infant mortality rate in the United States in 2018 was 5.6 deaths per 1000 live births. Infant mortality is defined as a child being born alive but dying before their first birthday. This study aimed to determine if adding socioeconomic factors to traditional predictive survival models improved the predictive power in terms of survival for late and post neonatal infants. Secondly, this study looked to develop a risk score to and predict which mothers would be classified as “High” or “Low” risk for infant death.

Data were analyzed from a …


Smoothed Quantiles For Claim Frequency Models, With Applications To Risk Measurement, Ponmalar Suruliraj Ratnam May 2020

Smoothed Quantiles For Claim Frequency Models, With Applications To Risk Measurement, Ponmalar Suruliraj Ratnam

Theses and Dissertations

Statistical models for the claim severity and claim frequency variables are routinely constructed and utilized by actuaries. Typical applications of such models include identification of optimal deductibles for selected loss elimination ratios, pricing of contract layers, determining credibility factors, risk and economic capital measures, and evaluation of effects of inflation, market trends and other quantities arising in insurance. While the actuarial literature on the severity models is extensive and rapidly growing, that for the claim frequency models lags behind. One of the reasons for such a gap is that various actuarial metrics do not possess ``nice'' statistical properties for the …


Fitting Of Lotka-Volterra Model For Coupled Population Growth Data Through Least-Squares Estimation Of Parameters, Jessica Ann Harter May 2020

Fitting Of Lotka-Volterra Model For Coupled Population Growth Data Through Least-Squares Estimation Of Parameters, Jessica Ann Harter

Theses and Dissertations

The population of two types of bacteria found in the Gulf Coast of Florida, V.chagasii and V. harveyi, can be described by the Lotka-Voltera competition model. Using data gathered in experiments conducted by Bury and Pickett (2015), we take a different approach to find parameter estimates using numerical methods in R. In particular, we find a numerical solution to the coupled set of ODEs and minimize the sum of squared errors in order to obtain the optimal parameter estimates that will fit the data best. In order to get a sense of accuracy of these parameter estimates, we use bootstrap …


Outlier-Resistant Models For Doubly Stochastic Point Processes, Leo Stephan Elsaesser May 2019

Outlier-Resistant Models For Doubly Stochastic Point Processes, Leo Stephan Elsaesser

Theses and Dissertations

This thesis proposes an outlier-resistant multiplicative component model for doubly stochastic point processes. The model is based on a principal component decomposition of the log-intensity functions, using heavy-tailed t-distributions for the component scores. As an example of application, the temporal distribution of bike check-out times in the Divvy bike sharing system of Chicago is analyzed using the t-model.


A Statistical Model For The Influence Of Temperature On Bike Demand In Bike-Sharing Systems, Tobias Tietze May 2019

A Statistical Model For The Influence Of Temperature On Bike Demand In Bike-Sharing Systems, Tobias Tietze

Theses and Dissertations

Efficient fleet management is essential for bike-sharing systems. Thus, it is important to understand the impact of environmental factors on bike demand. This thesis proposes a method to analyze the influence of temperature on bike demand. Hourly temperature data are approximated by smoothed curves and modeled by functional principal components. Bike check-out times, which can be seen as realizations of a doubly stochastic process, are modeled using multiplicative component models on the underlying intensity functions. The respective component scores are then related via a multivariate regression model. An analysis of data from the Divvy system of the City of Chicago …


A Statistical Model For The Influence Of Temperature On Bike Demand In Bike-Sharing Systems, Tobias Tietze May 2019

A Statistical Model For The Influence Of Temperature On Bike Demand In Bike-Sharing Systems, Tobias Tietze

Theses and Dissertations

Efficient fleet management is essential for bike-sharing systems. Thus, it is important to understand the impact of environmental factors on bike demand. This thesis proposes a method to analyze the influence of temperature on bike demand. Hourly temperature data are approximated by smoothed curves and modeled by functional principal components. Bike check-out times, which can be seen as realizations of a doubly stochastic process, are modeled using multiplicative component models on the underlying intensity functions. The respective component scores are then related via a multivariate regression model. An analysis of data from the Divvy system of the City of Chicago …


Identifying And Incorporating Driver Behavior Variables Into Crash Prediction Models, Mohammad Razaur Rahman Shaon May 2019

Identifying And Incorporating Driver Behavior Variables Into Crash Prediction Models, Mohammad Razaur Rahman Shaon

Theses and Dissertations

All travelers are exposed to the risk for crashes on the road, as none of the roadways are entirely safe. Under Vision Zero, improving traffic safety on our nation’s highways is and will continue to be one of the most pivotal tasks on the national transportation agenda. For decades, researchers and transportation professionals have strived to identify causal relationships between crash occurrence and roadway geometry, and traffic-related variables on the mission of creating a safe environment for the traveling public. Although great achievements have been witnessed such as the publication of the Highway Safety Manual (HSM), research is rather limited …


Outlier-Resistant Models For Doubly Stochastic Point Processes, Leo Stephan Elsaesser May 2019

Outlier-Resistant Models For Doubly Stochastic Point Processes, Leo Stephan Elsaesser

Theses and Dissertations

This thesis proposes an outlier-resistant multiplicative component model for doubly stochastic point processes. The model is based on a principal component decomposition of the log-intensity functions, using heavy-tailed t-distributions for the component scores. As an example of application, the temporal distribution of bike check-out times in the Divvy bike sharing system of Chicago is analyzed using the t-model.


The Strong Law Of Large Numbers For U-Statistics Under Random Censorship, Jan Höft Dec 2018

The Strong Law Of Large Numbers For U-Statistics Under Random Censorship, Jan Höft

Theses and Dissertations

We introduce a semi-parametric U-statistics estimator for randomly right censored data. We will study the strong law of large numbers for this estimator under proper assumptions about the conditional expectation of the censoring indicator with re- spect to the observed life times. Moreover we will conduct simulation studies, where the semi-parametric estimator is compared to a U-statistic based on the Kaplan- Meier product limit estimator in terms of bias, variance and mean squared error, under different censoring models.


Network Analysis Of Scientific Collaboration And Co-Authorship Of The Trifecta Of Malaria, Tuberculosis And Hiv/Aids In Benin., Gbedegnon Roseric Azondekon Aug 2018

Network Analysis Of Scientific Collaboration And Co-Authorship Of The Trifecta Of Malaria, Tuberculosis And Hiv/Aids In Benin., Gbedegnon Roseric Azondekon

Theses and Dissertations

Despite the international mobilization and increase in research funding, Malaria, Tuberculosis and HIV/AIDS are three infectious diseases that have claimed more lives in sub Saharan Africa than any other place in the World. Consortia, research network and research centers both in Africa and around the world team up in a multidisciplinary and transdisciplinary approach to boost efforts to curb these diseases. Despite the progress in research, very little is known about the dynamics of research collaboration in the fight of these Infectious Diseases in Africa resulting in a lack of information on the relationship between African research collaborators. This dissertation …


Robust Estimation Of Parametric Models For Insurance Loss Data, Chudamani Poudyal Jun 2018

Robust Estimation Of Parametric Models For Insurance Loss Data, Chudamani Poudyal

Theses and Dissertations

Parametric statistical models for insurance claims severity are continuous, right-skewed, and frequently heavy-tailed. The data sets that such models are usually fitted to contain outliers that

are difficult to identify and separate from genuine data. Moreover, due to commonly used actuarial “loss control strategies,” the random variables we observe and wish to model are affected by truncation (due to deductibles), censoring (due to policy limits), scaling

(due to coinsurance proportions) and other transformations. In the current practice, statistical inference for loss models is almost exclusively likelihood (MLE) based, which typically results in non-robust parameter estimators, pricing models, and risk measures. …


Calibration Of A Stochastic Price Model For American Electricity Markets, Oliver G. Meister May 2018

Calibration Of A Stochastic Price Model For American Electricity Markets, Oliver G. Meister

Theses and Dissertations

This thesis discusses models for electricity spot prices from the Midwestern American and Manitoba market. The models are based on experiences in European markets and rely on a superposition model with several jump components. The methodology of Bayesian Inference solved with a Markov chain Monte Carlo algorithm has been applied to find estimators for the processes of the model. The specific Markov chain Monte Carlo algorithm applied a Random Walk Metropolis combined with a Gibbs sampler. The different estimators of the models are evaluated with the posterior predictive value and simulations of the electricity spot prices.

We have modified this …


Fitting A Complex Markov Chain Model For Firm And Market Productivity, Julia Ruth Valder May 2018

Fitting A Complex Markov Chain Model For Firm And Market Productivity, Julia Ruth Valder

Theses and Dissertations

This thesis develops a methodology of estimating parameters for a complex Markov chain model for firm productivity. The model consists of two Markov chains, one describing firm-level productivity and the other modeling the productivity of the whole market. If applicable, the model can be used to help with optimal decision making problems for labor demand. The need for such a model is motivated and the economical background of this research is shown. A brief introduction to the concept of Markov chains and their application in this context is given. The simulated data that is being used for the estimation is …


Social Network Analysis On Wisconsin Archival Facebook Community, Jennifer Stevenson Aug 2017

Social Network Analysis On Wisconsin Archival Facebook Community, Jennifer Stevenson

Theses and Dissertations

The purpose of this study was to understand how Wisconsin archives are using Facebook (Wisconson archives Facebook community, WAFC). Few archive studies use quantitative measurements to draw conclusions from social media application use. Quantitative data is needed in order to identify the various ways that social media is being used in an archive. Without the data behind the assumptions, it is impossible to improve service and outreach to the archive users. This study proposed a mixed methods approach to aid in the process, using social network analysis, inferential statistics and thematic analysis. This study measured the effects of implementation of …


Robust Latent Ability Estimation Based On Item Response Information And Model Fit, Hotaka Maeda Aug 2017

Robust Latent Ability Estimation Based On Item Response Information And Model Fit, Hotaka Maeda

Theses and Dissertations

Aberrant testing behaviors may result in inaccurate person trait estimation. To counter its effects, a new robust ability estimation procedure called downweighting of aberrant responses estimation (DARE) is developed. This procedure downweights both uninformative items and model-misfitting response patterns. The purpose of this study is to present DARE and to evaluate its performance against other robust methods, including biweight (Mislevy & Bock, 1982) and biweight-MAP (BMAP; Maeda & Zhang, 2017b). The traditional maximum likelihood (MLE) and maximum a-posteriori (MAP) methods are also included as baseline methods. A Monte Carlo simulation is conducted with the design variables being test length, type …


Optimal Warranty Period For Free-Replacement Policy Of Agm Batteries, Jennifer Paola Garantiva Poveda Aug 2017

Optimal Warranty Period For Free-Replacement Policy Of Agm Batteries, Jennifer Paola Garantiva Poveda

Theses and Dissertations

The objective of this study is to analyze the suitability of the age-based warranty model and a millage based warranty model for absorbent glass mat batteries (AGM) for the automobile industry. The battery life expectancy can be assessed and described by a combination of different terms such as: state of health (SOH), deep of discharge (DOD), state of energy (SOE) and state of charge (SOC). However, using actual data from the field, the implementation of reliability engineering and statistical modeling we aim to calculate optimal limits for warranty policies that minimize warranty costs. The outcomes of this research will enable …


Infinite-Dimensional Traits: Estimation Of Mean, Covariance, And Selection Gradient Of Tribolium Castaneum Growth Curves, Ly Viet Hoang May 2017

Infinite-Dimensional Traits: Estimation Of Mean, Covariance, And Selection Gradient Of Tribolium Castaneum Growth Curves, Ly Viet Hoang

Theses and Dissertations

In evolutionary biology, traits like growth curves, reaction norms or morphological shapes cannot be described by a finite vector of components alone. Instead, continuous functions represent a more useful structure. Such traits are called function-valued or infinite-dimensional traits. Kirkpatrick and Heckmann outlined the first quantitative genetic model for these traits. Beder and Gomulkiewicz extended the theory on the selection gradient and the evolutionary response from finite- to infinite-dimensional traits.

Rigorous methods for the estimation of these quantities were developed throughout the years. In his dissertation, Baur defines estimators for the mean and covariance function, as well as for the selection …


Robust And Computationally Efficient Methods For Fitting Loss Models And Pricing Insurance Risks, Qian Zhao May 2017

Robust And Computationally Efficient Methods For Fitting Loss Models And Pricing Insurance Risks, Qian Zhao

Theses and Dissertations

Continuous parametric distributions are useful tools for modeling and pricing insurance risks, measuring income inequality in economics, investigating reliability of engineering systems, and in many other areas of application. In this dissertation, we propose and develop a new method for estimation of their parameters—the method of Winsorized moments (MWM)—which is conceptually similar to the method of trimmed moments (MTM) and thus is robust and computationally efficient. Both approaches yield explicit formulas of parameter estimators for location-scale and log-location-scale families, which are commonly used to model claim severity. Large-sample properties of the new estimators are provided and corroborated through simulations. Their …


Associated Hypothesis In Linear Models With Unbalanced Data, Rica Katharina Wedowski May 2017

Associated Hypothesis In Linear Models With Unbalanced Data, Rica Katharina Wedowski

Theses and Dissertations

In a two-way linear model one can test six different hypotheses regarding the effects in this model. Those hypotheses can be ranked from less specific to more specific. Therefore the more specific hypotheses are nested in the less specific ones. To test those nested hypotheses sequential sums of squares are used. Searle sees a problem with these since they test an associated hypothesis that has the same sums of squares but involve the sample sizes. Hypotheses should be generic and not dependent on the data. The proof he uses in his book Linear Models for Unbalanced Data is not easy …


Ethnic Party Bans And Civil Unrest: A Measurement Modeling Approach To Predicting Effects Of Constitutional Engineering, Kelly Gleason May 2017

Ethnic Party Bans And Civil Unrest: A Measurement Modeling Approach To Predicting Effects Of Constitutional Engineering, Kelly Gleason

Theses and Dissertations

Political representation through exclusively ethnic parties has long been thought to create, or enforce, social cleavages leading to conflict. To gain support and mobilize ethnic constituents, ethnic party leadership has incentive to exaggerate differences between, or even antagonize, members of other ethnic groups through the process of ethnic outbidding. Classic political theory cautions that the exclusive nature of ethnic parties can also produce a dangerous zero sum game between ethnic groups that cannot be solved by compromise via democratic institutions. Several institutional solutions have been proposed to counter the problem of instability ethnic divisions create for new democracies, encountering varying …


Black-Scholes Model: An Analysis Of The Influence Of Volatility, Cornelia Krome May 2017

Black-Scholes Model: An Analysis Of The Influence Of Volatility, Cornelia Krome

Theses and Dissertations

In this thesis the influence of volatility in the Black-Scholes model is analyzed. The deduced Black-Scholes formula estimates the price of European options. Contrary to the other parameters of the formula, the future volatility of the underlying asset cannot be observed in the market. The parameter needs to be assumed in order to calculate the option price. An inaccurate assumption may lead to an erroneous volatility. It is studied how a falsely assumed volatility impacts on the option price. Empirical simulations will be carried out to get an impression of possible errors in the computations. Afterwards, those results will be …


Stage-Specific Predictive Models For Cancer Survivability, Elham Sagheb Hossein Pour Dec 2016

Stage-Specific Predictive Models For Cancer Survivability, Elham Sagheb Hossein Pour

Theses and Dissertations

Survivability of cancer strongly depends on the stage of cancer. In most previous works, machine learning survivability prediction models for a particular cancer, were trained and evaluated together on all stages of the cancer. In this work, we trained and evaluated survivability prediction models for five major cancers, together on all stages and separately for every stage. We named these models joint and stage-specific models respectively. The obtained results for the cancers which we investigated reveal that, the best model to predict the survivability of the cancer for one specific stage is the model which is specifically built for that …


Estimating The Selection Gradient Of A Function-Valued Trait, Tyler John Baur Dec 2016

Estimating The Selection Gradient Of A Function-Valued Trait, Tyler John Baur

Theses and Dissertations

Kirkpatrick and Heckman initiated the study of function-valued traits in 1989. How to estimate the selection gradient of a function-valued trait is a major question asked by evolutionary biologists. In this dissertation, we give an explicit expansion of the selection gradient and construct estimators based on two different samples: one consisting of independent organisms (the independent case), and the other consisting of independent families of equally related organisms (the dependent case).

In the independent case we first construct and prove the joint consistency of sieve estimators of the mean and covariance functions of a Gaussian process, based on previous developments …