Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 13 of 13

Full-Text Articles in Statistics and Probability

Modeling The Probability Of A Successful Stolen Base Attempt In Major League Baseball, Cade Stanley Apr 2023

Modeling The Probability Of A Successful Stolen Base Attempt In Major League Baseball, Cade Stanley

Senior Theses

In Major League Baseball (MLB), the outcome of a stolen base attempt has important implications. Success moves the runner closer to scoring, while failure records an out and removes the runner from the basepaths altogether. Therefore, it is important that the decision by a coach or player to steal a base is well-informed. In this thesis, I explore a statistical approach to making this decision. I train logistic regression and random forest models, using data about the game situation and about the runner, pitcher, and catcher involved in the stolen base attempt, to estimate the probability that a stolen base …


The Classification Of Basket Neural Cells In The Mammalian Neocortex, Sreya Pudi Oct 2021

The Classification Of Basket Neural Cells In The Mammalian Neocortex, Sreya Pudi

Senior Theses

Basket neuronal cells of the mammalian neocortex have been classically categorized into two or more groups. Originally, it was thought that the large and small types are the naturally occurring groups that emerge from reasons that relate to neurobiological function and anatomical position. Later, a study based on anatomical and physiological features of these neurons introduced a third type, the net basket cell which is intermediate in size as compared to the large and small types. In this study, multivariate analysis was used to test the hypothesis that the large and small types are morphologically distinct groups. The results of …


Does Defense Actually Win Championships? Using Statistics To Examine One Of The Greatest Stereotypes In Sports, Thomas Burkett Apr 2021

Does Defense Actually Win Championships? Using Statistics To Examine One Of The Greatest Stereotypes In Sports, Thomas Burkett

Senior Theses

A common saying in sports is that “defense wins championships.” However, the past decade of play in the modern NBA has seen a rise and focus in offensive efficiency and 3-pointers. This thesis tests whether defense can truly predict a championship winning team in today’s NBA through two-sample hypothesis testing and multiple logistic regression models. The results found that both defensive and offensive statistics were significant predictors of championship teams, meaning that a balanced team, rather than one specialized in defense alone, is a more accurate predictor of championship success.


Boom Or Bust: Examining The Relationship Between High School Recruiting Rankings And The Nfl Draft, Nicholas E. Tice Apr 2020

Boom Or Bust: Examining The Relationship Between High School Recruiting Rankings And The Nfl Draft, Nicholas E. Tice

Senior Theses

The goal of this thesis is to model the probability of a high school football player’s chance of being drafted based on information taken from their recruiting profile. The response variable is binary and defined as drafted (1) or undrafted (0). The independent variables were collected by scraping data from the recruiting websites including height, weight, position, hometown, recruiting grade and other socioeconomic factors based on the player’s high school. 247Sports and ESPN were the two recruiting services used and compared in this study. Because of the binary nature of the dependent variable, logistic regression and decision trees were chosen …


Statistical Design Of Experiment Techniques In Manufacturing, Caroline M. Kerfonta Oct 2018

Statistical Design Of Experiment Techniques In Manufacturing, Caroline M. Kerfonta

Senior Theses

There are many statistical techniques used to design experiments. These techniques are used in many different fields. This thesis will focus on the use of the three most common techniques used to design statistical experiments in manufacturing.

The three techniques that will be investigated are completely randomized design, randomized block design, and factorial design. These techniques will be compared, contrasted, and explained. Research examples will be presented along with sample R code for each technique. These examples will be accompanied by analysis of the techniques as well as an overview of the uses and history of experiments in manufacturing


Bayesian Semi- And Non-Parametric Analysis For Spatially Correlated Survival Data, Haiming Zhou Jan 2015

Bayesian Semi- And Non-Parametric Analysis For Spatially Correlated Survival Data, Haiming Zhou

Theses and Dissertations

Flexible incorporation of both geographical patterning and risk effects in cancer survival models is becoming increasingly important, due in part to the recent availability of large cancer registries. The analysis of spatial survival data is challenged by the presence of spatial dependence and censoring for survival times. Accurately modeling the risk factors and geographical pattern that explain the differences in survival is particularly of interest. Within this dissertation, the first chapter reviews commonlyused baseline priors, semiparametric and nonparametric Bayesian survival models and recent approaches for accommodating spatial dependence, both conditional and marginal. The last three chapters contribute three flexible survival …


Semiparametric Regression Analysis Of Bivariate Interval-Censored Data, Naichen Wang Dec 2014

Semiparametric Regression Analysis Of Bivariate Interval-Censored Data, Naichen Wang

Theses and Dissertations

Survival analysis is a long-lasting and popular research area and has numerous applications in all fields such as social science, engineering, economics, industry, and public health. Interval-censored data are a special type of survival data, in which the survival time of interest is never exactly observed but is known to fall within some observed interval. Interval-censored data arise commonly in real-life studies, in which subjects are examined at periodical or irregular follow-up visits. In this dissertation, we develop efficient statistical approaches for regression analysis of bivariate intervalcensored data, in which the two survival times of interest are correlated and both …


Ranking World Class Chess Players Using Only Results From Head-To-Head Games, Sterling Swygert May 2014

Ranking World Class Chess Players Using Only Results From Head-To-Head Games, Sterling Swygert

Senior Theses

This honors thesis explores a method of ranking the world’s top ten chess grand- masters using only the outcomes of games containing only players in that very set. This method allows for players in a single era to be quickly ranked via algorithmic and numerical means, including very specific information, from a statistical stand- point. Furthermore, unlike the rating systems that are commonly used, the Elo and the Glicko systems, this method is Classicist in its statistical approach, rather than Bayesian. Finally, this ranking method also differs from others as it limits the infor- mation to games between the individuals …


Methods For Clustering Mixed Data, Jeanmarie L. Hendrickson Jan 2014

Methods For Clustering Mixed Data, Jeanmarie L. Hendrickson

Theses and Dissertations

We give a brief introduction to cluster analysis and then propose and discuss a few methods for clustering mixed data. In particular, a model-based clustering method for mixed data based on Everitt's (1988) work is described, and we use a simulated annealing method to estimate the parameters for Everitt's model. A penalized log likelihood with the simulated annealing method is proposed as a remedy for the parameter estimates being drawn to extremes. Everitt's approach and the proposed method are compared based on their performance in clustering simulated data. We then use the penalized log likelihood method on a heart disease …


The Complete Plus-Minus: A Case Study Of The Columbus Blue Jackets, Nathan Spagnola Jan 2013

The Complete Plus-Minus: A Case Study Of The Columbus Blue Jackets, Nathan Spagnola

Theses and Dissertations

A new hockey statistic termed the Complete Plus-Minus (CPM) was created to calculate the abilities of hockey players in the National Hockey League (NHL). This new statistic was used to analyze the Columbus Blue Jackets for the 2011-2012 season. The CPM for the Blue Jackets was created using two logistic regressions that modeled a goal being scored for and against the Blue Jackets. Whether a goal was scored for or against the team were the responses, while events on the ice were the predictors in the model. It was found that the team's poor performance was due to a weak …


Dynamic Modeling And Statistical Analysis Of Event Times, Edsel A. Pena Nov 2006

Dynamic Modeling And Statistical Analysis Of Event Times, Edsel A. Pena

Faculty Publications

This review article provides an overview of recent work in the modeling and analysis of recurrent events arising in engineering, reliability, public health, biomedicine and other areas. Recurrent event modeling possesses unique facets making it different and more difficult to handle than single event settings. For instance, the impact of an increasing number of event occurrences needs to be taken into account, the effects of covariates should be considered, potential association among the interevent times within a unit cannot be ignored, and the effects of performed interventions after each event occurrence need to be factored in. A recent general class …


Large Deviations For Processes With Independent Increments, James Lynch, Jayaram Sethuraman Jan 1987

Large Deviations For Processes With Independent Increments, James Lynch, Jayaram Sethuraman

Faculty Publications

Let X be a topological space and F denote the Borel σ-field in X. A family of probability measures {Pλ} is said to obey the large deviation principle (LDP) with rate function I(⋅) if Pλ(A) can be suitably approximated by exp{−λinfx∈AI(x)} for appropriate sets A in F. Here the LDP is studied for probability measures induced by stochastic processes with stationary and independent increments which have no Gaussian component. It is assumed that the moment generating function of the increments exists and thus the sample paths of such stochastic processes lie in the space of functions of bounded variation. The …


Some Comments On The Erdos-Renyl Law And A Theorem Of Shepp, James Lynch Jan 1982

Some Comments On The Erdos-Renyl Law And A Theorem Of Shepp, James Lynch

Faculty Publications

We show that the finiteness of the moment generating function is necessary for the finiteness of the lim sup of the moving averages considered by Shepp (1964). This also implies that the same must be true for the Erdos-Renyi law of large numbers.