Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 23 of 23

Full-Text Articles in Statistics and Probability

A Comparison Of Confidence Intervals In State Space Models, Jinyu Du Jul 2023

A Comparison Of Confidence Intervals In State Space Models, Jinyu Du

Statistical Science Theses and Dissertations

This thesis develops general procedures for constructing confidence intervals (CIs) of the error disturbance parameters (standard deviations) and transformations of the error disturbance parameters in time-invariant state space models (ssm). With only a set of observations, estimating individual error disturbance parameters accurately in the presence of other unknown parameters in ssm is a very challenging problem. We attempted to construct four different types of confidence intervals, Wald, likelihood ratio, score, and higher-order asymptotic intervals for both the simple local level model and the general time-invariant state space models (ssm). We show that for a simple local level model, both the …


High-Dimensional Variable Selection Via Knockoffs Using Gradient Boosting, Amr Essam Mohamed Apr 2023

High-Dimensional Variable Selection Via Knockoffs Using Gradient Boosting, Amr Essam Mohamed

Dissertations

As data continue to grow rapidly in size and complexity, efficient and effective statistical methods are needed to detect the important variables/features. Variable selection is one of the most crucial problems in statistical applications. This problem arises when one wants to model the relationship between the response and the predictors. The goal is to reduce the number of variables to a minimal set of explanatory variables that are truly associated with the response of interest to improve the model accuracy. Effectively choosing the true influential variables and controlling the False Discovery Rate (FDR) without sacrificing power has been a challenge …


Bayesian Estimation Of The Intensity Function Of A Non-Homogeneous Poisson Process, James Jensen Oct 2022

Bayesian Estimation Of The Intensity Function Of A Non-Homogeneous Poisson Process, James Jensen

Theses

In this paper we explore Bayesian inference and its application to the problem of estimating the intensity function of a non-homogeneous Poisson process. These processes model the behavior of phenomena in which one or more events, known as arrivals, occur independently of one another over a certain period of time. We are concerned with the number of events occurring during particular time intervals across several realizations of the process. We show that given sufficient data, we are able to construct a piecewise-constant function which accurately estimates the mean rates on particular intervals. Further, we show that as we reduce these …


On Misuses Of The Kolmogorov–Smirnov Test For One-Sample Goodness-Of-Fit, Anthony Zeimbekakis Apr 2022

On Misuses Of The Kolmogorov–Smirnov Test For One-Sample Goodness-Of-Fit, Anthony Zeimbekakis

Honors Scholar Theses

The Kolmogorov–Smirnov (KS) test is one of the most popular goodness-of-fit tests for comparing a sample with a hypothesized parametric distribution. Nevertheless, it has often been misused. The standard one-sample KS test applies to independent, continuous data with a hypothesized distribution that is completely specified. It is not uncommon, however, to see in the literature that it was applied to dependent, discrete, or rounded data, with hypothesized distributions containing estimated parameters. For example, it has been "discovered" multiple times that the test is too conservative when the parameters are estimated. We demonstrate misuses of the one-sample KS test in three …


Applying The Data: Predictive Analytics In Sport, Anthony Teeter, Margo Bergman Nov 2020

Applying The Data: Predictive Analytics In Sport, Anthony Teeter, Margo Bergman

Access*: Interdisciplinary Journal of Student Research and Scholarship

The history of wagering predictions and their impact on wide reaching disciplines such as statistics and economics dates to at least the 1700’s, if not before. Predicting the outcomes of sports is a multibillion-dollar business that capitalizes on these tools but is in constant development with the addition of big data analytics methods. Sportsline.com, a popular website for fantasy sports leagues, provides odds predictions in multiple sports, produces proprietary computer models of both winning and losing teams, and provides specific point estimates. To test likely candidates for inclusion in these prediction algorithms, the authors developed a computer model, and test …


Sensitivity Analysis For Incomplete Data And Causal Inference, Heng Chen May 2020

Sensitivity Analysis For Incomplete Data And Causal Inference, Heng Chen

Statistical Science Theses and Dissertations

In this dissertation, we explore sensitivity analyses under three different types of incomplete data problems, including missing outcomes, missing outcomes and missing predictors, potential outcomes in \emph{Rubin causal model (RCM)}. The first sensitivity analysis is conducted for the \emph{missing completely at random (MCAR)} assumption in frequentist inference; the second one is conducted for the \emph{missing at random (MAR)} assumption in likelihood inference; the third one is conducted for one novel assumption, the ``sixth assumption'' proposed for the robustness of instrumental variable estimand in causal inference.


Using Stability To Select A Shrinkage Method, Dean Dustin May 2020

Using Stability To Select A Shrinkage Method, Dean Dustin

Department of Statistics: Dissertations, Theses, and Student Work

Shrinkage methods are estimation techniques based on optimizing expressions to find which variables to include in an analysis, typically a linear regression. The general form of these expressions is the sum of an empirical risk plus a complexity penalty based on the number of parameters. Many shrinkage methods are known to satisfy an ‘oracle’ property meaning that asymptotically they select the correct variables and estimate their coefficients efficiently. In Section 1.2, we show oracle properties in two general settings. The first uses a log likelihood in place of the empirical risk and allows a general class of penalties. The second …


The Importance Of Type I Error Rates When Studying Bias In Monte Carlo Studies In Statistics, Michael Harwell Feb 2020

The Importance Of Type I Error Rates When Studying Bias In Monte Carlo Studies In Statistics, Michael Harwell

Journal of Modern Applied Statistical Methods

Two common outcomes of Monte Carlo studies in statistics are bias and Type I error rate. Several versions of bias statistics exist but all employ arbitrary cutoffs for deciding when bias is ignorable or non-ignorable. This article argues Type I error rates should be used when assessing bias.


Modeling Stochastically Intransitive Relationships In Paired Comparison Data, Ryan Patrick Alexander Mcshane Jan 2019

Modeling Stochastically Intransitive Relationships In Paired Comparison Data, Ryan Patrick Alexander Mcshane

Statistical Science Theses and Dissertations

If the Warriors beat the Rockets and the Rockets beat the Spurs, does that mean that the Warriors are better than the Spurs? Sophisticated fans would argue that the Warriors are better by the transitive property, but could Spurs fans make a legitimate argument that their team is better despite this chain of evidence?

We first explore the nature of intransitive (rock-scissors-paper) relationships with a graph theoretic approach to the method of paired comparisons framework popularized by Kendall and Smith (1940). Then, we focus on the setting where all pairs of items, teams, players, or objects have been compared to …


Minimizing The Perceived Financial Burden Due To Cancer, Hassan Azhar, Zoheb Allam, Gino Varghese, Daniel W. Engels, Sajiny John Aug 2018

Minimizing The Perceived Financial Burden Due To Cancer, Hassan Azhar, Zoheb Allam, Gino Varghese, Daniel W. Engels, Sajiny John

SMU Data Science Review

In this paper, we present a regression model that predicts perceived financial burden that a cancer patient experiences in the treatment and management of the disease. Cancer patients do not fully understand the burden associated with the cost of cancer, and their lack of understanding can increase the difficulties associated with living with the disease, in particular coping with the cost. The relationship between demographic characteristics and financial burden were examined in order to better understand the characteristics of a cancer patient and their burden, while all subsets regression was used to determine the best predictors of financial burden. Age, …


On The Performance Of Some Poisson Ridge Regression Estimators, Cynthia Zaldivar Mar 2018

On The Performance Of Some Poisson Ridge Regression Estimators, Cynthia Zaldivar

FIU Electronic Theses and Dissertations

Multiple regression models play an important role in analyzing and making predictions about data. Prediction accuracy becomes lower when two or more explanatory variables in the model are highly correlated. One solution is to use ridge regression. The purpose of this thesis is to study the performance of available ridge regression estimators for Poisson regression models in the presence of moderately to highly correlated variables. As performance criteria, we use mean square error (MSE), mean absolute percentage error (MAPE), and percentage of times the maximum likelihood (ML) estimator produces a higher MSE than the ridge regression estimator. A Monte Carlo …


Bayesian Model For Detection Of Outliers In Linear Regression With Application To Longitudinal Data, Zahraa Al-Sharea Dec 2017

Bayesian Model For Detection Of Outliers In Linear Regression With Application To Longitudinal Data, Zahraa Al-Sharea

Graduate Theses and Dissertations

Outlier detection is one of the most important challenges with many present-day applications. Outliers can occur due to uncertainty in data generating mechanisms or due to an error in data recording/processing. Outliers can drastically change the study's results and make predictions less reliable. Detecting outliers in longitudinal studies is quite challenging because this kind of study is working with observations that change over time. Therefore, the same subject can produce an outlier at one point in time produce regular observations at all other time points. A Bayesian hierarchical modeling assigns parameters that can quantify whether each observation is an outlier …


Gilmore Girls And Instagram: A Statistical Look At The Popularity Of The Television Show Through The Lens Of An Instagram Page, Brittany Simmons May 2017

Gilmore Girls And Instagram: A Statistical Look At The Popularity Of The Television Show Through The Lens Of An Instagram Page, Brittany Simmons

Student Scholar Symposium Abstracts and Posters

After going on the Warner Brothers Tour in December of 2015, I created a Gilmore Girls Instagram account. This account, which started off as a way for me to create edits of the show and post my photos from the tour turned into something bigger than I ever could have imagined. In just over a year I have over 55,000 followers. I post content including revival news, merchandise, and edits of the show that have been featured in Entertainment Weekly, Bustle, E! News, People Magazine, Yahoo News, & GilmoreNews.

I created a dataset of qualitative and quantitative outcomes from my …


P-Values Versus Significance Levels, Phillip I. Good May 2013

P-Values Versus Significance Levels, Phillip I. Good

Journal of Modern Applied Statistical Methods

In this article Phillip Good responds to Richard Anderson's article Conceptual Distinction between the Critical p Value and the Type I Error Rate in Permutation Testing.


Conceptual Distinction Between The Critical P Value And The Type I Error Rate In Permutation Testing: Author Response To Peer Comments, Richard B. Anderson May 2013

Conceptual Distinction Between The Critical P Value And The Type I Error Rate In Permutation Testing: Author Response To Peer Comments, Richard B. Anderson

Journal of Modern Applied Statistical Methods

Richard Anderson responds to comments regarding his target article Conceptual Distinction between the Critical p Value and the Type I Error Rate in Permutation Testing.


A Response To Anderson's (2013) Conceptual Distinction Between The Critical P Value And Type I Error Rate In Permutation Testing, Fortunato Pesarin, Stefano Bonnini May 2013

A Response To Anderson's (2013) Conceptual Distinction Between The Critical P Value And Type I Error Rate In Permutation Testing, Fortunato Pesarin, Stefano Bonnini

Journal of Modern Applied Statistical Methods

Pesarin and Bonnini respond to Anderson's (2013) Conceptual Distinction between the Critical p value and Type I Error Rate in Permutation Testing


Conceptual Distinction Between The Critical P Value And The Type I Error Rate In Permutation Testing, Richard B. Anderson May 2013

Conceptual Distinction Between The Critical P Value And The Type I Error Rate In Permutation Testing, Richard B. Anderson

Journal of Modern Applied Statistical Methods

To counter past assertions that permutation testing is not distribution-free, this article clarifies that the critical p value (alpha) in permutation testing is not a Type I error rate and that a test's validity is independent of the concept of Type I error.


A Method For Generating Realistic Correlation Matrices, Johanna S. Hardin, Stephan Ramon Garcia, David Golan Jan 2013

A Method For Generating Realistic Correlation Matrices, Johanna S. Hardin, Stephan Ramon Garcia, David Golan

Pomona Faculty Publications and Research

Simulating sample correlation matrices is important in many areas of statistics. Approaches such as generating Gaussian data and finding their sample correlation matrix or generating random uniform $[-1,1]$ deviates as pairwise correlations both have drawbacks. We develop an algorithm for adding noise, in a highly controlled manner, to general correlation matrices. In many instances, our method yields results which are superior to those obtained by simply simulating Gaussian data. Moreover, we demonstrate how our general algorithm can be tailored to a number of different correlation models. Using our results with a few different applications, we show that simulating correlation matrices …


Adaptive Randomization Designs, Jenna Colavincenzo Jun 2012

Adaptive Randomization Designs, Jenna Colavincenzo

Statistics

Adaptive design methodologies use prior information to develop a clinical trial design. The goal of an adaptive design is to maintain the integrity and validity of the study while giving the researcher flexibility in identifying the optimal treatment. An example of an adaptive design can be seen in a basic pharmaceutical trial. There are three phases of the overall trial to compare treatments and experimenters use the information from the previous phase to make changes to the subsequent phase before it begins.

Adaptive design methods have been in practice since the 1970s, but have become increasingly complex ever since. One …


Using The R Library Rpanel For Gui-Based Simulations In Introductory Statistics Courses, Ryan M. Allison May 2012

Using The R Library Rpanel For Gui-Based Simulations In Introductory Statistics Courses, Ryan M. Allison

Statistics

As a student, I noticed that the statistical package R (http://www.r-project.org) would have several benefits of its usage in the classroom. One benefit to the package is its free and open-source nature. This would be a great benefit for instructors and students alike since it would be of no cost to use, unlike other statistical packages. Due to this, students could continue using the program after their statistical courses and into their professional careers. It would be good to expose students while they are in school to a tool that professionals use in industry. R also has powerful …


A Simulation Study Of The Impact Of Forecast Recovery For Control Charts Applied To Arma Processes, John N. Dyer, B. Michael Adams, Michael D. Conerly Nov 2002

A Simulation Study Of The Impact Of Forecast Recovery For Control Charts Applied To Arma Processes, John N. Dyer, B. Michael Adams, Michael D. Conerly

Journal of Modern Applied Statistical Methods

Forecast-based schemes are often used to monitor autocorrelated processes, but the resulting forecast recovery has a significant effect on the performance of control charts. This article describes forecast recovery for autocorrelated processes, and the resulting simulation study is used to explain the performance of control charts applied to forecast errors.


Measuring Hotel Service Quality: Tools For Gaining The Competitive Edge, Robert C. Ford, Susan A. Bach Jan 1997

Measuring Hotel Service Quality: Tools For Gaining The Competitive Edge, Robert C. Ford, Susan A. Bach

Hospitality Review

As the hotel industry grows more competitive, quality guest service becomes an increasingly important part of managers' responsibility measuring the quality of service delivery is facilitated when managers know what types of assessment methods are available to them. The authors present and discuss the following available measurement techniques and describe the situations where they best meet the needs of hotel managers: management observation, employee feedback programs, comment cards, mailed surveys, personal and telephone interviews, focus groups, and mystery shopping.


Simulation Of Mathematical Models In Genetic Analysis, Dinesh Govindal Patel May 1964

Simulation Of Mathematical Models In Genetic Analysis, Dinesh Govindal Patel

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

In recent years a new field of statistics has become of importance in many branches of experimental science. This is the Monte Carlo Method, so called because it is based on simulation of stochastic processes. By stochastic process, it is meant some possible physical process in the real world that has some random or stochastic element in its structure. This is the subject which may appropriately be called the dynamic part of statistics or the statistics of "change," in contrast with the static statistical problems which have so far been the more systematically studied. Many obvious examples of such processes …