Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability

Bootstrap

Institution
Publication Year
Publication
Publication Type

Articles 1 - 30 of 65

Full-Text Articles in Physical Sciences and Mathematics

Finite Mixtures Of Mean-Parameterized Conway-Maxwell-Poisson Models, Dongying Zhan Jan 2023

Finite Mixtures Of Mean-Parameterized Conway-Maxwell-Poisson Models, Dongying Zhan

Theses and Dissertations--Statistics

For modeling count data, the Conway-Maxwell-Poisson (CMP) distribution is a popular generalization of the Poisson distribution due to its ability to characterize data over- or under-dispersion. While the classic parameterization of the CMP has been well-studied, its main drawback is that it is does not directly model the mean of the counts. This is mitigated by using a mean-parameterized version of the CMP distribution. In this work, we are concerned with the setting where count data may be comprised of subpopulations, each possibly having varying degrees of data dispersion. Thus, we propose a finite mixture of mean-parameterized CMP distributions. An …


A Bootstrap Method For A Multiple-Imputation Variance Estimator In Survey Sampling, Lili Yu, Yichuan Zhao Nov 2022

A Bootstrap Method For A Multiple-Imputation Variance Estimator In Survey Sampling, Lili Yu, Yichuan Zhao

Department of Biostatistics, Epidemiology, and Environmental Health Sciences Faculty Publications

Rubin’s variance estimator of the multiple imputation estimator for a domain mean is not asymptotically unbiased. Kim et al. derived the closed-form bias for Rubin’s variance estimator. In addition, they proposed an asymptotically unbiased variance estimator for the multiple imputation estimator when the imputed values can be written as a linear function of the observed values. However, this needs the assumption that the covariance of the imputed values in the same imputed dataset is twice that in the different imputed datasets. In this study, we proposed a bootstrap variance estimator that does not need this assumption. Both theoretical argument and …


Non-Inferiority Testing: Kernel Estimation And Overlap Measure, Larie C. Ward Jan 2022

Non-Inferiority Testing: Kernel Estimation And Overlap Measure, Larie C. Ward

Electronic Theses and Dissertations

In non-inferiority testing, the decision of whether a proposed treatment is non-inferior to a reference treatment depends on model assumptions and choices of acceptable tolerance limits. Here, we consider a method that employs kernels to estimate the probability density functions of both the experimental and reference populations from two independent samples. Based on these densities, we introduce a quantity called the overlap coefficient or overlap measure. A bootstrap technique is helpful in exploring the distribution and variance empirically. We derive the distribution of this measure and define a hypothesis test that can be applied to the non-inferiority setting under some …


Jmasm 52: Extremely Efficient Permutation And Bootstrap Hypothesis Tests Using R, Christina Chatzipantsiou, Marios Dimitriadis, Manos Papadakis, Michail Tsagris Jul 2020

Jmasm 52: Extremely Efficient Permutation And Bootstrap Hypothesis Tests Using R, Christina Chatzipantsiou, Marios Dimitriadis, Manos Papadakis, Michail Tsagris

Journal of Modern Applied Statistical Methods

Re-sampling based statistical tests are known to be computationally heavy, but reliable when small sample sizes are available. Despite their nice theoretical properties not much effort has been put to make them efficient. Computationally efficient method for calculating permutation-based p-values for the Pearson correlation coefficient and two independent samples t-test are proposed. The method is general and can be applied to other similar two sample mean or two mean vectors cases.


Bayesian Analysis Of Extended Cox Model With Time-Varying Covariates Using Bootstrap Prior, Oyebayo R. Olaniran, Mohd Asrul A. Abdullah Jul 2020

Bayesian Analysis Of Extended Cox Model With Time-Varying Covariates Using Bootstrap Prior, Oyebayo R. Olaniran, Mohd Asrul A. Abdullah

Journal of Modern Applied Statistical Methods

A new Bayesian estimation procedure for extended cox model with time varying covariate was presented. The prior was determined using bootstrapping technique within the framework of parametric empirical Bayes. The efficiency of the proposed method was observed using Monte Carlo simulation of extended Cox model with time varying covariates under varying scenarios. Validity of the proposed method was also ascertained using real life data set of Stanford heart transplant. Comparison of the proposed method with its competitor established appreciable supremacy of the method.


Quasi-Likelihood Ratio Tests For Homoscedasticity In Linear Regression, Lili Yu, Varadan Sevilimedu, Robert Vogel, Hani Samawi Apr 2020

Quasi-Likelihood Ratio Tests For Homoscedasticity In Linear Regression, Lili Yu, Varadan Sevilimedu, Robert Vogel, Hani Samawi

Journal of Modern Applied Statistical Methods

Two quasi-likelihood ratio tests are proposed for the homoscedasticity assumption in the linear regression models. They require few assumptions than the existing tests. The properties of the tests are investigated through simulation studies. An example is provided to illustrate the usefulness of the new proposed tests.


A Flexible Zero-Inflated Poisson Regression Model, Eric S. Roemmele Jan 2019

A Flexible Zero-Inflated Poisson Regression Model, Eric S. Roemmele

Theses and Dissertations--Statistics

A practical problem often encountered with observed count data is the presence of excess zeros. Zero-inflation in count data can easily be handled by zero-inflated models, which is a two-component mixture of a point mass at zero and a discrete distribution for the count data. In the presence of predictors, zero-inflated Poisson (ZIP) regression models are, perhaps, the most commonly used. However, the fully parametric ZIP regression model could sometimes be restrictive, especially with respect to the mixing proportions. Taking inspiration from some of the recent literature on semiparametric mixtures of regressions models for flexible mixture modeling, we propose a …


Evaluation Of Using The Bootstrap Procedure To Estimate The Population Variance, Nghia Trong Nguyen May 2018

Evaluation Of Using The Bootstrap Procedure To Estimate The Population Variance, Nghia Trong Nguyen

Electronic Theses and Dissertations

The bootstrap procedure is widely used in nonparametric statistics to generate an empirical sampling distribution from a given sample data set for a statistic of interest. Generally, the results are good for location parameters such as population mean, median, and even for estimating a population correlation. However, the results for a population variance, which is a spread parameter, are not as good due to the resampling nature of the bootstrap method. Bootstrap samples are constructed using sampling with replacement; consequently, groups of observations with zero variance manifest in these samples. As a result, a bootstrap variance estimator will carry a …


A Comparison Of Some Confidence Intervals For Estimating The Kurtosis Parameter, Guensley Jerome Jun 2017

A Comparison Of Some Confidence Intervals For Estimating The Kurtosis Parameter, Guensley Jerome

FIU Electronic Theses and Dissertations

Several methods have been proposed to estimate the kurtosis of a distribution. The three common estimators are: g2, G2 and b2. This thesis addressed the performance of these estimators by comparing them under the same simulation environments and conditions. The performance of these estimators are compared through confidence intervals by determining the average width and probabilities of capturing the kurtosis parameter of a distribution. We considered and compared classical and non-parametric methods in constructing these intervals. Classical method assumes normality to construct the confidence intervals while the non-parametric methods rely on bootstrap techniques. The bootstrap …


Jmasm44: Implementing Multiple Ratio Imputation By The Emb Algorithm (R), Masayoshi Takahashi May 2017

Jmasm44: Implementing Multiple Ratio Imputation By The Emb Algorithm (R), Masayoshi Takahashi

Journal of Modern Applied Statistical Methods

Although single ratio imputation is often used to deal with missing values in practice, there is a paucity of discussion regarding multiple ratio imputation. Code in the R statistical environment is presented to execute multiple ratio imputation by the Expectation-Maximization with Bootstrapping (EMB) algorithm.


Multiple Ratio Imputation By The Emb Algorithm: Theory And Simulation, Masayoshi Takahashi May 2017

Multiple Ratio Imputation By The Emb Algorithm: Theory And Simulation, Masayoshi Takahashi

Journal of Modern Applied Statistical Methods

Although multiple imputation is the gold standard of treating missing data, single ratio imputation is often used in practice. Based on Monte Carlo simulation, the Expectation-Maximization with Bootstrapping (EMB) algorithm to create multiple ratio imputation is used to fill in the gap between theory and practice.


On Some Test Statistics For Testing The Population Skewness And Kurtosis: An Empirical Study, Yawen Guo Aug 2016

On Some Test Statistics For Testing The Population Skewness And Kurtosis: An Empirical Study, Yawen Guo

FIU Electronic Theses and Dissertations

The purpose of this thesis is to propose some test statistics for testing the skewness and kurtosis parameters of a distribution, not limited to a normal distribution. Since a theoretical comparison is not possible, a simulation study has been conducted to compare the performance of the test statistics. We have compared both parametric methods (classical method with normality assumption) and non-parametric methods (bootstrap in Bias Corrected Standard Method, Efron’s Percentile Method, Hall’s Percentile Method and Bias Corrected Percentile Method). Our simulation results for testing the skewness parameter indicate that the power of the tests differs significantly across sample sizes, the …


Statistical Methodology For Data With Multiple Limits Of Detection, Robert M. Flikkema Jun 2016

Statistical Methodology For Data With Multiple Limits Of Detection, Robert M. Flikkema

Dissertations

Limitations of instruments used to collect continuous data sometimes lead to obtaining observations lower than a limit of detection. These observations are known as nondetects. They could be zeroes, or positive numbers, but they are too small to be recorded by a measuring device. Nondetects frequently occur in environmental data. Trace amounts of chemicals can exist in soil or groundwater and are undetectable by a machine reading. These observations pose a problem to researchers since the true values are unknown.

Simulations in the literature have led to inconsistent conclusions regarding what estimation technique to use with nondetect data when estimating …


Causal Inference In Observational Studies With Clustered Data, Meng Wu Jan 2016

Causal Inference In Observational Studies With Clustered Data, Meng Wu

Legacy Theses & Dissertations (2009 - 2024)

In this thesis, we study causal inference in observational studies with clustered data.


Combating Anti-Statistical Thinking Using Simulation-Based Methods Throughout The Undergraduate Curriculum, Nathan L. Tintle, Beth Chance, George Cobb, Soma Roy, Todd Swanson, Jill Vanderstoep Dec 2015

Combating Anti-Statistical Thinking Using Simulation-Based Methods Throughout The Undergraduate Curriculum, Nathan L. Tintle, Beth Chance, George Cobb, Soma Roy, Todd Swanson, Jill Vanderstoep

Faculty Work Comprehensive List

The use of simulation-based methods for introducing inference is growing in popularity for the Stat 101 course, due in part to increasing evidence of the methods ability to improve students’ statistical thinking. This impact comes from simulation-based methods (a) clearly presenting the overarching logic of inference, (b) strengthening ties between statistics and probability/mathematical concepts, (c) encouraging a focus on the entire research process, (d) facilitating student thinking about advanced statistical concepts, (e) allowing more time to explore, do, and talk about real research and messy data, and (f) acting as a firmer foundation on which to build statistical intuition. Thus, …


Per-Contact Infectivity Of Hcv Associated With Injection Exposures In A Prospective Cohort Of Young Injection Drug Users In San Francisco, Ca (Ufo Study), Yuridia Leyva Sep 2015

Per-Contact Infectivity Of Hcv Associated With Injection Exposures In A Prospective Cohort Of Young Injection Drug Users In San Francisco, Ca (Ufo Study), Yuridia Leyva

Mathematics & Statistics ETDs

Sharing needles and ancillary injection drug equipment places injection drug users (IDU) at risk for Hepatitis C Virus (HCV), a highly infectious blood-borne virus. A limited number of studies have analyzed the per-contact infectivity of HCV associated with the use of previously-used needles, but per-contact infectivity of ancillary injecting equipment has not been previously investigated. Our goal is to estimate the per-contact infectivity of HCV associated with (1) injecting with another person's previously-used needle, classified as receptive needle sharing (RNS), and (2) using another person's previously-used ancillary injecting equipment, such as cookers to melt drugs and cottons to strain impurities …


Testing Equality Of Locally Stationary Covariances With Application To Mortality Rate Modeling, Zahra Teimouri, Ali R. Taheriyoun Jun 2015

Testing Equality Of Locally Stationary Covariances With Application To Mortality Rate Modeling, Zahra Teimouri, Ali R. Taheriyoun

Applications and Applied Mathematics: An International Journal (AAM)

No abstract provided.


A Review Of Frequentist Tests For The 2x2 Binomial Trial, Chris Lloyd Dec 2014

A Review Of Frequentist Tests For The 2x2 Binomial Trial, Chris Lloyd

Chris J. Lloyd

The 2x2 binomial trial is the simplest of data structures yet its statistical analysis and the issues it raises have been debated and revisited for over 70 years. Which analysis should biomedical researchers use in applications? In this review, we consider frequentist tests only, specifically tests with control size either exactly or very close to exactly. These procedures can be classified as conditional and unconditional. Amongst tests motivated by a conditional model, Lancaster’s mid-p and Liebermeister’s test are less conservative than Fisher’s classical test, but do not control type 1 error. Within the conditional framework, only Fisher’s test can be …


The Use Of Bootstrapping When Using Propensity-Score Matching Without Replacement: A Simulation Study, Peter Austin, Dylan Small Jan 2014

The Use Of Bootstrapping When Using Propensity-Score Matching Without Replacement: A Simulation Study, Peter Austin, Dylan Small

Peter Austin

Propensity-score matching is frequently used to estimate the effect of treatments, exposures, and interventions when using observational data. An important issue when using propensity-score matching is how to estimate the standard error of the estimated treatment effect. Accurate variance estimation permits construction of confidence intervals that have the advertised coverage rates and tests of statistical significance that have the correct type I error rates. There is disagreement in the literature as to how standard errors should be estimated. The bootstrap is a commonly used resampling method that permits estimation of the sampling variability of estimated parameters. Bootstrap methods are rarely …


Constructing Confidence Intervals For Effect Sizes In Anova Designs, Li-Ting Chen, Chao-Ying Joanne Peng Nov 2013

Constructing Confidence Intervals For Effect Sizes In Anova Designs, Li-Ting Chen, Chao-Ying Joanne Peng

Journal of Modern Applied Statistical Methods

A confidence interval for effect sizes provides a range of plausible population effect sizes (ES) that are consistent with data. This article defines an ES as a standardized linear contrast of means. The noncentral method, Bonett’s method, and the bias-corrected and accelerated bootstrap method are illustrated for constructing the confidence interval for such an effect size. Results obtained from the three methods are discussed and interpretations of results are offered.


Bootstrap Interval Estimation Of Reliability Via Coefficient Omega, Miguel A. Padilla, Jasmin Divers May 2013

Bootstrap Interval Estimation Of Reliability Via Coefficient Omega, Miguel A. Padilla, Jasmin Divers

Journal of Modern Applied Statistical Methods

Three different bootstrap confidence intervals (CIs) for coefficient omega were investigated. The CIs were assessed through a simulation study with conditions not previously investigated. All methods performed well; however, the normal theory bootstrap (NTB) CI had the best performance because it had more consistent acceptable coverage under the simulation conditions investigated.


Theory And Methods For Gini Coefficients Partitioned By Quantile Range, Chaitra Nagaraja Dec 2012

Theory And Methods For Gini Coefficients Partitioned By Quantile Range, Chaitra Nagaraja

Chaitra H Nagaraja

The Gini coefficient is frequently used to measure inequality in populations. However, it is possible that inequality levels may change over time differently for disparate subgroups which cannot be detected with population-level estimates only. Therefore, it may be informative to examine inequality separately for these segments. The case where the population is split into two segments based on non-overlapping quantile ranges is examined. Asymptotic theory is derived and practical methods to estimate standard errors and construct confidence intervals using resampling methods are developed. An application to per capita income across census tracts using American Community Survey data is considered.


The Length-Biased Lognormal Distribution And Its Application In The Analysis Of Data From Oil Field Exploration Studies, Makarand V. Ratnaparkhi, Uttara V. Naik-Nimbalkar May 2012

The Length-Biased Lognormal Distribution And Its Application In The Analysis Of Data From Oil Field Exploration Studies, Makarand V. Ratnaparkhi, Uttara V. Naik-Nimbalkar

Journal of Modern Applied Statistical Methods

The length-biased version of the lognormal distribution and related estimation problems are considered and sized-biased data arising in the exploration of oil fields is analyzed. The properties of the estimators are studied using simulations and the use of sample mode as an estimate of the lognormal parameter is discussed.


Empirical Sampling From Permutation Space With Unique Patterns, Justice I. Odiase May 2012

Empirical Sampling From Permutation Space With Unique Patterns, Justice I. Odiase

Journal of Modern Applied Statistical Methods

The exact distribution of a test statistic ultimately guarantees that the probability of a Type I error is exactly α. Several methods for estimating the exact distribution of a test statistic have evolved over the years with inherent computational problems and varying degrees of accuracy. The unique pattern of permutations resulting from using experimental data to sample within the permutation space without the risk of repeating permutations is identified. The method presented circumvents the theoretical requirements of asymptotic procedures and the computational difficulties associated with an exhaustive enumeration of permutations. Results show that time and space complexities are drastically reduced …


A Systematic Selection Method For The Development Of Cancer Staging Systems, Yunzhi Lin, Richard Chappell, Mithat Gonen Jan 2012

A Systematic Selection Method For The Development Of Cancer Staging Systems, Yunzhi Lin, Richard Chappell, Mithat Gonen

Memorial Sloan-Kettering Cancer Center, Dept. of Epidemiology & Biostatistics Working Paper Series

The tumor-node-metastasis (TNM) staging system has been the anchor of cancer diagnosis, treatment, and prognosis for many years. For meaningful clinical use, an orderly, progressive condensation of the T and N categories into an overall staging system needs to be defined, usually with respect to a time-to-event outcome. This can be considered as a cutpoint selection problem for a censored response partitioned with respect to two ordered categorical covariates and their interaction. The aim is to select the best grouping of the TN categories. A novel bootstrap cutpoint/model selection method is proposed for this task by maximizing bootstrap estimates of …


A Stochastic Version Of The Em Algorithm To Analyze Multivariate Skew-Normal Data With Missing Responses, M. Khounsiavash, M. Ganjali, T. Baghfalaki Dec 2011

A Stochastic Version Of The Em Algorithm To Analyze Multivariate Skew-Normal Data With Missing Responses, M. Khounsiavash, M. Ganjali, T. Baghfalaki

Applications and Applied Mathematics: An International Journal (AAM)

In this paper an algorithm called SEM, which is a stochastic version of the EM algorithm, is used to analyze multivariate skew-normal data with intermittent missing values. Also, a multivariate selection model framework for modeling of both missing and response mechanisms is formulated. By the SEM algorithm missing values of responses are inputed by the conditional distribution of missing values given observed data and then the log-likelihood of the pseudocomplete data is maximized. The algorithm is iterated until convergence of parameter estimates. Results of an application are also reported where a Bootstrap approach is used to compute the standard error …


Modeling Repairable System Failures With Interval Failure Data And Time Dependent Covariate, Jayanthi Arasan, Samira Ehsani Nov 2011

Modeling Repairable System Failures With Interval Failure Data And Time Dependent Covariate, Jayanthi Arasan, Samira Ehsani

Journal of Modern Applied Statistical Methods

An application of a repairable system model for interval failure data with a time dependent covariate is examined. The performance of several models based on the NHPP when applied to real data on ball bearing failures is also explored. The best model for the data was selected based on results of the likelihood ratio test. The bootstrapping technique was applied to obtain the variance estimate for the estimated expected number of failures. Results demonstrate that the proposed model works well and is easy to implement, in addition the bootstrap variance estimate provides a simple substitute for the traditional estimate.


Weighting Large Datasets With Complex Sampling Designs: Choosing The Appropriate Variance Estimation Method, Sara Mann, James Chowhan May 2011

Weighting Large Datasets With Complex Sampling Designs: Choosing The Appropriate Variance Estimation Method, Sara Mann, James Chowhan

Journal of Modern Applied Statistical Methods

Using the Canadian Workplace and Employee Survey (WES), three variance estimation methods for weighting large datasets with complex sampling designs are compared: simple final weighting, standard bootstrapping and mean bootstrapping. Using a logit analysis, it is shown - depending on which weighting method is used - different predictor variables are significant. The potential lack of independence inherent in a multi-stage cluster sample design, as in the WES, results in a downward bias in the variance when conducting statistical inference (using the simple final weight), which in turn results in increased Type I errors. Bootstrap methods can account for the survey’s …


Generalized Variances Ratio Test For Comparing K Covariance Matrices From Dependent Normal Populations, Marcelo Angelo Cirillo, Daniel Furtado Ferreira, Thelma Sáfadi, Eric Batista Ferreira Nov 2010

Generalized Variances Ratio Test For Comparing K Covariance Matrices From Dependent Normal Populations, Marcelo Angelo Cirillo, Daniel Furtado Ferreira, Thelma Sáfadi, Eric Batista Ferreira

Journal of Modern Applied Statistical Methods

New tests based on the ratio of generalized variances are presented to compare covariance matrices from dependent normal populations. Monte Carlo simulation concluded that the tests considered controlled the Type I error, providing empirical probabilities that were consistent with the nominal level stipulated.


Another Look At Resampling: Replenishing Small Samples With Virtual Data Through S-Smart, Haiyan Bai, Wei Pan, Leigh Lihshing Wang, Phillip Neal Ritchey May 2010

Another Look At Resampling: Replenishing Small Samples With Virtual Data Through S-Smart, Haiyan Bai, Wei Pan, Leigh Lihshing Wang, Phillip Neal Ritchey

Journal of Modern Applied Statistical Methods

A new resampling method is introduced to generate virtual data through a smoothing technique for replenishing small samples. The replenished analyzable sample retains the statistical properties of the original small sample, has small standard errors and possesses adequate statistical power.