Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistical Theory

2019

Institution
Keyword
Publication
Publication Type

Articles 1 - 30 of 39

Full-Text Articles in Statistics and Probability

Generalized Matrix Decomposition Regression: Estimation And Inference For Two-Way Structured Data, Yue Wang, Ali Shojaie, Tim Randolph, Jing Ma Dec 2019

Generalized Matrix Decomposition Regression: Estimation And Inference For Two-Way Structured Data, Yue Wang, Ali Shojaie, Tim Randolph, Jing Ma

UW Biostatistics Working Paper Series

Analysis of two-way structured data, i.e., data with structures among both variables and samples, is becoming increasingly common in ecology, biology and neuro-science. Classical dimension-reduction tools, such as the singular value decomposition (SVD), may perform poorly for two-way structured data. The generalized matrix decomposition (GMD, Allen et al., 2014) extends the SVD to two-way structured data and thus constructs singular vectors that account for both structures. While the GMD is a useful dimension-reduction tool for exploratory analysis of two-way structured data, it is unsupervised and cannot be used to assess the association between such data and an outcome of interest. …


Statistical Inference For Networks Of High-Dimensional Point Processes, Xu Wang, Mladen Kolar, Ali Shojaie Dec 2019

Statistical Inference For Networks Of High-Dimensional Point Processes, Xu Wang, Mladen Kolar, Ali Shojaie

UW Biostatistics Working Paper Series

Fueled in part by recent applications in neuroscience, high-dimensional Hawkes process have become a popular tool for modeling the network of interactions among multivariate point process data. While evaluating the uncertainty of the network estimates is critical in scientific applications, existing methodological and theoretical work have only focused on estimation. To bridge this gap, this paper proposes a high-dimensional statistical inference procedure with theoretical guarantees for multivariate Hawkes process. Key to this inference procedure is a new concentration inequality on the first- and second-order statistics for integrated stochastic processes, which summarizes the entire history of the process. We apply this …


Economic Design Of Acceptance Sampling Plans For Truncated Life Tests Using Three-Parameter Lindley Distribution, Amer Ibrahim Al-Omari, Enrico Ciavolino, Amjad D. Al-Nasser Nov 2019

Economic Design Of Acceptance Sampling Plans For Truncated Life Tests Using Three-Parameter Lindley Distribution, Amer Ibrahim Al-Omari, Enrico Ciavolino, Amjad D. Al-Nasser

Journal of Modern Applied Statistical Methods

A single acceptance sampling plan for the three-parameter Lindley distribution under a truncated life test is developed. For various consumer’s confidence levels, acceptance numbers, and values of the ratio of the experimental time to the specified average lifetime, the minimum sample size important to assert a certain average lifetime are calculated. The operating characteristic (OC) function values as well as the associated producer’s risks are also provided. A numerical example is presented to illustrate the suggested acceptance sampling plans.


Joint Asymptotics For Smoothing Spline Semiparametric Nonlinear Models, Jiahui Yu Oct 2019

Joint Asymptotics For Smoothing Spline Semiparametric Nonlinear Models, Jiahui Yu

Doctoral Dissertations

We study the joint asymptotics of general smoothing spline semiparametric models in the settings of density estimation and regression. We provide a systematic framework which incorporates many existing models as special cases, and further allows for nonlinear relationships between the finite-dimensional Euclidean parameter and the infinite-dimensional functional parameter. For both density estimation and regression, we establish the local existence and uniqueness of the penalized likelihood estimators for our proposed models. In the density estimation setting, we prove joint consistency and obtain the rates of convergence of the joint estimator in an appropriate norm. The convergence rate of the parametric component …


The Estimation Of Missing Values In Rectangular Lattice Designs, Emmanuel Ogochukwu Ossai, Abimibola Victoria Oladugba Sep 2019

The Estimation Of Missing Values In Rectangular Lattice Designs, Emmanuel Ogochukwu Ossai, Abimibola Victoria Oladugba

Journal of Modern Applied Statistical Methods

Algebraic expressions for estimating missing data when one or more observation(s) are missing in Rectangular lattice designs with repetition were derived using the method of minimizing the residual sum of squares. Results showed that the estimated value(s) were significantly approximate to that of the actual value(s).


Prediction Of High School Graduation With Decision Trees, Andrea M. Lee Aug 2019

Prediction Of High School Graduation With Decision Trees, Andrea M. Lee

MSU Graduate Theses

While working as an educator for the past fourteen years, we are always looking at data and determining ways to help our students. Graduation status is one area of interest. I wanted to apply statistical methods to try and find early indicators of those students who may drop out, thus being able to provide early intervention to those students. With early intervention, we may be able to lower our dropout rate. While studying different methods of pattern recognition, I found that the decision tree method in machine learning was the best for the data that I had collected. Decision trees …


Optimal Design For A Causal Structure, Zaher Kmail Aug 2019

Optimal Design For A Causal Structure, Zaher Kmail

Department of Statistics: Dissertations, Theses, and Student Work

Linear models and mixed models are important statistical tools. But in many natural phenomena, there is more than one endogenous variable involved and these variables are related in a sophisticated way. Structural Equation Modeling (SEM) is often used to model the complex relationships between the endogenous and exogenous variables. It was first implemented in research to estimate the strength and direction of direct and indirect effects among variables and to measure the relative magnitude of each causal factor.

Historically, traditional optimal design theory focuses on univariate linear, nonlinear, and mixed models. There is no current literature on the subject of …


Interpreting Patient Reported Outcomes In Orthopaedic Surgery: A Systematic Review, Shgufta Docter, Zina Fathalla, Michael Lukacs, Michaela Khan, Morgan Jennings, Shu-Hsuan Liu, Dong Zi, Dianne Bryant Jun 2019

Interpreting Patient Reported Outcomes In Orthopaedic Surgery: A Systematic Review, Shgufta Docter, Zina Fathalla, Michael Lukacs, Michaela Khan, Morgan Jennings, Shu-Hsuan Liu, Dong Zi, Dianne Bryant

Western Research Forum

Background: Reporting methods of patient reported outcome measures (PROMs) vary in orthopaedic surgery literature. While most studies report statistical significance, the interpretation of results would be improved if authors reported confidence intervals (CIs), the minimally clinically important difference (MCID), and number needed to treat (NNT).

Objective: To assess the quality and interpretability of reporting the results of PROMs. To evaluate reporting, we will assess the proportion of studies that reported (1) 95% CIs, (2) MCID, and (3) NNT. To evaluate interpretation, we will assess the proportion of studies that discussed results using the MCID or the effect sizes and how …


Measure Of Departure From Marginal Average Point-Symmetry For Two-Way Contingency Tables, Kiyotaka Iki, Sadao Tomizawa Jun 2019

Measure Of Departure From Marginal Average Point-Symmetry For Two-Way Contingency Tables, Kiyotaka Iki, Sadao Tomizawa

Journal of Modern Applied Statistical Methods

For the analysis of two-way contingency tables with ordered categories, Yamamoto, Tahata, Suzuki, and Tomizawa (2011) considered a measure to represent the degree of departure from marginal point-symmetry. The maximum value of the measure cannot distinguish two kinds of marginal complete asymmetry with respect to the midpoint. A measure is proposed which can distinguish two kinds of marginal asymmetry with respect to the midpoint. It also gives large-sample confidence interval for the proposed measure.


The Impact Of Equating On Detection Of Treatment Effects, Youn-Jeng Choi, Seohyun Kim, Allan S. Cohen, Zhenqiu Lu Jun 2019

The Impact Of Equating On Detection Of Treatment Effects, Youn-Jeng Choi, Seohyun Kim, Allan S. Cohen, Zhenqiu Lu

Journal of Modern Applied Statistical Methods

Equating makes it possible to compare performances on different forms of a test. Three different equating methods (baseline selection, subgroup, and subscore equating) using common-item item response theory equating were examined for their impact on detection of treatment effects in multilevel models.


Upper Record Values From Extended Exponential Distribution, Devendra Kumar, Sanku Dey May 2019

Upper Record Values From Extended Exponential Distribution, Devendra Kumar, Sanku Dey

Journal of Modern Applied Statistical Methods

Some recurrence relations are established for the single and product moments of upper record values for the extended exponential distribution by Nadarajah and Haghighi (2011) as an alternative to the gamma, Weibull, and the exponentiated exponential distributions. Recurrence relations for negative moments and quotient moments of upper record values are also obtained. Using relations of single moments and product moments, means, variances, and covariances of upper record values from samples of sizes up to 10 are tabulated for various values of the shape parameter and scale parameter. A characterization of this distribution based on conditional moments of record …


Generalizations Of The Arcsine Distribution, Rebecca Rasnick May 2019

Generalizations Of The Arcsine Distribution, Rebecca Rasnick

Electronic Theses and Dissertations

The arcsine distribution looks at the fraction of time one player is winning in a fair coin toss game and has been studied for over a hundred years. There has been little further work on how the distribution changes when the coin tosses are not fair or when a player has already won the initial coin tosses or, equivalently, starts with a lead. This thesis will first cover a proof of the arcsine distribution. Then, we explore how the distribution changes when the coin the is unfair. Finally, we will explore the distribution when one person has won the first …


The Andersen Likelihood Ratio Test With A Random Split Criterion Lacks Power, Georg Krammer Apr 2019

The Andersen Likelihood Ratio Test With A Random Split Criterion Lacks Power, Georg Krammer

Journal of Modern Applied Statistical Methods

The Andersen LRT uses sample characteristics as split criteria to evaluate Rasch model fit, or theory driven hypothesis testing for a test. The power and Type I error of a random split criterion was evaluated with a simulation study. Results consistently show a random split criterion lacks power.


Weighted Version Of Generalized Inverse Weibull Distribution, Sofi Mudiasir, S. P. Ahmad Apr 2019

Weighted Version Of Generalized Inverse Weibull Distribution, Sofi Mudiasir, S. P. Ahmad

Journal of Modern Applied Statistical Methods

Weighted distributions are used in many fields, such as medicine, ecology, and reliability. A weighted version of the generalized inverse Weibull distribution, known as weighted generalized inverse Weibull distribution (WGIWD), is proposed. Basic properties including mode, moments, moment generating function, skewness, kurtosis, and Shannon’s entropy are studied. The usefulness of the new model was demonstrated by applying it to a real-life data set. The WGIWD fits better than its submodels, such as length biased generalized inverse Weibull (LGIW), generalized inverse Weibull (GIW), inverse Weibull (IW) and inverse exponential (IE) distributions.


Calibration Of Measurements, Edward Kroc, Bruno D. Zumbo Apr 2019

Calibration Of Measurements, Edward Kroc, Bruno D. Zumbo

Journal of Modern Applied Statistical Methods

Traditional notions of measurement error typically rely on a strong mean-zero assumption on the expectation of the errors conditional on an unobservable “true score” (classical measurement error) or on the data themselves (Berkson measurement error). Weakly calibrated measurements for an unobservable true quantity are defined based on a weaker mean-zero assumption, giving rise to a measurement model of differential error. Applications show it retains many attractive features of estimation and inference when performing a naive data analysis (i.e. when performing an analysis on the error-prone measurements themselves), and other interesting properties not present in the classical or Berkson cases. Applied …


Estimation Of Mean With Two-Parameter Ratio-Product-Ratio Estimator In Double Sampling Using Ancillary Information Under Non-Response, Surya K. Pal, Housila P. Singh Apr 2019

Estimation Of Mean With Two-Parameter Ratio-Product-Ratio Estimator In Double Sampling Using Ancillary Information Under Non-Response, Surya K. Pal, Housila P. Singh

Journal of Modern Applied Statistical Methods

Ratio-product-ratio estimators with two parameters in double sampling under non-response are considered along with their properties. Practical conditions are obtained in which the suggested estimators are more proficient than other existing estimators. An example is given.


Efficient Class Of Estimators For Finite Population Mean Using Auxiliary Information In Two-Occasion Successive Sampling, G. N. Singh, Mohd Khalid Apr 2019

Efficient Class Of Estimators For Finite Population Mean Using Auxiliary Information In Two-Occasion Successive Sampling, G. N. Singh, Mohd Khalid

Journal of Modern Applied Statistical Methods

In the case of sampling on two occasions, a class of estimators is considered which uses information on the first occasion as well as the second occasion in order to estimate the population means on the current (second) occasion. The usefulness of auxiliary information in enhancing the efficiency of this estimation is examined through the class of proposed estimators. Some properties of the class of estimators and a strategy of optimum replacement are discussed. The proposed class of estimators were empirically compared with the sample mean estimator in the case of no matching. The established optimum estimator, which is a …


Jmasm 51: Bayesian Reliability Analysis Of Binomial Model – Application To Success/Failure Data, M. Tanwir Akhtar, Athar Ali Khan Mar 2019

Jmasm 51: Bayesian Reliability Analysis Of Binomial Model – Application To Success/Failure Data, M. Tanwir Akhtar, Athar Ali Khan

Journal of Modern Applied Statistical Methods

Reliability data are generated in the form of success/failure. An attempt was made to model such type of data using binomial distribution in the Bayesian paradigm. For fitting the Bayesian model both analytic and simulation techniques are used. Laplace approximation was implemented for approximating posterior densities of the model parameters. Parallel simulation tools were implemented with an extensive use of R and JAGS. R and JAGS code are developed and provided. Real data sets are used for the purpose of illustration.


A Random Forests Approach To Assess Determinants Of Central Bank Independence, Maddalena Cavicchioli, Angeliki Papana, Ariadni Papana Dagiasis, Barbara Pistoresi Mar 2019

A Random Forests Approach To Assess Determinants Of Central Bank Independence, Maddalena Cavicchioli, Angeliki Papana, Ariadni Papana Dagiasis, Barbara Pistoresi

Journal of Modern Applied Statistical Methods

A non-parametric efficient statistical method, Random Forests, is implemented for the selection of the determinants of Central Bank Independence (CBI) among a large database of economic, political, and institutional variables for OECD countries. It permits ranking all the determinants based on their importance in respect to the CBI and does not impose a priori assumptions on potential nonlinear relationships in the data. Collinearity issues are resolved, because correlated variables can be simultaneously considered.


Maximum Likelihood Estimation For The Generalized Pareto Distribution And Goodness-Of-Fit Test With Censored Data, Minh H. Pham, Chris Tsokos, Bong-Jin Choi Mar 2019

Maximum Likelihood Estimation For The Generalized Pareto Distribution And Goodness-Of-Fit Test With Censored Data, Minh H. Pham, Chris Tsokos, Bong-Jin Choi

Journal of Modern Applied Statistical Methods

The generalized Pareto distribution (GPD) is a flexible parametric model commonly used in financial modeling. Maximum likelihood estimation (MLE) of the GPD was proposed by Grimshaw (1993). Maximum likelihood estimation of the GPD for censored data is developed, and a goodness-of-fit test is constructed to verify an MLE algorithm in R and to support the model-validation step. The algorithms were composed in R. Grimshaw’s algorithm outperforms functions available in the R package ‘gPdtest’. A simulation study showed the MLE method for censored data and the goodness-of-fit test are both reliable.


Unified Methods For Feature Selection In Large-Scale Genomic Studies With Censored Survival Outcomes, Lauren Spirko-Burns, Karthik Devarajan Mar 2019

Unified Methods For Feature Selection In Large-Scale Genomic Studies With Censored Survival Outcomes, Lauren Spirko-Burns, Karthik Devarajan

COBRA Preprint Series

One of the major goals in large-scale genomic studies is to identify genes with a prognostic impact on time-to-event outcomes which provide insight into the disease's process. With rapid developments in high-throughput genomic technologies in the past two decades, the scientific community is able to monitor the expression levels of tens of thousands of genes and proteins resulting in enormous data sets where the number of genomic features is far greater than the number of subjects. Methods based on univariate Cox regression are often used to select genomic features related to survival outcome; however, the Cox model assumes proportional hazards …


Bayesian Approximation Techniques For Scale Parameter Of Laplace Distribution, Uzma Jan, S. P. Ahmad Mar 2019

Bayesian Approximation Techniques For Scale Parameter Of Laplace Distribution, Uzma Jan, S. P. Ahmad

Journal of Modern Applied Statistical Methods

The Bayesian estimation of the scale parameter of a Laplace Distribution is obtained using two approximation techniques, like Normal approximation and Tierney and Kadane (T-K) approximation, under different informative priors.


Can One Test Fit All? Responses To The Article “Striving For Simple But Effective Advice For Comparing The Central Tendency Of Two Populations” (Ruxton & Neuhäuser, 2018), Diep Nguyen, Eun Sook Kim, Yi-Hsin Chen Mar 2019

Can One Test Fit All? Responses To The Article “Striving For Simple But Effective Advice For Comparing The Central Tendency Of Two Populations” (Ruxton & Neuhäuser, 2018), Diep Nguyen, Eun Sook Kim, Yi-Hsin Chen

Journal of Modern Applied Statistical Methods

Responses to suggestions made by Ruxton & Neuhäuser (2018) regarding Nguyen et al. (2016) are given.


Φ-Divergence Loss-Based Artificial Neural Network, R. L. Salamwade, D. M. Sakate, S. K. Mathur Mar 2019

Φ-Divergence Loss-Based Artificial Neural Network, R. L. Salamwade, D. M. Sakate, S. K. Mathur

Journal of Modern Applied Statistical Methods

Artificial Neural Networks (ANNs) can fit non-linear functions and recognize patterns better than several standard techniques. Performance of ANNs is measured by using loss functions. Phi-divergence estimator is generalization of maximum likelihood estimator and it possesses all its properties. A neural network is proposed which is trained using phi-divergence loss.


On The Conditional And Unconditional Type I Error Rates And Power Of Tests In Linear Models With Heteroscedastic Errors, Patrick J. Rosopa, Alice M. Brawley, Theresa P. Atkinson, Stephen A. Robertson Mar 2019

On The Conditional And Unconditional Type I Error Rates And Power Of Tests In Linear Models With Heteroscedastic Errors, Patrick J. Rosopa, Alice M. Brawley, Theresa P. Atkinson, Stephen A. Robertson

Journal of Modern Applied Statistical Methods

Preliminary tests for homoscedasticity may be unnecessary in general linear models. Based on Monte Carlo simulations, results suggest that when testing for differences between independent slopes, the unconditional use of weighted least squares regression and HC4 regression performed the best across a wide range of conditions.


A Robust Nonparametric Measure Of Effect Size Based On An Analog Of Cohen's D, Plus Inferences About The Median Of The Typical Difference, Rand Wilcox Mar 2019

A Robust Nonparametric Measure Of Effect Size Based On An Analog Of Cohen's D, Plus Inferences About The Median Of The Typical Difference, Rand Wilcox

Journal of Modern Applied Statistical Methods

The paper describes a nonparametric analog of Cohen's d, Q. It is established that a confidence interval for Q can be computed via a method for computing a confidence interval for the median of D = X1X2, which in turn is related to making inferences about P(X1 < X2).


Striving For Simple But Effective Advice For Comparing The Central Tendency Of Two Populations, Graeme Ruxton, Markus Neuhäuser Mar 2019

Striving For Simple But Effective Advice For Comparing The Central Tendency Of Two Populations, Graeme Ruxton, Markus Neuhäuser

Journal of Modern Applied Statistical Methods

Nguyen et al. (2016) offered advice to researchers in the commonly-encountered situation where they are interested in testing for a difference in central tendency between two populations. Their data and the available literature support very simple advice that strikes the best balance between ease of implementation, power and reliability. Specifically, apply Satterthwaite’s test, with preliminary ranking of the data if a strong deviation from normality is expected, or is suggested by visual inspection of the data. This simple guideline will serve well except when dealing with small samples of discrete data, when more sophisticated treatment may be required.


Robust Ancova, Curvature, And The Curse Of Dimensionality, Rand Wilcox Mar 2019

Robust Ancova, Curvature, And The Curse Of Dimensionality, Rand Wilcox

Journal of Modern Applied Statistical Methods

There is a substantial collection of robust analysis of covariance (ANCOVA) methods that effectively deals with non-normality, unequal population slope parameters, outliers, and heteroscedasticity. Some are based on the usual linear model and others are based on smoothers (nonparametric regression estimators). However, extant results are limited to one or two covariates. A minor goal here is to extend a recently-proposed method, based on the usual linear model, to situations where there are up to six covariates. The usual linear model might provide a poor approximation of the true regression surface. The main goal is to suggest a method, based on …


Should We Give Up On Causality?, Tom Knapp Mar 2019

Should We Give Up On Causality?, Tom Knapp

Journal of Modern Applied Statistical Methods

No abstract provided.


Logistic Regression: An Inferential Method For Identifying The Best Predictors, Rand Wilcox Mar 2019

Logistic Regression: An Inferential Method For Identifying The Best Predictors, Rand Wilcox

Journal of Modern Applied Statistical Methods

When dealing with a logistic regression model, there is a simple method for estimating the strength of the association between the jth covariate and the dependent variable when all covariates are entered into the model. There is the issue of determining whether the jth independent variable has a stronger or weaker association than the kth independent variable. This note describes a method for dealing with this issue that was found to perform reasonably well in simulations.