Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Keyword
-
- Model selection (7)
- Prediction (6)
- Effect size (5)
- Cross-validation (4)
- Density estimation (4)
-
- Longitudinal data (4)
- Meta-analysis (4)
- Weighting (4)
- Bootstrap (3)
- Causal inference (3)
- Censored data (3)
- Classification (3)
- Confidence interval (3)
- Loss function (3)
- Multiple testing (3)
- Survival analysis (3)
- Adjusted p-value (2)
- Asymptotic control (2)
- Bootstrap method (2)
- Censoring (2)
- Comparative genomic hybridization (2)
- Consistency (2)
- Cross validation (2)
- Diagnostic tests (2)
- Disease screening (2)
- Double robustness (2)
- Edgeworth expansion (2)
- Effect sizes (2)
- Estimating equation (2)
- Estimation (2)
- Publication
-
- Journal of Modern Applied Statistical Methods (55)
- U.C. Berkeley Division of Biostatistics Working Paper Series (15)
- The University of Michigan Department of Biostatistics Working Paper Series (9)
- Johns Hopkins University, Dept. of Biostatistics Working Papers (8)
- Harvard University Biostatistics Working Paper Series (7)
- Publication Type
Articles 1 - 30 of 102
Full-Text Articles in Physical Sciences and Mathematics
Robust Likelihood-Based Analysis Of Multivariate Data With Missing Values, Rod Little, An Hyonggin
Robust Likelihood-Based Analysis Of Multivariate Data With Missing Values, Rod Little, An Hyonggin
The University of Michigan Department of Biostatistics Working Paper Series
The model-based approach to inference from multivariate data with missing values is reviewed. Regression prediction is most useful when the covariates are predictive of the missing values and the probability of being missing, and in these circumstances predictions are particularly sensitive to model misspecification. The use of penalized splines of the propensity score is proposed to yield robust model-based inference under the missing at random (MAR) assumption, assuming monotone missing data. Simulation comparisons with other methods suggest that the method works well in a wide range of populations, with little loss of efficiency relative to parametric models when the latter …
Multiple Testing. Part Ii. Step-Down Procedures For Control Of The Family-Wise Error Rate, Mark J. Van Der Laan, Sandrine Dudoit, Katherine S. Pollard
Multiple Testing. Part Ii. Step-Down Procedures For Control Of The Family-Wise Error Rate, Mark J. Van Der Laan, Sandrine Dudoit, Katherine S. Pollard
U.C. Berkeley Division of Biostatistics Working Paper Series
The present article proposes two step-down multiple testing procedures for asymptotic control of the family-wise error rate (FWER): the first procedure is based on maxima of test statistics (step-down maxT), while the second relies on minima of unadjusted p-values (step-down minP). A key feature of our approach is the test statistics null distribution (rather than data generating null distribution) used to derive cut-offs (i.e., rejection regions) for these test statistics and the resulting adjusted p-values. For general null hypotheses, corresponding to submodels for the data generating distribution, we identify an asymptotic domination condition for a null distribution under which the …
Multiple Testing. Part I. Single-Step Procedures For Control Of General Type I Error Rates, Sandrine Dudoit, Mark J. Van Der Laan, Katherine S. Pollard
Multiple Testing. Part I. Single-Step Procedures For Control Of General Type I Error Rates, Sandrine Dudoit, Mark J. Van Der Laan, Katherine S. Pollard
U.C. Berkeley Division of Biostatistics Working Paper Series
The present article proposes general single-step multiple testing procedures for controlling Type I error rates defined as arbitrary parameters of the distribution of the number of Type I errors, such as the generalized family-wise error rate. A key feature of our approach is the test statistics null distribution (rather than data generating null distribution) used to derive cut-offs (i.e., rejection regions) for these test statistics and the resulting adjusted p-values. For general null hypotheses, corresponding to submodels for the data generating distribution, we identify an asymptotic domination condition for a null distribution under which single-step common-quantile and common-cut-off procedures asymptotically …
Loss-Based Estimation With Cross-Validation: Applications To Microarray Data Analysis And Motif Finding, Sandrine Dudoit, Mark J. Van Der Laan, Sunduz Keles, Annette M. Molinaro, Sandra E. Sinisi, Siew Leng Teng
Loss-Based Estimation With Cross-Validation: Applications To Microarray Data Analysis And Motif Finding, Sandrine Dudoit, Mark J. Van Der Laan, Sunduz Keles, Annette M. Molinaro, Sandra E. Sinisi, Siew Leng Teng
U.C. Berkeley Division of Biostatistics Working Paper Series
Current statistical inference problems in genomic data analysis involve parameter estimation for high-dimensional multivariate distributions, with typically unknown and intricate correlation patterns among variables. Addressing these inference questions satisfactorily requires: (i) an intensive and thorough search of the parameter space to generate good candidate estimators, (ii) an approach for selecting an optimal estimator among these candidates, and (iii) a method for reliably assessing the performance of the resulting estimator. We propose a unified loss-based methodology for estimator construction, selection, and performance assessment with cross-validation. In this approach, the parameter of interest is defined as the risk minimizer for a suitable …
Kernel Estimation Of Rate Function For Recurrent Event Data, Chin-Tsang Chiang, Mei-Cheng Wang, Chiung-Yu Huang
Kernel Estimation Of Rate Function For Recurrent Event Data, Chin-Tsang Chiang, Mei-Cheng Wang, Chiung-Yu Huang
Johns Hopkins University, Dept. of Biostatistics Working Papers
Recurrent event data are largely characterized by the rate function but smoothing techniques for estimating the rate function have never been rigorously developed or studied in statistical literature. This paper considers the moment and least squares methods for estimating the rate function from recurrent event data. With an independent censoring assumption on the recurrent event process, we study statistical properties of the proposed estimators and propose bootstrap procedures for the bandwidth selection and for the approximation of confidence intervals in the estimation of the occurrence rate function. It is identified that the moment method without resmoothing via a smaller bandwidth …
Unified Cross-Validation Methodology For Selection Among Estimators And A General Cross-Validated Adaptive Epsilon-Net Estimator: Finite Sample Oracle Inequalities And Examples, Mark J. Van Der Laan, Sandrine Dudoit
Unified Cross-Validation Methodology For Selection Among Estimators And A General Cross-Validated Adaptive Epsilon-Net Estimator: Finite Sample Oracle Inequalities And Examples, Mark J. Van Der Laan, Sandrine Dudoit
U.C. Berkeley Division of Biostatistics Working Paper Series
In Part I of this article we propose a general cross-validation criterian for selecting among a collection of estimators of a particular parameter of interest based on n i.i.d. observations. It is assumed that the parameter of interest minimizes the expectation (w.r.t. to the distribution of the observed data structure) of a particular loss function of a candidate parameter value and the observed data structure, possibly indexed by a nuisance parameter. The proposed cross-validation criterian is defined as the empirical mean over the validation sample of the loss function at the parameter estimate based on the training sample, averaged over …
Weighting Adjustments For Unit Nonresponse With Multiple Outcome Variables, Sonya L. Vartivarian, Rod Little
Weighting Adjustments For Unit Nonresponse With Multiple Outcome Variables, Sonya L. Vartivarian, Rod Little
The University of Michigan Department of Biostatistics Working Paper Series
Weighting is a common form of unit nonresponse adjustment in sample surveys where entire questionnaires are missing due to noncontact or refusal to participate. Weights are inversely proportional to the probability of selection and response. A common approach computes the response weight adjustment cells based on covariate information. When the number of cells thus created is too large, a coarsening method such as response propensity stratification can be applied to reduce the number of adjustment cells. Simulations in Vartivarian and Little (2002) indicate improved efficiency and robustness of weighting adjustments based on the joint classification of the sample by two …
Estimating Predictors For Long- Or Short-Term Survivors, Lu Tian, Wei Wang, L. J. Wei
Estimating Predictors For Long- Or Short-Term Survivors, Lu Tian, Wei Wang, L. J. Wei
Harvard University Biostatistics Working Paper Series
No abstract provided.
Smooth Quantile Ratio Estimation With Regression: Estimating Medical Expenditures For Smoking Attributable Diseases, Francesca Dominici, Scott L. Zeger
Smooth Quantile Ratio Estimation With Regression: Estimating Medical Expenditures For Smoking Attributable Diseases, Francesca Dominici, Scott L. Zeger
Johns Hopkins University, Dept. of Biostatistics Working Papers
In this paper we introduce a semi-parametric regression model for estimating the difference in the expected value of two positive and highly skewed random variables as a function of covariates. Our method extends Smooth Quantile Ratio Estimation (SQUARE), a novel estimator of the mean difference of two positive random variables, to a regression model.
The methodological development of this paper is motivated by a common problem in econometrics where we are interested in estimating the difference in the average expenditures between two populations, say with and without a disease, taking covariates into account. Let Y1 and Y2 be two positive …
A Nonparametric Comparison Of Conditional Distributions With Nonnegligible Cure Fractions, Yi Li, Jin Feng
A Nonparametric Comparison Of Conditional Distributions With Nonnegligible Cure Fractions, Yi Li, Jin Feng
Harvard University Biostatistics Working Paper Series
No abstract provided.
Survival Analysis With Heterogeneous Covariate Measurement Error, Yi Li, Louise Ryan
Survival Analysis With Heterogeneous Covariate Measurement Error, Yi Li, Louise Ryan
Harvard University Biostatistics Working Paper Series
No abstract provided.
Loss Function Based Ranking In Two-Stage, Hierarchical Models, Rongheng Lin, Thomas A. Louis, Susan M. Paddock, Greg Ridgeway
Loss Function Based Ranking In Two-Stage, Hierarchical Models, Rongheng Lin, Thomas A. Louis, Susan M. Paddock, Greg Ridgeway
Johns Hopkins University, Dept. of Biostatistics Working Papers
Several authors have studied the performance of optimal, squared error loss (SEL) estimated ranks. Though these are effective, in many applications interest focuses on identifying the relatively good (e.g., in the upper 10%) or relatively poor performers. We construct loss functions that address this goal and evaluate candidate rank estimates, some of which optimize specific loss functions. We study performance for a fully parametric hierarchical model with a Gaussian prior and Gaussian sampling distributions, evaluating performance for several loss functions. Results show that though SEL-optimal ranks and percentiles do not specifically focus on classifying with respect to a percentile cut …
Statistical Inference For Infinite Dimensional Parameters Via Asymptotically Pivotal Estimating Functions, Meredith A. Goldwasser, Lu Tian, L. J. Wei
Statistical Inference For Infinite Dimensional Parameters Via Asymptotically Pivotal Estimating Functions, Meredith A. Goldwasser, Lu Tian, L. J. Wei
Harvard University Biostatistics Working Paper Series
No abstract provided.
Model Comparisons Using Information Measures, C. Mitchell Dayton
Model Comparisons Using Information Measures, C. Mitchell Dayton
Journal of Modern Applied Statistical Methods
Methodologists have criticized the use of significance tests in the behavioral sciences but have failed to provide alternative data analysis strategies that appeal to applied researchers. For purposes of comparing alternate models for data, information-theoretic measures such as Akaike AIC have advantages in comparison with significance tests. Model-selection procedures based on a min(AIC) strategy, for example, are holistic rather than dependent upon a series of sometimes contradictory binary (accept/reject) decisions.
Fortune Cookies, Measurement Error, And Experimental Design, Greogry R. Hancock
Fortune Cookies, Measurement Error, And Experimental Design, Greogry R. Hancock
Journal of Modern Applied Statistical Methods
This article pertains to the theoretical and practical detriments of measurement error in traditional univariate and multivariate experimental design, and points toward modern methods that facilitate greater accuracy in effect size estimates and power in hypothesis testing.
A Comparison Of Equivalence Testing In Combination With Hypothesis Testing And Effect Sizes, Christopher J. Mecklin
A Comparison Of Equivalence Testing In Combination With Hypothesis Testing And Effect Sizes, Christopher J. Mecklin
Journal of Modern Applied Statistical Methods
Equivalence testing, an alternative to testing for statistical significance, is little used in educational research. Equivalence testing is useful in situations where the researcher wishes to show that two means are not significantly different. A simulation study assessed the relationships between effect size, sample size, statistical significance, and statistical equivalence.
Approximate Bayesian Confidence Intervals For The Variance Of A Gaussian Distribution, Vincent A. R. Camara
Approximate Bayesian Confidence Intervals For The Variance Of A Gaussian Distribution, Vincent A. R. Camara
Journal of Modern Applied Statistical Methods
The aim of the present study is to obtain and compare confidence intervals for the variance of a Gaussian distribution. Considering respectively the square error and the Higgins-Tsokos loss functions, approximate Bayesian confidence intervals for the variance of a normal population are derived. Using normal data and SAS software, the obtained approximate Bayesian confidence intervals will then be compared to the ones obtained with the well known classical method. The Bayesian approach relies only on the observations. It is shown that the proposed approximate Bayesian approach relies only on the observations. The classical method, that uses the Chi-square statistic, does …
Using Zero-Inflated Count Regression Models To Estimate The Fertility Of U. S. Women, Dudley L. Poston Jr., Sherry L. Mckibben
Using Zero-Inflated Count Regression Models To Estimate The Fertility Of U. S. Women, Dudley L. Poston Jr., Sherry L. Mckibben
Journal of Modern Applied Statistical Methods
In the modeling of count variables there is sometimes a preponderance of zero counts. This article concerns the estimation of Poisson regression models (PRM) and negative binomial regression models (NBRM) to predict the average number of children ever born (CEB) to women in the U.S. The PRM and NBRM will often under-predict zeros because they do not consider zero counts of women who are not trying to have children. The fertility of U.S. white and Mexican-origin women show that zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB) models perform better in many respects than the Poisson and negative binomial models. …
Test Of Homogeneity For Umbrella Alternatives In Dose-Response Relationship For Poisson Variables, Chengjie Xiong, Yan Yan, Ming Ji
Test Of Homogeneity For Umbrella Alternatives In Dose-Response Relationship For Poisson Variables, Chengjie Xiong, Yan Yan, Ming Ji
Journal of Modern Applied Statistical Methods
This article concerns the testing and estimation of a dose-response effect in medical studies. We study the statistical test of homogeneity against umbrella alternatives in a sequence of Poisson distributions associated with an ordered dose variable. We propose a test similar to Cochran-Armitage’s trend test and study the asymptotic null distribution and the power of the test. We also propose an estimator to the vertex point when the umbrella pattern is confirmed and study the performance of the estimator. A real data set pertaining to the number of visible revertant colonies associated with different doses of test agents in an …
Alphabet Letter Recognition And Emergent Literacy Abilities Of Rising Kindergarten Children Living In Low-Income Families, Stephanie Wehry
Alphabet Letter Recognition And Emergent Literacy Abilities Of Rising Kindergarten Children Living In Low-Income Families, Stephanie Wehry
Journal of Modern Applied Statistical Methods
Alphabet letter recognition item responses from 1,299 rising kindergarten children from low-income families were used to determine the dimensionality of letter recognition ability. The rising kindergarteners were enrolled in preschool classrooms implementing a research-based early literary curriculum. Item responses from the TERA-3 subtests were also analyzed. Results indicated alphabet letter recognition was unitary. The ability of boys and younger children was less than girls and older children. Child-level letter recognition was highly associated with TERA-3 measures of letter knowledge and conventions of print. Classroom-level mean letter recognition ability accounted for most of variance in classroom mean TERA-3 scores.
A Note On Mles For Normal Distribution Parameters Based On Disjoint Partial Sums Of A Random Sample, W. J. Hurley
A Note On Mles For Normal Distribution Parameters Based On Disjoint Partial Sums Of A Random Sample, W. J. Hurley
Journal of Modern Applied Statistical Methods
Maximum likelihood estimators are computed for the parameters of a normal distribution based on disjoint partial sums of a random sample. It has application in the disaggregation of financial data.
Deconstructing Arguments From The Case Against Hypothesis Testing, Shlomo S. Sawilowsky
Deconstructing Arguments From The Case Against Hypothesis Testing, Shlomo S. Sawilowsky
Journal of Modern Applied Statistical Methods
The main purpose of this article is to contest the propositions that (1) hypothesis tests should be abandoned in favor of confidence intervals, and (2) science has not benefited from hypothesis testing. The minor purpose is to propose (1) descriptive statistics, graphics, and effect sizes do not obviate the need for hypothesis testing, (2) significance testing (reporting p values and leaving it to the reader to determine significance) is subjective and outside the realm of the scientific method, and (3) Bayesian and qualitative methods should be used for Bayesian and qualitative research studies, respectively.
Conventional And Robust Paired And Independent-Samples T Tests: Type I Error And Power Rates, Katherine Fradette, H. J. Keselman, Lisa Lix, James Algina, Rand R. Wilcox
Conventional And Robust Paired And Independent-Samples T Tests: Type I Error And Power Rates, Katherine Fradette, H. J. Keselman, Lisa Lix, James Algina, Rand R. Wilcox
Journal of Modern Applied Statistical Methods
Monte Carlo methods were used to examine Type I error and power rates of 2 versions (conventional and robust) of the paired and independent-samples t tests under nonnormality. The conventional (robust) versions employed least squares means and variances (trimmed means and Winsorized variances) to test for differences between groups.
Fitting Generalized Linear Mixed Models For Point-Referenced Spatial Data, Armin Gemperli, Penelope Vounatsou
Fitting Generalized Linear Mixed Models For Point-Referenced Spatial Data, Armin Gemperli, Penelope Vounatsou
Journal of Modern Applied Statistical Methods
Non-Gaussian point-referenced spatial data are frequently modeled using generalized linear mixed models (GLMM) with location-specific random effects. Spatial dependence can be introduced in the covariance matrix of the random effects. Maximum likelihood-based or Bayesian estimation implemented via Markov chain Monte Carlo (MCMC) for such models is computationally demanding especially for large sample sizes because of the large number of random effects and the inversion of the covariance matrix involved in the likelihood. We review three fitting procedures, the Penalized Quasi Likelihood method, the MCMC, and the Sampling-Importance-Resampling method. They are assessed in terms of estimation accuracy, ease of implementation, and …
Jmasm9: Converting Kendall’S Tau For Correlational Or Meta-Analytic Analyses, David A. Walker
Jmasm9: Converting Kendall’S Tau For Correlational Or Meta-Analytic Analyses, David A. Walker
Journal of Modern Applied Statistical Methods
Expanding on past research, this study provides researchers with a detailed table for use in meta-analytic applications when engaged in assorted examinations of various r-related statistics, such as Kendall’s tau (τ) and Cohen’s d, that estimate the magnitude of experimental or observational effect. A program to convert from the lesser-used tau coefficient to other effect size indices when conducting correlational or meta-analytic analyses is presented.
Joint Modeling And Estimation For Recurrent Event Processes And Failure Time Data, Chiung-Yu Huang, Mei-Cheng Wang
Joint Modeling And Estimation For Recurrent Event Processes And Failure Time Data, Chiung-Yu Huang, Mei-Cheng Wang
Johns Hopkins University, Dept. of Biostatistics Working Papers
Recurrent event data are commonly encountered in longitudinal follow-up studies related to biomedical science, econometrics, reliability, and demography. In many studies, recurrent events serve as important measurements for evaluating disease progression, health deterioration, or insurance risk. When analyzing recurrent event data, an independent censoring condition is typically required for the construction of statistical methods. Nevertheless, in some situations, the terminating time for observing recurrent events could be correlated with the recurrent event process and, as a result, the assumption of independent censoring is violated. In this paper, we consider joint modeling of a recurrent event process and a failure time …
P* Index Of Segregation: Distribution Under Reassignment, Charles F. Bond, F. D. Richard
P* Index Of Segregation: Distribution Under Reassignment, Charles F. Bond, F. D. Richard
Journal of Modern Applied Statistical Methods
Students of intergroup relations have measured segregation with a P* index. In this article, we describe the distribution of this index under a stochastic model. We derive exact, closed-form expressions for the mean, variance, and skewness of P* under random segregation. These yield equivalent expressions for a second segregation index: η2. Our analytic results reveal some of the distributional properties of these indices, inform new standardizations of the indices, and enable small-sample significance testing. Two illustrative examples are presented.
A Critical Examination Of The Use Of Preliminary Tests In Two-Sample Tests Of Location, Kimberly T. Perry
A Critical Examination Of The Use Of Preliminary Tests In Two-Sample Tests Of Location, Kimberly T. Perry
Journal of Modern Applied Statistical Methods
This paper explores the appropriateness of testing the equality of two means using either a t test, the Welch test, or the Wilcoxon-Mann-Whitney test for two independent samples based on the results of using two classes of preliminary tests (i.e., tests for population variance equality and symmetry in underlying distributions).
Confidence Intervals For P(X Less Than Y) In The Exponential Case With Common Location Parameter, Ayman Baklizi
Confidence Intervals For P(X Less Than Y) In The Exponential Case With Common Location Parameter, Ayman Baklizi
Journal of Modern Applied Statistical Methods
The problem considered is interval estimation of the stress - strength reliability R = P(Xθ and λ respectively and a common location parameter μ . Several types of asymptotic, approximate and bootstrap intervals are investigated. Performances are investigated using simulation techniques and compared in terms of attainment of the nominal confidence level, symmetry of lower and upper error rates, and expected length. Recommendations concerning their usage are given.
Random Regression Models Based On The Elliptically Contoured Distribution Assumptions With Applications To Longitudinal Data, Alfred A. Bartolucci, Shimin Zheng, Sejong Bae, Karan P. Singh
Random Regression Models Based On The Elliptically Contoured Distribution Assumptions With Applications To Longitudinal Data, Alfred A. Bartolucci, Shimin Zheng, Sejong Bae, Karan P. Singh
Journal of Modern Applied Statistical Methods
We generalize Lyles et al.’s (2000) random regression models for longitudinal data, accounting for both undetectable values and informative drop-outs in the distribution assumptions. Our models are constructed on the generalized multivariate theory which is based on the Elliptically Contoured Distribution (ECD). The estimation of the fixed parameters in the random regression models are invariant under the normal or the ECD assumptions. For the Human Immunodeficiency Virus Epidemiology Research Study data, ECD models fit the data better than classical normal models according to the Akaike (1974) Information Criterion. We also note that both univariate distributions of the random intercept and …