Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 102

Full-Text Articles in Physical Sciences and Mathematics

Robust Likelihood-Based Analysis Of Multivariate Data With Missing Values, Rod Little, An Hyonggin Dec 2003

Robust Likelihood-Based Analysis Of Multivariate Data With Missing Values, Rod Little, An Hyonggin

The University of Michigan Department of Biostatistics Working Paper Series

The model-based approach to inference from multivariate data with missing values is reviewed. Regression prediction is most useful when the covariates are predictive of the missing values and the probability of being missing, and in these circumstances predictions are particularly sensitive to model misspecification. The use of penalized splines of the propensity score is proposed to yield robust model-based inference under the missing at random (MAR) assumption, assuming monotone missing data. Simulation comparisons with other methods suggest that the method works well in a wide range of populations, with little loss of efficiency relative to parametric models when the latter …


Multiple Testing. Part Ii. Step-Down Procedures For Control Of The Family-Wise Error Rate, Mark J. Van Der Laan, Sandrine Dudoit, Katherine S. Pollard Dec 2003

Multiple Testing. Part Ii. Step-Down Procedures For Control Of The Family-Wise Error Rate, Mark J. Van Der Laan, Sandrine Dudoit, Katherine S. Pollard

U.C. Berkeley Division of Biostatistics Working Paper Series

The present article proposes two step-down multiple testing procedures for asymptotic control of the family-wise error rate (FWER): the first procedure is based on maxima of test statistics (step-down maxT), while the second relies on minima of unadjusted p-values (step-down minP). A key feature of our approach is the test statistics null distribution (rather than data generating null distribution) used to derive cut-offs (i.e., rejection regions) for these test statistics and the resulting adjusted p-values. For general null hypotheses, corresponding to submodels for the data generating distribution, we identify an asymptotic domination condition for a null distribution under which the …


Multiple Testing. Part I. Single-Step Procedures For Control Of General Type I Error Rates, Sandrine Dudoit, Mark J. Van Der Laan, Katherine S. Pollard Dec 2003

Multiple Testing. Part I. Single-Step Procedures For Control Of General Type I Error Rates, Sandrine Dudoit, Mark J. Van Der Laan, Katherine S. Pollard

U.C. Berkeley Division of Biostatistics Working Paper Series

The present article proposes general single-step multiple testing procedures for controlling Type I error rates defined as arbitrary parameters of the distribution of the number of Type I errors, such as the generalized family-wise error rate. A key feature of our approach is the test statistics null distribution (rather than data generating null distribution) used to derive cut-offs (i.e., rejection regions) for these test statistics and the resulting adjusted p-values. For general null hypotheses, corresponding to submodels for the data generating distribution, we identify an asymptotic domination condition for a null distribution under which single-step common-quantile and common-cut-off procedures asymptotically …


Loss-Based Estimation With Cross-Validation: Applications To Microarray Data Analysis And Motif Finding, Sandrine Dudoit, Mark J. Van Der Laan, Sunduz Keles, Annette M. Molinaro, Sandra E. Sinisi, Siew Leng Teng Dec 2003

Loss-Based Estimation With Cross-Validation: Applications To Microarray Data Analysis And Motif Finding, Sandrine Dudoit, Mark J. Van Der Laan, Sunduz Keles, Annette M. Molinaro, Sandra E. Sinisi, Siew Leng Teng

U.C. Berkeley Division of Biostatistics Working Paper Series

Current statistical inference problems in genomic data analysis involve parameter estimation for high-dimensional multivariate distributions, with typically unknown and intricate correlation patterns among variables. Addressing these inference questions satisfactorily requires: (i) an intensive and thorough search of the parameter space to generate good candidate estimators, (ii) an approach for selecting an optimal estimator among these candidates, and (iii) a method for reliably assessing the performance of the resulting estimator. We propose a unified loss-based methodology for estimator construction, selection, and performance assessment with cross-validation. In this approach, the parameter of interest is defined as the risk minimizer for a suitable …


Kernel Estimation Of Rate Function For Recurrent Event Data, Chin-Tsang Chiang, Mei-Cheng Wang, Chiung-Yu Huang Dec 2003

Kernel Estimation Of Rate Function For Recurrent Event Data, Chin-Tsang Chiang, Mei-Cheng Wang, Chiung-Yu Huang

Johns Hopkins University, Dept. of Biostatistics Working Papers

Recurrent event data are largely characterized by the rate function but smoothing techniques for estimating the rate function have never been rigorously developed or studied in statistical literature. This paper considers the moment and least squares methods for estimating the rate function from recurrent event data. With an independent censoring assumption on the recurrent event process, we study statistical properties of the proposed estimators and propose bootstrap procedures for the bandwidth selection and for the approximation of confidence intervals in the estimation of the occurrence rate function. It is identified that the moment method without resmoothing via a smaller bandwidth …


Unified Cross-Validation Methodology For Selection Among Estimators And A General Cross-Validated Adaptive Epsilon-Net Estimator: Finite Sample Oracle Inequalities And Examples, Mark J. Van Der Laan, Sandrine Dudoit Nov 2003

Unified Cross-Validation Methodology For Selection Among Estimators And A General Cross-Validated Adaptive Epsilon-Net Estimator: Finite Sample Oracle Inequalities And Examples, Mark J. Van Der Laan, Sandrine Dudoit

U.C. Berkeley Division of Biostatistics Working Paper Series

In Part I of this article we propose a general cross-validation criterian for selecting among a collection of estimators of a particular parameter of interest based on n i.i.d. observations. It is assumed that the parameter of interest minimizes the expectation (w.r.t. to the distribution of the observed data structure) of a particular loss function of a candidate parameter value and the observed data structure, possibly indexed by a nuisance parameter. The proposed cross-validation criterian is defined as the empirical mean over the validation sample of the loss function at the parameter estimate based on the training sample, averaged over …


Weighting Adjustments For Unit Nonresponse With Multiple Outcome Variables, Sonya L. Vartivarian, Rod Little Nov 2003

Weighting Adjustments For Unit Nonresponse With Multiple Outcome Variables, Sonya L. Vartivarian, Rod Little

The University of Michigan Department of Biostatistics Working Paper Series

Weighting is a common form of unit nonresponse adjustment in sample surveys where entire questionnaires are missing due to noncontact or refusal to participate. Weights are inversely proportional to the probability of selection and response. A common approach computes the response weight adjustment cells based on covariate information. When the number of cells thus created is too large, a coarsening method such as response propensity stratification can be applied to reduce the number of adjustment cells. Simulations in Vartivarian and Little (2002) indicate improved efficiency and robustness of weighting adjustments based on the joint classification of the sample by two …


Estimating Predictors For Long- Or Short-Term Survivors, Lu Tian, Wei Wang, L. J. Wei Nov 2003

Estimating Predictors For Long- Or Short-Term Survivors, Lu Tian, Wei Wang, L. J. Wei

Harvard University Biostatistics Working Paper Series

No abstract provided.


Smooth Quantile Ratio Estimation With Regression: Estimating Medical Expenditures For Smoking Attributable Diseases, Francesca Dominici, Scott L. Zeger Nov 2003

Smooth Quantile Ratio Estimation With Regression: Estimating Medical Expenditures For Smoking Attributable Diseases, Francesca Dominici, Scott L. Zeger

Johns Hopkins University, Dept. of Biostatistics Working Papers

In this paper we introduce a semi-parametric regression model for estimating the difference in the expected value of two positive and highly skewed random variables as a function of covariates. Our method extends Smooth Quantile Ratio Estimation (SQUARE), a novel estimator of the mean difference of two positive random variables, to a regression model.

The methodological development of this paper is motivated by a common problem in econometrics where we are interested in estimating the difference in the average expenditures between two populations, say with and without a disease, taking covariates into account. Let Y1 and Y2 be two positive …


A Nonparametric Comparison Of Conditional Distributions With Nonnegligible Cure Fractions, Yi Li, Jin Feng Nov 2003

A Nonparametric Comparison Of Conditional Distributions With Nonnegligible Cure Fractions, Yi Li, Jin Feng

Harvard University Biostatistics Working Paper Series

No abstract provided.


Survival Analysis With Heterogeneous Covariate Measurement Error, Yi Li, Louise Ryan Nov 2003

Survival Analysis With Heterogeneous Covariate Measurement Error, Yi Li, Louise Ryan

Harvard University Biostatistics Working Paper Series

No abstract provided.


Loss Function Based Ranking In Two-Stage, Hierarchical Models, Rongheng Lin, Thomas A. Louis, Susan M. Paddock, Greg Ridgeway Nov 2003

Loss Function Based Ranking In Two-Stage, Hierarchical Models, Rongheng Lin, Thomas A. Louis, Susan M. Paddock, Greg Ridgeway

Johns Hopkins University, Dept. of Biostatistics Working Papers

Several authors have studied the performance of optimal, squared error loss (SEL) estimated ranks. Though these are effective, in many applications interest focuses on identifying the relatively good (e.g., in the upper 10%) or relatively poor performers. We construct loss functions that address this goal and evaluate candidate rank estimates, some of which optimize specific loss functions. We study performance for a fully parametric hierarchical model with a Gaussian prior and Gaussian sampling distributions, evaluating performance for several loss functions. Results show that though SEL-optimal ranks and percentiles do not specifically focus on classifying with respect to a percentile cut …


Statistical Inference For Infinite Dimensional Parameters Via Asymptotically Pivotal Estimating Functions, Meredith A. Goldwasser, Lu Tian, L. J. Wei Nov 2003

Statistical Inference For Infinite Dimensional Parameters Via Asymptotically Pivotal Estimating Functions, Meredith A. Goldwasser, Lu Tian, L. J. Wei

Harvard University Biostatistics Working Paper Series

No abstract provided.


Model Comparisons Using Information Measures, C. Mitchell Dayton Nov 2003

Model Comparisons Using Information Measures, C. Mitchell Dayton

Journal of Modern Applied Statistical Methods

Methodologists have criticized the use of significance tests in the behavioral sciences but have failed to provide alternative data analysis strategies that appeal to applied researchers. For purposes of comparing alternate models for data, information-theoretic measures such as Akaike AIC have advantages in comparison with significance tests. Model-selection procedures based on a min(AIC) strategy, for example, are holistic rather than dependent upon a series of sometimes contradictory binary (accept/reject) decisions.


Fortune Cookies, Measurement Error, And Experimental Design, Greogry R. Hancock Nov 2003

Fortune Cookies, Measurement Error, And Experimental Design, Greogry R. Hancock

Journal of Modern Applied Statistical Methods

This article pertains to the theoretical and practical detriments of measurement error in traditional univariate and multivariate experimental design, and points toward modern methods that facilitate greater accuracy in effect size estimates and power in hypothesis testing.


A Comparison Of Equivalence Testing In Combination With Hypothesis Testing And Effect Sizes, Christopher J. Mecklin Nov 2003

A Comparison Of Equivalence Testing In Combination With Hypothesis Testing And Effect Sizes, Christopher J. Mecklin

Journal of Modern Applied Statistical Methods

Equivalence testing, an alternative to testing for statistical significance, is little used in educational research. Equivalence testing is useful in situations where the researcher wishes to show that two means are not significantly different. A simulation study assessed the relationships between effect size, sample size, statistical significance, and statistical equivalence.


Approximate Bayesian Confidence Intervals For The Variance Of A Gaussian Distribution, Vincent A. R. Camara Nov 2003

Approximate Bayesian Confidence Intervals For The Variance Of A Gaussian Distribution, Vincent A. R. Camara

Journal of Modern Applied Statistical Methods

The aim of the present study is to obtain and compare confidence intervals for the variance of a Gaussian distribution. Considering respectively the square error and the Higgins-Tsokos loss functions, approximate Bayesian confidence intervals for the variance of a normal population are derived. Using normal data and SAS software, the obtained approximate Bayesian confidence intervals will then be compared to the ones obtained with the well known classical method. The Bayesian approach relies only on the observations. It is shown that the proposed approximate Bayesian approach relies only on the observations. The classical method, that uses the Chi-square statistic, does …


Using Zero-Inflated Count Regression Models To Estimate The Fertility Of U. S. Women, Dudley L. Poston Jr., Sherry L. Mckibben Nov 2003

Using Zero-Inflated Count Regression Models To Estimate The Fertility Of U. S. Women, Dudley L. Poston Jr., Sherry L. Mckibben

Journal of Modern Applied Statistical Methods

In the modeling of count variables there is sometimes a preponderance of zero counts. This article concerns the estimation of Poisson regression models (PRM) and negative binomial regression models (NBRM) to predict the average number of children ever born (CEB) to women in the U.S. The PRM and NBRM will often under-predict zeros because they do not consider zero counts of women who are not trying to have children. The fertility of U.S. white and Mexican-origin women show that zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB) models perform better in many respects than the Poisson and negative binomial models. …


Test Of Homogeneity For Umbrella Alternatives In Dose-Response Relationship For Poisson Variables, Chengjie Xiong, Yan Yan, Ming Ji Nov 2003

Test Of Homogeneity For Umbrella Alternatives In Dose-Response Relationship For Poisson Variables, Chengjie Xiong, Yan Yan, Ming Ji

Journal of Modern Applied Statistical Methods

This article concerns the testing and estimation of a dose-response effect in medical studies. We study the statistical test of homogeneity against umbrella alternatives in a sequence of Poisson distributions associated with an ordered dose variable. We propose a test similar to Cochran-Armitage’s trend test and study the asymptotic null distribution and the power of the test. We also propose an estimator to the vertex point when the umbrella pattern is confirmed and study the performance of the estimator. A real data set pertaining to the number of visible revertant colonies associated with different doses of test agents in an …


Alphabet Letter Recognition And Emergent Literacy Abilities Of Rising Kindergarten Children Living In Low-Income Families, Stephanie Wehry Nov 2003

Alphabet Letter Recognition And Emergent Literacy Abilities Of Rising Kindergarten Children Living In Low-Income Families, Stephanie Wehry

Journal of Modern Applied Statistical Methods

Alphabet letter recognition item responses from 1,299 rising kindergarten children from low-income families were used to determine the dimensionality of letter recognition ability. The rising kindergarteners were enrolled in preschool classrooms implementing a research-based early literary curriculum. Item responses from the TERA-3 subtests were also analyzed. Results indicated alphabet letter recognition was unitary. The ability of boys and younger children was less than girls and older children. Child-level letter recognition was highly associated with TERA-3 measures of letter knowledge and conventions of print. Classroom-level mean letter recognition ability accounted for most of variance in classroom mean TERA-3 scores.


A Note On Mles For Normal Distribution Parameters Based On Disjoint Partial Sums Of A Random Sample, W. J. Hurley Nov 2003

A Note On Mles For Normal Distribution Parameters Based On Disjoint Partial Sums Of A Random Sample, W. J. Hurley

Journal of Modern Applied Statistical Methods

Maximum likelihood estimators are computed for the parameters of a normal distribution based on disjoint partial sums of a random sample. It has application in the disaggregation of financial data.


Deconstructing Arguments From The Case Against Hypothesis Testing, Shlomo S. Sawilowsky Nov 2003

Deconstructing Arguments From The Case Against Hypothesis Testing, Shlomo S. Sawilowsky

Journal of Modern Applied Statistical Methods

The main purpose of this article is to contest the propositions that (1) hypothesis tests should be abandoned in favor of confidence intervals, and (2) science has not benefited from hypothesis testing. The minor purpose is to propose (1) descriptive statistics, graphics, and effect sizes do not obviate the need for hypothesis testing, (2) significance testing (reporting p values and leaving it to the reader to determine significance) is subjective and outside the realm of the scientific method, and (3) Bayesian and qualitative methods should be used for Bayesian and qualitative research studies, respectively.


Conventional And Robust Paired And Independent-Samples T Tests: Type I Error And Power Rates, Katherine Fradette, H. J. Keselman, Lisa Lix, James Algina, Rand R. Wilcox Nov 2003

Conventional And Robust Paired And Independent-Samples T Tests: Type I Error And Power Rates, Katherine Fradette, H. J. Keselman, Lisa Lix, James Algina, Rand R. Wilcox

Journal of Modern Applied Statistical Methods

Monte Carlo methods were used to examine Type I error and power rates of 2 versions (conventional and robust) of the paired and independent-samples t tests under nonnormality. The conventional (robust) versions employed least squares means and variances (trimmed means and Winsorized variances) to test for differences between groups.


Fitting Generalized Linear Mixed Models For Point-Referenced Spatial Data, Armin Gemperli, Penelope Vounatsou Nov 2003

Fitting Generalized Linear Mixed Models For Point-Referenced Spatial Data, Armin Gemperli, Penelope Vounatsou

Journal of Modern Applied Statistical Methods

Non-Gaussian point-referenced spatial data are frequently modeled using generalized linear mixed models (GLMM) with location-specific random effects. Spatial dependence can be introduced in the covariance matrix of the random effects. Maximum likelihood-based or Bayesian estimation implemented via Markov chain Monte Carlo (MCMC) for such models is computationally demanding especially for large sample sizes because of the large number of random effects and the inversion of the covariance matrix involved in the likelihood. We review three fitting procedures, the Penalized Quasi Likelihood method, the MCMC, and the Sampling-Importance-Resampling method. They are assessed in terms of estimation accuracy, ease of implementation, and …


Jmasm9: Converting Kendall’S Tau For Correlational Or Meta-Analytic Analyses, David A. Walker Nov 2003

Jmasm9: Converting Kendall’S Tau For Correlational Or Meta-Analytic Analyses, David A. Walker

Journal of Modern Applied Statistical Methods

Expanding on past research, this study provides researchers with a detailed table for use in meta-analytic applications when engaged in assorted examinations of various r-related statistics, such as Kendall’s tau (τ) and Cohen’s d, that estimate the magnitude of experimental or observational effect. A program to convert from the lesser-used tau coefficient to other effect size indices when conducting correlational or meta-analytic analyses is presented.


Joint Modeling And Estimation For Recurrent Event Processes And Failure Time Data, Chiung-Yu Huang, Mei-Cheng Wang Nov 2003

Joint Modeling And Estimation For Recurrent Event Processes And Failure Time Data, Chiung-Yu Huang, Mei-Cheng Wang

Johns Hopkins University, Dept. of Biostatistics Working Papers

Recurrent event data are commonly encountered in longitudinal follow-up studies related to biomedical science, econometrics, reliability, and demography. In many studies, recurrent events serve as important measurements for evaluating disease progression, health deterioration, or insurance risk. When analyzing recurrent event data, an independent censoring condition is typically required for the construction of statistical methods. Nevertheless, in some situations, the terminating time for observing recurrent events could be correlated with the recurrent event process and, as a result, the assumption of independent censoring is violated. In this paper, we consider joint modeling of a recurrent event process and a failure time …


P* Index Of Segregation: Distribution Under Reassignment, Charles F. Bond, F. D. Richard Nov 2003

P* Index Of Segregation: Distribution Under Reassignment, Charles F. Bond, F. D. Richard

Journal of Modern Applied Statistical Methods

Students of intergroup relations have measured segregation with a P* index. In this article, we describe the distribution of this index under a stochastic model. We derive exact, closed-form expressions for the mean, variance, and skewness of P* under random segregation. These yield equivalent expressions for a second segregation index: η2. Our analytic results reveal some of the distributional properties of these indices, inform new standardizations of the indices, and enable small-sample significance testing. Two illustrative examples are presented.


A Critical Examination Of The Use Of Preliminary Tests In Two-Sample Tests Of Location, Kimberly T. Perry Nov 2003

A Critical Examination Of The Use Of Preliminary Tests In Two-Sample Tests Of Location, Kimberly T. Perry

Journal of Modern Applied Statistical Methods

This paper explores the appropriateness of testing the equality of two means using either a t test, the Welch test, or the Wilcoxon-Mann-Whitney test for two independent samples based on the results of using two classes of preliminary tests (i.e., tests for population variance equality and symmetry in underlying distributions).


Confidence Intervals For P(X Less Than Y) In The Exponential Case With Common Location Parameter, Ayman Baklizi Nov 2003

Confidence Intervals For P(X Less Than Y) In The Exponential Case With Common Location Parameter, Ayman Baklizi

Journal of Modern Applied Statistical Methods

The problem considered is interval estimation of the stress - strength reliability R = P(Xθ and λ respectively and a common location parameter μ . Several types of asymptotic, approximate and bootstrap intervals are investigated. Performances are investigated using simulation techniques and compared in terms of attainment of the nominal confidence level, symmetry of lower and upper error rates, and expected length. Recommendations concerning their usage are given.


Random Regression Models Based On The Elliptically Contoured Distribution Assumptions With Applications To Longitudinal Data, Alfred A. Bartolucci, Shimin Zheng, Sejong Bae, Karan P. Singh Nov 2003

Random Regression Models Based On The Elliptically Contoured Distribution Assumptions With Applications To Longitudinal Data, Alfred A. Bartolucci, Shimin Zheng, Sejong Bae, Karan P. Singh

Journal of Modern Applied Statistical Methods

We generalize Lyles et al.’s (2000) random regression models for longitudinal data, accounting for both undetectable values and informative drop-outs in the distribution assumptions. Our models are constructed on the generalized multivariate theory which is based on the Elliptically Contoured Distribution (ECD). The estimation of the fixed parameters in the random regression models are invariant under the normal or the ECD assumptions. For the Human Immunodeficiency Virus Epidemiology Research Study data, ECD models fit the data better than classical normal models according to the Akaike (1974) Information Criterion. We also note that both univariate distributions of the random intercept and …