Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 105

Full-Text Articles in Statistics and Probability

Multiple Testing Procedures For Controlling Tail Probability Error Rates, Sandrine Dudoit, Mark J. Van Der Laan, Merrill D. Birkner Dec 2004

Multiple Testing Procedures For Controlling Tail Probability Error Rates, Sandrine Dudoit, Mark J. Van Der Laan, Merrill D. Birkner

U.C. Berkeley Division of Biostatistics Working Paper Series

The present article discusses and compares multiple testing procedures (MTP) for controlling Type I error rates defined as tail probabilities for the number (gFWER) and proportion (TPPFP) of false positives among the rejected hypotheses. Specifically, we consider the gFWER- and TPPFP-controlling MTPs proposed recently by Lehmann & Romano (2004) and in a series of four articles by Dudoit et al. (2004), van der Laan et al. (2004b,a), and Pollard & van der Laan (2004). The former Lehmann & Romano (2004) procedures are marginal, in the sense that they are based solely on the marginal distributions of the test statistics, i.e., …


Multiple Testing Procedures: R Multtest Package And Applications To Genomics, Katherine S. Pollard, Sandrine Dudoit, Mark J. Van Der Laan Dec 2004

Multiple Testing Procedures: R Multtest Package And Applications To Genomics, Katherine S. Pollard, Sandrine Dudoit, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

The Bioconductor R package multtest implements widely applicable resampling-based single-step and stepwise multiple testing procedures (MTP) for controlling a broad class of Type I error rates, in testing problems involving general data generating distributions (with arbitrary dependence structures among variables), null hypotheses, and test statistics. The current version of multtest provides MTPs for tests concerning means, differences in means, and regression parameters in linear and Cox proportional hazards models. Procedures are provided to control Type I error rates defined as tail probabilities for arbitrary functions of the numbers of false positives and rejected hypotheses. These error rates include tail probabilities …


Semiparametric Regression In Capture-Recapture Modelling, O. Gimenez, C. Barbraud, Ciprian M. Crainiceanu, S. Jenouvrier, B.T. Morgan Dec 2004

Semiparametric Regression In Capture-Recapture Modelling, O. Gimenez, C. Barbraud, Ciprian M. Crainiceanu, S. Jenouvrier, B.T. Morgan

Johns Hopkins University, Dept. of Biostatistics Working Papers

Capture-recapture models were developed to estimate survival using data arising from marking and monitoring wild animals over time. Variation in the survival process may be explained by incorporating relevant covariates. We develop nonparametric and semiparametric regression models for estimating survival in capture-recapture models. A fully Bayesian approach using MCMC simulations was employed to estimate the model parameters. The work is illustrated by a study of Snow petrels, in which survival probabilities are expressed as nonlinear functions of a climate covariate, using data from a 40-year study on marked individuals, nesting at Petrels Island, Terre Adelie.


Semi-Parametric Single-Index Two-Part Regression Models, Xiao-Hua Zhou, Hua Liang Dec 2004

Semi-Parametric Single-Index Two-Part Regression Models, Xiao-Hua Zhou, Hua Liang

UW Biostatistics Working Paper Series

In this paper, we proposed a semi-parametric single-index two-part regression model to weaken assumptions in parametric regression methods that were frequently used in the analysis of skewed data with additional zero values. The estimation procedure for the parameters of interest in the model was easily implemented. The proposed estimators were shown to be consistent and asymptotically normal. Through a simulation study, we showed that the proposed estimators have reasonable finite-sample performance. We illustrated the application of the proposed method in one real study on the analysis of health care costs.


A Bayesian Mixture Model Relating Dose To Critical Organs And Functional Complication In 3d Conformal Radiation Therapy, Tim Johnson, Jeremy Taylor, Randall K. Ten Haken, Avraham Eisbruch Nov 2004

A Bayesian Mixture Model Relating Dose To Critical Organs And Functional Complication In 3d Conformal Radiation Therapy, Tim Johnson, Jeremy Taylor, Randall K. Ten Haken, Avraham Eisbruch

The University of Michigan Department of Biostatistics Working Paper Series

A goal of radiation therapy is to deliver maximum dose to the target tumor while minimizing complications due to irradiation of critical organs. Technological advances in 3D conformal radiation therapy has allowed great strides in realizing this goal, however complications may still arise. Critical organs may be adjacent to tumors or in the path of the radiation beam. Several mathematical models have been proposed that describe a relationship between dose and observed functional complication, however only a few published studies have successfully fit these models to data using modern statistical methods which make efficient use of the data. One complication …


Choice Of Monitoring Mechanism For Optimal Nonparametric Functional Estimation For Binary Data, Nicholas P. Jewell, Mark J. Van Der Laan, Stephen Shiboski Nov 2004

Choice Of Monitoring Mechanism For Optimal Nonparametric Functional Estimation For Binary Data, Nicholas P. Jewell, Mark J. Van Der Laan, Stephen Shiboski

U.C. Berkeley Division of Biostatistics Working Paper Series

Optimal designs of dose levels in order to estimate parameters from a model for binary response data have a long and rich history. These designs are based on parametric models. Here we consider fully nonparametric models with interest focused on estimation of smooth functionals using plug-in estimators based on the nonparametric maximum likelihood estimator. An important application of the results is the derivation of the optimal choice of the monitoring time distribution function for current status observation of a survival distribution. The optimal choice depends in a simple way on the dose response function and the form of the functional. …


Semiparametric Binary Regression Under Monotonicity Constraints, Moulinath Banerjee, Pinaki Biswas, Debashis Ghosh Nov 2004

Semiparametric Binary Regression Under Monotonicity Constraints, Moulinath Banerjee, Pinaki Biswas, Debashis Ghosh

The University of Michigan Department of Biostatistics Working Paper Series

Summary: We study a binary regression model where the response variable $\Delta$ is the indicator of an event of interest (for example, the incidence of cancer) and the set of covariates can be partitioned as $(X,Z)$ where $Z$ (real valued) is the covariate of primary interest and $X$ (vector valued) denotes a set of control variables. For any fixed $X$, the conditional probability of the event of interest is assumed to be a monotonic function of $Z$. The effect of the control variables is captured by a regression parameter $\beta$. We show that the baseline conditional probability function (corresponding to …


Deletion/Substitution/Addition Algorithm For Partitioning The Covariate Space In Prediction, Annette Molinaro, Mark J. Van Der Laan Nov 2004

Deletion/Substitution/Addition Algorithm For Partitioning The Covariate Space In Prediction, Annette Molinaro, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

We propose a new method for predicting censored (and non-censored) clinical outcomes from a highly-complex covariate space. Previously we suggested a unified strategy for predictor construction, selection, and performance assessment. Here we introduce a new algorithm which generates a piecewise constant estimation sieve of candidate predictors based on an intensive and comprehensive search over the entire covariate space. This algorithm allows us to elucidate interactions and correlation patterns in addition to main effects.


Confidence Intervals On Subsets May Be Misleading, Juliet Popper Shaffer Nov 2004

Confidence Intervals On Subsets May Be Misleading, Juliet Popper Shaffer

Journal of Modern Applied Statistical Methods

A combination of hypothesis testing and confidence interval construction is often used in social and behavioral science studies. Sometimes confidence intervals are computed or reported only if a null hypothesis is rejected, perhaps to see whether the range of values is of practical importance. Sometimes they are constructed or reported only if a null hypothesis is accepted, in order to assess the range of plausible nonnull values due to inadequate power to detect them. Even if always computed, they are interpreted differently, depending on whether the null value is or is not included. Furthermore, many studies in which the null …


Confidence Elicitation And Anchoring In The Respondent-Generated Intervals (Rgi) Protocol, Liping Chu, S. James Press, Judith M. Tanur Nov 2004

Confidence Elicitation And Anchoring In The Respondent-Generated Intervals (Rgi) Protocol, Liping Chu, S. James Press, Judith M. Tanur

Journal of Modern Applied Statistical Methods

The Respondent-Generated Intervals protocol (RGI) has been used to have respondents recall the answer to a factual question by giving not only a point estimate but also bounds within which they feel it is almost certain that the true value of the quantity being reported upon falls. The RGI protocol is elaborated in this article with the goal of improving the accuracy of the estimators by introducing cueing mechanisms to direct confident (and thus presumably accurate) respondents to give shorter intervals and less confident (and thus presumably less accurate) respondents to give longer ones.


Assessing Treatment Effects In Randomized Longitudinal Two-Group Designs With Missing Observations, James Algina, H. J. Keselman Nov 2004

Assessing Treatment Effects In Randomized Longitudinal Two-Group Designs With Missing Observations, James Algina, H. J. Keselman

Journal of Modern Applied Statistical Methods

SAS’s PROC MIXED can be problematic when analyzing data from randomized longitudinal two-group designs when observations are missing over time. Overall (1996, 1999) and colleagues found a number of procedures that are effective in controlling the number of false positives (Type I errors) and are yet sensitive (powerful) to detect treatment effects. Two favorable methods incorporate time in study and baseline scores to model the missing data mechanism; one method was a single-stage PROC MIXED ANCOVA solution and the other was a two-stage endpoint analysis using the change scores as dependent scores. Because the twostage approach can lack sensitivity to …


An Overview Of The Respondent-Generated Intervals (Rgi) Approach To Sample Surveys, S. James Press, Judith M. Tanur Nov 2004

An Overview Of The Respondent-Generated Intervals (Rgi) Approach To Sample Surveys, S. James Press, Judith M. Tanur

Journal of Modern Applied Statistical Methods

This article brings together many years of research on the Respondent-Generated Intervals (RGI) approach to recall in factual sample surveys. Additionally presented is new research on the use of RGI in opinion surveys and the use of RGI with gamma-distributed data. The research combines Bayesian hierarchical modeling with various cognitive aspects of sample surveys.


Multivariate Contrasts For Repeated Measures Designs Under Assumption Violations, Lisa M. Lix, Aynslie M. Hinds Nov 2004

Multivariate Contrasts For Repeated Measures Designs Under Assumption Violations, Lisa M. Lix, Aynslie M. Hinds

Journal of Modern Applied Statistical Methods

Conventional and approximate degrees of freedom procedures for testing multivariate interaction contrasts in groups by trials repeated measures designs were compared under assumption violation conditions. Procedures were based on either least-squares or robust estimators. Power generally favored test procedures based on robust estimators for non-normal distributions, but was influenced by the degree of departure from non-normality, definition of power, and magnitude of the multivariate effect size.


Modeling Incomplete Longitudinal Data, Hakan Demirtas Nov 2004

Modeling Incomplete Longitudinal Data, Hakan Demirtas

Journal of Modern Applied Statistical Methods

This article presents a review of popular parametric, semiparametric and ad-hoc approaches for analyzing incomplete longitudinal data.


Variance Stabilizing Power Transformation For Time Series, Victor M. Guerrero, Rafael Perera Nov 2004

Variance Stabilizing Power Transformation For Time Series, Victor M. Guerrero, Rafael Perera

Journal of Modern Applied Statistical Methods

A confidence interval was derived for the index of a power transformation that stabilizes the variance of a time-series. The process starts from a model-independent procedure that minimizes a coefficient of variation to yield a point estimate of the transformation index. The confidence coefficient of the interval is calibrated through a simulation.


Size And Power Of The Reset Test As Applied To Systems Of Equations: A Bootstrap Approach, Ghazi Shukur, Panagiotis Mantalos Nov 2004

Size And Power Of The Reset Test As Applied To Systems Of Equations: A Bootstrap Approach, Ghazi Shukur, Panagiotis Mantalos

Journal of Modern Applied Statistical Methods

The size and power of various generalization of the RESET test for functional misspecification are investigated, using the “Bootsrap critical values”, in systems ranging from one to ten equations. The properties of 8 versions of the test are studied using Monte Carlo methods. The results are then compared with another study of Shukur and Edgerton (2002), in which they used the asymptotic critical values instead and found that in general only one version of the tests works well regarding size properties. In our study, when applying the bootstrap critical values, we find that all the tests exhibits correct size even …


Type I Error Rates For A One Factor Within-Subjects Design With Missing Values, Miguel A. Padilla, James Algina Nov 2004

Type I Error Rates For A One Factor Within-Subjects Design With Missing Values, Miguel A. Padilla, James Algina

Journal of Modern Applied Statistical Methods

Missing data are a common problem in educational research. A promising technique, that can be implemented in SAS PROC MIXED and is therefore widely available, is to use maximum likelihood to estimate model parameters and base hypothesis tests on these estimates. However, it is not clear which test statistic in PROC MIXED performs better with missing data. The performance of the Hotelling- Lawley-McKeon and Kenward-Roger omnibus test statistics on the means for a single factor withinsubject ANOVA are compared. The results indicate that the Kenward-Roger statistic performed better in terms of keeping the Type I error close to the nominal …


Interval Estimation For The Scale Parameter Of Burr Type X Distribution Based On Grouped Data, Amjad D. Al-Nasser, Ayman Baklizi Nov 2004

Interval Estimation For The Scale Parameter Of Burr Type X Distribution Based On Grouped Data, Amjad D. Al-Nasser, Ayman Baklizi

Journal of Modern Applied Statistical Methods

The application of some bootstrap type intervals for the scale parameter of the Burr type X distribution with grouped data is proposed. The general asymptotic confidence interval procedure (Chen & Mi, 2001) is studied. The performance of these intervals is investigated and compared. Some of the bootstrap intervals give better performance for situations of small sample size and heavy censoring.


Pseudo-Random Number Generation In R For Commonly Used Multivariate Distributions, Hakan Demirtas Nov 2004

Pseudo-Random Number Generation In R For Commonly Used Multivariate Distributions, Hakan Demirtas

Journal of Modern Applied Statistical Methods

An increasing number of practitioners and applied statisticians have started using the R programming system in recent years for their computing and data analysis needs. As far as pseudo-random number generation is concerned, the built-in generator in R does not contain multivariate distributions. In this article, R routines for widely used multivariate distributions are presented.


An Algorithm And Code For Computing Exact Critical Values For The Kruskal-Wallis Nonparametric One-Way Anova, Sikha Bagui, Subhash Bagui Nov 2004

An Algorithm And Code For Computing Exact Critical Values For The Kruskal-Wallis Nonparametric One-Way Anova, Sikha Bagui, Subhash Bagui

Journal of Modern Applied Statistical Methods

In this article, an algorithm and code to compute exact critical values (or percentiles) for Kruskal-Wallis test on k independent treatment populations with equal or unequal sample sizes using Visual Basic (VB.NET) is provided. This program has the ability to calculate critical values for any k , sample sizes (ni ) , and significance level (α ) . An exact critical value table for k = 4 is also developed. The table will be useful to practitioners since it is not available in standard nonparametric statistics texts. The program can also be used to compute any other …


Aligned Rank Tests As Robust Alternatives For Testing Interactions In Multiple Group Repeated Measures Designs With Heterogeneous Covariances, Xiaosheng Lei, Janet K. Holt, T. Mark Beasley Nov 2004

Aligned Rank Tests As Robust Alternatives For Testing Interactions In Multiple Group Repeated Measures Designs With Heterogeneous Covariances, Xiaosheng Lei, Janet K. Holt, T. Mark Beasley

Journal of Modern Applied Statistical Methods

Data simulation was used to investigate whether tests performed on aligned ranks (Beasley, 2002) could be used as robust alternatives to parametric methods for testing a split-plot interaction with non-normal data and heterogeneous covariance matrices. Results indicated the aligned rank method do not have any distinct advantage over parametric methods in this situation.


Multivariate And Multistrata Nonparametric Tests: The Nonparametric Combination Method, Livio Corain, Luigi Salmaso Nov 2004

Multivariate And Multistrata Nonparametric Tests: The Nonparametric Combination Method, Livio Corain, Luigi Salmaso

Journal of Modern Applied Statistical Methods

Researchers and practitioners in many scientific disciplines and industrial fields are often faced with complex problems when dealing with comparisons between two or more groups using classical parametric methods. The data arising from real problems rarely are in agreement with stringent parametric assumptions. The NonParametric Combination (NPC) methodology frees the researcher from stringent assumptions of parametric methods and allows a more flexible analysis, both in terms of specification of multivariate hypotheses and in terms of the nature of the variables involved in the analysis. An outline of NPC methodology is given, along with case studies.


Statistics And Technology: Reflections On 35 Years Of Change, James J. Higgins Nov 2004

Statistics And Technology: Reflections On 35 Years Of Change, James J. Higgins

Journal of Modern Applied Statistical Methods

From the days when statistical calculations were done on mechanical calculators to today, technology has transformed the discipline of statistics. More than just giving statisticians the power to crunch numbers, it has fundamentally changed the way we teach, do research, and consult. In this article, I give some examples of this from my 35 years as an academic statistician.


A Conversation With R. Clifford Blair On The Occasion Of His Retirement, Shlomo S. Sawilowsky Nov 2004

A Conversation With R. Clifford Blair On The Occasion Of His Retirement, Shlomo S. Sawilowsky

Journal of Modern Applied Statistical Methods

An interview was conducted on 23 November 2003 with R. Clifford Blair on the occasion on his retirement from the University of South Florida. This article is based on that interview. Biographical sketches and images of members of his academic genealogy are provided.


On Comparison Of Hypothesis Tests In The Bayesian Framework Without Loss Function, Vladimir Gercsik, Mark Kelbert Nov 2004

On Comparison Of Hypothesis Tests In The Bayesian Framework Without Loss Function, Vladimir Gercsik, Mark Kelbert

Journal of Modern Applied Statistical Methods

The problem is how to compare the quality of different hypothesis tests in a Bayesian framework without introducing a loss function. Three different linear orders on the set of all possible hypothesis tests are studied. The most natural order estimates the Fisher information between indicators of event and decision.


A New Goodness-Of-Fit Test For Item Response Theory, John H. Neel Nov 2004

A New Goodness-Of-Fit Test For Item Response Theory, John H. Neel

Journal of Modern Applied Statistical Methods

Chi-square techniques for testing goodness-of-fit in item response theory are shown to give incorrect results. A new measure, CB, based on cumulants is proposed which avoids the arbitrary nature of interval creation found in chi-square techniques. The distribution of CB is estimated using Monte Carlo techniques and critical values for testing goodness-of-fit are given.


Monte Carlo Evaluation Of Ordinal D With Improved Confidence Interval, Du Feng, Norman Cliff Nov 2004

Monte Carlo Evaluation Of Ordinal D With Improved Confidence Interval, Du Feng, Norman Cliff

Journal of Modern Applied Statistical Methods

This article reports a Monte Carlo evaluation of ordinal statistic d with modified confidence intervals (CI) for location comparison of two independent groups under various conditions. Type I error rate, power, and coverage of CI of d were compared to those of the Welch's t-test.


On A Simple Method For Analyzing Multivariate Survival Data Using Sample Survey Methods, Pingfu Fu, J. Sunil Rao Nov 2004

On A Simple Method For Analyzing Multivariate Survival Data Using Sample Survey Methods, Pingfu Fu, J. Sunil Rao

Journal of Modern Applied Statistical Methods

A simple technique is illustrated for analyzing multivariate survival data. The data situation arises when an individual records multiple survival events, or when individuals recording single survival events are grouped into clusters. Past work has focused on developing new methods to handle such data. Here, we use a connection between Poisson regression and survival modeling and a cluster sampling approach to adjust the variance estimates. The approach requires parametric assumption for the marginal hazard function, but avoids specification of a joint multivariate survival distribution. A simulation study demonstrates the proposed approach is a competing method of recent developed marginal approaches …


A Note On Extending Scheffé’S Modified Multiple-Comparison Procedure To Other Analysis Situations, Xinyue Zhou, Joel R. Levin Nov 2004

A Note On Extending Scheffé’S Modified Multiple-Comparison Procedure To Other Analysis Situations, Xinyue Zhou, Joel R. Levin

Journal of Modern Applied Statistical Methods

This article extends Scheffé’s modified (sequential) multiple-comparison procedure in one-way analysisof- variance to other analysis situations, including interaction comparisons in factorial ANOVA designs, tests of partial regression coefficients in multiple-regression analysis, and comparisons of means in onefactor multivariate analyses of variance. Researchers who are concerned with maintaining familywise Type I error rates while increasing statistical power relative to the original (simultaneous) Scheffé-based procedures are encouraged to consider these improved multiple-comparison methods.


The President’S Problem, Jann-Huei Jinn Nov 2004

The President’S Problem, Jann-Huei Jinn

Journal of Modern Applied Statistical Methods

A solution is offered in response to a complex combination problem challenged by Blom, Englund, and Sandell (1998). The problem is to determine the probability that a random permutation of the word BILLCLINTON has no equal neighbors.