Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 89

Full-Text Articles in Statistics and Probability

Model-Robust Bayesian Regression And The Sandwich Estimator, Adam A. Szpiro, Kenneth M. Rice, Thomas Lumley Dec 2007

Model-Robust Bayesian Regression And The Sandwich Estimator, Adam A. Szpiro, Kenneth M. Rice, Thomas Lumley

UW Biostatistics Working Paper Series

PLEASE NOTE THAT AN UPDATED VERSION OF THIS RESEARCH IS AVAILABLE AS WORKING PAPER 338 IN THE UNIVERSITY OF WASHINGTON BIOSTATISTICS WORKING PAPER SERIES (http://www.bepress.com/uwbiostat/paper338).

In applied regression problems there is often sufficient data for accurate estimation, but standard parametric models do not accurately describe the source of the data, so associated uncertainty estimates are not reliable. We describe a simple Bayesian approach to inference in linear regression that recovers least-squares point estimates while providing correct uncertainty bounds by explicitly recognizing that standard modeling assumptions need not be valid. Our model-robust development parallels frequentist estimating equations and leads to intervals …


New Technique For Imputing Missing Item Responses For An Ordinal Variable: Using Tennessee Youth Risk Behavior Survey As An Example., Andaleeb Abrar Ahmed Dec 2007

New Technique For Imputing Missing Item Responses For An Ordinal Variable: Using Tennessee Youth Risk Behavior Survey As An Example., Andaleeb Abrar Ahmed

Electronic Theses and Dissertations

Surveys ordinarily ask questions in an ordinal scale and often result in missing data. We suggest a regression based technique for imputing missing ordinal data. Multilevel cumulative logit model was used with an assumption that observed responses of certain key variables can serve as covariate in predicting missing item responses of an ordinal variable. Individual predicted probabilities at each response level were obtained. Average individual predicted probabilities for each response level were used to randomly impute the missing responses using a uniform distribution. Finally, likelihood ratio chi square statistics was used to compare the imputed and observed distributions. Two other …


Estimating Sensitivity And Specificity From A Phase 2 Biomarker Study That Allows For Early Termination, Margaret S. Pepe Phd Dec 2007

Estimating Sensitivity And Specificity From A Phase 2 Biomarker Study That Allows For Early Termination, Margaret S. Pepe Phd

UW Biostatistics Working Paper Series

Development of a disease screening biomarker involves several phases. In phase 2 its sensitivity and specificity is compared with established thresholds for minimally acceptable performance. Since we anticipate that most candidate markers will not prove to be useful and availability of specimens and funding is limited, early termination of a study is appropriate if accumulating data indicate that the marker is inadequate. Yet, for markers that complete phase 2, we seek estimates of sensitivity and specificity to proceed with the design of subsequent phase 3 studies.

We suggest early stopping criteria and estimation procedures that adjust for bias caused by …


Bootstrap Confidence Regions For Optimal Operating Conditions In Response Surface Methodology, Roger D. Gibb, I-Li Lu, Walter H. Carter Jr Nov 2007

Bootstrap Confidence Regions For Optimal Operating Conditions In Response Surface Methodology, Roger D. Gibb, I-Li Lu, Walter H. Carter Jr

COBRA Preprint Series

This article concerns the application of bootstrap methodology to construct a likelihood-based confidence region for operating conditions associated with the maximum of a response surface constrained to a specified region. Unlike classical methods based on the stationary point, proper interpretation of this confidence region does not depend on unknown model parameters. In addition, the methodology does not require the assumption of normally distributed errors. The approach is demonstrated for concave-down and saddle system cases in two dimensions. Simulation studies were performed to assess the coverage probability of these regions.

AMS 2000 subj Classification: 62F25, 62F40, 62F30, 62J05.

Key words: Stationary …


Loss-Based Estimation With Evolutionary Algorithms And Cross-Validation, David Shilane, Richard H. Liang, Sandrine Dudoit Nov 2007

Loss-Based Estimation With Evolutionary Algorithms And Cross-Validation, David Shilane, Richard H. Liang, Sandrine Dudoit

U.C. Berkeley Division of Biostatistics Working Paper Series

Many statistical inference methods rely upon selection procedures to estimate a parameter of the joint distribution of explanatory and outcome data, such as the regression function. Within the general framework for loss-based estimation of Dudoit and van der Laan, this project proposes an evolutionary algorithm (EA) as a procedure for risk optimization. We also analyze the size of the parameter space for polynomial regression under an interaction constraints along with constraints on either the polynomial or variable degree.


Resampling-Based Empirical Bayes Multiple Testing Procedures For Controlling Generalized Tail Probability And Expected Value Error Rates: , Sandrine Dudoit, Houston N. Gilbert, Mark J. Van Der Laan Nov 2007

Resampling-Based Empirical Bayes Multiple Testing Procedures For Controlling Generalized Tail Probability And Expected Value Error Rates: , Sandrine Dudoit, Houston N. Gilbert, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

This article proposes resampling-based empirical Bayes multiple testing procedures for controlling a broad class of Type I error rates, defined as generalized tail probability (gTP) error rates, gTP(q,g) = Pr(g(Vn,Sn) > q), and generalized expected value (gEV) error rates, gEV(g) = [g(Vn,Sn)], for arbitrary functions g(Vn,Sn) of the numbers of false positives Vn and true positives Sn. Of particular interest are error rates based on the …


Time-Series Intervention Analysis Using Itsacorr: Fatal Flaws, Bradley E. Huitema, Joseph W. Mckean, Sean Laraway Nov 2007

Time-Series Intervention Analysis Using Itsacorr: Fatal Flaws, Bradley E. Huitema, Joseph W. Mckean, Sean Laraway

Journal of Modern Applied Statistical Methods

The ITSACORR method (Crosbie, 1993, 1995) is evaluated for the analysis of two-phase interrupted time-series designs. It is shown that each component of the ITSACORR framework (including the structural model, the design matrix, the autocorrelation estimator, the ultimate parameter estimation scheme, and the inferential method) contains fatal flaws.


A Comparison Of Procedures For The Analysis Of Multivariate Repeated Measurements, Lisa M. Lix, Anita M. Lloyd Nov 2007

A Comparison Of Procedures For The Analysis Of Multivariate Repeated Measurements, Lisa M. Lix, Anita M. Lloyd

Journal of Modern Applied Statistical Methods

Three procedures for analyzing within-subjects effects in multivariate repeated measures designs are compared when group covariances are heterogeneous: the multiple regression model (MRM) with a structured covariance, Johansen’s (1980) procedure, and the multivariate Brown and Forsythe (1974) procedure. A preliminary likelihood ratio test of a Kronecker product covariance structure is sensitive to sample size and derivational assumption violations. Error rates of the procedures are generally well-controlled except when the distribution is skewed. The MRM procedure displayed few power advantages over the other procedures.


From Information Lost To Knowledge Gained: The Benefits Of Analyzing All The Research Evidence, Joseph L. Balloun, Hilton Barrett Nov 2007

From Information Lost To Knowledge Gained: The Benefits Of Analyzing All The Research Evidence, Joseph L. Balloun, Hilton Barrett

Journal of Modern Applied Statistical Methods

Data analyses should reveal truths about data. To the extent possible analyses should tell a complete picture. Data analyses should not inadvertently ignore phenomena that might be discovered in sample data sets. However, common univariate or multivariate data analysis methods tend to be based on only the means, standard deviations, and Pearson correlations. The result is that many important truths are discovered, but not the whole truth. This article illustrates in a sample data set that (a) data analyses of other properties of variables and groups are feasible and practical, and (b) such analyses may reveal important information not otherwise …


Bayesian Subset Selection Of Binomial Parameters Using Possibly Misclassified Data, James D. Stamey, Thomas L. Bratcher, Dean M. Young Nov 2007

Bayesian Subset Selection Of Binomial Parameters Using Possibly Misclassified Data, James D. Stamey, Thomas L. Bratcher, Dean M. Young

Journal of Modern Applied Statistical Methods

Three Bayesian approaches are considered for the selection of binomial proportion parameters when data is subject to misclassification. The cases where the misclassification is non-differential and differential were considered, thus extending previous work which considered only non-differential misclassification. In this article, various selection criteria are applied to a simulated data set and a real data set.


Regarding Lui K. J. (2006). Interval Estimation Of Risk Difference In Simple Compliance Randomized Trials. Jmasm, 5, 395-407., Ian R. White Nov 2007

Regarding Lui K. J. (2006). Interval Estimation Of Risk Difference In Simple Compliance Randomized Trials. Jmasm, 5, 395-407., Ian R. White

Journal of Modern Applied Statistical Methods

No abstract provided.


Optimal Trimming And Outlier Elimination, Philip H. Ramsey, Patricia P. Ramsey Nov 2007

Optimal Trimming And Outlier Elimination, Philip H. Ramsey, Patricia P. Ramsey

Journal of Modern Applied Statistical Methods

Five data sets with known true values are used to determine the optimal number of pairs that should be trimmed in order to produce the minimum relative error. The optimal trimming in the five data sets is found to be 1%, 5%, 7%, 10% and 28%. The 28% rate is shown to be an outlier among the five data sets. Results of four data sets are used to establish cutoff values for outlier detection in two robust methods of outlier detection.


Multiple Comparison Of Medians Using Permutation Tests, Scott J. Richter, Melinda H. Mccann Nov 2007

Multiple Comparison Of Medians Using Permutation Tests, Scott J. Richter, Melinda H. Mccann

Journal of Modern Applied Statistical Methods

A robust method is proposed for simultaneous pairwise comparison using permutation tests and median differences. The new procedure provides strong control of familywise error rate and has better power properties than the median procedure of Nemenyi/Levy. It can be more powerful than the Tukey-Kramer procedure using mean differences, especially for nonnormal distributions and unequal sample sizes.


The Non-Parametric Difference Score: A Workable Solution For Analyzing Two-Wave Change When The Measures Themselves Change Across Waves, Jennifer E. V. Lloyd, Bruno D. Zumbo Nov 2007

The Non-Parametric Difference Score: A Workable Solution For Analyzing Two-Wave Change When The Measures Themselves Change Across Waves, Jennifer E. V. Lloyd, Bruno D. Zumbo

Journal of Modern Applied Statistical Methods

The non-parametric difference score is introduced. It is a workable solution to the problem of analyzing change over two waves (i.e., a pretest-posttest design) when the measures themselves vary over time. An example highlighting the solution’s implementation is provided, as is a discussion of the solution’s assumptions, strengths, and limitations.


The Effect Of Different Degrees Of Freedom Of The Chi-Square Distribution On The Statistical Power Of The T, Permutation T, And Wilcoxon Tests, Michèle Weber Nov 2007

The Effect Of Different Degrees Of Freedom Of The Chi-Square Distribution On The Statistical Power Of The T, Permutation T, And Wilcoxon Tests, Michèle Weber

Journal of Modern Applied Statistical Methods

The Chi-square distribution is used quite often in Monte Carlo studies to examine statistical power of competing statistics. The power spectrum of the t-test, Wilcoxon test, and permutation t test are compared under various degrees of freedom for this distribution. The two t tests have similar power, which is generally less than the Wilcoxon.


Probability Coverage And Interval Length For Welch’S And Yuen’S Techniques: Shift In Location, Change In Scale, And (Un)Equal Sizes, S. Jonathan Mends-Cole Nov 2007

Probability Coverage And Interval Length For Welch’S And Yuen’S Techniques: Shift In Location, Change In Scale, And (Un)Equal Sizes, S. Jonathan Mends-Cole

Journal of Modern Applied Statistical Methods

Coverage for Welch’s technique was less than the confidence-level when size was inversely proportional to variance and skewness was extreme. Under negative kurtosis, coverage for Yuen’s technique was attenuated. Under skewness and heteroscedasticity, coverage for Yuen’s technique was more accurate than Welch’s technique.


Tests For 2 X 2 Tables In Clinical Trials, Vic Hasselblad, Yulia Lokhnygina Nov 2007

Tests For 2 X 2 Tables In Clinical Trials, Vic Hasselblad, Yulia Lokhnygina

Journal of Modern Applied Statistical Methods

Five standard tests are compared: chi-squared, Fisher's exact, Yates’ correction, Fisher’s exact mid-p, and Barnard’s. Yates’ is always inferior to Fisher’s exact. Fisher’s exact is so conservative that one should look for alternatives. For certain sample sizes, Fisher’s mid-p or Barnard’s test maintain the nominal alpha and have superior power.


Semi Parametric Estimation Of Some Reliability Measures Of Geometric Distribution, Mathachan Pathiyil, E.S. Jeevanand Nov 2007

Semi Parametric Estimation Of Some Reliability Measures Of Geometric Distribution, Mathachan Pathiyil, E.S. Jeevanand

Journal of Modern Applied Statistical Methods

Semi parametric estimators of the survival function, the hazard function, and the mean residual life function of geometric distribution using uncensored and Type II censored samples are obtained. The accuracy of the estimators so obtained is investigated empirically using simulated samples. The results are applied to a real life data set for illustration.


Large Deviations Techniques For Error Exponents To Multiple Hypothesis Lao Testing, Leader Navaei Nov 2007

Large Deviations Techniques For Error Exponents To Multiple Hypothesis Lao Testing, Leader Navaei

Journal of Modern Applied Statistical Methods

In this article the problem of multiple hypotheses testing using a theory of large deviations is studied. The reliability matrix of Logarithmically Asymptotically Optimal (LAO) tests is introduced and described, and the conditions for the positive of all its elements are indicated.


Interference On Overlapping Coefficients In Two Exponential Populations, Mohammad Fraiwan Al-Saleh, Hani M. Samawi Nov 2007

Interference On Overlapping Coefficients In Two Exponential Populations, Mohammad Fraiwan Al-Saleh, Hani M. Samawi

Journal of Modern Applied Statistical Methods

Three measures of overlap, namely Matusita’s measureρ , Morisita’s measure λ and Weitzman’s measure Δ are investigated in this article for two exponential populations with different means. It is well that the estimators of those measures of overlap are biased. The bias is of these estimators depends on the unknown overlap parameters. There are no closed-form, exact formulas, for those estimators variances or their exact sampling distributions. Monte Carlo evaluations are used to study the bias and precision of the proposed overlap measures. Bootstrap method and Taylor series approximation are used to construct confidence intervals for the overlap measures


Performance Of Some Correlation Coefficients When Applied To Zero-Clustered Data, L. W. Huson Nov 2007

Performance Of Some Correlation Coefficients When Applied To Zero-Clustered Data, L. W. Huson

Journal of Modern Applied Statistical Methods

Zero-clustered data occur widely in medical research and are characterised by the presence of a group of observations of value zero in a distribution of otherwise continuous non-negative responses. A simulation study was conducted to investigate the properties of a number of correlation coefficients applied to samples of zero-clustered data.


The Correlation Coefficients, Rudy A. Gideon Nov 2007

The Correlation Coefficients, Rudy A. Gideon

Journal of Modern Applied Statistical Methods

A generalized method of defining and interpreting correlation coefficients is given. Seven correlation coefficients are defined — three for continuous data and four on the ranks of the data. A quick calculation of the rank based correlation coefficients using a 0-1 graph-matrix is shown. Examples and comparisons are given.


Covariate Dependent Markov Models For Analysis Of Repeated Binary Outcomes, M.A. Islam, R.I. Chowdhury, K.P. Singh Nov 2007

Covariate Dependent Markov Models For Analysis Of Repeated Binary Outcomes, M.A. Islam, R.I. Chowdhury, K.P. Singh

Journal of Modern Applied Statistical Methods

The covariate dependence in a higher order Markov models is examined. First order Markov models with covariate dependence are discussed and are generalized for higher order. A simple alternative is also proposed. The estimation procedure is discussed for higher order with a number of covariates. The proposed model takes into account the past transitions. Transitions are fitted and are tested in order to examine their influence on the most recent transitions. Applications are illustrated using maternal morbidity during pregnancy. The binary outcome at each visit during pregnancy is observed for each subject and then the covariate dependent Markov models are …


Operating Characteristics Of The Dif Mimic Approach Using Jöreskog’S Covariance Matrix With Ml And Wls Estimation For Short Scales, Michaela N. Gelin, Bruno D. Zumbo Nov 2007

Operating Characteristics Of The Dif Mimic Approach Using Jöreskog’S Covariance Matrix With Ml And Wls Estimation For Short Scales, Michaela N. Gelin, Bruno D. Zumbo

Journal of Modern Applied Statistical Methods

Type I error rate of a structural equation modeling (SEM) approach for investigating differential item functioning (DIF) in short scales was studied. Muthén’s SEM model for DIF was examined using a covariance matrix (Jöreskog, 2002). It is conditioned on the latent variable, while testing the effect of the grouping variable over-and-above the underlying latent variable. Thus, it is a multiple-indicators, multiple-causes (MIMIC) DIF model. Type I error rates were determined using data reflective of short scales with ordinal item response formats typically found in the social and behavioral sciences. Results indicate Type I error rates for the DIF MIMIC model, …


A Simple Method For Finding Emperical Liklihood Type Intervals For The Roc Curve, Ayman Baklizi Nov 2007

A Simple Method For Finding Emperical Liklihood Type Intervals For The Roc Curve, Ayman Baklizi

Journal of Modern Applied Statistical Methods

Interval estimation of the ROC curve is considered using the empirical likelihood techniques. Suggested is a procedure that is very simple computationally and avoids the constrained optimization problems usually faced with empirical likelihood methods. Various modifications are suggested and the performance of the intervals is evaluated in terms of their coverage probability. The results show that some of the suggested intervals compete well with other intervals known in the literature.


The Effect Of Garch (1,1) On The Granger Causality Test In Stable Var Models, Panagiotis Mantalos, Ghazi Shukur, Pär Sjölander Nov 2007

The Effect Of Garch (1,1) On The Granger Causality Test In Stable Var Models, Panagiotis Mantalos, Ghazi Shukur, Pär Sjölander

Journal of Modern Applied Statistical Methods

Using Monte Carlo methods, the properties of Granger causality test in stable VAR models are studied under the presence of different magnitudes of GARCH effects in the error terms. Analysis reveals that substantial GARCH effects influence the size properties of the Granger causality test, especially in small samples. The power functions of the test are usually slightly lower when GARCH effects are imposed among the residuals compared with the case of white noise residuals.


Generalized Linear Mixed-Effects Models For The Analysis Of Odor Detection Data, Sandra Hall, Matthew S. Mayo, Xu-Feng Niu, James C. Walker Nov 2007

Generalized Linear Mixed-Effects Models For The Analysis Of Odor Detection Data, Sandra Hall, Matthew S. Mayo, Xu-Feng Niu, James C. Walker

Journal of Modern Applied Statistical Methods

Olfactory detection has become a science of interest. Seven individuals’ odor detection abilities are explored and an attempt is made to characterize all subjects with one generalized linear mixed effects model. Two methods of fitting the models were used and simulations were conducted to discover which method yielded the best results.


A Modified Control Chart For Samples Drawn From Finite Populations, Michael B. C. Khoo Nov 2007

A Modified X̄ Control Chart For Samples Drawn From Finite Populations, Michael B. C. Khoo

Journal of Modern Applied Statistical Methods

The chart works well under the assumption of random sampling from infinite populations. However, many process monitoring scenarios may consist of random sampling from finite populations. A modified chart is proposed in this article to solve the problems encountered by the standard chart when samples are drawn from finite populations.


Optimum Choice Of Covariates For A Series Of Sbibds Obtained Through Projective Geometry, Ganesh Dutta, Premadhis Das, Nripes Kumar Mandal Nov 2007

Optimum Choice Of Covariates For A Series Of Sbibds Obtained Through Projective Geometry, Ganesh Dutta, Premadhis Das, Nripes Kumar Mandal

Journal of Modern Applied Statistical Methods

A block design set up is considered in presence of a number of controllable covariates. The problem is that of choosing the values of the covariates so that for a given block design, it is optimum in the sense of attaining minimum variance for the estimation of each of the covariate parameters. In case of incomplete block designs, the choice of the values of the covariates depends heavily on the allocation of treatments to the plots of blocks; more specifically on the method of construction of the incomplete block design. In this paper the situation where the block design is …


A New Generalization Of Negative Ploya-Eggenberger Distribution And Its Applications, Anwar Hassan, Sheikh Nilal Ahmad Nov 2007

A New Generalization Of Negative Ploya-Eggenberger Distribution And Its Applications, Anwar Hassan, Sheikh Nilal Ahmad

Journal of Modern Applied Statistical Methods

A new generalization of negative Polya-Eggenberger distribution (GNPED) has been obtained by mixing the negative binomial distribution with generalized beta distribution-Π defined by Nadarajah and Kotz (2003). Some special cases and properties of GNPED have been studied. Further, the proposed model has been fitted to two data sets (used by Gupta & Ong, 2004) that provide a satisfactory fit and better alternative as compared to negative binomial and some of its mixture models and extensions. Also, the negative Polya-Eggenberger distribution (NPED), obtained by mixing negative binomial with beta distribution of I-kind, has been fitted to the same data sets for …