Open Access. Powered by Scholars. Published by Universities.®

Statistical Methodology

Series

Institution
Keyword
Publication Year
Publication

Articles 1 - 26 of 26

Full-Text Articles in Design of Experiments and Sample Surveys

Modeling And Fitting Two-Way Tables Containing Outliers, David L. Farnsworth Feb 2023

Modeling And Fitting Two-Way Tables Containing Outliers, David L. Farnsworth

Articles

A model is proposed for two-way tables of measurement data containing outliers. The two independent variables are categorical and error free. Neither missing values nor replication are present. The model consists of the sum of a customary additive part that can be fit using least squares and a part that is composed of outliers. Recommendations are made for methods for identifying cells containing outliers and for fitting the model. A graph of the observations is used to determine the outliers’ locations. For all cells containing an outlier, replacement values are determined simultaneously using a classical missing-data tool. The result is …


Optimal Design For A Causal Structure, Zaher Kmail Aug 2019

Optimal Design For A Causal Structure, Zaher Kmail

Department of Statistics: Dissertations, Theses, and Student Work

Linear models and mixed models are important statistical tools. But in many natural phenomena, there is more than one endogenous variable involved and these variables are related in a sophisticated way. Structural Equation Modeling (SEM) is often used to model the complex relationships between the endogenous and exogenous variables. It was first implemented in research to estimate the strength and direction of direct and indirect effects among variables and to measure the relative magnitude of each causal factor.

Historically, traditional optimal design theory focuses on univariate linear, nonlinear, and mixed models. There is no current literature on the subject of …


What’S Brewing? A Statistics Education Discovery Project, Marla A. Sole, Sharon L. Weinberg Jan 2017

What’S Brewing? A Statistics Education Discovery Project, Marla A. Sole, Sharon L. Weinberg

Publications and Research

We believe that students learn best, are actively engaged, and are genuinely interested when working on real-world problems. This can be done by giving students the opportunity to work collaboratively on projects that investigate authentic, familiar problems. This article shares one such project that was used in an introductory statistics course. We describe the steps taken to investigate why customers are charged more for iced coffee than hot coffee, which included collecting data and using descriptive and inferential statistical analysis. Interspersed throughout the article, we describe strategies that can help teachers implement the project and scaffold material to assist students …


Combined Computational-Experimental Design Of High-Temperature, High-Intensity Permanent Magnetic Alloys With Minimal Addition Of Rare-Earth Elements, Rajesh Jha May 2016

Combined Computational-Experimental Design Of High-Temperature, High-Intensity Permanent Magnetic Alloys With Minimal Addition Of Rare-Earth Elements, Rajesh Jha

FIU Electronic Theses and Dissertations

AlNiCo magnets are known for high-temperature stability and superior corrosion resistance and have been widely used for various applications. Reported magnetic energy density ((BH) max) for these magnets is around 10 MGOe. Theoretical calculations show that ((BH) max) of 20 MGOe is achievable which will be helpful in covering the gap between AlNiCo and Rare-Earth Elements (REE) based magnets. An extended family of AlNiCo alloys was studied in this dissertation that consists of eight elements, and hence it is important to determine composition-property relationship between each of the alloying elements and their influence on the bulk properties.

In …


Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret Jan 2016

Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret

UW Biostatistics Working Paper Series

We have frequently implemented crossover studies to evaluate new therapeutic interventions for genital herpes simplex virus infection. The outcome measured to assess the efficacy of interventions on herpes disease severity is the viral shedding rate, defined as the frequency of detection of HSV on the genital skin and mucosa. We performed a simulation study to ascertain whether our standard model, which we have used previously, was appropriately considering all the necessary features of the shedding data to provide correct inference. We simulated shedding data under our standard, validated assumptions and assessed the ability of 5 different models to reproduce the …


Best Practice Recommendations For Data Screening, Justin A. Desimone, Peter D. Harms, Alice J. Desimone Feb 2015

Best Practice Recommendations For Data Screening, Justin A. Desimone, Peter D. Harms, Alice J. Desimone

Department of Management: Faculty Publications

Survey respondents differ in their levels of attention and effort when responding to items. There are a number of methods researchers may use to identify respondents who fail to exert sufficient effort in order to increase the rigor of analysis and enhance the trustworthiness of study results. Screening techniques are organized into three general categories, which differ in impact on survey design and potential respondent awareness. Assumptions and considerations regarding appropriate use of screening techniques are discussed along with descriptions of each technique. The utility of each screening technique is a function of survey design and administration. Each technique has …


Robust Optimization Of Biological Protocols, Patrick Flaherty, Ronald W. Davis Jan 2015

Robust Optimization Of Biological Protocols, Patrick Flaherty, Ronald W. Davis

Mathematics and Statistics Department Faculty Publication Series

When conducting high-throughput biological experiments, it is often necessary to develop a protocol that is both inexpensive and robust. Standard approaches are either not cost-effective or arrive at an optimized protocol that is sensitive to experimental variations. Here, we describe a novel approach that directly minimizes the cost of the protocol while ensuring the protocol is robust to experimental variation. Our approach uses a risk-averse conditional value-at-risk criterion in a robust parameter design framework. We demonstrate this approach on a polymerase chain reaction protocol and show that our improved protocol is less expensive than the standard protocol and more robust …


Adaptive Pair-Matching In The Search Trial And Estimation Of The Intervention Effect, Laura Balzer, Maya L. Petersen, Mark J. Van Der Laan Jan 2014

Adaptive Pair-Matching In The Search Trial And Estimation Of The Intervention Effect, Laura Balzer, Maya L. Petersen, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

In randomized trials, pair-matching is an intuitive design strategy to protect study validity and to potentially increase study power. In a common design, candidate units are identified, and their baseline characteristics used to create the best n/2 matched pairs. Within the resulting pairs, the intervention is randomized, and the outcomes measured at the end of follow-up. We consider this design to be adaptive, because the construction of the matched pairs depends on the baseline covariates of all candidate units. As consequence, the observed data cannot be considered as n/2 independent, identically distributed (i.i.d.) pairs of units, as current practice assumes. …


A General Procedure Of Estimating Population Mean Using Information On Auxiliary Attribute, Sachin Malik, Rajesh Singh, Florentin Smarandache Jan 2014

A General Procedure Of Estimating Population Mean Using Information On Auxiliary Attribute, Sachin Malik, Rajesh Singh, Florentin Smarandache

Branch Mathematics and Statistics Faculty and Staff Publications

This paper deals with the problem of estimating the finite population mean when some information on auxiliary attribute is available. It is shown that the proposed estimator is more efficient than the usual mean estimator and other existing estimators. The results have been illustrated numerically by taking empirical population considered in the literature.


A General Family Of Dual To Ratio-Cum-Product Estimator In Sample Surveys, Florentin Smarandache, Rajesh Singh, Mukesh Kumar, Pankaj Chauhan, Nirmala Sawan Dec 2011

A General Family Of Dual To Ratio-Cum-Product Estimator In Sample Surveys, Florentin Smarandache, Rajesh Singh, Mukesh Kumar, Pankaj Chauhan, Nirmala Sawan

Branch Mathematics and Statistics Faculty and Staff Publications

This paper presents a family of dual to ratio-cum-product estimators for the finite population mean. Under simple random sampling without replacement (SRSWOR) scheme, expressions of the bias and mean-squared error (MSE) up to the first order of approximation are derived. We show that the proposed family is more efficient than usual unbiased estimator, ratio estimator, product estimator, Singh estimator (1967), Srivenkataramana (1980) and Bandyopadhyaya estimator (1980) and Singh et al. (2005) estimator. An empirical study is carried out to illustrate the performance of the constructed estimator over others.


Estimation And Testing In Targeted Group Sequential Covariate-Adjusted Randomized Clinical Trials, Antoine Chambaz, Mark J. Van Der Laan Apr 2011

Estimation And Testing In Targeted Group Sequential Covariate-Adjusted Randomized Clinical Trials, Antoine Chambaz, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

This article is devoted to the construction and asymptotic study of adaptive group sequential covariate-adjusted randomized clinical trials analyzed through the prism of the semiparametric methodology of targeted maximum likelihood estimation (TMLE). We show how to build, as the data accrue group-sequentially, a sampling design which targets a user-supplied optimal design. We also show how to carry out a sound TMLE statistical inference based on such an adaptive sampling scheme (therefore extending some results known in the i.i.d setting only so far), and how group-sequential testing applies on top of it. The procedure is robust (i.e., consistent even if the …


Confidence Intervals For The Population Mean Tailored To Small Sample Sizes, With Applications To Survey Sampling, Michael Rosenblum, Mark J. Van Der Laan Jun 2008

Confidence Intervals For The Population Mean Tailored To Small Sample Sizes, With Applications To Survey Sampling, Michael Rosenblum, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

The validity of standard confidence intervals constructed in survey sampling is based on the central limit theorem. For small sample sizes, the central limit theorem may give a poor approximation, resulting in confidence intervals that are misleading. We discuss this issue and propose methods for constructing confidence intervals for the population mean tailored to small sample sizes.

We present a simple approach for constructing confidence intervals for the population mean based on tail bounds for the sample mean that are correct for all sample sizes. Bernstein's inequality provides one such tail bound. The resulting confidence intervals have guaranteed coverage probability …


Detailed Version: Analyzing Direct Effects In Randomized Trials With Secondary Interventions: An Application To Hiv Prevention Trials, Michael A. Rosenblum, Nicholas P. Jewell, Mark J. Van Der Laan, Stephen Shiboski, Ariane Van Der Straten, Nancy Padian Oct 2007

Detailed Version: Analyzing Direct Effects In Randomized Trials With Secondary Interventions: An Application To Hiv Prevention Trials, Michael A. Rosenblum, Nicholas P. Jewell, Mark J. Van Der Laan, Stephen Shiboski, Ariane Van Der Straten, Nancy Padian

U.C. Berkeley Division of Biostatistics Working Paper Series

This is the detailed technical report that accompanies the paper “Analyzing Direct Effects in Randomized Trials with Secondary Interventions: An Application to HIV Prevention Trials” (an unpublished, technical report version of which is available online at http://www.bepress.com/ucbbiostat/paper223).

The version here gives full details of the models for the time-dependent analysis, and presents further results in the data analysis section. The Methods for Improving Reproductive Health in Africa (MIRA) trial is a recently completed randomized trial that investigated the effect of diaphragm and lubricant gel use in reducing HIV infection among susceptible women. 5,045 women were randomly assigned to either the …


Analyzing Direct Effects In Randomized Trials With Secondary Interventions , Michael Rosenblum, Nicholas P. Jewell, Mark J. Van Der Laan, Stephen Shiboski, Ariane Van Der Straten, Nancy Padian Sep 2007

Analyzing Direct Effects In Randomized Trials With Secondary Interventions , Michael Rosenblum, Nicholas P. Jewell, Mark J. Van Der Laan, Stephen Shiboski, Ariane Van Der Straten, Nancy Padian

U.C. Berkeley Division of Biostatistics Working Paper Series

The Methods for Improving Reproductive Health in Africa (MIRA) trial is a recently completed randomized trial that investigated the effect of diaphragm and lubricant gel use in reducing HIV infection among susceptible women. 5,045 women were randomly assigned to either the active treatment arm or not. Additionally, all subjects in both arms received intensive condom counselling and provision, the "gold standard" HIV prevention barrier method. There was much lower reported condom use in the intervention arm than in the control arm, making it difficult to answer important public health questions based solely on the intention-to-treat analysis. We adapt an analysis …


New Statistical Paradigms Leading To Web-Based Tools For Clinical/Translational Science, Knut M. Wittkowski May 2005

New Statistical Paradigms Leading To Web-Based Tools For Clinical/Translational Science, Knut M. Wittkowski

COBRA Preprint Series

As the field of functional genetics and genomics is beginning to mature, we become confronted with new challenges. The constant drop in price for sequencing and gene expression profiling as well as the increasing number of genetic and genomic variables that can be measured makes it feasible to address more complex questions. The success with rare diseases caused by single loci or genes has provided us with a proof-of-concept that new therapies can be developed based on functional genomics and genetics.

Common diseases, however, typically involve genetic epistasis, genomic pathways, and proteomic pattern. Moreover, to better understand the underlying biologi-cal …


Censored Linear Regression For Case-Cohort Studies, Bin Nan, Menggang Yu, Jack Kalbfleisch Oct 2004

Censored Linear Regression For Case-Cohort Studies, Bin Nan, Menggang Yu, Jack Kalbfleisch

The University of Michigan Department of Biostatistics Working Paper Series

Right censored data from a classical case-cohort design and a stratified case-cohort design are considered. In the classical case-cohort design, the subcohort is obtained as a simple random sample of the entire cohort, whereas in the stratified design, the subcohort is selected by independent Bernoulli sampling with arbitrary selection probabilities. For each design and under a linear regression model, methods for estimating the regression parameters are proposed and analyzed. These methods are derived by modifying the linear ranks tests and estimating equations that arise from full-cohort data using methods that are similar to the "pseudo-likelihood" estimating equation that has been …


Non-Parametric Estimation Of Roc Curves In The Absence Of A Gold Standard, Xiao-Hua Zhou, Pete Castelluccio, Chuan Zhou Jul 2004

Non-Parametric Estimation Of Roc Curves In The Absence Of A Gold Standard, Xiao-Hua Zhou, Pete Castelluccio, Chuan Zhou

UW Biostatistics Working Paper Series

In evaluation of diagnostic accuracy of tests, a gold standard on the disease status is required. However, in many complex diseases, it is impossible or unethical to obtain such the gold standard. If an imperfect standard is used as if it were a gold standard, the estimated accuracy of the tests would be biased. This type of bias is called imperfect gold standard bias. In this paper we develop a maximum likelihood (ML) method for estimating ROC curves and their areas of ordinal-scale tests in the absence of a gold standard. Our simulation study shows the proposed estimates for the …


New Estimating Methods For Surrogate Outcome Data, Bin Nan Jun 2004

New Estimating Methods For Surrogate Outcome Data, Bin Nan

The University of Michigan Department of Biostatistics Working Paper Series

Surrogate outcome data arise frequently in medical research. The true outcomes of interest are expensive or hard to ascertain, but measurements of surrogate outcomes (or more generally speaking, the correlates of the true outcomes) are usually available. In this paper we assume that the conditional expectation of the true outcome given covariates is known up to a finite dimensional parameter. When the true outcome is missing at random, the e±cient score function for the parameter in the conditional mean model has a simple form, which is similar to the generalized estimating functions. There is no integral equation involved as in …


Multiple Imputation For Interval Censored Data With Auxiliary Variables, Chiu-Hsieh Hsu, Jeremy Taylor, Susan Murray Feb 2004

Multiple Imputation For Interval Censored Data With Auxiliary Variables, Chiu-Hsieh Hsu, Jeremy Taylor, Susan Murray

The University of Michigan Department of Biostatistics Working Paper Series

We propose a nonparametric multiple imputation scheme, NPMLE imputation, for the analysis of interval censored survival data. Features of the method are that it converts interval-censored data problems to complete data or right censored data problems to which many standard approaches can be used, and the measures of uncertainty are easily obtained. In addition to the event time of primary interest, there are frequently other auxiliary variables that are associated with the event time. For the goal of estimating the marginal survival distribution, these auxiliary variables may provide some additional information about the event time for the interval censored observations. …


Optimal Sample Size For Multiple Testing: The Case Of Gene Expression Microarrays, Peter Muller, Giovanni Parmigiani, Christian Robert, Judith Rousseau Feb 2004

Optimal Sample Size For Multiple Testing: The Case Of Gene Expression Microarrays, Peter Muller, Giovanni Parmigiani, Christian Robert, Judith Rousseau

Johns Hopkins University, Dept. of Biostatistics Working Papers

We consider the choice of an optimal sample size for multiple comparison problems. The motivating application is the choice of the number of microarray experiments to be carried out when learning about differential gene expression. However, the approach is valid in any application that involves multiple comparisons in a large number of hypothesis tests. We discuss two decision problems in the context of this setup: the sample size selection and the decision about the multiple comparisons. We adopt a decision theoretic approach,using loss functions that combine the competing goals of discovering as many ifferentially expressed genes as possible, while keeping …


Robust Likelihood-Based Analysis Of Multivariate Data With Missing Values, Rod Little, An Hyonggin Dec 2003

Robust Likelihood-Based Analysis Of Multivariate Data With Missing Values, Rod Little, An Hyonggin

The University of Michigan Department of Biostatistics Working Paper Series

The model-based approach to inference from multivariate data with missing values is reviewed. Regression prediction is most useful when the covariates are predictive of the missing values and the probability of being missing, and in these circumstances predictions are particularly sensitive to model misspecification. The use of penalized splines of the propensity score is proposed to yield robust model-based inference under the missing at random (MAR) assumption, assuming monotone missing data. Simulation comparisons with other methods suggest that the method works well in a wide range of populations, with little loss of efficiency relative to parametric models when the latter …


Weighting Adjustments For Unit Nonresponse With Multiple Outcome Variables, Sonya L. Vartivarian, Rod Little Nov 2003

Weighting Adjustments For Unit Nonresponse With Multiple Outcome Variables, Sonya L. Vartivarian, Rod Little

The University of Michigan Department of Biostatistics Working Paper Series

Weighting is a common form of unit nonresponse adjustment in sample surveys where entire questionnaires are missing due to noncontact or refusal to participate. Weights are inversely proportional to the probability of selection and response. A common approach computes the response weight adjustment cells based on covariate information. When the number of cells thus created is too large, a coarsening method such as response propensity stratification can be applied to reduce the number of adjustment cells. Simulations in Vartivarian and Little (2002) indicate improved efficiency and robustness of weighting adjustments based on the joint classification of the sample by two …


Inference For The Population Total From Probability-Proportional-To-Size Samples Based On Predictions From A Penalized Spline Nonparametric Model, Hui Zheng, Rod Little Aug 2003

Inference For The Population Total From Probability-Proportional-To-Size Samples Based On Predictions From A Penalized Spline Nonparametric Model, Hui Zheng, Rod Little

The University of Michigan Department of Biostatistics Working Paper Series

Inference about the finite population total from probability-proportional-to-size (PPS) samples is considered. In previous work (Zheng and Little, 2003), penalized spline (p-spline) nonparametric model-based estimators were shown to generally outperform the Horvitz-Thompson (HT) and generalized regression (GR) estimators in terms of the root mean squared error. In this article we develop model-based, jackknife and balanced repeated replicate variance estimation methods for the p-spline based estimators. Asymptotic properties of the jackknife method are discussed. Simulations show that p-spline point estimators and their jackknife standard errors lead to inferences that are superior to HT or GR based inferences. This suggests that nonparametric …


Semiparametric Regression Models With Missing Data: The Mathematics In The Work Of Robins Et Al., Menggang Yu, Bin Nan May 2003

Semiparametric Regression Models With Missing Data: The Mathematics In The Work Of Robins Et Al., Menggang Yu, Bin Nan

The University of Michigan Department of Biostatistics Working Paper Series

This review is an attempt to understand the landmark papers of Robins, Rotnitzky, and Zhao (1994) and Robins and Rotnitzky (1992). We revisit their main results and corresponding proofs using the theory outlined in the monograph by Bickel, Klaassen, Ritov, and Wellner (1993). We also discuss an illustrative example to show the details of applying these theoretical results.


Penalized Spline Nonparametric Mixed Models For Inference About A Finite Population Mean From Two-Stage Samples, Hui Zheng, Rod Little Mar 2003

Penalized Spline Nonparametric Mixed Models For Inference About A Finite Population Mean From Two-Stage Samples, Hui Zheng, Rod Little

The University of Michigan Department of Biostatistics Working Paper Series

Samplers often distrust model-based approaches to survey inference due to concerns about model misspecification when applied to large samples from complex populations. We suggest that the model-based paradigm can work very successfully in survey settings, provided models are chosen that take into account the sample design and avoid strong parametric assumptions. The Horvitz-Thompson (HT) estimator is a simple design-unbiased estimator of the finite population total in probability sampling designs. From a modeling perspective, the HT estimator performs well when the ratios of the outcome values and the inclusion probabilities are exchangeable. When this assumption is not met, the HT estimator …


Assessing The Accuracy Of A New Diagnostic Test When A Gold Standard Does Not Exist, Todd A. Alonzo, Margaret S. Pepe Oct 1998

Assessing The Accuracy Of A New Diagnostic Test When A Gold Standard Does Not Exist, Todd A. Alonzo, Margaret S. Pepe

UW Biostatistics Working Paper Series

Often the accuracy of a new diagnostic test must be assessed when a perfect gold standard does not exist. Use of an imperfect test biases the accuracy estimates of the new test. This paper reviews existing approaches to this problem including discrepant resolution and latent class analysis. Deficiencies with these approaches are identified. A new approach is proposed that combines the results of several imperfect reference tests to define a better reference standard. We call this the composite reference standard (CRS). Using the CRS, accuracy can be assessed using multistage sampling designs. Maximum likelihood estimates of accuracy and expressions for …