Open Access. Powered by Scholars. Published by Universities.®
Design of Experiments and Sample Surveys Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Institution
- Keyword
-
- Auxiliary variables (2)
- Diagnostic tests (2)
- Missing at random (2)
- Weighting (2)
- Adaptive design; asymptotic normality; canonical distribution; clinical trial; group-sequential testing; targeted maximum likelihood methodology (1)
-
- Adaptive designs; Average treatment effect; Cluster randomized trials; Pair-matching; Randomized trials; Targeted minimum loss-based estimation (TMLE) (1)
- Additive model (1)
- AlNiCo magnets (1)
- Analysis of Designed Experiments (1)
- Auxiliary attribute (1)
- Balanced repeated replication (1)
- Bernstein's inequality; central limit theorem; confidence interval; influence curve; normal distribution; survey sampling (1)
- Bias (1)
- Case-cohort design (1)
- Causal inference (1)
- Causal structure modeling (1)
- Censored linear regression (1)
- Coffee prices (1)
- Collaborative learning (1)
- Composition-property relationship (1)
- Conditional mean regression (1)
- Counterfactual (1)
- Counting processes (1)
- Course project (1)
- Crossover (1)
- Data cleaning (1)
- Data collection (1)
- Data quality (1)
- Data-driven Materials Science (1)
- Descriptive statistics (1)
- Publication Year
- Publication
-
- The University of Michigan Department of Biostatistics Working Paper Series (8)
- U.C. Berkeley Division of Biostatistics Working Paper Series (5)
- UW Biostatistics Working Paper Series (3)
- Branch Mathematics and Statistics Faculty and Staff Publications (2)
- Articles (1)
-
- COBRA Preprint Series (1)
- Department of Management: Faculty Publications (1)
- Department of Statistics: Dissertations, Theses, and Student Work (1)
- FIU Electronic Theses and Dissertations (1)
- Johns Hopkins University, Dept. of Biostatistics Working Papers (1)
- Mathematics and Statistics Department Faculty Publication Series (1)
- Publications and Research (1)
Articles 1 - 26 of 26
Full-Text Articles in Design of Experiments and Sample Surveys
Modeling And Fitting Two-Way Tables Containing Outliers, David L. Farnsworth
Modeling And Fitting Two-Way Tables Containing Outliers, David L. Farnsworth
Articles
A model is proposed for two-way tables of measurement data containing outliers. The two independent variables are categorical and error free. Neither missing values nor replication are present. The model consists of the sum of a customary additive part that can be fit using least squares and a part that is composed of outliers. Recommendations are made for methods for identifying cells containing outliers and for fitting the model. A graph of the observations is used to determine the outliers’ locations. For all cells containing an outlier, replacement values are determined simultaneously using a classical missing-data tool. The result is …
Optimal Design For A Causal Structure, Zaher Kmail
Optimal Design For A Causal Structure, Zaher Kmail
Department of Statistics: Dissertations, Theses, and Student Work
Linear models and mixed models are important statistical tools. But in many natural phenomena, there is more than one endogenous variable involved and these variables are related in a sophisticated way. Structural Equation Modeling (SEM) is often used to model the complex relationships between the endogenous and exogenous variables. It was first implemented in research to estimate the strength and direction of direct and indirect effects among variables and to measure the relative magnitude of each causal factor.
Historically, traditional optimal design theory focuses on univariate linear, nonlinear, and mixed models. There is no current literature on the subject of …
What’S Brewing? A Statistics Education Discovery Project, Marla A. Sole, Sharon L. Weinberg
What’S Brewing? A Statistics Education Discovery Project, Marla A. Sole, Sharon L. Weinberg
Publications and Research
We believe that students learn best, are actively engaged, and are genuinely interested when working on real-world problems. This can be done by giving students the opportunity to work collaboratively on projects that investigate authentic, familiar problems. This article shares one such project that was used in an introductory statistics course. We describe the steps taken to investigate why customers are charged more for iced coffee than hot coffee, which included collecting data and using descriptive and inferential statistical analysis. Interspersed throughout the article, we describe strategies that can help teachers implement the project and scaffold material to assist students …
Combined Computational-Experimental Design Of High-Temperature, High-Intensity Permanent Magnetic Alloys With Minimal Addition Of Rare-Earth Elements, Rajesh Jha
FIU Electronic Theses and Dissertations
AlNiCo magnets are known for high-temperature stability and superior corrosion resistance and have been widely used for various applications. Reported magnetic energy density ((BH) max) for these magnets is around 10 MGOe. Theoretical calculations show that ((BH) max) of 20 MGOe is achievable which will be helpful in covering the gap between AlNiCo and Rare-Earth Elements (REE) based magnets. An extended family of AlNiCo alloys was studied in this dissertation that consists of eight elements, and hence it is important to determine composition-property relationship between each of the alloying elements and their influence on the bulk properties.
In …
Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret
Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret
UW Biostatistics Working Paper Series
We have frequently implemented crossover studies to evaluate new therapeutic interventions for genital herpes simplex virus infection. The outcome measured to assess the efficacy of interventions on herpes disease severity is the viral shedding rate, defined as the frequency of detection of HSV on the genital skin and mucosa. We performed a simulation study to ascertain whether our standard model, which we have used previously, was appropriately considering all the necessary features of the shedding data to provide correct inference. We simulated shedding data under our standard, validated assumptions and assessed the ability of 5 different models to reproduce the …
Best Practice Recommendations For Data Screening, Justin A. Desimone, Peter D. Harms, Alice J. Desimone
Best Practice Recommendations For Data Screening, Justin A. Desimone, Peter D. Harms, Alice J. Desimone
Department of Management: Faculty Publications
Survey respondents differ in their levels of attention and effort when responding to items. There are a number of methods researchers may use to identify respondents who fail to exert sufficient effort in order to increase the rigor of analysis and enhance the trustworthiness of study results. Screening techniques are organized into three general categories, which differ in impact on survey design and potential respondent awareness. Assumptions and considerations regarding appropriate use of screening techniques are discussed along with descriptions of each technique. The utility of each screening technique is a function of survey design and administration. Each technique has …
Robust Optimization Of Biological Protocols, Patrick Flaherty, Ronald W. Davis
Robust Optimization Of Biological Protocols, Patrick Flaherty, Ronald W. Davis
Mathematics and Statistics Department Faculty Publication Series
When conducting high-throughput biological experiments, it is often necessary to develop a protocol that is both inexpensive and robust. Standard approaches are either not cost-effective or arrive at an optimized protocol that is sensitive to experimental variations. Here, we describe a novel approach that directly minimizes the cost of the protocol while ensuring the protocol is robust to experimental variation. Our approach uses a risk-averse conditional value-at-risk criterion in a robust parameter design framework. We demonstrate this approach on a polymerase chain reaction protocol and show that our improved protocol is less expensive than the standard protocol and more robust …
Adaptive Pair-Matching In The Search Trial And Estimation Of The Intervention Effect, Laura Balzer, Maya L. Petersen, Mark J. Van Der Laan
Adaptive Pair-Matching In The Search Trial And Estimation Of The Intervention Effect, Laura Balzer, Maya L. Petersen, Mark J. Van Der Laan
U.C. Berkeley Division of Biostatistics Working Paper Series
In randomized trials, pair-matching is an intuitive design strategy to protect study validity and to potentially increase study power. In a common design, candidate units are identified, and their baseline characteristics used to create the best n/2 matched pairs. Within the resulting pairs, the intervention is randomized, and the outcomes measured at the end of follow-up. We consider this design to be adaptive, because the construction of the matched pairs depends on the baseline covariates of all candidate units. As consequence, the observed data cannot be considered as n/2 independent, identically distributed (i.i.d.) pairs of units, as current practice assumes. …
A General Procedure Of Estimating Population Mean Using Information On Auxiliary Attribute, Sachin Malik, Rajesh Singh, Florentin Smarandache
A General Procedure Of Estimating Population Mean Using Information On Auxiliary Attribute, Sachin Malik, Rajesh Singh, Florentin Smarandache
Branch Mathematics and Statistics Faculty and Staff Publications
This paper deals with the problem of estimating the finite population mean when some information on auxiliary attribute is available. It is shown that the proposed estimator is more efficient than the usual mean estimator and other existing estimators. The results have been illustrated numerically by taking empirical population considered in the literature.
A General Family Of Dual To Ratio-Cum-Product Estimator In Sample Surveys, Florentin Smarandache, Rajesh Singh, Mukesh Kumar, Pankaj Chauhan, Nirmala Sawan
A General Family Of Dual To Ratio-Cum-Product Estimator In Sample Surveys, Florentin Smarandache, Rajesh Singh, Mukesh Kumar, Pankaj Chauhan, Nirmala Sawan
Branch Mathematics and Statistics Faculty and Staff Publications
This paper presents a family of dual to ratio-cum-product estimators for the finite population mean. Under simple random sampling without replacement (SRSWOR) scheme, expressions of the bias and mean-squared error (MSE) up to the first order of approximation are derived. We show that the proposed family is more efficient than usual unbiased estimator, ratio estimator, product estimator, Singh estimator (1967), Srivenkataramana (1980) and Bandyopadhyaya estimator (1980) and Singh et al. (2005) estimator. An empirical study is carried out to illustrate the performance of the constructed estimator over others.
Estimation And Testing In Targeted Group Sequential Covariate-Adjusted Randomized Clinical Trials, Antoine Chambaz, Mark J. Van Der Laan
Estimation And Testing In Targeted Group Sequential Covariate-Adjusted Randomized Clinical Trials, Antoine Chambaz, Mark J. Van Der Laan
U.C. Berkeley Division of Biostatistics Working Paper Series
This article is devoted to the construction and asymptotic study of adaptive group sequential covariate-adjusted randomized clinical trials analyzed through the prism of the semiparametric methodology of targeted maximum likelihood estimation (TMLE). We show how to build, as the data accrue group-sequentially, a sampling design which targets a user-supplied optimal design. We also show how to carry out a sound TMLE statistical inference based on such an adaptive sampling scheme (therefore extending some results known in the i.i.d setting only so far), and how group-sequential testing applies on top of it. The procedure is robust (i.e., consistent even if the …
Confidence Intervals For The Population Mean Tailored To Small Sample Sizes, With Applications To Survey Sampling, Michael Rosenblum, Mark J. Van Der Laan
Confidence Intervals For The Population Mean Tailored To Small Sample Sizes, With Applications To Survey Sampling, Michael Rosenblum, Mark J. Van Der Laan
U.C. Berkeley Division of Biostatistics Working Paper Series
The validity of standard confidence intervals constructed in survey sampling is based on the central limit theorem. For small sample sizes, the central limit theorem may give a poor approximation, resulting in confidence intervals that are misleading. We discuss this issue and propose methods for constructing confidence intervals for the population mean tailored to small sample sizes.
We present a simple approach for constructing confidence intervals for the population mean based on tail bounds for the sample mean that are correct for all sample sizes. Bernstein's inequality provides one such tail bound. The resulting confidence intervals have guaranteed coverage probability …
Detailed Version: Analyzing Direct Effects In Randomized Trials With Secondary Interventions: An Application To Hiv Prevention Trials, Michael A. Rosenblum, Nicholas P. Jewell, Mark J. Van Der Laan, Stephen Shiboski, Ariane Van Der Straten, Nancy Padian
Detailed Version: Analyzing Direct Effects In Randomized Trials With Secondary Interventions: An Application To Hiv Prevention Trials, Michael A. Rosenblum, Nicholas P. Jewell, Mark J. Van Der Laan, Stephen Shiboski, Ariane Van Der Straten, Nancy Padian
U.C. Berkeley Division of Biostatistics Working Paper Series
This is the detailed technical report that accompanies the paper “Analyzing Direct Effects in Randomized Trials with Secondary Interventions: An Application to HIV Prevention Trials” (an unpublished, technical report version of which is available online at http://www.bepress.com/ucbbiostat/paper223).
The version here gives full details of the models for the time-dependent analysis, and presents further results in the data analysis section. The Methods for Improving Reproductive Health in Africa (MIRA) trial is a recently completed randomized trial that investigated the effect of diaphragm and lubricant gel use in reducing HIV infection among susceptible women. 5,045 women were randomly assigned to either the …
Analyzing Direct Effects In Randomized Trials With Secondary Interventions , Michael Rosenblum, Nicholas P. Jewell, Mark J. Van Der Laan, Stephen Shiboski, Ariane Van Der Straten, Nancy Padian
Analyzing Direct Effects In Randomized Trials With Secondary Interventions , Michael Rosenblum, Nicholas P. Jewell, Mark J. Van Der Laan, Stephen Shiboski, Ariane Van Der Straten, Nancy Padian
U.C. Berkeley Division of Biostatistics Working Paper Series
The Methods for Improving Reproductive Health in Africa (MIRA) trial is a recently completed randomized trial that investigated the effect of diaphragm and lubricant gel use in reducing HIV infection among susceptible women. 5,045 women were randomly assigned to either the active treatment arm or not. Additionally, all subjects in both arms received intensive condom counselling and provision, the "gold standard" HIV prevention barrier method. There was much lower reported condom use in the intervention arm than in the control arm, making it difficult to answer important public health questions based solely on the intention-to-treat analysis. We adapt an analysis …
New Statistical Paradigms Leading To Web-Based Tools For Clinical/Translational Science, Knut M. Wittkowski
New Statistical Paradigms Leading To Web-Based Tools For Clinical/Translational Science, Knut M. Wittkowski
COBRA Preprint Series
As the field of functional genetics and genomics is beginning to mature, we become confronted with new challenges. The constant drop in price for sequencing and gene expression profiling as well as the increasing number of genetic and genomic variables that can be measured makes it feasible to address more complex questions. The success with rare diseases caused by single loci or genes has provided us with a proof-of-concept that new therapies can be developed based on functional genomics and genetics.
Common diseases, however, typically involve genetic epistasis, genomic pathways, and proteomic pattern. Moreover, to better understand the underlying biologi-cal …
Censored Linear Regression For Case-Cohort Studies, Bin Nan, Menggang Yu, Jack Kalbfleisch
Censored Linear Regression For Case-Cohort Studies, Bin Nan, Menggang Yu, Jack Kalbfleisch
The University of Michigan Department of Biostatistics Working Paper Series
Right censored data from a classical case-cohort design and a stratified case-cohort design are considered. In the classical case-cohort design, the subcohort is obtained as a simple random sample of the entire cohort, whereas in the stratified design, the subcohort is selected by independent Bernoulli sampling with arbitrary selection probabilities. For each design and under a linear regression model, methods for estimating the regression parameters are proposed and analyzed. These methods are derived by modifying the linear ranks tests and estimating equations that arise from full-cohort data using methods that are similar to the "pseudo-likelihood" estimating equation that has been …
Non-Parametric Estimation Of Roc Curves In The Absence Of A Gold Standard, Xiao-Hua Zhou, Pete Castelluccio, Chuan Zhou
Non-Parametric Estimation Of Roc Curves In The Absence Of A Gold Standard, Xiao-Hua Zhou, Pete Castelluccio, Chuan Zhou
UW Biostatistics Working Paper Series
In evaluation of diagnostic accuracy of tests, a gold standard on the disease status is required. However, in many complex diseases, it is impossible or unethical to obtain such the gold standard. If an imperfect standard is used as if it were a gold standard, the estimated accuracy of the tests would be biased. This type of bias is called imperfect gold standard bias. In this paper we develop a maximum likelihood (ML) method for estimating ROC curves and their areas of ordinal-scale tests in the absence of a gold standard. Our simulation study shows the proposed estimates for the …
New Estimating Methods For Surrogate Outcome Data, Bin Nan
New Estimating Methods For Surrogate Outcome Data, Bin Nan
The University of Michigan Department of Biostatistics Working Paper Series
Surrogate outcome data arise frequently in medical research. The true outcomes of interest are expensive or hard to ascertain, but measurements of surrogate outcomes (or more generally speaking, the correlates of the true outcomes) are usually available. In this paper we assume that the conditional expectation of the true outcome given covariates is known up to a finite dimensional parameter. When the true outcome is missing at random, the e±cient score function for the parameter in the conditional mean model has a simple form, which is similar to the generalized estimating functions. There is no integral equation involved as in …
Multiple Imputation For Interval Censored Data With Auxiliary Variables, Chiu-Hsieh Hsu, Jeremy Taylor, Susan Murray
Multiple Imputation For Interval Censored Data With Auxiliary Variables, Chiu-Hsieh Hsu, Jeremy Taylor, Susan Murray
The University of Michigan Department of Biostatistics Working Paper Series
We propose a nonparametric multiple imputation scheme, NPMLE imputation, for the analysis of interval censored survival data. Features of the method are that it converts interval-censored data problems to complete data or right censored data problems to which many standard approaches can be used, and the measures of uncertainty are easily obtained. In addition to the event time of primary interest, there are frequently other auxiliary variables that are associated with the event time. For the goal of estimating the marginal survival distribution, these auxiliary variables may provide some additional information about the event time for the interval censored observations. …
Optimal Sample Size For Multiple Testing: The Case Of Gene Expression Microarrays, Peter Muller, Giovanni Parmigiani, Christian Robert, Judith Rousseau
Optimal Sample Size For Multiple Testing: The Case Of Gene Expression Microarrays, Peter Muller, Giovanni Parmigiani, Christian Robert, Judith Rousseau
Johns Hopkins University, Dept. of Biostatistics Working Papers
We consider the choice of an optimal sample size for multiple comparison problems. The motivating application is the choice of the number of microarray experiments to be carried out when learning about differential gene expression. However, the approach is valid in any application that involves multiple comparisons in a large number of hypothesis tests. We discuss two decision problems in the context of this setup: the sample size selection and the decision about the multiple comparisons. We adopt a decision theoretic approach,using loss functions that combine the competing goals of discovering as many ifferentially expressed genes as possible, while keeping …
Robust Likelihood-Based Analysis Of Multivariate Data With Missing Values, Rod Little, An Hyonggin
Robust Likelihood-Based Analysis Of Multivariate Data With Missing Values, Rod Little, An Hyonggin
The University of Michigan Department of Biostatistics Working Paper Series
The model-based approach to inference from multivariate data with missing values is reviewed. Regression prediction is most useful when the covariates are predictive of the missing values and the probability of being missing, and in these circumstances predictions are particularly sensitive to model misspecification. The use of penalized splines of the propensity score is proposed to yield robust model-based inference under the missing at random (MAR) assumption, assuming monotone missing data. Simulation comparisons with other methods suggest that the method works well in a wide range of populations, with little loss of efficiency relative to parametric models when the latter …
Weighting Adjustments For Unit Nonresponse With Multiple Outcome Variables, Sonya L. Vartivarian, Rod Little
Weighting Adjustments For Unit Nonresponse With Multiple Outcome Variables, Sonya L. Vartivarian, Rod Little
The University of Michigan Department of Biostatistics Working Paper Series
Weighting is a common form of unit nonresponse adjustment in sample surveys where entire questionnaires are missing due to noncontact or refusal to participate. Weights are inversely proportional to the probability of selection and response. A common approach computes the response weight adjustment cells based on covariate information. When the number of cells thus created is too large, a coarsening method such as response propensity stratification can be applied to reduce the number of adjustment cells. Simulations in Vartivarian and Little (2002) indicate improved efficiency and robustness of weighting adjustments based on the joint classification of the sample by two …
Inference For The Population Total From Probability-Proportional-To-Size Samples Based On Predictions From A Penalized Spline Nonparametric Model, Hui Zheng, Rod Little
Inference For The Population Total From Probability-Proportional-To-Size Samples Based On Predictions From A Penalized Spline Nonparametric Model, Hui Zheng, Rod Little
The University of Michigan Department of Biostatistics Working Paper Series
Inference about the finite population total from probability-proportional-to-size (PPS) samples is considered. In previous work (Zheng and Little, 2003), penalized spline (p-spline) nonparametric model-based estimators were shown to generally outperform the Horvitz-Thompson (HT) and generalized regression (GR) estimators in terms of the root mean squared error. In this article we develop model-based, jackknife and balanced repeated replicate variance estimation methods for the p-spline based estimators. Asymptotic properties of the jackknife method are discussed. Simulations show that p-spline point estimators and their jackknife standard errors lead to inferences that are superior to HT or GR based inferences. This suggests that nonparametric …
Semiparametric Regression Models With Missing Data: The Mathematics In The Work Of Robins Et Al., Menggang Yu, Bin Nan
Semiparametric Regression Models With Missing Data: The Mathematics In The Work Of Robins Et Al., Menggang Yu, Bin Nan
The University of Michigan Department of Biostatistics Working Paper Series
This review is an attempt to understand the landmark papers of Robins, Rotnitzky, and Zhao (1994) and Robins and Rotnitzky (1992). We revisit their main results and corresponding proofs using the theory outlined in the monograph by Bickel, Klaassen, Ritov, and Wellner (1993). We also discuss an illustrative example to show the details of applying these theoretical results.
Penalized Spline Nonparametric Mixed Models For Inference About A Finite Population Mean From Two-Stage Samples, Hui Zheng, Rod Little
Penalized Spline Nonparametric Mixed Models For Inference About A Finite Population Mean From Two-Stage Samples, Hui Zheng, Rod Little
The University of Michigan Department of Biostatistics Working Paper Series
Samplers often distrust model-based approaches to survey inference due to concerns about model misspecification when applied to large samples from complex populations. We suggest that the model-based paradigm can work very successfully in survey settings, provided models are chosen that take into account the sample design and avoid strong parametric assumptions. The Horvitz-Thompson (HT) estimator is a simple design-unbiased estimator of the finite population total in probability sampling designs. From a modeling perspective, the HT estimator performs well when the ratios of the outcome values and the inclusion probabilities are exchangeable. When this assumption is not met, the HT estimator …
Assessing The Accuracy Of A New Diagnostic Test When A Gold Standard Does Not Exist, Todd A. Alonzo, Margaret S. Pepe
Assessing The Accuracy Of A New Diagnostic Test When A Gold Standard Does Not Exist, Todd A. Alonzo, Margaret S. Pepe
UW Biostatistics Working Paper Series
Often the accuracy of a new diagnostic test must be assessed when a perfect gold standard does not exist. Use of an imperfect test biases the accuracy estimates of the new test. This paper reviews existing approaches to this problem including discrepant resolution and latent class analysis. Deficiencies with these approaches are identified. A new approach is proposed that combines the results of several imperfect reference tests to define a better reference standard. We call this the composite reference standard (CRS). Using the CRS, accuracy can be assessed using multistage sampling designs. Maximum likelihood estimates of accuracy and expressions for …