Open Access. Powered by Scholars. Published by Universities.®
- Keyword
-
- Genetics (6)
- BLUPs; Kernel function; Model/variable selection; Nonparametric regression; Penalized likelihood; REML; Score test; Smoothing parameter; Support vector machines (2)
- Self-rated health (2)
- (1)
- ADL (1)
-
- Accelerated failure time model; Buckley-James method; Censored survival data; Elastic-net; High-dimensional covariate (1)
- Adaptive design; phase I trial; dose escalation; CRM; maximum tolerated dose (1)
- Adjacency matrix; disease mapping; epidemiology; Markov processes (1)
- Adjustment uncertainty; Bayesian model averaging; Air pollution (1)
- Age-at-onset; Asymptotoc bias; Case-cohort; Case-control; Rare disease (1)
- Aggregate data design; auxiliary variables; ecological bias; efficiency; two-phase sampling; within-area confounding (1)
- Air pollution; Case-crossover design; Environmental epidemiology; Log-linear model; Poisson regression; Time series (1)
- Akaike information; asymptotic efficiency; consistency; profile likelihood; likelihood ratio test; testing on the boundary; Laplace approximation; reciprocal importance sampling; bridge sampling (1)
- All-available case estimator; Complete-case estimator; Hypertension; Maximum likelihood estimator; Missing data; Moment-based estimator (1)
- Alpha spending; Change-point; Hazard estimation; Multiple comparisons; Survival analysis (1)
- Amplifications (1)
- Annotation metadata; Gene Ontology (GO); genomics; microarray; multiple hypothesis testing; resampling (1)
- As-treated analysis; Per-protocol analysis; Causal inference; Instrumental variables; Principal stratification; Propensity scores (1)
- Asthma; Cluster Detection; Cumulative Residuals; Martingales; Spatial Scan Statistic (1)
- Asymptotic bias and variance; Clustered survival data; Efficiency; Estimating equation; Kernel smoothing; Marginal model; Sandwich estimator (1)
- Asymptotic bias; EM algorithm; Maximum likelihood estimator; Measurement error; Structural modeling; Transitional Models (1)
- Asymptotic efficiency; Conditional score method; Functional modeling; Measurement error; Longitudinal data; Semiparametric inference; Transition models (1)
- Average bioequivalence; Crossover design; Gibbs sampling; Mixture of Dirichlet Process prior; Markov Chain Monte Carlo (1)
- B-spline (1)
- BDMCMC (1)
- BRFSS (1)
- Bayes factor (1)
- Bayesian (1)
- Bayesian analysis (1)
- Bayesian statistics; Fourier basis; FFT; geostatistics; generalized linear mixed model; generalized additive model; Markov chain Monte Carlo; spatial statistics; spectral representation (1)
- Publication
-
- UW Biostatistics Working Paper Series (24)
- Harvard University Biostatistics Working Paper Series (23)
- The University of Michigan Department of Biostatistics Working Paper Series (12)
- U.C. Berkeley Division of Biostatistics Working Paper Series (12)
- Johns Hopkins University, Dept. of Biostatistics Working Papers (10)
Articles 1 - 30 of 101
Full-Text Articles in Entire DC Network
Lehmann Family Of Roc Curves, Mithat Gonen, Glenn Heller
Lehmann Family Of Roc Curves, Mithat Gonen, Glenn Heller
Memorial Sloan-Kettering Cancer Center, Dept. of Epidemiology & Biostatistics Working Paper Series
Receiver operating characteristic (ROC) curves are useful in evaluating the ability of a continuous marker in discriminating between the two states of a binary outcome such as diseased/not diseased. The most popular parametric model for an ROC curve is the binormal model which assumes that the marker is normally distributed conditional on the outcome. Here we present an alternative to the binormal model based on the Lehmann family, also known as the proportional hazards specification. The resulting ROC curve and its functionals (such as the area under the curve) have simple analytic forms. We derive closed-form expressions for the asymptotic …
Interacting With Local And Remote Data Respositories Using The Stashr Package, Sandrah P. Eckel, Roger Peng
Interacting With Local And Remote Data Respositories Using The Stashr Package, Sandrah P. Eckel, Roger Peng
Johns Hopkins University, Dept. of Biostatistics Working Papers
The stashR package (a Set of Tools for Administering SHared Repositories) for R implements a simple key-value style database where character string keys are associated with data values. The key-value databases can be either stored locally on the user's computer or accessed remotely via the Internet. Methods specific to the stashR package allow users to share data repositories or access previously created remote data repositories. In particular, methods are available for the S4 classes localDB and remoteDB to insert, retrieve, or delete data from the database as well as to synchronize local copies of the data to the remote version …
A Likelihood Based Method For Real Time Estimation Of The Serial Interval And Reproductive Number Of An Epidemic, Laura Forsberg White, Marcello Pagano
A Likelihood Based Method For Real Time Estimation Of The Serial Interval And Reproductive Number Of An Epidemic, Laura Forsberg White, Marcello Pagano
Harvard University Biostatistics Working Paper Series
No abstract provided.
A Semiparametric Approach For The Nonparametric Transformation Survival Model With Multiple Covariates, Xiao Song, Shuangge Ma, Jian Huang, Xiao-Hua Zhou
A Semiparametric Approach For The Nonparametric Transformation Survival Model With Multiple Covariates, Xiao Song, Shuangge Ma, Jian Huang, Xiao-Hua Zhou
UW Biostatistics Working Paper Series
The nonparametric transformation model for survival time that makes no parametric assumptions on both the transformation function and the error is appealing in its flexibility. The nonparametric transformation model makes no assumption on the forms of the transformation function and the error distribution. This model is appealing in its flexibility for modeling censored survival data. Current approaches for estimation of the regression parameters involve maximizing discontinuous objective functions, which are numerically infeasible to implement in the case of multiple covariates. Based on the partial rank estimator (Khan & Tamer, 2004), we propose a smoothed partial rank estimator which maximizes a …
Gamma Shape Mixtures For Heavy-Tailed Distributions, Sergio Venturini, Francesca Dominici, Giovanni Parmigiani
Gamma Shape Mixtures For Heavy-Tailed Distributions, Sergio Venturini, Francesca Dominici, Giovanni Parmigiani
Johns Hopkins University, Dept. of Biostatistics Working Papers
An important question in health services research is the estimation of the proportion of medical expenditures that exceed a given threshold. Typically, medical expenditures present highly skewed, heavy tailed distributions, for which a) simple variable transformations are insufficient to achieve a tractable low- dimensional parametric form and b) nonparametric methods are not efficient in estimating exceedance probabilities for large thresholds. Motivated by this context, in this paper we propose a general Bayesian approach for the estimation of tail probabilities of heavy-tailed distributions,based on a mixture of gamma distributions in which the mixing occurs over the shape parameter. This family provides …
Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh
Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh
The University of Michigan Department of Biostatistics Working Paper Series
SUMMARY. We consider a semiparametric regression model that relates a normal outcome to covariates and a genetic pathway, where the covariate effects are modeled parametrically and the pathway effect of multiple gene expressions is modeled parametrically or nonparametrically using least squares kernel machines (LSKMs). This unified framework allows a flexible function for the joint effect of multiple genes within a pathway by specifying a kernel function and allows for the possibility that each gene expression effect might be nonlinear and the genes within the same pathway are likely to interact with each other in a complicated way. This semiparametric model …
Spatio-Temporal Analysis Of Areal Data And Discovery Of Neighborhood Relationships In Conditionally Autoregressive Models, Subharup Guha, Louise Ryan
Spatio-Temporal Analysis Of Areal Data And Discovery Of Neighborhood Relationships In Conditionally Autoregressive Models, Subharup Guha, Louise Ryan
Harvard University Biostatistics Working Paper Series
No abstract provided.
Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh
Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh
Harvard University Biostatistics Working Paper Series
No abstract provided.
Analysis Of Case-Control Age-At-Onset Data Using A Modified Case-Cohort Method, Bin Nan, Xihong Lin
Analysis Of Case-Control Age-At-Onset Data Using A Modified Case-Cohort Method, Bin Nan, Xihong Lin
The University of Michigan Department of Biostatistics Working Paper Series
Case-control designs are widely used in rare disease studies. In a typical case-control study, data are collected from a sample of all available subjects who have experienced a disease (cases) and a sub-sample of subjects who have not experienced the disease (controls) in a study cohort. Cases are often oversampled in case-control studies. Logistic regression is a common tool to estimate the relative risks of the disease and a set of covariates. Very often in such a study, information of ages-at-onset of the disease for all cases and ages at survey of controls are known. Standard logistic regression analysis using …
Smoothed Rank Regression With Censored Data, Glenn Heller
Smoothed Rank Regression With Censored Data, Glenn Heller
Memorial Sloan-Kettering Cancer Center, Dept. of Epidemiology & Biostatistics Working Paper Series
A weighted rank estimating function is proposed to estimate the regression parameter vector in an accelerated failure time model with right censored data. In general, rank estimating functions are discontinuous in the regression parameter, creating difficulties in determining the asymptotic distribution of the estimator. A local distribution function is used to create a rank based estimating function that is continuous and monotone in the regression parameter vector. A weight is included in the estimating function to produce a bounded influence estimate. The asymptotic distribution of the regression estimator is developed and simulations are performed to examine its finite sample properties. …
Properties Of Monotonic Effects, Tyler J. Vanderweele, James M. Robins
Properties Of Monotonic Effects, Tyler J. Vanderweele, James M. Robins
COBRA Preprint Series
Various relationships are shown hold between monotonic effects and weak monotonic effects and the monotonicity of certain conditional expectations. This relationship is considered for both binary and non-binary variables. Counterexamples are provide to show that the results do not hold under less restrictive conditions. The ideas of monotonic effects are furthermore used to relate signed edges on a directed acyclic graph to qualitative effect modification.
Multiple Testing With An Empirical Alternative Hypothesis, James E. Signorovitch
Multiple Testing With An Empirical Alternative Hypothesis, James E. Signorovitch
Harvard University Biostatistics Working Paper Series
An optimal multiple testing procedure is identified for linear hypotheses under the general linear model, maximizing the expected number of false null hypotheses rejected at any significance level. The optimal procedure depends on the unknown data-generating distribution, but can be consistently estimated. Drawing information together across many hypotheses, the estimated optimal procedure provides an empirical alternative hypothesis by adapting to underlying patterns of departure from the null. Proposed multiple testing procedures based on the empirical alternative are evaluated through simulations and an application to gene expression microarray data. Compared to a standard multiple testing procedure, it is not unusual for …
Doubly Penalized Buckley-James Method For Survival Data With High-Dimensional Covariates, Sijian Wang, Bin Nan, Ji Zhu, David G. Beer
Doubly Penalized Buckley-James Method For Survival Data With High-Dimensional Covariates, Sijian Wang, Bin Nan, Ji Zhu, David G. Beer
The University of Michigan Department of Biostatistics Working Paper Series
Recent interest in cancer research focuses on predicting patients' survival by investigating gene expression profiles based on microarray analysis. We propose a doubly penalized Buckley-James method for the semiparametric accelerated failure time model to relate high-dimensional genomic data to censored survival outcomes, which uses a mixture of L1-norm and L2-norm penalties. Similar to the elastic-net method for linear regression model with uncensored data, the proposed method performs automatic gene selection and parameter estimation, where highly correlated genes are able to be selected (or removed) together. The two-dimensional tuning parameter is determined by cross-validation and uniform design. …
A Note On Bias Due To Fitting Prospective Multivariate Generalized Linear Models To Categorical Outcomes Ignoring Retrospective Sampling Schemes, Bhramar Mukherjee, Ivy Liu
A Note On Bias Due To Fitting Prospective Multivariate Generalized Linear Models To Categorical Outcomes Ignoring Retrospective Sampling Schemes, Bhramar Mukherjee, Ivy Liu
The University of Michigan Department of Biostatistics Working Paper Series
Outcome dependent sampling designs are commonly used in economics, market research and epidemiological studies. Case-control sampling design is a classic example of outcome dependent sampling, where exposure information is collected on subjects conditional on their disease status. In many situations, the outcome under consideration may have multiple categories instead of a simple dichotomization. For example, in a case-control study, there may be disease sub-classification among the “cases” based on progression of the disease, or in terms of other histological and morphological characteristics of the disease. In this note, we investigate the issue of fitting prospective multivariate generalized linear models to …
Exploiting Gene-Environment Independence For Analysis Of Case-Control Studies: An Empirical Bayes Approach To Trade Off Between Bias And Efficiency, Bhramar Mukherjee, Nilanjan Chatterjee
Exploiting Gene-Environment Independence For Analysis Of Case-Control Studies: An Empirical Bayes Approach To Trade Off Between Bias And Efficiency, Bhramar Mukherjee, Nilanjan Chatterjee
The University of Michigan Department of Biostatistics Working Paper Series
Standard prospective logistic regression analysis of case-control data often leads to very imprecise estimates of gene-environment interactions due to small numbers of cases or controls in cells of crossing genotype and exposure. In contrast, under the assumption of gene-environment independence, modern “retrospective” methods, including the “case-only” approach, can estimate the interaction parameters much more precisely, but they can be seriously biased when the underlying assumption of gene-environment independence is violated. In this article, we propose a novel approach to analyze case-control data that can relax the gene-environment independence assumption using an empirical Bayes framework. In the special case, involving a …
Large Cluster Asymptotics For Gee: Working Correlation Models, Hyoju Chung, Thomas Lumley
Large Cluster Asymptotics For Gee: Working Correlation Models, Hyoju Chung, Thomas Lumley
UW Biostatistics Working Paper Series
This paper presents large cluster asymptotic results for generalized estimating equations. The complexity of working correlation model is characterized in terms of the number of working correlation components to be estimated. When the cluster size is relatively large, we may encounter a situation where a high-dimensional working correlation matrix is modeled and estimated from the data. In the present asymptotic setting, the cluster size and the complexity of working correlation model grow with the number of independent clusters. We show the existence, weak consistency and asymptotic normality of marginal regression parameter estimators using the results of empirical process theory and …
Statistical Analysis Of Air Pollution Panel Studies: An Illustration, Holly Janes, Lianne Sheppard, Kristen Shepherd
Statistical Analysis Of Air Pollution Panel Studies: An Illustration, Holly Janes, Lianne Sheppard, Kristen Shepherd
UW Biostatistics Working Paper Series
The panel study design is commonly used to evaluate the short-term health effects of air pollution. Standard statistical methods for analyzing longitudinal data are available, but the literature reveals that the techniques are not well understood by practitioners. We illustrate these methods using data from the 1999 to 2002 Seattle panel study. Marginal, conditional, and transitional approaches for modeling longitudinal data are reviewed and contrasted with respect to their parameter interpretation and methods for accounting for correlation and dealing with missing data. We also discuss and illustrate techniques for controlling for time-dependent and time-independent confounding, and for exploring and summarizing …
Bayesian Hidden Markov Modeling Of Array Cgh Data, Subharup Guha, Yi Li, Donna Neuberg
Bayesian Hidden Markov Modeling Of Array Cgh Data, Subharup Guha, Yi Li, Donna Neuberg
Harvard University Biostatistics Working Paper Series
Genomic alterations have been linked to the development and progression of cancer. The technique of Comparative Genomic Hybridization (CGH) yields data consisting of fluorescence intensity ratios of test and reference DNA samples. The intensity ratios provide information about the number of copies in DNA. Practical issues such as the contamination of tumor cells in tissue specimens and normalization errors necessitate the use of statistics for learning about the genomic alterations from array-CGH data. As increasing amounts of array CGH data become available, there is a growing need for automated algorithms for characterizing genomic profiles. Specifically, there is a need for …
Exploration Of Distributional Models For A Novel Intensity-Dependent Normalization , Nicola Lama, Patrizia Boracchi, Elia Mario Biganzoli
Exploration Of Distributional Models For A Novel Intensity-Dependent Normalization , Nicola Lama, Patrizia Boracchi, Elia Mario Biganzoli
COBRA Preprint Series
Currently used gene intensity-dependent normalization methods, based on regression smoothing techniques, usually approach the two problems of location bias detrending and data re-scaling without taking into account the censoring characteristic of certain gene expressions produced by experiment measurement constraints or by previous normalization steps. Moreover, the bias vs variance balance control of normalization procedures is not often discussed but left to the user's experience. Here an approximate maximum likelihood procedure to fit a model smoothing the dependences of log-fold gene expression differences on average gene intensities is presented. Central tendency and scaling factor were modeled by means of B-splines smoothing …
Targeted Maximum Likelihood Learning, Mark J. Van Der Laan, Daniel Rubin
Targeted Maximum Likelihood Learning, Mark J. Van Der Laan, Daniel Rubin
U.C. Berkeley Division of Biostatistics Working Paper Series
Suppose one observes a sample of independent and identically distributed observations from a particular data generating distribution. Suppose that one has available an estimate of the density of the data generating distribution such as a maximum likelihood estimator according to a given or data adaptively selected model. Suppose that one is concerned with estimation of a particular pathwise differentiable Euclidean parameter. A substitution estimator evaluating the parameter of the density estimator is typically too biased and might not even converge at the parametric rate: that is, the density estimator was targeted to be a good estimator of the density and …
Crude Cumulative Incidence In The Form Of A Horvitz-Thompson Like And Kaplan-Meier Like Estimator, Laura Antolini, Elia Mario Biganzoli, Patrizia Boracchi
Crude Cumulative Incidence In The Form Of A Horvitz-Thompson Like And Kaplan-Meier Like Estimator, Laura Antolini, Elia Mario Biganzoli, Patrizia Boracchi
COBRA Preprint Series
The link between the nonparametric estimator of the crude cumulative incidence of a competing risk and the Kaplan-Meier estimator is exploited. The equivalence of the nonparametric crude cumulative incidence to an inverse-probability-of-censoring weighted average of the sub-distribution function is proved. The link between the estimation of crude cumulative incidence curves and Gray's family of nonparametric tests is considered. The crude cumulative incidence is proved to be a Kaplan-Meier like estimator based on the sub-distribution hazard, i.e. the quantity on which Gray's family of tests is based. A standard probabilistic formalism is adopted to have a note accessible to applied statisticians.
Cox Models With Nonlinear Effect Of Covariates Measured With Error: A Case Study Of Chronic Kidney Disease Incidence, Ciprian M. Crainiceanu, David Ruppert, Josef Coresh
Cox Models With Nonlinear Effect Of Covariates Measured With Error: A Case Study Of Chronic Kidney Disease Incidence, Ciprian M. Crainiceanu, David Ruppert, Josef Coresh
Johns Hopkins University, Dept. of Biostatistics Working Papers
We propose, develop and implement the simulation extrapolation (SIMEX) methodology for Cox regression models when the log hazard function is linear in the model parameters but nonlinear in the variables measured with error (LPNE). The class of LPNE functions contains but is not limited to strata indicators, splines, quadratic and interaction terms. The first order bias correction method proposed here has the advantage that it remains computationally feasible even when the number of observations is very large and multiple models need to be explored. Theoretical and simulation results show that the SIMEX method outperforms the naive method even with small …
Covariate Specific Roc Curve With Survival Outcome, Xiao Song, Xiao-Hua Zhou
Covariate Specific Roc Curve With Survival Outcome, Xiao Song, Xiao-Hua Zhou
UW Biostatistics Working Paper Series
The receiver operating characteristic (ROC) curve has been extended to survival data recently, including the nonparametric approach by Heagerty, Lumley and Pepe (2000) and the semiparametric approach by Heagerty and Zheng (2005) using standard survival analysis techniques based on two different time-dependent ROC curve definitions. However, both approaches cannot adjust for the effect of covariates on the accuracy of the biomarker. To account for the covariate effect, we propose semiparametric models for covariate specific ROC curves corresponding to the two time-dependent ROC curve definitions, respectively. We show that the estimators are consistent and converge to Gaussian processes. In the case …
Spatial Cluster Detection For Censored Outcome Data, Andrea J. Cook, Diane Gold, Yi Li
Spatial Cluster Detection For Censored Outcome Data, Andrea J. Cook, Diane Gold, Yi Li
Harvard University Biostatistics Working Paper Series
No abstract provided.
Diagnosing Bias In The Inverse Probability Of Treatment Weighted Estimator Resulting From Violation Of Experimental Treatment Assignment, Yue Wang, Maya L. Petersen, David Bangsberg, Mark J. Van Der Laan
Diagnosing Bias In The Inverse Probability Of Treatment Weighted Estimator Resulting From Violation Of Experimental Treatment Assignment, Yue Wang, Maya L. Petersen, David Bangsberg, Mark J. Van Der Laan
U.C. Berkeley Division of Biostatistics Working Paper Series
Inverse probability of treatment weighting (IPTW) is frequently used to estimate the causal effects of treatments and interventions. The consistency of the IPTW estimator relies not only on the well-recognized assumption of no unmeasured confounders (Sequential Randomization Assumption or SRA), but also on the assumption of experimentation in the assignment of treatment (Experimental Treatment Assignment or ETA). In finite samples, violations in the ETA assumption can occur due simply to chance; certain treatments become rare or non-existent for certain strata of the population. Such practical violations of the ETA assumption occur frequently in real data, and can result in significant …
Extending Marginal Structural Models Through Local, Penalized, And Additive Learning, Daniel Rubin, Mark J. Van Der Laan
Extending Marginal Structural Models Through Local, Penalized, And Additive Learning, Daniel Rubin, Mark J. Van Der Laan
U.C. Berkeley Division of Biostatistics Working Paper Series
Marginal structural models (MSMs) allow one to form causal inferences from data, by specifying a relationship between a treatment and the marginal distribution of a corresponding counterfactual outcome. Following their introduction in Robins (1997), MSMs have typically been fit after assuming a semiparametric model, and then estimating a finite dimensional parameter. van der Laan and Dudoit (2003) proposed to instead view MSM fitting not as a task of semiparametric parameter estimation, but of nonparametric function approximation. They introduced a class of causal effect estimators based on mapping loss functions suitable for the unavailable counterfactual data to those suitable for the …
Conditional Likelihood Methods For Haplotype-Based Association Analysis Using Matched Case-Control Data, Jinbo Chen, Carmen Rodriguez
Conditional Likelihood Methods For Haplotype-Based Association Analysis Using Matched Case-Control Data, Jinbo Chen, Carmen Rodriguez
UPenn Biostatistics Working Papers
Genetic epidemiologists routinely assess disease susceptibility in relation to haplotypes, i.e., combinations of alleles on a single chromosome. We study statistical methods for inferring haplotype-related disease risk using SNP genotype data from matched case-control studies, where controls are individually matched to cases on some selected factors. Assuming a logistic regression model for haplotype-disease association, we propose two conditional likelihood approaches that address the issue that haplotypes cannot be inferred with certainty from SNP genotype data (phase ambiquity). One approach is based on the likelihood of disease status conditioned on the total number of cases, genotypes, and other covariates within each …
Generalized Confidence Intervals For The Ratio Or Difference Of Two Means For Lognormal Populations With Zeros, Yea-Hung Chen, Xiao-Hua Zhou
Generalized Confidence Intervals For The Ratio Or Difference Of Two Means For Lognormal Populations With Zeros, Yea-Hung Chen, Xiao-Hua Zhou
UW Biostatistics Working Paper Series
We discuss in this article methods for analyzing lognormal data that may include zeros. Specifically, we are interested in interval estimation for the ratio or difference of the population means. We propose here two generalized pivotal (GP) approaches: a ``true'' GP method and an ``approximate'' GP method. Additionally, we propose two likelihood-based approaches: a signed log-likelihood ratio (SLLR) method and a modified SLLR method. Our simulation studies suggest that the approximate generalized pivotal approach outperforms all other known methods; it results in highly accurate coverage frequencies and fairly low bias, even in small sample settings.
Multiple Imputation - Review Of Theory, Implementation And Software, Ofer Harel, Xiao-Hua Zhou
Multiple Imputation - Review Of Theory, Implementation And Software, Ofer Harel, Xiao-Hua Zhou
UW Biostatistics Working Paper Series
Missing data is a common complication in data analysis. In many medical settings missing data can cause difficulties in estimation, precision and inference. Multiple imputation (MI) \cite{Rubin87} is a simulation based approach to deal with incomplete data. Although there are many different methods to deal with incomplete data, MI has become one of the leading methods. Since the late 80's we observed a constant increase in the use and publication of MI related research. This tutorial does not attempt to cover all the material concerning MI, but rather provides an overview and combines together the theory behind MI, the implementation …
Multiple Imputation For The Comparison Of Two Screening Tests In Two-Phase Alzheimer Studies, Ofer Harel, Xiao-Hua Zhou
Multiple Imputation For The Comparison Of Two Screening Tests In Two-Phase Alzheimer Studies, Ofer Harel, Xiao-Hua Zhou
UW Biostatistics Working Paper Series
Two-phase designs are common in epidemiological studies of dementia, and especially in Alzheimer research. In the first phase, all subjects are screened using a common screening test(s), while in the second phase, only a subset of these subjects is tested using a more definitive verification assessment, i.e. golden standard test. When comparing the accuracy of two screening tests in a two-phase study of dementia, inferences are commonly made using only the verified sample. It is well documented that in that case, there is a risk for bias, called verification bias. When the two screening tests have only two values (e.g. …