Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

2007

COBRA

Biostatistics

Keyword
Publication

Articles 1 - 30 of 41

Full-Text Articles in Physical Sciences and Mathematics

Spatio-Temporal Associations Between Goes Aerosol Optical Depth Retrievals And Ground-Level Pm2.5, Christopher J. Paciorek, Yang Liu, Hortensia Moreno-Macias, Shobha Kondragunta Dec 2007

Spatio-Temporal Associations Between Goes Aerosol Optical Depth Retrievals And Ground-Level Pm2.5, Christopher J. Paciorek, Yang Liu, Hortensia Moreno-Macias, Shobha Kondragunta

Harvard University Biostatistics Working Paper Series

We assess the strength of association between aerosol optical depth (AOD) retrievals from the GOES Aerosol/Smoke Product (GASP) and ground-level fine particulate matter (PM2.5) to assess AOD as a proxy for PM2.5 in the United States. GASP AOD is retrieved from a geostationary platform and therefore provides dense temporal coverage with half-hourly observations every day, in contrast to once per day snapshots from polar-orbiting satellites. However, GASP AOD is based on a less-sophisticated instrument and retrieval algorithm. We find that correlations between GASP AOD and PM2.5 over time at fixed locations are reasonably high, except in the winter and in …


Longitudinal Data With Follow-Up Truncated By Death: Finding A Match Between Analysis Method And Research Aims, Brenda Kurland, Laura Lee Johnson, Paula Diehr Nov 2007

Longitudinal Data With Follow-Up Truncated By Death: Finding A Match Between Analysis Method And Research Aims, Brenda Kurland, Laura Lee Johnson, Paula Diehr

UW Biostatistics Working Paper Series

Diverse analysis approaches have been proposed to distinguish data missing due to death from nonresponse, and to summarize trajectories of longitudinal data truncated by death. We demonstrate how these analysis approaches arise from factorizations of the distribution of longitudinal data and survival information. Models are illustrated using hypothetical data examples (cognitive functioning in older adults, and quality of life under hospice care) and up to 10 annual assessments of longitudinal cognitive functioning data for 3814 participants in an observational study. For unconditional models, deaths do not occur, deaths are independent of the longitudinal response, or the unconditional longitudinal response averages …


A Parametric Roc Model Based Approach For Evaluating The Predictiveness Of Continuous Markers In Case-Control Studies, Ying Huang, Margaret Pepe Nov 2007

A Parametric Roc Model Based Approach For Evaluating The Predictiveness Of Continuous Markers In Case-Control Studies, Ying Huang, Margaret Pepe

UW Biostatistics Working Paper Series

The predictiveness curve shows the population distribution of risk endowed by a marker or risk prediction model. It provides a means for assessing the model's capacity for risk stratification. Methods for making inference about the predictiveness curve have been developed using cross-sectional or cohort data. Here we consider inference based on case-control studies and prior knowledge about prevalence or incidence of the outcome. We exploit the relationship between the ROC curve and the predictiveness curve given disease prevalence. Methods are developed for deriving the predictiveness curve from a parametric ROC model. Estimation of the whole range and of a portion …


Estimation Of Dose-Response Functions For Longitudinal Data, Erica E M Moodie, David A. Stephens Nov 2007

Estimation Of Dose-Response Functions For Longitudinal Data, Erica E M Moodie, David A. Stephens

COBRA Preprint Series

In a longitudinal study of dose-response, the presence of confounding or non-compliance compromises the estimation of the true effect of a treatment. Standard regression methods cannot remove the bias introduced by patient-selected treatment level, that is, they do not permit the estimation of the causal effect of dose. Using an approach based on the Generalized Propensity Score (GPS), a generalization of the classical, binary treatment propensity score, it is possible to construct a balancing score that provides a more meaningful estimation procedure for the true (unconfounded) effect of dose. Previously, the GPS has been applied only in a single interval …


Loss-Based Estimation With Evolutionary Algorithms And Cross-Validation, David Shilane, Richard H. Liang, Sandrine Dudoit Nov 2007

Loss-Based Estimation With Evolutionary Algorithms And Cross-Validation, David Shilane, Richard H. Liang, Sandrine Dudoit

U.C. Berkeley Division of Biostatistics Working Paper Series

Many statistical inference methods rely upon selection procedures to estimate a parameter of the joint distribution of explanatory and outcome data, such as the regression function. Within the general framework for loss-based estimation of Dudoit and van der Laan, this project proposes an evolutionary algorithm (EA) as a procedure for risk optimization. We also analyze the size of the parameter space for polynomial regression under an interaction constraints along with constraints on either the polynomial or variable degree.


Identifiability And Estimation Of Causal Effects In Randomized Trials With Noncompliance And Completely Non-Ignorable Missing-Data, Hua Chen, Zhi Geng, Xiao-Hua Zhou Nov 2007

Identifiability And Estimation Of Causal Effects In Randomized Trials With Noncompliance And Completely Non-Ignorable Missing-Data, Hua Chen, Zhi Geng, Xiao-Hua Zhou

UW Biostatistics Working Paper Series

In this paper we first studied parameter identifiability in randomized clinical trials with noncompliance and missing outcomes. We showed that under certain conditions the parameters of interest were identifiable even under different types of completely non-ignorable missing data, that is, the missing mechanism depends on the outcome.We then derived their maximum likelihood (ML) and moment estimators and evaluated their finite-sample properties in simulation studies in terms of bias, efficiency and robustness. Our sensitive analysis showed the assumed non-ignorable missing- data model had an important impact on the estimated complier average causal effect (CACE) parameter. Our new method provides some new …


Nonparametric And Semiparametric Group Sequential Methods For Comparing Accuracy Of Diagnostic Tests, Liansheng Tang, Scott S. Emerson, Xiao-Hua Zhou Oct 2007

Nonparametric And Semiparametric Group Sequential Methods For Comparing Accuracy Of Diagnostic Tests, Liansheng Tang, Scott S. Emerson, Xiao-Hua Zhou

UW Biostatistics Working Paper Series

Comparison of the accuracy of two diagnostic tests using the receiver operating characteristic (ROC) curves from two diagnostic tests has been typically conducted using fixed sample designs. On the other hand, the human experimentation inherent in a comparison of diagnostic modalities argues for periodic monitoring of the accruing data to address many issues related to the ethics and efficiency of the medical study. To date, very little research has been done in the use of sequential sampling plans for comparative ROC studies, even when these studies may use expensive and unsafe diagnostic procedures. In this paper, we propose a nonparametric …


A Bayesian Image Analysis Of The Change In Tumor/Brain Contrast Uptake Induced By Radiation Via Reversible Jump Markov Chain Monte Carlo, Xiaoxi Zhang, Tim Johnson, Roderick J.A. Little Oct 2007

A Bayesian Image Analysis Of The Change In Tumor/Brain Contrast Uptake Induced By Radiation Via Reversible Jump Markov Chain Monte Carlo, Xiaoxi Zhang, Tim Johnson, Roderick J.A. Little

The University of Michigan Department of Biostatistics Working Paper Series

This work is motivated by a pilot study on the change in tumor/brain contrast uptake induced by radiation via quantitative Magnetic Resonance Imaging. The results inform the optimal timing of administering chemotherapy in the context of radiotherapy. A noticeable feature of the data is spatial heterogeneity. The tumor is physiologically and pathologically distinct from surrounding healthy tissue. Also, the tumor itself is usually highly heterogeneous. We employ a Gaussian Hidden Markov Random Field model that respects the above features. The model introduces a latent layer of discrete labels from an Markov Random Field (MRF) governed by a spatial regularization parameter. …


A Smoothing Approach To Data Masking, Yijie Zhous, Francesca Dominici, Thomas A. Louis Oct 2007

A Smoothing Approach To Data Masking, Yijie Zhous, Francesca Dominici, Thomas A. Louis

Johns Hopkins University, Dept. of Biostatistics Working Papers

Individual-level data are often not publicly available due to confidentiality. Instead, masked data are released for public use. However, analyses performed using masked data may produce invalid statistical results such as biased parameter estimates or incorrect standard errors. In this paper, we propose a data masking method using spatial smoothing, and we investigate the bias of parameter estimates resulting from analyses using the masked data for Generalized Linear Models (GLM). The method allows for varying both the form and the degree of masking by utilizing a smoothing weight function and a smoothness parameter. We show that data masking by using …


Roc Surfaces In The Presence Of Verification Bias, Yueh-Yun Chi, Xiao-Hua (Andrew) Zhou Sep 2007

Roc Surfaces In The Presence Of Verification Bias, Yueh-Yun Chi, Xiao-Hua (Andrew) Zhou

UW Biostatistics Working Paper Series

In diagnostic medicine, the Receiver Operating Characteristic (ROC) surface is one of the established tools for assessing the accuracy of a diagnostic test in discriminating three disease states, and the volume under the ROC surface has served as a summary index for diagnostic accuracy. In practice, the selection for definitive disease examination may be based on initial test measurements, and induces verification bias in the assessment. We propose here a nonparametric likelihood-based approach to construct the empirical ROC surface in the presence of differential verification, and to estimate the volume under the ROC surface. Estimators of the standard deviation are …


Comparing Trends In Cancer Rates Across Overlapping Regions, Yi Li, Ram C. Tiwari Aug 2007

Comparing Trends In Cancer Rates Across Overlapping Regions, Yi Li, Ram C. Tiwari

Harvard University Biostatistics Working Paper Series

No abstract provided.


Effective Communication Of Standard Errors And Confidence Intervals, Thomas A. Louis, Scott L. Zeger Aug 2007

Effective Communication Of Standard Errors And Confidence Intervals, Thomas A. Louis, Scott L. Zeger

Johns Hopkins University, Dept. of Biostatistics Working Papers

We recommend a format for communicating an estimate with its standard error or confidence interval. The format reinforces that the associated variability is an inseparable component of the estimate and it substantially improves clarity in tabular displays.


Inference For Survival Curves With Informatively Coarsened Discrete Event-Time Data: Application To Alive, Michelle Shardell, Daniel O. Scharfstein, David Vlahov, Noya Galai Aug 2007

Inference For Survival Curves With Informatively Coarsened Discrete Event-Time Data: Application To Alive, Michelle Shardell, Daniel O. Scharfstein, David Vlahov, Noya Galai

Johns Hopkins University, Dept. of Biostatistics Working Papers

In many prospective studies, including AIDS Link to the Intravenous Experience (ALIVE), researchers are interested in comparing event-time distributions (e.g.,for human immunodeficiency virus seroconversion) between a small number of groups (e.g., risk behavior categories). However, these comparisons are complicated by participants missing visits or attending visits off schedule and seroconverting during this absence. Such data are interval-censored, or more generally,coarsened. Most analysis procedures rely on the assumption of non-informative censoring, a special case of coarsening at random that may produce biased results if not valid. Our goal is to perform inference for estimated survival functions across a small number of …


Effectively Combining Independent 2 X 2 Tables For Valid Inferences In Meta Analysis With All Available Data But No Artificial Continuity Corrections For Studies With Zero Events And Its Application To The Analysis Of Rosiglitazone's Cardiovascular Disease Related Event Data, Lu Tian, Tianxi Cai, Nikita Piankov, Pierre-Yves Cremieux, L. J. Wei Aug 2007

Effectively Combining Independent 2 X 2 Tables For Valid Inferences In Meta Analysis With All Available Data But No Artificial Continuity Corrections For Studies With Zero Events And Its Application To The Analysis Of Rosiglitazone's Cardiovascular Disease Related Event Data, Lu Tian, Tianxi Cai, Nikita Piankov, Pierre-Yves Cremieux, L. J. Wei

Harvard University Biostatistics Working Paper Series

No abstract provided.


Biomarker Discovery Using Targeted Maximum Likelihood Estimation: Application To The Treatment Of Antiretroviral Resistant Hiv Infection, Oliver Bembom, Maya L. Petersen , Soo-Yon Rhee , W. Jeffrey Fessel , Sandra E. Sinisi, Robert W. Shafer, Mark J. Van Der Laan Aug 2007

Biomarker Discovery Using Targeted Maximum Likelihood Estimation: Application To The Treatment Of Antiretroviral Resistant Hiv Infection, Oliver Bembom, Maya L. Petersen , Soo-Yon Rhee , W. Jeffrey Fessel , Sandra E. Sinisi, Robert W. Shafer, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Researchers in clinical science and bioinformatics frequently aim to learn which of a set of candidate biomarkers is important in determining a given outcome, and to rank the contributions of the candidates accordingly. This article introduces a new approach to research questions of this type, based on targeted maximum likelihood estimation of variable importance measures.

The methodology is illustrated using an example drawn from the treatment of HIV infection. Specifically, given a list of candidate mutations in the protease enzyme of HIV, we aim to discover mutations that reduce clinical virologic response to antiretroviral regimens containing the protease inhibitor lopinavir. …


An Example Of How To Write The Statistical Section Of A Bioequivalence Study Protocol For Fda Review, William F. Mccarthy Jul 2007

An Example Of How To Write The Statistical Section Of A Bioequivalence Study Protocol For Fda Review, William F. Mccarthy

COBRA Preprint Series

This paper provides a detailed example of how one should write the statistical section of a bioequivalence study protocol for FDA review. Three forms of bioequivalence are covered: average bioequivalence (ABE), population bioequivalence (PBE) and individual bioequivalence (IBE). The method of analysis is based on Jones and Kenward (2003) and a modification of their SAS Macro is provided.


Variable Selection For Nonparametric Varying-Coefficient Models For Analysis Of Repeated Measurements, Lifeng Wang, Hongzhe Li Jul 2007

Variable Selection For Nonparametric Varying-Coefficient Models For Analysis Of Repeated Measurements, Lifeng Wang, Hongzhe Li

UPenn Biostatistics Working Papers

Nonparametric varying-coefficient models are commonly used for analysis of data measured repeatedly over time, including longitudinal and functional responses data. While many procedures have been developed for estimating the varying-coefficients, the problem of variable selection for such models has not been addressed. In this article, we present a regularized estimation procedure for variable selection for such nonparametric varying-coefficient models using basis function approximations and a group smoothly clipped absolute deviation penalty (gSCAD). This gSCAD procedure simultaneously selects significant variables with time-varying effects and estimates unknown smooth functions using basis function approximations. With appropriate selection of the tuning parameters, we have …


Assessment Of Sample Size And Power For The Analysis Of Clustered Matched-Pair Data, William F. Mccarthy Jul 2007

Assessment Of Sample Size And Power For The Analysis Of Clustered Matched-Pair Data, William F. Mccarthy

COBRA Preprint Series

This paper outlines how one can determined the sample size or power of a study design that is based on clustered matched-pair data. Detailed examples are provided.


Adjustment To The Mcnemar’S Test For The Analysis Of Clustered Matched-Pair Data, William F. Mccarthy Jul 2007

Adjustment To The Mcnemar’S Test For The Analysis Of Clustered Matched-Pair Data, William F. Mccarthy

COBRA Preprint Series

This paper presents how one can adjust the McNemar’s test for the analysis of clustered matched-pair data. A McNemar’s-like table for K clusters of matched-pair data is used.


The Existence Of Maximum Likelihood Estimates For The Binary Response Logistic Regression Model, William F. Mccarthy Jul 2007

The Existence Of Maximum Likelihood Estimates For The Binary Response Logistic Regression Model, William F. Mccarthy

COBRA Preprint Series

The existence of maximum likelihood estimates for the binary response logistic regression model depends on the configuration of the data points in your data set. There are three mutually exclusive and exhaustive categories for the configuration of data points in a data set: Complete Separation, Quasi-Complete Separation, and Overlap. For this paper, a binary response logistic regression model is considered. A 2 x 2 tabular presentation of the data set to be modeled is provided for each of the three categories mentioned above. In addition, the paper will present an example of a data set whose data points have a …


Lachenbruch’S Method For Determining The Sample Size Required For Testing Interactions: How It Compares To Nquery Advisor And O’Brien’S Sas Unifypow., William F. Mccarthy Jul 2007

Lachenbruch’S Method For Determining The Sample Size Required For Testing Interactions: How It Compares To Nquery Advisor And O’Brien’S Sas Unifypow., William F. Mccarthy

COBRA Preprint Series

Lachenbruch (1988) proposed a simple method based on the use of orthogonal contrasts to determine the sample size or power for testing main effects and interactions, and uses the normal distribution instead of the non-central F distribution. This method can be used for factorial designs of various size. The example illustrated in this paper considers a 2 x 2 factorial design. This paper will determine both sample size and power of a particular study design with anticipated (assumed) means for each cell of the 2 x 2 factorial design. Lachenbruch’s method will be compared to nQuery Advisor 6.0 (2005) and …


The Assessment Of The Degree Of Concordance Between The Observed Values And The Predicted Values Of A Mixed-Effect Model Using “Method Of Comparison” Techniques, William F. Mccarthy, Nan Guo Jul 2007

The Assessment Of The Degree Of Concordance Between The Observed Values And The Predicted Values Of A Mixed-Effect Model Using “Method Of Comparison” Techniques, William F. Mccarthy, Nan Guo

COBRA Preprint Series

In this paper, we present a methodology for determining the degree of concordance between observed and model-based predicted values of a mixed-effect model. In particular, we will compare the degree to which observed and model-based predicted values agree by using ‘method of comparison’ techniques. We will also present the results of the concordance correlation coefficient (CCC).


Reporting And Interpretation In Genome-Wide Association Studies, Jon Wakefield Jul 2007

Reporting And Interpretation In Genome-Wide Association Studies, Jon Wakefield

UW Biostatistics Working Paper Series

In the context of genome-wide association studies we critique a number of methods that have been suggested for flagging associations for further investigation. The p-value is by far the most commonly used measure, but requires careful calibration when the a priori probability of an association is small, and discards information by not considering the power associated with each test. The q-value is a frequentist method by which the false discovery rate (FDR) may be controlled. We advocate the use of the Bayes factor as a summary of the information in the data with respect to the comparison of the null …


Assessment Of A Cgh-Based Genetic Instability, David A. Engler, Yiping Shen, J F. Gusella, Rebecca A. Betensky Jul 2007

Assessment Of A Cgh-Based Genetic Instability, David A. Engler, Yiping Shen, J F. Gusella, Rebecca A. Betensky

Harvard University Biostatistics Working Paper Series

No abstract provided.


Survival Analysis With Large Dimensional Covariates: An Application In Microarray Studies, David A. Engler, Yi Li Jul 2007

Survival Analysis With Large Dimensional Covariates: An Application In Microarray Studies, David A. Engler, Yi Li

Harvard University Biostatistics Working Paper Series

Use of microarray technology often leads to high-dimensional and low- sample size data settings. Over the past several years, a variety of novel approaches have been proposed for variable selection in this context. However, only a small number of these have been adapted for time-to-event data where censoring is present. Among standard variable selection methods shown both to have good predictive accuracy and to be computationally efficient is the elastic net penalization approach. In this paper, adaptation of the elastic net approach is presented for variable selection both under the Cox proportional hazards model and under an accelerated failure time …


Identifying Patients Who Need Additional Biomarkers For Better Prediction Of Health Outcome Or Diagnosis Of Clinical Phenotype, Lu Tian, Tianxi Cai, L. J. Wei Jun 2007

Identifying Patients Who Need Additional Biomarkers For Better Prediction Of Health Outcome Or Diagnosis Of Clinical Phenotype, Lu Tian, Tianxi Cai, L. J. Wei

Harvard University Biostatistics Working Paper Series

No abstract provided.


Identifying Effect Modifiers In Air Pollution Time-Series Studies Using A Two-Stage Analysis, Sandrah P. Eckel, Thomas A. Louis Jun 2007

Identifying Effect Modifiers In Air Pollution Time-Series Studies Using A Two-Stage Analysis, Sandrah P. Eckel, Thomas A. Louis

Johns Hopkins University, Dept. of Biostatistics Working Papers

Studies of the health effects of air pollution such as the National Morbidity and Mortality Air Pollution Study (NMMAPS) relate changes in daily pollution to daily deaths in a sample of cities and calendar years. Generally, city-specific estimates are combined into regional and national estimates using two-stage models. Our two-stage analysis identifies effect modifiers of the relation between single-day lagged PM10 and daily mortality in people age 65 and older from the 50 largest NMMAPS cities. We build on the standard approach by "fractionating" city-specific analyses to produce month-year-city specific estimated air pollution effects (slopes) in Stage I. In Stage …


The Analysis Of Pixel Intensity (Myocardial Signal Density) Data: The Quantification Of Myocardial Perfusion By Imaging Methods., William F. Mccarthy, Douglas R. Thompson May 2007

The Analysis Of Pixel Intensity (Myocardial Signal Density) Data: The Quantification Of Myocardial Perfusion By Imaging Methods., William F. Mccarthy, Douglas R. Thompson

COBRA Preprint Series

This paper described a number of important issues in the analysis of pixel intensity data, as well as approaches for dealing with these. We particularly emphasized the issue of clustering, which may be ubiquitous in studies of pixel intensity data. Clustering can take many forms, e.g., measurements of different sections of a heart or repeated measurements of the same research participant. Clustering typically has the effect of increasing variance estimates. When one fails to account for clustering, variance estimates may be unrealistically small, resulting in spurious significance. We illustrated several possible approaches to account for clustering, including adjusting standard errors …


Estimating The Effect Of Vigorous Physical Activity On Mortality In The Elderly Based On Realistic Individualized Treatment And Intention-To-Treat Rules, Oliver Bembom, Mark J. Van Der Laan May 2007

Estimating The Effect Of Vigorous Physical Activity On Mortality In The Elderly Based On Realistic Individualized Treatment And Intention-To-Treat Rules, Oliver Bembom, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

The effect of vigorous physical activity on mortality in the elderly is difficult to estimate using conventional approaches to causal inference that define this effect by comparing the mortality risks corresponding to hypothetical scenarios in which all subjects in the target population engage in a given level of vigorous physical activity. A causal effect defined on the basis of such a static treatment intervention can only be identified from observed data if all subjects in the target population have a positive probability of selecting each of the candidate treatment options, an assumption that is highly unrealistic in this case since …


Analyzing Sequentially Randomized Trials Based On Causal Effect Models For Realistic Individualized Treatment Rules, Oliver Bembom, Mark J. Van Der Laan May 2007

Analyzing Sequentially Randomized Trials Based On Causal Effect Models For Realistic Individualized Treatment Rules, Oliver Bembom, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

In this paper, we argue that causal effect models for realistic individualized treatment rules represent an attractive tool for analyzing sequentially randomized trials. Unlike a number of methods proposed previously, this approach does not rely on the assumption that intermediate outcomes are discrete or that models for the distributions of these intermediate outcomes given the observed past are correctly specified. In addition, it generalizes the methodology for performing pairwise comparisons between individualized treatment rules by allowing the user to posit a marginal structural model for all candidate treatment rules simultaneously. If only a small number of candidate treatment rules are …