Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Biostatistics

PDF

2009

Institution
Keyword
Publication
Publication Type

Articles 1 - 30 of 64

Full-Text Articles in Physical Sciences and Mathematics

Is Survival The Only Or Even The Right Outcome For Evaluating Treatments For Out-Of-Hospital Cardiac Arrest? A Proposed Test Based On Both An Intermediate And Ultimate Outcome., Al Hallstrom Nov 2009

Is Survival The Only Or Even The Right Outcome For Evaluating Treatments For Out-Of-Hospital Cardiac Arrest? A Proposed Test Based On Both An Intermediate And Ultimate Outcome., Al Hallstrom

UW Biostatistics Working Paper Series

It is generally agreed that the goal of resuscitation is survival with neurological and physiological status similar to that preceding the cardiac arrest. Previously I have argued that the lack of improvement in outcome from resuscitation over the past 3 to 4 decades, as compared to the substantial progress made in treatment of ischemic heart disease, is a consequence of the absence of randomized clinical trials of new interventions and the use of intermediate endpoints such as return of spontaneous circulation or admittance to hospital. Proponents of these intermediate endpoints have argued that those involved in the resuscitation have no …


A Statistical Framework For The Analysis Of Chip-Seq Data, Pei Fen Kuan, Dongjun Chung, Guangjin Pan, James A. Thomson, Ron Stewart, Sunduz Keles Nov 2009

A Statistical Framework For The Analysis Of Chip-Seq Data, Pei Fen Kuan, Dongjun Chung, Guangjin Pan, James A. Thomson, Ron Stewart, Sunduz Keles

Sunduz Keles

Chromatin immunoprecipitation followed by sequencing (ChIP-Seq) has revolutionalized experiments for genome-wide profiling of DNA-binding proteins, histone modifications, and nucleosome occupancy. As the cost of sequencing is decreasing, many researchers are switching from microarray-based technologies (ChIP-chip) to ChIP-Seq for genome-wide study of transcriptional regulation. Despite its increasing and well-deserved popularity, there is little work that investigates and accounts for sources of biases in the ChIP-Seq technology. These biases typically arise from both the standard pre-processing protocol and the underlying DNA sequence of the generated data.

We study data from a naked DNA sequencing experiment, which sequences non-cross-linked DNA after deproteinizing and …


Nonlinear Models In Multivariate Population Bioequivalence Testing, Bassam Dahman Nov 2009

Nonlinear Models In Multivariate Population Bioequivalence Testing, Bassam Dahman

Theses and Dissertations

In this dissertation a methodology is proposed for simultaneously evaluating the population bioequivalence (PBE) of a generic drug to a pre-licensed drug, or the bioequivalence of two formulations of a drug using multiple correlated pharmacokinetic metrics. The univariate criterion that is accepted by the food and drug administration (FDA) for testing population bioequivalence is generalized. Very few approaches for testing multivariate extensions of PBE have appeared in the literature. One method uses the trace of the covariance matrix as a measure of total variability, and another uses a pooled variance instead of the reference variance. The former ignores the correlation …


Two-Stage Decompositions For The Analysis Of Functional Connectivity For Fmri With Application To Alzheimer's Disease Risk, Brian S. Caffo, Ciprian M. Crainiceanu, Guillermo Verduzco, Stewart H. Mostofsky, Susan Spear-Bassett, James J. Pekar Nov 2009

Two-Stage Decompositions For The Analysis Of Functional Connectivity For Fmri With Application To Alzheimer's Disease Risk, Brian S. Caffo, Ciprian M. Crainiceanu, Guillermo Verduzco, Stewart H. Mostofsky, Susan Spear-Bassett, James J. Pekar

COBRA Preprint Series

Functional connectivity is the study of correlations in measured neurophysiological signals. Altered functional connectivity has been shown to be associated with numerous diseases including Alzheimer's disease and mild cognitive impairment. In this manuscript we use a two-stage application of the singular value decomposition to obtain data driven population-level measures of functional connectivity in functional magnetic resonance imaging (fMRI). The method is computationally simple and amenable to high dimensional fMRI data with large numbers of subjects. Simulation studies suggest the ability of the decomposition methods to recover population brain networks and their associated loadings. We further demonstrate the utility of these …


Joint Mixed-Effects Models For Longitudinal Data Analysis: An Application For The Metabolic Syndrome, John Thorp Iii Nov 2009

Joint Mixed-Effects Models For Longitudinal Data Analysis: An Application For The Metabolic Syndrome, John Thorp Iii

Theses and Dissertations

Mixed-effects models are commonly used to model longitudinal data as they can appropriately account for within and between subject sources of variability. Univariate mixed effect modeling strategies are well developed for a single outcome (response) variable that may be continuous (e.g. Gaussian) or categorical (e.g. binary, Poisson) in nature. Only recently have extensions been discussed for jointly modeling multiple outcome variables measures longitudinally. Many diseases processes are a function of several factors that are correlated. For example, the metabolic syndrome, a constellation of cardiovascular risk factors associated with an increased risk of cardiovascular disease and type 2 diabetes, is often …


Bayesian Functional Data Analysis Using Winbugs, Ciprian M. Crainiceanu, A. Jeffrey Goldsmith Nov 2009

Bayesian Functional Data Analysis Using Winbugs, Ciprian M. Crainiceanu, A. Jeffrey Goldsmith

Johns Hopkins University, Dept. of Biostatistics Working Papers

We provide user friendly software for Bayesian analysis of Functional Data Models using WinBUGS 1.4. The excellent properties of Bayesian analysis in this context are due to: 1) dimensionality reduction, which leads to low dimensional projection bases; 2)the mixed model representation of functional models, which provides a modular approach to model extension; and 3) the orthogonality of the principal component bases, which contributes to excellent chain convergence and mixing properties. Our paper provides one more, essential, reason for using Bayesian analysis for Functional models: the existence of software.


Analysis Of Subgroup Data In Clinical Trials, Kao-Tai Tsai, Karl E. Peace Nov 2009

Analysis Of Subgroup Data In Clinical Trials, Kao-Tai Tsai, Karl E. Peace

Biostatistics Faculty Presentations

This conference abstract was published in the Proceedings of the Sixteenth Annual Biopharmaceutical Applied Statistics Symposium.


Analisis Data Riskesdas 2007/2008: Kontribusi Karakteristik Ibu Terhadap Status Imunisasi Anak Di Indonesia, Sutanto Priyo Hastono Oct 2009

Analisis Data Riskesdas 2007/2008: Kontribusi Karakteristik Ibu Terhadap Status Imunisasi Anak Di Indonesia, Sutanto Priyo Hastono

Kesmas

Cakupan imunisasi terbukti dapat menurunkan secara signifikan kejadian kesakitan dan kematian yang diakibatkan penyakit tersebut, tetapi di Indonesia cakupan tersebut tergolong rendah. Tujuan penelitian adalah mengetahui hubungan karakteristik ibu dengan status imunisasi anak di Indonesia. Disain yang digunakan dalam penelitian adalah potong lintang dengan sampel anak yang berumur antara 1-2 tahun yang tinggal di wilayah Indonesia. Sumber data sekunder yang digunakan adalah Riskesdas Depkes tahun 2007/2008. Proporsi anak usia 12-24 bulan yang mendapat imunisasi lengkap adalah 56,2 % (95% CI :55,1-57,3). Pendidikan ibu dan pendidikan suami ditemukan berhubungan secara bermakna dengan status imunisasi dasar pada anak. Hasil analisis multilevel menemukan …


The Em Algorithm For Group Testing Regression Models Under Matrix Pooling, Christopher R. Bilder, Boan Zhang Oct 2009

The Em Algorithm For Group Testing Regression Models Under Matrix Pooling, Christopher R. Bilder, Boan Zhang

Department of Statistics: Faculty Publications

No abstract provided.


Composite Likelihood Em Algorithm With Applications To Multivariate Hidden Markov Model , Xin Gao, Peter Xuekun Song Sep 2009

Composite Likelihood Em Algorithm With Applications To Multivariate Hidden Markov Model , Xin Gao, Peter Xuekun Song

COBRA Preprint Series

The method of composite likelihood is useful to deal with estimation and inference in parametric models with high-dimensional data, where the full likelihood approach renders to intractable computational complexity. We develop an extension of the EM algorithm in the framework of composite likelihood estimation in the presence of missing data or latent variables. We establish three key theoretical properties of the composite likelihood EM (CLEM) algorithm, including the ascent property, the algorithmic convergence and the convergence rate. The proposed method is applied to estimate the transition probabilities in multivariate hidden Markov model. Simulation studies are presented to demonstrate the empirical …


Readings In Targeted Maximum Likelihood Estimation, Mark J. Van Der Laan, Sherri Rose, Susan Gruber Sep 2009

Readings In Targeted Maximum Likelihood Estimation, Mark J. Van Der Laan, Sherri Rose, Susan Gruber

U.C. Berkeley Division of Biostatistics Working Paper Series

This is a compilation of current and past work on targeted maximum likelihood estimation. It features the original targeted maximum likelihood learning paper as well as chapters on super (machine) learning using cross validation, randomized controlled trials, realistic individualized treatment rules in observational studies, biomarker discovery, case-control studies, and time-to-event outcomes with censored data, among others. We hope this collection is helpful to the interested reader and stimulates additional research in this important area.


Robustness Of Semiparametric Efficiency In Nearly-Correct Models For Two-Phase Samples, Thomas Lumley Sep 2009

Robustness Of Semiparametric Efficiency In Nearly-Correct Models For Two-Phase Samples, Thomas Lumley

UW Biostatistics Working Paper Series

Augmented inverse-probability weighted (AIPW) estimators for incomplete-data models typically do not have full semiparametric efficiency, but do have model-robustness properties not shared by the efficient estimator. We examine the performance of efficient and AIPW estimators when the complete-data model is nearly correctly specified, in the sense that the misspecification is not reliably detectable from the data by any possible diagnostic or test. Asymptotic results for these nearly true models are obtained by representing them as sequences of misspecified models that are mutually contiguous with a correctly specified model. For some least favorable direction of model misspecification the bias in the …


Causal Inference For Nested Case-Control Studies Using Targeted Maximum Likelihood Estimation, Sherri Rose, Mark J. Van Der Laan Sep 2009

Causal Inference For Nested Case-Control Studies Using Targeted Maximum Likelihood Estimation, Sherri Rose, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

A nested case-control study is conducted within a well-defined cohort arising out of a population of interest. This design is often used in epidemiology to reduce the costs associated with collecting data on the full cohort; however, the case control sample within the cohort is a biased sample. Methods for analyzing case-control studies have largely focused on logistic regression models that provide conditional and not marginal causal estimates of the odds ratio. We previously developed a Case-Control Weighted Targeted Maximum Likelihood Estimation (TMLE) procedure for case-control study designs, which relies on the prevalence probability q0. We propose the use of …


Integrative Analysis Of Cancer Genomic Data, Shuangge Ma Sep 2009

Integrative Analysis Of Cancer Genomic Data, Shuangge Ma

Shuangge Ma

In the past decade, we have witnessed a period of unparallel development in the field of cancer genomics. To address the same or similar biomedical questions, multiple cancer genomic studies have been independently designed and conducted. Cancer gene signatures identified from analysis of individual datasets often have low reproducibility. A cost-effective way of improving reproducibility is to conduct integrative analysis of datasets from multiple studies with comparable designs. To properly integrate multiple studies and conduct integrative analysis, we need to access various public data warehouses, retrieve experiment protocols and raw data, evaluate individual studies and select those with comparable designs, …


Redefining Cpg Islands Using A Hideen Markov Model, Hao Wu, Brain Caffo, Harris A. Jaffee, Andrew P. Feinberg, Rafael A. Irizarry Sep 2009

Redefining Cpg Islands Using A Hideen Markov Model, Hao Wu, Brain Caffo, Harris A. Jaffee, Andrew P. Feinberg, Rafael A. Irizarry

Johns Hopkins University, Dept. of Biostatistics Working Papers

The DNA of most vertebrates is depleted in CpG dinucleotides; C followed by a G in the 5’ to 3’ direction. CpGs are the target for DNA methylation, a chemical modification of cytosine (C) heritable during cell division and the most well characterized epigenetic mechanism. The remaining CpGs tend to cluster in regions referred to as CpG islands (CGI). Knowing CGI locations is important because they mark functionally relevant epigenetic loci in development and disease. For various mammals, including human, a readily available and widely used list of CGI is available from the UCSC Genome Browser. This list was derived …


Deriving Optimal Composite Scores: Relating Observational/Longitudinal Data With A Primary Endpoint, Rhonda Ellis Sep 2009

Deriving Optimal Composite Scores: Relating Observational/Longitudinal Data With A Primary Endpoint, Rhonda Ellis

Theses and Dissertations

In numerous clinical/experimental studies, multiple endpoints are measured on each subject. It is often not clear which of these endpoints should be designated as of primary importance. The desirability function approach is a way of combining multiple responses into a single unitless composite score. The response variables may include multiple types of data: binary, ordinal, count, interval data. Each response variable is transformed to a 0 to1 unitless scale with zero representing a completely undesirable response and one representing the ideal value. In desirability function methodology, weights on individual components can be incorporated to allow different levels of importance to …


Marginal Hazards Model For Multivariate Failure Time Data With Auxiliary Covariates, Zhaozhi Fan, Xiao-Feng Wang Sep 2009

Marginal Hazards Model For Multivariate Failure Time Data With Auxiliary Covariates, Zhaozhi Fan, Xiao-Feng Wang

Xiaofeng Wang

A marginal hazards model of multivariate failure times has been developed based on the ‘working independence’ assumption [L.J. Wei, D.Y. Lin, and L. Wessfeld, Regression analysis of multivariate incomplete failure time data by modeling marginal distributions, J. Amer. Statist. Assoc. 84 (1989), pp. 1065–1073.]. In this article, we study the marginal hazards model of multivariate failure times with continuous auxiliary covariates. We consider the case of common baseline hazards for subjects from the same clusters. We extend the kernel smoothing procedure of Zhou and Wang [H. Zhou and C.Y. Wang, Failure time regression with continuous covariates measured with error, J. …


Comparing Risk Scoring Systems Beyond The Roc Paradigm In Survival Analysis, Hajime Uno, Lu Tian, Tianxi Cai, Isaac S. Kohane, L. J. Wei Aug 2009

Comparing Risk Scoring Systems Beyond The Roc Paradigm In Survival Analysis, Hajime Uno, Lu Tian, Tianxi Cai, Isaac S. Kohane, L. J. Wei

Harvard University Biostatistics Working Paper Series

No abstract provided.


Multiple Loci Within The Major Histocompatibility Complex Confer Risk Of Psoriasis, Bing-Jian Feng, Liang-Dan Sun, Razieh Soltani-Arabshahi, Anne M. Bowcock, Rajan P. Nair, Philip Stuart, James T. Elder, Steven J. Schrodi, Ann B. Begovich, Goncalo R. Abecasis, Xue-Jun Zhang, Kristina P. Callis Duffin, Gerald G. Krueger, David E. Goldgar Jul 2009

Multiple Loci Within The Major Histocompatibility Complex Confer Risk Of Psoriasis, Bing-Jian Feng, Liang-Dan Sun, Razieh Soltani-Arabshahi, Anne M. Bowcock, Rajan P. Nair, Philip Stuart, James T. Elder, Steven J. Schrodi, Ann B. Begovich, Goncalo R. Abecasis, Xue-Jun Zhang, Kristina P. Callis Duffin, Gerald G. Krueger, David E. Goldgar

Steven J Schrodi

Psoriasis is a common inflammatory skin disease characterized by thickened scaly red plaques. Previously we have performed a genome-wide association study (GWAS) on psoriasis with 1,359 cases and 1,400 controls, which were genotyped for 447,249 SNPs. The most significant finding was for SNP rs12191877, which is in tight linkage disequilibrium with HLA-Cw*0602, the consensus risk allele for psoriasis. However, it is not known whether there are other psoriasis loci within the MHC in addition to HLA-C. In the present study, we searched for additional susceptibility loci within the human leukocyte antigen (HLA) region through in-depth analyses of the GWAS data; …


A Sequential Algorithm To Identify The Mixing Endpoints In Liquids In Pharmaceutical Applications, Akriti Saxena Jul 2009

A Sequential Algorithm To Identify The Mixing Endpoints In Liquids In Pharmaceutical Applications, Akriti Saxena

Theses and Dissertations

The objective of this thesis is to develop a sequential algorithm to determine accurately and quickly, at which point in time a product is well mixed or reaches a steady state plateau, in terms of the Refractive Index (RI). An algorithm using sequential non-linear model fitting and prediction is proposed. A simulation study representing typical scenarios in a liquid manufacturing process in pharmaceutical industries was performed to evaluate the proposed algorithm. The data simulated included autocorrelated normal errors and used the Gompertz model. A set of 27 different combinations of the parameters of the Gompertz function were considered. The results …


On Quality Control Measures In Genome-Wide Association Studies: A Test To Assess The Genotyping Quality Of Individual Probands In Family-Based Association Studies And An Application To The Hapmap Data, David W. Fardo, Iuliana Ionita-Laza, Christoph Lange Jul 2009

On Quality Control Measures In Genome-Wide Association Studies: A Test To Assess The Genotyping Quality Of Individual Probands In Family-Based Association Studies And An Application To The Hapmap Data, David W. Fardo, Iuliana Ionita-Laza, Christoph Lange

Biostatistics Faculty Publications

Allele transmissions in pedigrees provide a natural way of evaluating the genotyping quality of a particular proband in a family-based, genome-wide association study. We propose a transmission test that is based on this feature and that can be used for quality control filtering of genome-wide genotype data for individual probands. The test has one degree of freedom and assesses the average genotyping error rate of the genotyped SNPs for a particular proband. As we show in simulation studies, the test is sufficiently powerful to identify probands with an unreliable genotyping quality that cannot be detected with standard quality control filters. …


Nonparametric Population Average Models: Deriving The Form Of Approximate Population Average Models Estimated Using Generalized Estimating Equations, Alan E. Hubbard, Mark J. Van Der Laan Jun 2009

Nonparametric Population Average Models: Deriving The Form Of Approximate Population Average Models Estimated Using Generalized Estimating Equations, Alan E. Hubbard, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

For estimating regressions for repeated measures outcome data, a popular choice is the population average models estimated by generalized estimating equations (GEE). We review in this report the derivation of the robust inference (sandwich-type estimator of the standard error). In addition, we present formally how the approximation of a misspecified working population average model relates to the true model and in turn how to interpret the results of such a misspecified model.


Marginalized Frailty Models For Multivariate Survival Data, Megan Othus, Yi Li Jun 2009

Marginalized Frailty Models For Multivariate Survival Data, Megan Othus, Yi Li

Harvard University Biostatistics Working Paper Series

No abstract provided.


"Implementation Of Quasi-Least Squares With The R Package Qlspack", Jichun Xie, Justine Shults Jun 2009

"Implementation Of Quasi-Least Squares With The R Package Qlspack", Jichun Xie, Justine Shults

UPenn Biostatistics Working Papers

Quasi-least squares (QLS) is an alternative method for estimating the correlation parameters within the framework of generalized estimating equations (GEE) that has two main advantages over the moment estimates that are typically applied for GEE: (1) It guarantees a consistent estimate of the correlation parameter and a positive definite estimated correlation matrix, for several correlation structures; and (2) It allows for easier implementation of some correlation structures that have not yet been implemented in the framework of GEE. Furthermore, because QLS is a method in the framework of GEE, existing software can be employed within the QLS algorithm for estimation …


Comparing Bootstrap And Jackknife Variance Estimation Methods For Area Under The Roc Curve Using One-Stage Cluster Survey Data, Allison Dunning Jun 2009

Comparing Bootstrap And Jackknife Variance Estimation Methods For Area Under The Roc Curve Using One-Stage Cluster Survey Data, Allison Dunning

Theses and Dissertations

The purpose of this research is to examine the bootstrap and jackknife as methods for estimating the variance of the AUC from a study using a complex sampling design and to determine which characteristics of the sampling design effects this estimation. Data from a one-stage cluster sampling design of 10 clusters was examined. Factors included three true AUCs (.60, .75, and .90), three prevalence levels (50/50, 70/30, 90/10) (non-disease/disease), and finally three number of clusters sampled (2, 5, or 7). A simulated sample was constructed for each of the 27 combinations of AUC, prevalence and number of clusters. Estimates of …


Simple, Defensible Sample Sizes Based On Cost Efficiency -- With Discussion And Rejoinder, Peter Bacchetti, Charles E. Mcculloch, Mark R. Segal, Richard Simon, Peter Muller, Gary L. Rosner, James A. Hanley, Stan Shapiro Jun 2009

Simple, Defensible Sample Sizes Based On Cost Efficiency -- With Discussion And Rejoinder, Peter Bacchetti, Charles E. Mcculloch, Mark R. Segal, Richard Simon, Peter Muller, Gary L. Rosner, James A. Hanley, Stan Shapiro

COBRA Preprint Series

The conventional approach of choosing sample size to provide 80% or greater power ignores the cost implications of different sample size choices. Costs, however, are often impossible for investigators and funders to ignore in actual practice. Here, we propose and justify a new approach for choosing sample size based on cost efficiency, the ratio of a study’s projected scientific and/or practical value to its total cost. By showing that a study’s projected value exhibits diminishing marginal returns as a function of increasing sample size for a wide variety of definitions of study value, we are able to develop two simple …


Nonparametric And Semiparametric Estimation Of The Three Way Receiver Operating Characteristic Surface, Jialiang Li, Xiao-Hua Zhou Jun 2009

Nonparametric And Semiparametric Estimation Of The Three Way Receiver Operating Characteristic Surface, Jialiang Li, Xiao-Hua Zhou

UW Biostatistics Working Paper Series

In many situations the diagnostic decision is not limited to a binary choice. Binary statistical tools such as receiver operating characteristic (ROC) curve and area under the ROC curve (AUC) need to be expanded to address three-category classification problem. Previous authors have suggest various ways to model the extension of AUC but not the ROC surface. Only simple parametric approaches are proposed for modeling the ROC measure under the assumption that test results all follow normal distributions. We study the estimation methods of three dimensional ROC surfaces with nonparametric and semiparametric estimators. Asymptotical results are provided as a basis for …


Evaluating Markers For Treatment Selection Based On Survival Time, Xiao Song, Xiao-Hua Zhou Jun 2009

Evaluating Markers For Treatment Selection Based On Survival Time, Xiao Song, Xiao-Hua Zhou

UW Biostatistics Working Paper Series

For many medical conditions several treatment options may be available for treating patients. We consider evaluating markers based on a simple treatment selection policy that incorporates information on the patient's marker value exceeding a threshold. For example, colon cancer patients may be treated by surgery alone or surgery plus chemotherapy. The c-myc gene expression level may be used as a biomarker for treatment selection. Although traditional regression methods may assess the effect of the marker and treatment on outcomes, it is appealing to quantify more directly the potential impact on the population of using the marker to select treatment. A …


A Tale Of Two Streets: Incorporating Grouping Structure In High Dimensional Data Mining, Shuangge Ma Jun 2009

A Tale Of Two Streets: Incorporating Grouping Structure In High Dimensional Data Mining, Shuangge Ma

Shuangge Ma

No abstract provided.


A Machine-Learning Algorithm For Estimating And Ranking The Impact Of Environmental Risk Factors In Exploratory Epidemiological Studies, Jessica G. Young, Alan E. Hubbard, B Eskenazi, Nicholas P. Jewell Jun 2009

A Machine-Learning Algorithm For Estimating And Ranking The Impact Of Environmental Risk Factors In Exploratory Epidemiological Studies, Jessica G. Young, Alan E. Hubbard, B Eskenazi, Nicholas P. Jewell

U.C. Berkeley Division of Biostatistics Working Paper Series

No abstract provided.