Open Access. Powered by Scholars. Published by Universities.®
- Institution
- Keyword
-
- Statistical Methodology (3)
- measurement error. (1)
- Annual percent change (APC) (1)
- AUC; Cox's proportional hazards model; Framingham risk score; ROC (1)
- Age-adjusted cancer rates (1)
-
- Area under the receiver operating characteristic curve; C-statistic; Cox's regression; Gaussian process; Integrated discriminiation improvement; Improvement in the area under the curve; Risk prediction (1)
- Asthma; Cumuluative Residuals; Repeated Measured; Spatial Cluster Detection; Wheeze (1)
- Asymptotic linearity; coarsening at random; causal effect; censored data; cross-validation; collaborative double robust; efficient influence curve (1)
- Bayesian inference; Markov chain; Monte Carlo; Clustering (1)
- Bayesian methods; design-based inference; sampling weights; regression; robustness; survey sampling (1)
- Body mass index; Cumulative residuals; Generalized estimating equations; Socioeconomic status; Spatial cluster detection; Weighted linear regression (1)
- Bootstrap null distribution (1)
- ChIP-Seq (1)
- Classification; Diagnosis; Prediction; Prognosis; Risk models (1)
- Computational Biology/Bioinformatics (1)
- Cross-validation; HIV-infection; Nonparametric function estimation; Personalized medicine; Subgroup analysis (1)
- Diffusion tensor imaging; random matrix; likelihood ratio test; manifold-valued data; Satterthwaite approximation; multiple testing (1)
- Empirical Bayes multiple hypothesis testing (1)
- Empirical Bayes; false discovery rate; graphical model selection; influence curve null distribution; lower-order partial correlation; multiple testing (1)
- Estimating equation; proportional hazards model; proportional odds model; right censoring; transformation model (1)
- False discovery rate (1)
- Functional principal component analysis (FPCA); multilevel models (1)
- Functional principal components; Sleep EEG; Smoothing (1)
- Generalized Type I error rate (1)
- Hierarchical smoothing; Penalized splines; Sleep (1)
- High Throughput Sequencing (1)
- Influence curve (1)
- Information criterion; Kullback-Leibler information; model selection; penalized splines; random effect; variance component (1)
- Local likelihood function; nonparametric function estimation; perturbation-resampling method; Risk index score (1)
- Minimum power divergence estimators (1)
- Publication
- Publication Type
Articles 1 - 30 of 32
Full-Text Articles in Statistical Methodology
Pragmatic Estimation Of A Spatio-Temporal Air Quality Model With Irregular Monitoring Data, Paul D. Sampson, Adam A. Szpiro, Lianne Sheppard, Johan Lindström, Joel D. Kaufman
Pragmatic Estimation Of A Spatio-Temporal Air Quality Model With Irregular Monitoring Data, Paul D. Sampson, Adam A. Szpiro, Lianne Sheppard, Johan Lindström, Joel D. Kaufman
UW Biostatistics Working Paper Series
Statistical analyses of the health effects of air pollution have increasingly used GIS-based covariates for prediction of ambient air quality in “land-use” regression models. More recently these regression models have accounted for spatial correlation structure in combining monitoring data with land-use covariates. The current paper builds on these concepts to address spatio-temporal prediction of ambient concentrations of particulate matter with aerodynamic diameter less than 2.5 μm (PM2.5) on the basis of a model representing spatially varying seasonal trends and spatial correlation structures. Our hierarchical methodology provides a pragmatic approach that fully exploits regulatory and other supplemental monitoring data which jointly …
On The Behaviour Of Marginal And Conditional Akaike Information Criteria In Linear Mixed Models, Sonja Greven, Thomas Kneib
On The Behaviour Of Marginal And Conditional Akaike Information Criteria In Linear Mixed Models, Sonja Greven, Thomas Kneib
Johns Hopkins University, Dept. of Biostatistics Working Papers
In linear mixed models, model selection frequently includes the selection of random effects. Two versions of the Akaike information criterion (AIC) have been used, based either on the marginal or on the conditional distribution. We show that the marginal AIC is no longer an asymptotically unbiased estimator of the Akaike information, and in fact favours smaller models without random effects. For the conditional AIC, we show that ignoring estimation uncertainty in the random effects covariance matrix, as is common practice, induces a bias that leads to the selection of any random effect not predicted to be exactly zero. We derive …
Survival Analysis With Error-Prone Time-Varying Covariates: A Risk Set Calibration Approach, Xiaomei Liao, David M. Zucker, Yi Li, Donna Spiegelman
Survival Analysis With Error-Prone Time-Varying Covariates: A Risk Set Calibration Approach, Xiaomei Liao, David M. Zucker, Yi Li, Donna Spiegelman
Harvard University Biostatistics Working Paper Series
No abstract provided.
A Statistical Framework For The Analysis Of Chip-Seq Data, Pei Fen Kuan, Dongjun Chung, Guangjin Pan, James A. Thomson, Ron Stewart, Sunduz Keles
A Statistical Framework For The Analysis Of Chip-Seq Data, Pei Fen Kuan, Dongjun Chung, Guangjin Pan, James A. Thomson, Ron Stewart, Sunduz Keles
Sunduz Keles
Chromatin immunoprecipitation followed by sequencing (ChIP-Seq) has revolutionalized experiments for genome-wide profiling of DNA-binding proteins, histone modifications, and nucleosome occupancy. As the cost of sequencing is decreasing, many researchers are switching from microarray-based technologies (ChIP-chip) to ChIP-Seq for genome-wide study of transcriptional regulation. Despite its increasing and well-deserved popularity, there is little work that investigates and accounts for sources of biases in the ChIP-Seq technology. These biases typically arise from both the standard pre-processing protocol and the underlying DNA sequence of the generated data.
We study data from a naked DNA sequencing experiment, which sequences non-cross-linked DNA after deproteinizing and …
A New Class Of Minimum Power Divergence Estimators With Applications To Cancer Surveillance, Nirian Martin, Yi Li
A New Class Of Minimum Power Divergence Estimators With Applications To Cancer Surveillance, Nirian Martin, Yi Li
Harvard University Biostatistics Working Paper Series
No abstract provided.
Quasi-Least Squares With Mixed Linear Correlation Structures, Jichun Xie, Justine Shults, Jon Peet, Dwight Stambolian, Mary F. Cotch
Quasi-Least Squares With Mixed Linear Correlation Structures, Jichun Xie, Justine Shults, Jon Peet, Dwight Stambolian, Mary F. Cotch
UPenn Biostatistics Working Papers
Quasi-least squares (QLS) is a two-stage computational approach for estimation of the correlation parameters in the framework of generalized estimating equations (GEE). We prove two general results for the class of mixed linear correlation structures: namely, that the stage one QLS estimate of the correlation parameter always exists and is feasible (yields a positive definite estimated correlation matrix) for any correlation structure, while the stage two estimator exists and is unique (and therefore consistent) with probability one, for the class of mixed linear correlation structures. Our general results justify the implementation of QLS for particular members of the class of …
Readings In Targeted Maximum Likelihood Estimation, Mark J. Van Der Laan, Sherri Rose, Susan Gruber
Readings In Targeted Maximum Likelihood Estimation, Mark J. Van Der Laan, Sherri Rose, Susan Gruber
U.C. Berkeley Division of Biostatistics Working Paper Series
This is a compilation of current and past work on targeted maximum likelihood estimation. It features the original targeted maximum likelihood learning paper as well as chapters on super (machine) learning using cross validation, randomized controlled trials, realistic individualized treatment rules in observational studies, biomarker discovery, case-control studies, and time-to-event outcomes with censored data, among others. We hope this collection is helpful to the interested reader and stimulates additional research in this important area.
Causal Inference For Nested Case-Control Studies Using Targeted Maximum Likelihood Estimation, Sherri Rose, Mark J. Van Der Laan
Causal Inference For Nested Case-Control Studies Using Targeted Maximum Likelihood Estimation, Sherri Rose, Mark J. Van Der Laan
U.C. Berkeley Division of Biostatistics Working Paper Series
A nested case-control study is conducted within a well-defined cohort arising out of a population of interest. This design is often used in epidemiology to reduce the costs associated with collecting data on the full cohort; however, the case control sample within the cohort is a biased sample. Methods for analyzing case-control studies have largely focused on logistic regression models that provide conditional and not marginal causal estimates of the odds ratio. We previously developed a Case-Control Weighted Targeted Maximum Likelihood Estimation (TMLE) procedure for case-control study designs, which relies on the prevalence probability q0. We propose the use of …
Targeted Maximum Likelihood Estimation: A Gentle Introduction, Susan Gruber, Mark J. Van Der Laan
Targeted Maximum Likelihood Estimation: A Gentle Introduction, Susan Gruber, Mark J. Van Der Laan
U.C. Berkeley Division of Biostatistics Working Paper Series
This paper provides a concise introduction to targeted maximum likelihood estimation (TMLE) of causal effect parameters. The interested analyst should gain sufficient understanding of TMLE from this introductory tutorial to be able to apply the method in practice. A program written in R is provided. This program implements a basic version of TMLE that can be used to estimate the effect of a binary point treatment on a continuous or binary outcome.
Comparing Risk Scoring Systems Beyond The Roc Paradigm In Survival Analysis, Hajime Uno, Lu Tian, Tianxi Cai, Isaac S. Kohane, L. J. Wei
Comparing Risk Scoring Systems Beyond The Roc Paradigm In Survival Analysis, Hajime Uno, Lu Tian, Tianxi Cai, Isaac S. Kohane, L. J. Wei
Harvard University Biostatistics Working Paper Series
No abstract provided.
Combinational Mixtures Of Multiparameter Distributions, Valeria Edefonti, Giovanni Parmigiani
Combinational Mixtures Of Multiparameter Distributions, Valeria Edefonti, Giovanni Parmigiani
Johns Hopkins University, Dept. of Biostatistics Working Papers
We introduce combinatorial mixtures - a flexible class of models for inference on mixture distributions whose component have multidimensional parameters. The key idea is to allow each element of the component-specific parameter vectors to be shared by a subset of other components. This approach allows for mixtures that range from very flexible to very parsimonious, and unifies inference on component-specific parameters with inference on the number of components. We develop Bayesian inference and computation approaches for this class of distributions, and illustrate them in an application. This work was originally motivated by the analysis of cancer subtypes: in terms of …
Shrinkage Estimation Of Expression Fold Change As An Alternative To Testing Hypotheses Of Equivalent Expression, Zahra Montazeri, Corey M. Yanofsky, David R. Bickel
Shrinkage Estimation Of Expression Fold Change As An Alternative To Testing Hypotheses Of Equivalent Expression, Zahra Montazeri, Corey M. Yanofsky, David R. Bickel
COBRA Preprint Series
Research on analyzing microarray data has focused on the problem of identifying differentially expressed genes to the neglect of the problem of how to integrate evidence that a gene is differentially expressed with information on the extent of its differential expression. Consequently, researchers currently prioritize genes for further study either on the basis of volcano plots or, more commonly, according to simple estimates of the fold change after filtering the genes with an arbitrary statistical significance threshold. While the subjective and informal nature of the former practice precludes quantification of its reliability, the latter practice is equivalent to using a …
The Effect Of Correlation In False Discovery Rate Estimation, Armin Schwartzman, Xihong Lin
The Effect Of Correlation In False Discovery Rate Estimation, Armin Schwartzman, Xihong Lin
Harvard University Biostatistics Working Paper Series
No abstract provided.
Spatial Cluster Detection For Repeatedly Measured Outcomes While Accounting For Residential History, Andrea J. Cook, Diane Gold, Yi Li
Spatial Cluster Detection For Repeatedly Measured Outcomes While Accounting For Residential History, Andrea J. Cook, Diane Gold, Yi Li
Harvard University Biostatistics Working Paper Series
No abstract provided.
Marginalized Frailty Models For Multivariate Survival Data, Megan Othus, Yi Li
Marginalized Frailty Models For Multivariate Survival Data, Megan Othus, Yi Li
Harvard University Biostatistics Working Paper Series
No abstract provided.
Spatial Cluster Detection For Weighted Outcomes Using Cumulative Geographic Residuals, Andrea J. Cook, Yi Li, David Arterburn, Ram C. Tiwari
Spatial Cluster Detection For Weighted Outcomes Using Cumulative Geographic Residuals, Andrea J. Cook, Yi Li, David Arterburn, Ram C. Tiwari
Harvard University Biostatistics Working Paper Series
No abstract provided.
On The C-Statistics For Evaluating Overall Adequacy Of Risk Prediction Procedures With Censored Survival Data, Hajime Uno, Tianxi Cai, Michael J. Pencina, Ralph B. D'Agostino, L. J. Wei
On The C-Statistics For Evaluating Overall Adequacy Of Risk Prediction Procedures With Censored Survival Data, Hajime Uno, Tianxi Cai, Michael J. Pencina, Ralph B. D'Agostino, L. J. Wei
Harvard University Biostatistics Working Paper Series
No abstract provided.
Estimating Subject-Specific Dependent Competing Risk Profile With Censored Event Time Observations, Yi Li, Lu Tian, L. J. Wei
Estimating Subject-Specific Dependent Competing Risk Profile With Censored Event Time Observations, Yi Li, Lu Tian, L. J. Wei
Harvard University Biostatistics Working Paper Series
No abstract provided.
Resampling-Based Multiple Hypothesis Testing With Applications To Genomics: New Developments In The R/Bioconductor Package Multtest, Houston N. Gilbert, Katherine S. Pollard, Mark J. Van Der Laan, Sandrine Dudoit
Resampling-Based Multiple Hypothesis Testing With Applications To Genomics: New Developments In The R/Bioconductor Package Multtest, Houston N. Gilbert, Katherine S. Pollard, Mark J. Van Der Laan, Sandrine Dudoit
U.C. Berkeley Division of Biostatistics Working Paper Series
The multtest package is a standard Bioconductor package containing a suite of functions useful for executing, summarizing, and displaying the results from a wide variety of multiple testing procedures (MTPs). In addition to many popular MTPs, the central methodological focus of the multtest package is the implementation of powerful joint multiple testing procedures. Joint MTPs are able to account for the dependencies between test statistics by effectively making use of (estimates of) the test statistics joint null distribution. To this end, two additional bootstrap-based estimates of the test statistics joint null distribution have been developed for use in the …
A Class Of Semiparametric Mixture Cure Survival Models With Dependent Censoring, Megan Othus, Yi Li, Ram C. Tiwari
A Class Of Semiparametric Mixture Cure Survival Models With Dependent Censoring, Megan Othus, Yi Li, Ram C. Tiwari
Harvard University Biostatistics Working Paper Series
No abstract provided.
Collaborative Targeted Maximum Likelihood Estimation, Mark J. Van Der Laan, Susan Gruber
Collaborative Targeted Maximum Likelihood Estimation, Mark J. Van Der Laan, Susan Gruber
U.C. Berkeley Division of Biostatistics Working Paper Series
Collaborative double robust targeted maximum likelihood estimators represent a fundamental further advance over standard targeted maximum likelihood estimators of causal inference and variable importance parameters. The targeted maximum likelihood approach involves fluctuating an initial density estimate, (Q), in order to make a bias/variance tradeoff targeted towards a specific parameter in a semi-parametric model. The fluctuation involves estimation of a nuisance parameter portion of the likelihood, g. TMLE and other double robust estimators have been shown to be consistent and asymptotically normally distributed (CAN) under regularity conditions, when either one of these two factors of the likelihood of the data is …
Joint Multiple Testing Procedures For Graphical Model Selection With Applications To Biological Networks, Houston N. Gilbert, Mark J. Van Der Laan, Sandrine Dudoit
Joint Multiple Testing Procedures For Graphical Model Selection With Applications To Biological Networks, Houston N. Gilbert, Mark J. Van Der Laan, Sandrine Dudoit
U.C. Berkeley Division of Biostatistics Working Paper Series
Gaussian graphical models have become popular tools for identifying relationships between genes when analyzing microarray expression data. In the classical undirected Gaussian graphical model setting, conditional independence relationships can be inferred from partial correlations obtained from the concentration matrix (= inverse covariance matrix) when the sample size n exceeds the number of parameters p which need to estimated. In situations where n < p, another approach to graphical model estimation may rely on calculating unconditional (zero-order) and first-order partial correlations. In these settings, the goal is to identify a lower-order conditional independence graph, sometimes referred to as a ‘0-1 graphs’. For either choice of graph, model selection may involve a multiple testing problem, in which edges in a graph are drawn only after rejecting hypotheses involving (saturated or lower-order) partial correlation parameters. Most multiple testing procedures applied in previously proposed graphical model selection algorithms rely on standard, marginal testing methods which do not take into account the joint distribution of the test statistics derived from (partial) correlations. We propose and implement a multiple testing framework useful when testing for edge inclusion during graphical model selection. Two features of our methodology include (i) a computationally efficient and asymptotically valid test statistics joint null distribution derived from influence curves for correlation-based parameters, and (ii) the application of empirical Bayes joint multiple testing procedures which can effectively control a variety of popular Type I error rates by incorpo- rating joint null distributions such as those described here (Dudoit and van der Laan, 2008). Using a dataset from Arabidopsis thaliana, we observe that the use of more sophisticated, modular approaches to multiple testing allows one to identify greater numbers of edges when approximating an undirected graphical model using a 0-1 graph. Our framework may also be extended to edge testing algorithms for other types of graphical models (e.g., for classical undirected, bidirected, and directed acyclic graphs).
The Importance Of Scale For Spatial-Confounding Bias And Precision Of Spatial Regression Estimators, Christopher J. Paciorek
The Importance Of Scale For Spatial-Confounding Bias And Precision Of Spatial Regression Estimators, Christopher J. Paciorek
Harvard University Biostatistics Working Paper Series
Increasingly, regression models are used when residuals are spatially correlated. Prominent examples include studies in environmental epidemiology to understand the chronic health effects of pollutants. I consider the effects of residual spatial structure on the bias and precision of regression coefficients, developing a simple framework in which to understand the key issues and derive informative analytic results. When the spatial residual is induced by an unmeasured confounder, regression models with spatial random effects and closely-related models such as kriging and penalized splines are biased, even when the residual variance components are known. Analytic and simulation results show how the bias …
Analysis Of Randomized Comparative Clinical Trial Data For Personalized Treatment Selections, Tianxi Cai, Lu Tian, Peggy H. Wong, L. J. Wei
Analysis Of Randomized Comparative Clinical Trial Data For Personalized Treatment Selections, Tianxi Cai, Lu Tian, Peggy H. Wong, L. J. Wei
Harvard University Biostatistics Working Paper Series
No abstract provided.
Correlated Binary Regression Using Orthogonalized Residuals, Richard C. Zink, Bahjat F. Qaqish
Correlated Binary Regression Using Orthogonalized Residuals, Richard C. Zink, Bahjat F. Qaqish
COBRA Preprint Series
This paper focuses on marginal regression models for correlated binary responses when estimation of the association structure is of primary interest. A new estimating function approach based on orthogonalized residuals is proposed. This procedure allows a new representation and addresses some of the difficulties of the conditional-residual formulation of alternating logistic regressions of Carey, Zeger & Diggle (1993). The new method is illustrated with an analysis of data on impaired pulmonary function.
Group Comparison Of Eigenvalues And Eigenvectors Of Diffusion Tensors, Armin Schwartzman, Robert F. Dougherty, Jonathan E. Taylor
Group Comparison Of Eigenvalues And Eigenvectors Of Diffusion Tensors, Armin Schwartzman, Robert F. Dougherty, Jonathan E. Taylor
Harvard University Biostatistics Working Paper Series
No abstract provided.
Validation Of Differential Gene Expression Algorithms: Application Comparing Fold Change Estimation To Hypothesis Testing, David R. Bickel, Corey M. Yanofsky
Validation Of Differential Gene Expression Algorithms: Application Comparing Fold Change Estimation To Hypothesis Testing, David R. Bickel, Corey M. Yanofsky
COBRA Preprint Series
Sustained research on the problem of determining which genes are differentially expressed on the basis of microarray data has yielded a plethora of statistical algorithms, each justified by theory, simulation, or ad hoc validation and yet differing in practical results from equally justified algorithms. The widespread confusion on which method to use in practice has been exacerbated by the finding that simply ranking genes by their fold changes sometimes outperforms popular statistical tests.
Algorithms may be compared by quantifying each method's error in predicting expression ratios, whether such ratios are defined across microarray channels or between two independent groups. For …
Measures To Summarize And Compare The Predictive Capacity Of Markers, Wen Gu, Margaret Pepe
Measures To Summarize And Compare The Predictive Capacity Of Markers, Wen Gu, Margaret Pepe
UW Biostatistics Working Paper Series
The predictive capacity of a marker in a population can be described using the population distribution of risk (Huang et al., 2007; Pepe et al., 2008a; Stern, 2008). Virtually all standard statistical summaries of predictability and discrimination can be derived from it (Gail and Pfeiffer, 2005). The goal of this paper is to develop methods for making inference about risk prediction markers using summary measures derived from the risk distribution. We describe some new clinically motivated summary measures and give new interpretations to some existing statistical measures. Methods for estimating these summary measures are described along with distribution theory that …
Weighting And Prediction In Sample Surveys, Rod Little
Weighting And Prediction In Sample Surveys, Rod Little
The University of Michigan Department of Biostatistics Working Paper Series
A fundamental technique in survey sampling is to weight included units by the inverse of their probability of inclusion, which may be known (as in the case of sampling weights) or estimated (as in the case of nonresponse weights). The technique is closely associated with the design-based approach to survey inference, with the idea that units in the sample are representing a certain number of units in the population. I discuss weighting from a modeling perspective. Some common misconceptions of weighting will be addressed, including the idea that modelers can ignore the sampling weights, or that weighting necessarily reduces bias …
Multilevel Functional Principal Component Analysis, Chong-Zhi Di, Ciprian M. Crainiceanu, Brian S. Caffo, Naresh M. Punjabi
Multilevel Functional Principal Component Analysis, Chong-Zhi Di, Ciprian M. Crainiceanu, Brian S. Caffo, Naresh M. Punjabi
Chongzhi Di
The Sleep Heart Health Study (SHHS) is a comprehensive landmark study of sleep and its impacts on health outcomes. A primary metric of the SHHS is the in-home polysomnogram, which includes two electroencephalographic (EEG) channels for each subject, at two visits. The volume and importance of this data presents enormous challenges for analysis. To address these challenges, we introduce multilevel functional principal component analysis (MFPCA), a novel statistical methodology designed to extract core intra- and inter-subject geometric components of multilevel functional data. Though motivated by the SHHS, the proposed methodology is generally applicable, with potential relevance to many modern scientific …