Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistical Models

Series

2006

Institution
Keyword
Publication

Articles 1 - 21 of 21

Full-Text Articles in Physical Sciences and Mathematics

Gamma Shape Mixtures For Heavy-Tailed Distributions, Sergio Venturini, Francesca Dominici, Giovanni Parmigiani Dec 2006

Gamma Shape Mixtures For Heavy-Tailed Distributions, Sergio Venturini, Francesca Dominici, Giovanni Parmigiani

Johns Hopkins University, Dept. of Biostatistics Working Papers

An important question in health services research is the estimation of the proportion of medical expenditures that exceed a given threshold. Typically, medical expenditures present highly skewed, heavy tailed distributions, for which a) simple variable transformations are insufficient to achieve a tractable low- dimensional parametric form and b) nonparametric methods are not efficient in estimating exceedance probabilities for large thresholds. Motivated by this context, in this paper we propose a general Bayesian approach for the estimation of tail probabilities of heavy-tailed distributions,based on a mixture of gamma distributions in which the mixing occurs over the shape parameter. This family provides …


Spatio-Temporal Analysis Of Areal Data And Discovery Of Neighborhood Relationships In Conditionally Autoregressive Models, Subharup Guha, Louise Ryan Nov 2006

Spatio-Temporal Analysis Of Areal Data And Discovery Of Neighborhood Relationships In Conditionally Autoregressive Models, Subharup Guha, Louise Ryan

Harvard University Biostatistics Working Paper Series

No abstract provided.


Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh Nov 2006

Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh

Harvard University Biostatistics Working Paper Series

No abstract provided.


Statistical Analysis Of Air Pollution Panel Studies: An Illustration, Holly Janes, Lianne Sheppard, Kristen Shepherd Oct 2006

Statistical Analysis Of Air Pollution Panel Studies: An Illustration, Holly Janes, Lianne Sheppard, Kristen Shepherd

UW Biostatistics Working Paper Series

The panel study design is commonly used to evaluate the short-term health effects of air pollution. Standard statistical methods for analyzing longitudinal data are available, but the literature reveals that the techniques are not well understood by practitioners. We illustrate these methods using data from the 1999 to 2002 Seattle panel study. Marginal, conditional, and transitional approaches for modeling longitudinal data are reviewed and contrasted with respect to their parameter interpretation and methods for accounting for correlation and dealing with missing data. We also discuss and illustrate techniques for controlling for time-dependent and time-independent confounding, and for exploring and summarizing …


Procedure Models, C. F. Bartley, W. W. Watson Oct 2006

Procedure Models, C. F. Bartley, W. W. Watson

Publications (YM)

This procedure establishes the responsibilities and process for documenting activities that constitute scientific investigation modeling. Planning requirements for conducting modeling are contained in LP-2.29Q-BSC, Planning for Science Activities.


Cox Models With Nonlinear Effect Of Covariates Measured With Error: A Case Study Of Chronic Kidney Disease Incidence, Ciprian M. Crainiceanu, David Ruppert, Josef Coresh Sep 2006

Cox Models With Nonlinear Effect Of Covariates Measured With Error: A Case Study Of Chronic Kidney Disease Incidence, Ciprian M. Crainiceanu, David Ruppert, Josef Coresh

Johns Hopkins University, Dept. of Biostatistics Working Papers

We propose, develop and implement the simulation extrapolation (SIMEX) methodology for Cox regression models when the log hazard function is linear in the model parameters but nonlinear in the variables measured with error (LPNE). The class of LPNE functions contains but is not limited to strata indicators, splines, quadratic and interaction terms. The first order bias correction method proposed here has the advantage that it remains computationally feasible even when the number of observations is very large and multiple models need to be explored. Theoretical and simulation results show that the SIMEX method outperforms the naive method even with small …


Spatial Cluster Detection For Censored Outcome Data, Andrea J. Cook, Diane Gold, Yi Li Sep 2006

Spatial Cluster Detection For Censored Outcome Data, Andrea J. Cook, Diane Gold, Yi Li

Harvard University Biostatistics Working Paper Series

No abstract provided.


Adjustment Uncertainty In Effect Estimation, Ciprian M. Crainiceanu, Francesca Dominici, Giovanni Parmigiani Aug 2006

Adjustment Uncertainty In Effect Estimation, Ciprian M. Crainiceanu, Francesca Dominici, Giovanni Parmigiani

Johns Hopkins University, Dept. of Biostatistics Working Papers

The selection of confounders and their functional relationship with the out- come affects exposure effect estimates. In practice, there is often substantial uncertainty about this selection, which we define here as “adjustment uncertainty.” We address the problem of estimating the effect of exposure on an outcome with focus on quantifying the effect of unknown confounders from a large set of potential confounders. We propose a general statistical framework for handling adjustment uncertainty in exposure effect estimation, a specific implementation called "Structured Estimation under Adjustment Uncertainty (STEADy)", and associated visualization tools. Theoretical results and simulation studies show that STEADy consistently estimates …


Bayesian Smoothing Of Irregularly-Spaced Data Using Fourier Basis Functions, Christopher J. Paciorek Aug 2006

Bayesian Smoothing Of Irregularly-Spaced Data Using Fourier Basis Functions, Christopher J. Paciorek

Harvard University Biostatistics Working Paper Series

No abstract provided.


Predicting Future Responses Based On Possibly Misspecified Working Models, Tianxi Cai, Lu Tian, Scott D. Solomon, L.J. Wei Aug 2006

Predicting Future Responses Based On Possibly Misspecified Working Models, Tianxi Cai, Lu Tian, Scott D. Solomon, L.J. Wei

Harvard University Biostatistics Working Paper Series

No abstract provided.


An Informative Bayesian Structural Equation Model To Assess Source-Specific Health Effects Of Air Pollution, Margaret C. Nikolov, Brent A. Coull, Paul J. Catalano, John J. Godleski Jul 2006

An Informative Bayesian Structural Equation Model To Assess Source-Specific Health Effects Of Air Pollution, Margaret C. Nikolov, Brent A. Coull, Paul J. Catalano, John J. Godleski

Harvard University Biostatistics Working Paper Series

No abstract provided.


Mixed Multiplicative Factor Analysis Model For Air Pollution Exposure Assessment, Margaret C. Nikolov, Brent A. Coull, Paul J. Catalano, John J. Godleski Jul 2006

Mixed Multiplicative Factor Analysis Model For Air Pollution Exposure Assessment, Margaret C. Nikolov, Brent A. Coull, Paul J. Catalano, John J. Godleski

Harvard University Biostatistics Working Paper Series

No abstract provided.


Relative Risk Regression In Medical Research: Models, Contrasts, Estimators, And Algorithms, Thomas Lumley, Richard Kronmal, Shuangge Ma Jul 2006

Relative Risk Regression In Medical Research: Models, Contrasts, Estimators, And Algorithms, Thomas Lumley, Richard Kronmal, Shuangge Ma

UW Biostatistics Working Paper Series

The relative risk or prevalence ratio is a natural and familiar summary of association between a binary outcome and an exposure or intervention. For rare events, the relative risk can be approximately estimated by logistic regression. For common events estimation is more difficult. We review proposed estimation algorithms for relative risk regression. Some of these give inconsistent estimates or invalid standard errors. We show that the methods that give correct inference can be viewed as arising from a family of quasilikelihood estimating functions for the same generalized linear model, differing in their efficiency and in their robustness to outlying values …


Causal Comparisons In Randomized Trials Of Two Active Treatments: The Effect Of Supervised Exercise To Promote Smoking Cessation, Jason Roy, Joseph W. Hogan Jul 2006

Causal Comparisons In Randomized Trials Of Two Active Treatments: The Effect Of Supervised Exercise To Promote Smoking Cessation, Jason Roy, Joseph W. Hogan

COBRA Preprint Series

In behavioral medicine trials, such as smoking cessation trials, two or more active treatments are often compared. Noncompliance by some subjects with their assigned treatment poses a challenge to the data analyst. Causal parameters of interest might include those defined by subpopulations based on their potential compliance status under each assignment, using the principal stratification framework (e.g., causal effect of new therapy compared to standard therapy among subjects that would comply with either intervention). Even if subjects in one arm do not have access to the other treatment(s), the causal effect of each treatment typically can only be identified from …


Semiparametric Bayesian Modeling Of Multivariate Average Bioequivalence, Pulak Ghosh Dr., Mithat Gonen May 2006

Semiparametric Bayesian Modeling Of Multivariate Average Bioequivalence, Pulak Ghosh Dr., Mithat Gonen

Memorial Sloan-Kettering Cancer Center, Dept. of Epidemiology & Biostatistics Working Paper Series

Bioequivalence trials are usually conducted to compare two or more formulations of a drug. Simultaneous assessment of bioequivalence on multiple endpoints is called multivariate bioequivalence. Despite the fact that some tests for multivariate bioequivalence are suggested, current practice usually involves univariate bioequivalence assessments ignoring the correlations between the endpoints such as AUC and Cmax. In this paper we develop a semiparametric Bayesian test for bioequivalence under multiple endpoints. Specifically, we show how the correlation between the endpoints can be incorporated in the analysis and how this correlation affects the inference. Resulting estimates and posterior probabilities ``borrow strength'' from one another …


Profile Likelihood Estimation Of Partially Linear Panel Data Models With Fixed Effects, Liangjun Su, Aman Ullah May 2006

Profile Likelihood Estimation Of Partially Linear Panel Data Models With Fixed Effects, Liangjun Su, Aman Ullah

Research Collection School Of Economics

We consider consistent estimation of partially linear panel data models with fixed effects. We propose profile-likelihood-based estimators for both the parametric and nonparametric components in the models and establish convergence rates and asymptotic normality for both estimators.


Semiparametric Latent Variable Regression Models For Spatio-Temporal Modeling Of Mobile Source Particles In The Greater Boston Area, Alexandros Gryparis, Brent A. Coull, Joel Schwartz, Helen H. Suh Apr 2006

Semiparametric Latent Variable Regression Models For Spatio-Temporal Modeling Of Mobile Source Particles In The Greater Boston Area, Alexandros Gryparis, Brent A. Coull, Joel Schwartz, Helen H. Suh

Harvard University Biostatistics Working Paper Series

Traffic particle concentrations show considerable spatial variability within a metropolitan area. We consider latent variable semiparametric regression models for modeling the spatial and temporal variability of black carbon and elemental carbon concentrations in the greater Boston area. Measurements of these pollutants, which are markers of traffic particles, were obtained from several individual exposure studies conducted at specific household locations as well as 15 ambient monitoring sites in the city. The models allow for both flexible, nonlinear effects of covariates and for unexplained spatial and temporal variability in exposure. In addition, the different individual exposure studies recorded different surrogates of traffic …


Super Learning: An Application To Prediction Of Hiv-1 Drug Susceptibility, Sandra E. Sinisi, Maya L. Petersen, Mark J. Van Der Laan Apr 2006

Super Learning: An Application To Prediction Of Hiv-1 Drug Susceptibility, Sandra E. Sinisi, Maya L. Petersen, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Many statistical methods exist that can be used to learn a predictor based on observed data. Examples include decision trees, neural networks, support vector regression, least angle regression, Logic Regression, and the Deletion/Substitution/Addition algorithm. The optimal algorithm for prediction will vary depending on the underlying data-generating distribution. In this article, we introduce a "super learner," a prediction algorithm that applies any set of candidate learners and uses cross-validation to select among them. Theory shows that asymptotically the super learner performs essentially as well or better than any of the candidate learners. We briefly present the theory behind the super learner, …


Causal Effect Models For Intention To Treat And Realistic Individualized Treatment Rules, Mark J. Van Der Laan Mar 2006

Causal Effect Models For Intention To Treat And Realistic Individualized Treatment Rules, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

An important class of models in causal inference are the so-called marginal structural models which model the comparison between counterfactual outcome distributions corresponding with a static treatment intervention, conditional on user supplied baseline covariates, based on observing a longitudinal data structure on a sample of n independent and identically distributed experimental units. Identification of a static treatment regimen specific outcome distribution based on observational data requires beyond the so-called sequential randomization assumption that each experimental unit has positive probability of following the static treatment regimen. The latter assumption is called the experimental treatment assignment assumption (ETA) (which is parameter specific). …


On The Equivalence Of Case-Crossover And Time Series Methods In Environmental Epidemiology, Yun Lu, Scott L. Zeger Mar 2006

On The Equivalence Of Case-Crossover And Time Series Methods In Environmental Epidemiology, Yun Lu, Scott L. Zeger

Johns Hopkins University, Dept. of Biostatistics Working Papers

Time series and case-crossover methods are often viewed as competing alternatives in environmental epidemiologic studies. Several recent studies have compared the time series and case-crossover methods. In this paper, we show that case-crossover using conditional logistic regression is a special case of time series analysis when there is a common exposure such as in air pollution studies. This equivalence provides computational convenience for case-crossover analyses and a better understanding of time series models. Time series log-linear regression accounts for over-dispersion of the Poisson variance, while case-crossover analyses typically do not. This equivalence also permits model checking for case-crossover data using …


Multiple Tests Of Association With Biological Annotation Metadata, Sandrine Dudoit, Sunduz Keles, Mark J. Van Der Laan Mar 2006

Multiple Tests Of Association With Biological Annotation Metadata, Sandrine Dudoit, Sunduz Keles, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

We propose a general and formal statistical framework for the multiple tests of associations between known fixed features of a genome and unknown parameters of the distribution of variable features of this genome in a population of interest. The known fixed gene-annotation profiles, corresponding to the fixed features of the genome, may concern Gene Ontology (GO) annotation, pathway membership, regulation by particular transcription factors, nucleotide sequences, or protein sequences. The unknown gene-parameter profiles, corresponding to the variable features of the genome, may be, for example, regression coefficients relating genome-wide transcript levels or DNA copy numbers to possibly censored biological and …