Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 19 of 19

Full-Text Articles in Physical Sciences and Mathematics

Estimating Percentile-Specific Causal Effects: A Case Study Of Micronutrient Supplementation, Birth Weight, And Infant Mortality, Francesca Dominici, Scott L. Zeger, Giovanni Parmigiani, Joanne Katz, Parul Christian Dec 2004

Estimating Percentile-Specific Causal Effects: A Case Study Of Micronutrient Supplementation, Birth Weight, And Infant Mortality, Francesca Dominici, Scott L. Zeger, Giovanni Parmigiani, Joanne Katz, Parul Christian

Johns Hopkins University, Dept. of Biostatistics Working Papers

In developing countries, higher infant mortality is partially caused by poor maternal and fetal nutrition. Clinical trials of micronutrient supplementation are aimed at reducing the risk of infant mortality by increasing birth weight. Because infant mortality is greatest among the low birth weight infants (LBW) (• 2500 grams), an effective intervention may need to increase the birth weight among the smallest babies. Although it has been demonstrated that supplementation increases the birth weight in a trial conducted in Nepal, there is inconclusive evidence that the supplementation improves their survival. It has been hypothesized that a potential benefit of the treatment …


Ranking Usrds Provider-Specific Smrs From 1998-2001, Rongheng Lin, Thomas A. Louis, Susan M. Paddock, Greg Ridgeway Dec 2004

Ranking Usrds Provider-Specific Smrs From 1998-2001, Rongheng Lin, Thomas A. Louis, Susan M. Paddock, Greg Ridgeway

Johns Hopkins University, Dept. of Biostatistics Working Papers

Provider profiling (ranking, "league tables") is prevalent in health services research. Similarly, comparing educational institutions and identifying differentially expressed genes depend on ranking. Effective ranking procedures must be structured by a hierarchical (Bayesian) model and guided by a ranking-specific loss function, however even optimal methods can perform poorly and estimates must be accompanied by uncertainty assessments. We use the 1998-2001 Standardized Mortality Ratio (SMR) data from United States Renal Data System (USRDS) as a platform to identify issues and approaches. Our analyses extend Liu et al. (2004) by combining evidence over multiple years via an AR(1) model; by considering estimates …


Semiparametric Regression In Capture-Recapture Modelling, O. Gimenez, C. Barbraud, Ciprian M. Crainiceanu, S. Jenouvrier, B.T. Morgan Dec 2004

Semiparametric Regression In Capture-Recapture Modelling, O. Gimenez, C. Barbraud, Ciprian M. Crainiceanu, S. Jenouvrier, B.T. Morgan

Johns Hopkins University, Dept. of Biostatistics Working Papers

Capture-recapture models were developed to estimate survival using data arising from marking and monitoring wild animals over time. Variation in the survival process may be explained by incorporating relevant covariates. We develop nonparametric and semiparametric regression models for estimating survival in capture-recapture models. A fully Bayesian approach using MCMC simulations was employed to estimate the model parameters. The work is illustrated by a study of Snow petrels, in which survival probabilities are expressed as nonlinear functions of a climate covariate, using data from a 40-year study on marked individuals, nesting at Petrels Island, Terre Adelie.


The Proportional Odds Model For Assessing Rater Agreement With Multiple Modalities, Elizabeth Garrett-Mayer, Steven N. Goodman, Ralph H. Hruban Dec 2004

The Proportional Odds Model For Assessing Rater Agreement With Multiple Modalities, Elizabeth Garrett-Mayer, Steven N. Goodman, Ralph H. Hruban

Johns Hopkins University, Dept. of Biostatistics Working Papers

In this paper, we develop a model for evaluating an ordinal rating systems where we assume that the true underlying disease state is continuous in nature. Our approach in motivated by a dataset with 35 microscopic slides with 35 representative duct lesions of the pancreas. Each of the slides was evaluated by eight raters using two novel rating systems (PanIN illustrations and PanIN nomenclature),where each rater used each systems to rate the slide with slide identity masked between evaluations. We find that the two methods perform equally well but that differentiation of higher grade lesions is more consistent across raters …


Cross-Study Validation And Combined Analysis Of Gene Expression Microarray Data, Elizabeth Garrett-Mayer, Giovanni Parmigiani, Xiaogang Zhong, Leslie Cope, Edward Gabrielson Dec 2004

Cross-Study Validation And Combined Analysis Of Gene Expression Microarray Data, Elizabeth Garrett-Mayer, Giovanni Parmigiani, Xiaogang Zhong, Leslie Cope, Edward Gabrielson

Johns Hopkins University, Dept. of Biostatistics Working Papers

Investigations of transcript levels on a genomic scale using

hybridization-based arrays led to formidable advances in our

understanding of the biology of many human illnesses. At the same time, these investigations have generated controversy, because of the probabilistic nature of the conclusions, and the surfacing of noticeable discrepancies between the results of studies addressing the same biological question. In this article we present simple and effective data analysis and visualization tools for gauging the degree to which

the finding of one study are reproduced by others, and for integrating multiple studies in a single analysis.

We describe these approaches in …


On Marginalized Multilevel Models And Their Computation, Michael E. Griswold, Scott L. Zeger Nov 2004

On Marginalized Multilevel Models And Their Computation, Michael E. Griswold, Scott L. Zeger

Johns Hopkins University, Dept. of Biostatistics Working Papers

Clustered data analysis is characterized by the need to describe both systematic variation in a mean model and cluster-dependent random variation in an association model. Marginalized multilevel models embrace the robustness and interpretations of a marginal mean model, while retaining the likelihood inference capabilities and flexible dependence structures of a conditional association model. Although there has been increasing recognition of the attractiveness of marginalized multilevel models, there has been a gap in their practical application arising from a lack of readily available estimation procedures. We extend the marginalized multilevel model to allow for nonlinear functions in both the mean and …


Spatially Adaptive Bayesian P-Splines With Heteroscedastic Errors, Ciprian M. Crainiceanu, David Ruppert, Raymond J. Carroll Nov 2004

Spatially Adaptive Bayesian P-Splines With Heteroscedastic Errors, Ciprian M. Crainiceanu, David Ruppert, Raymond J. Carroll

Johns Hopkins University, Dept. of Biostatistics Working Papers

An increasingly popular tool for nonparametric smoothing are penalized splines (P-splines) which use low-rank spline bases to make computations tractable while maintaining accuracy as good as smoothing splines. This paper extends penalized spline methodology by both modeling the variance function nonparametrically and using a spatially adaptive smoothing parameter. These extensions have been studied before, but never together and never in the multivariate case. This combination is needed for satisfactory inference and can be implemented effectively by Bayesian \mbox{MCMC}. The variance process controlling the spatially-adaptive shrinkage of the mean and the variance of the heteroscedastic error process are modeled as log-penalized …


Bayesian Hierarchical Distributed Lag Models For Summer Ozone Exposure And Cardio-Respiratory Mortality, Yi Huang, Francesca Dominici, Michelle L. Bell Oct 2004

Bayesian Hierarchical Distributed Lag Models For Summer Ozone Exposure And Cardio-Respiratory Mortality, Yi Huang, Francesca Dominici, Michelle L. Bell

Johns Hopkins University, Dept. of Biostatistics Working Papers

In this paper, we develop Bayesian hierarchical distributed lag models for estimating associations between daily variations in summer ozone levels and daily variations in cardiovascular and respiratory (CVDRESP) mortality counts for 19 U.S. large cities included in the National Morbidity Mortality Air Pollution Study (NMMAPS) for the period 1987 - 1994.

At the first stage, we define a semi-parametric distributed lag Poisson regression model to estimate city-specific relative rates of CVDRESP associated with short-term exposure to summer ozone. At the second stage, we specify a class of distributions for the true city-specific relative rates to estimate an overall effect by …


A Hypothesis Test For The End Of A Common Source Outbreak, Ron Brookmeyer, Xiaojun You Sep 2004

A Hypothesis Test For The End Of A Common Source Outbreak, Ron Brookmeyer, Xiaojun You

Johns Hopkins University, Dept. of Biostatistics Working Papers

The objective of this paper is to develop a hypothesis testing procedure to determine whether a common source outbreak has ended. We do not assume that the calendar date of exposure to the pathogen is known. We assume an underlying parametric model for the incubation period distribution of a 2-paramter exponential model with a guarantee time, although the parameters are not assumed to be known. The hypothesis testing procedure is based on the spacings between ordered calendar dates of disease onset of the cases. A simulation study was performed to evaluate the robustness of the methods to a lognormal model …


Optimal Sampling Times In Bioequivalence Studies Using A Simulated Annealing Algorithm , Leena Choi, Brian Caffo, Charles Rohde Sep 2004

Optimal Sampling Times In Bioequivalence Studies Using A Simulated Annealing Algorithm , Leena Choi, Brian Caffo, Charles Rohde

Johns Hopkins University, Dept. of Biostatistics Working Papers

In pharmacokinetic (PK) studies, blood samples are taken over time on subjects after the administration of a drug to measure the time-course of the plasma drug concentrations. In bioequivalence studies, the trapezoidal rule on the sampled time points is often used to estimate the area under the plasma concentration-time curve, a quantity of principle interest. This manuscript investigates the choice of sampling time points to estimate the area under the curve. In particular, we explore the relative merits of several objective functions, those functions which are minimized with respect to the sampling times to obtain an optimal study design. We …


Effect Of Misreported Family History On Mendelian Mutation Prediction Models, Hormuzd A. Katki Sep 2004

Effect Of Misreported Family History On Mendelian Mutation Prediction Models, Hormuzd A. Katki

Johns Hopkins University, Dept. of Biostatistics Working Papers

People with familial history of disease often consult with genetic counselors about their chance of carrying mutations that increase disease risk. To aid them, genetic counselors use Mendelian models that predict whether the person carries deleterious mutations based on their reported family history. Such models rely on accurate reporting of each member's diagnosis and age of diagnosis, but this information may be inaccurate. Commonly encountered errors in family history can significantly distort predictions, and thus can alter the clinical management of people undergoing counseling, screening, or genetic testing. We derive general results about the distortion in the carrier probability estimate …


On Time Series Analysis Of Public Health And Biomedical Data, Scott L. Zeger, Rafael A. Irizarry, Roger D. Peng Sep 2004

On Time Series Analysis Of Public Health And Biomedical Data, Scott L. Zeger, Rafael A. Irizarry, Roger D. Peng

Johns Hopkins University, Dept. of Biostatistics Working Papers

A time series is a sequence of observations made over time. Examples in public health include daily ozone concentrations, weekly admissions to an emergency department or annual expenditures on health care in the United States. Time series models are used to describe the dependence of the response at each time on predictor variables including covariates and possibly previous values in the series. Time series methods are necessary to account for the correlation among repeated responses over time. This paper gives an overview of time series ideas and methods used in public health research.


Studying Effects Of Primary Care Physicians And Patients On The Trade-Off Between Charges For Primary Care And Specialty Care Using A Hierarchical Multivariate Two-Part Model, John W. Robinson, Scott L. Zeger, Christopher B. Forrest Aug 2004

Studying Effects Of Primary Care Physicians And Patients On The Trade-Off Between Charges For Primary Care And Specialty Care Using A Hierarchical Multivariate Two-Part Model, John W. Robinson, Scott L. Zeger, Christopher B. Forrest

Johns Hopkins University, Dept. of Biostatistics Working Papers

Objective. To examine effects of primary care physicians (PCPs) and patients on the association between charges for primary care and specialty care in a point-of-service (POS) health plan.

Data Source. Claims from 1996 for 3,308 adult male POS plan members, each of whom was assigned to one of the 50 family practitioner-PCPs with the largest POS plan member-loads.

Study Design. A hierarchical multivariate two-part model was fitted using a Gibbs sampler to estimate PCPs' effects on patients' annual charges for two types of services, primary care and specialty care, the associations among PCPs' effects, and within-patient associations between charges for …


A Hierarchical Multivariate Two-Part Model For Profiling Providers' Effects On Healthcare Charges, John W. Robinson, Scott L. Zeger, Christopher B. Forrest Aug 2004

A Hierarchical Multivariate Two-Part Model For Profiling Providers' Effects On Healthcare Charges, John W. Robinson, Scott L. Zeger, Christopher B. Forrest

Johns Hopkins University, Dept. of Biostatistics Working Papers

Procedures for analyzing and comparing healthcare providers' effects on health services delivery and outcomes have been referred to as provider profiling. In a typical profiling procedure, patient-level responses are measured for clusters of patients treated by providers that in turn, can be regarded as statistically exchangeable. Thus, a hierarchical model naturally represents the structure of the data. When provider effects on multiple responses are profiled, a multivariate model rather than a series of univariate models, can capture associations among responses at both the provider and patient levels. When responses are in the form of charges for healthcare services and sampled …


Quantitative Methods For Tracking Cognitive Change 3 Years After Coronary Artery Bypass Surgery, Sarah Barry, Scott L. Zeger, Ola A. Selnes, Maura A. Grega, Louis M. Borowicz, Jr., Guy M. Mckhann Jun 2004

Quantitative Methods For Tracking Cognitive Change 3 Years After Coronary Artery Bypass Surgery, Sarah Barry, Scott L. Zeger, Ola A. Selnes, Maura A. Grega, Louis M. Borowicz, Jr., Guy M. Mckhann

Johns Hopkins University, Dept. of Biostatistics Working Papers

Background: The analysis and interpretation of change in cognitive function test scores after Coronary Artery Bypass Grafting (CABG). Longitudinal studies with multiple outcomes present considerable statistical challenges. Application of hierarchical linear statistical models can estimate the effects of a surgical intervention on the time course of multiple biomarkers.

Methods: We use an "analyze then summarize" approach whereby we estimate the intervention effects separately for each cognitive test and then pool them, taking appropriate account of their statistical correlations. The model accounts for dropouts at follow-up, the chance of which may be related to past cognitive score, by implicitly imputing the …


Bayesian Geostatistical Design, Peter J. Diggle, Soren Lophaven Jun 2004

Bayesian Geostatistical Design, Peter J. Diggle, Soren Lophaven

Johns Hopkins University, Dept. of Biostatistics Working Papers

This paper describes the use of model-based geostatistics for choosing the optimal set of sampling locations, collectively called the design, for a geostatistical analysis. Two types of design situations are considered. These are retrospective design, which concerns the addition of sampling locations to, or deletion of locations from, an existing design, and prospective design, which consists of choosing optimal positions for a new set of sampling locations. We propose a Bayesian design criterion which focuses on the goal of efficient spatial prediction whilst allowing for the fact that model parameter values are unknown. The results show that in this situation …


A Model Based Background Adjustment For Oligonucleotide Expression Arrays, Zhijin Wu, Rafael A. Irizarry, Robert Gentleman, Francisco Martinez Murillo, Forrest Spencer May 2004

A Model Based Background Adjustment For Oligonucleotide Expression Arrays, Zhijin Wu, Rafael A. Irizarry, Robert Gentleman, Francisco Martinez Murillo, Forrest Spencer

Johns Hopkins University, Dept. of Biostatistics Working Papers

High density oligonucleotide expression arrays are widely used in many areas of biomedical research. Affymetrix GeneChip arrays are the most popular. In the Affymetrix system, a fair amount of further pre-processing and data reduction occurs following the image processing step. Statistical procedures developed by academic groups have been successful at improving the default algorithms provided by the Affymetrix system. In this paper we present a solution to one of the pre-processing steps, background adjustment, based on a formal statistical framework. Our solution greatly improves the performance of the technology in various practical applications.

Affymetrix GeneChip arrays use short oligonucleotides to …


Seasonal Analyses Of Air Pollution And Mortality In 100 U.S. Cities, Roger D. Peng, Francesca Dominici, Roberto Pastor-Barriuso, Scott L. Zeger, Jonathan M. Samet May 2004

Seasonal Analyses Of Air Pollution And Mortality In 100 U.S. Cities, Roger D. Peng, Francesca Dominici, Roberto Pastor-Barriuso, Scott L. Zeger, Jonathan M. Samet

Johns Hopkins University, Dept. of Biostatistics Working Papers

Time series models relating short-term changes in air pollution levels to daily mortality counts typically assume that the effects of air pollution on the log relative rate of mortality do not vary with time. However, these short-term effects might plausibly vary by season. Changes in the sources of air pollution and meteorology can result in changes in characteristics of the air pollution mixture across seasons. The authors develop Bayesian semi-parametric hierarchical models for estimating time-varying effects of pollution on mortality in multi-site time series studies. The methods are applied to the updated National Morbidity and Mortality Air Pollution Study database …


Optimal Sample Size For Multiple Testing: The Case Of Gene Expression Microarrays, Peter Muller, Giovanni Parmigiani, Christian Robert, Judith Rousseau Feb 2004

Optimal Sample Size For Multiple Testing: The Case Of Gene Expression Microarrays, Peter Muller, Giovanni Parmigiani, Christian Robert, Judith Rousseau

Johns Hopkins University, Dept. of Biostatistics Working Papers

We consider the choice of an optimal sample size for multiple comparison problems. The motivating application is the choice of the number of microarray experiments to be carried out when learning about differential gene expression. However, the approach is valid in any application that involves multiple comparisons in a large number of hypothesis tests. We discuss two decision problems in the context of this setup: the sample size selection and the decision about the multiple comparisons. We adopt a decision theoretic approach,using loss functions that combine the competing goals of discovering as many ifferentially expressed genes as possible, while keeping …