Open Access. Powered by Scholars. Published by Universities.®

Biostatistics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 12 of 12

Full-Text Articles in Biostatistics

An Analysis Of Risk Reduction Choices In Dcis Breast Cancer Patients, Lauren Soltesz Dec 2012

An Analysis Of Risk Reduction Choices In Dcis Breast Cancer Patients, Lauren Soltesz

Statistics

The main focus of this paper was to evaluate possible demographic and clinical characteristics associated with a woman’s choice of breast conserving surgery (BCS), unilateral mastectomy (ULM), or bilateral risk reduction mastectomy (BRRM). The cohort consisted of patients presenting to the City of Hope National Medical Center with ductal carcinoma in situ breast cancer who elected to have cancer directed surgery (N=305). Analyses to examine associations of patient characteristics with type of surgery were conducted using a multinomial logistic regression. Results showed that older women were more likely to choose breast conserving surgery over bilateral risk reduction mastectomy than younger …


Big Data And The Future, Sherri Rose Jul 2012

Big Data And The Future, Sherri Rose

Sherri Rose

No abstract provided.


Differential Patterns Of Interaction And Gaussian Graphical Models, Masanao Yajima, Donatello Telesca, Yuan Ji, Peter Muller Apr 2012

Differential Patterns Of Interaction And Gaussian Graphical Models, Masanao Yajima, Donatello Telesca, Yuan Ji, Peter Muller

COBRA Preprint Series

We propose a methodological framework to assess heterogeneous patterns of association amongst components of a random vector expressed as a Gaussian directed acyclic graph. The proposed framework is likely to be useful when primary interest focuses on potential contrasts characterizing the association structure between known subgroups of a given sample. We provide inferential frameworks as well as an efficient computational algorithm to fit such a model and illustrate its validity through a simulation. We apply the model to Reverse Phase Protein Array data on Acute Myeloid Leukemia patients to show the contrast of association structure between refractory patients and relapsed …


Statistical Methods For Proteomic Biomarker Discovery Based On Feature Extraction Or Functional Modeling Approaches, Jeffrey S. Morris Jan 2012

Statistical Methods For Proteomic Biomarker Discovery Based On Feature Extraction Or Functional Modeling Approaches, Jeffrey S. Morris

Jeffrey S. Morris

In recent years, developments in molecular biotechnology have led to the increased promise of detecting and validating biomarkers, or molecular markers that relate to various biological or medical outcomes. Proteomics, the direct study of proteins in biological samples, plays an important role in the biomarker discovery process. These technologies produce complex, high dimensional functional and image data that present many analytical challenges that must be addressed properly for effective comparative proteomics studies that can yield potential biomarkers. Specific challenges include experimental design, preprocessing, feature extraction, and statistical analysis accounting for the inherent multiple testing issues. This paper reviews various computational …


Integrative Bayesian Analysis Of High-Dimensional Multi-Platform Genomics Data, Wenting Wang, Veerabhadran Baladandayuthapani, Jeffrey S. Morris, Bradley M. Broom, Ganiraju C. Manyam, Kim-Anh Do Jan 2012

Integrative Bayesian Analysis Of High-Dimensional Multi-Platform Genomics Data, Wenting Wang, Veerabhadran Baladandayuthapani, Jeffrey S. Morris, Bradley M. Broom, Ganiraju C. Manyam, Kim-Anh Do

Jeffrey S. Morris

Motivation: Analyzing data from multi-platform genomics experiments combined with patients’ clinical outcomes helps us understand the complex biological processes that characterize a disease, as well as how these processes relate to the development of the disease. Current integration approaches that treat the data are limited in that they do not consider the fundamental biological relationships that exist among the data from platforms.

Statistical Model: We propose an integrative Bayesian analysis of genomics data (iBAG) framework for identifying important genes/biomarkers that are associated with clinical outcome. This framework uses a hierarchical modeling technique to combine the data obtained from multiple platforms …


Proportional Mean Residual Life Model For Right-Censored Length-Biased Data, Gary Kwun Chuen Chan, Ying Qing Chen, Chongzhi Di Jan 2012

Proportional Mean Residual Life Model For Right-Censored Length-Biased Data, Gary Kwun Chuen Chan, Ying Qing Chen, Chongzhi Di

Chongzhi Di

To study disease association with risk factors in epidemiologic studies, cross-sectional sampling is often more focused and less costly for recruiting study subjects who have already experienced initiating events. For time-to-event outcome, however, such a sampling strategy may be length-biased. Coupled with censoring, analysis of length-biased data can be quite challenging, due to the so-called “induced informative censoring” in which the survival time and censoring time are correlated through a common backward recurrence time. We propose to use the proportional mean residual life model of Oakes and Dasu (1990) for analysis of censored length-biased survival data. Several nonstandard data structures, …


Comparing The Cohort Design And The Nested Case-Control Design In The Presence Of Both Time-Invariant And Time-Dependent Treatment And Competing Risks: Bias And Precision, Peter C. Austin Jan 2012

Comparing The Cohort Design And The Nested Case-Control Design In The Presence Of Both Time-Invariant And Time-Dependent Treatment And Competing Risks: Bias And Precision, Peter C. Austin

Peter Austin

Purpose: Observational studies using electronic administrative health care databases are often used to estimate the effects of treatments and exposures. Traditionally, a cohort design has been used to estimate these effects, but increasingly studies are using a nested case-control (NCC) design. The relative statistical efficiency of these two designs has not been examined in detail.

Methods: We used Monte Carlo simulations to compare these two designs in terms of the bias and precision of effect estimates. We examined three different settings: (A): treatment occurred at baseline and there was a single outcome of interest; (B): treatment was time-varying and there …


Using Ensemble-Based Methods For Directly Estimating Causal Effects: An Investigation Of Tree-Based G-Computation, Peter C. Austin Jan 2012

Using Ensemble-Based Methods For Directly Estimating Causal Effects: An Investigation Of Tree-Based G-Computation, Peter C. Austin

Peter Austin

Researchers are increasingly using observational or nonrandomized data to estimate causal treatment effects. Essential to the production of high-quality evidence is the ability to reduce or minimize the confounding that frequently occurs in observational studies. When using the potential outcome framework to define causal treatment effects, one requires the potential outcome under each possible treatment. However, only the outcome under the actual treatment received is observed, whereas the potential outcomes under the other treatments are considered missing data. Some authors have proposed that parametric regression models be used to estimate potential outcomes. In this study, we examined the use of …


Regression Trees For Predicting Mortality In Patients With Cardiovascular Disease: What Improvement Is Achieved By Using Ensemble-Based Methods?, Peter C. Austin Jan 2012

Regression Trees For Predicting Mortality In Patients With Cardiovascular Disease: What Improvement Is Achieved By Using Ensemble-Based Methods?, Peter C. Austin

Peter Austin

In biomedical research, the logistic regression model is the most commonly used method for predicting the probability of a binary outcome. While many clinical researchers have expressed an enthusiasm for regression trees, this method may have limited accuracy for predicting health outcomes. We aimed to evaluate the improvement that is achieved by using ensemble-based methods, including bootstrap aggregation (bagging) of regression trees, random forests, and boosted regression trees. We analyzed 30-day mortality in two large cohorts of patients hospitalized with either acute myocardial infarction (N = 16,230) or congestive heart failure (N = 15,848) in two distinct eras (1991-2001 and …


Generating Survival Times To Simulate Cox Proportional Hazards Models With Time-Varying Covariates., Peter C. Austin Jan 2012

Generating Survival Times To Simulate Cox Proportional Hazards Models With Time-Varying Covariates., Peter C. Austin

Peter Austin

Simulations and Monte Carlo methods serve an important role in modern statistical research. They allow for an examination of the performance of statistical procedures in settings in which analytic and mathematical derivations may not be feasible. A key element in any statistical simulation is the existence of an appropriate data-generating process: one must be able to simulate data from a specified statistical model. We describe data-generating processes for the Cox proportional hazards model with time-varying covariates when event times follow an exponential, Weibull, or Gompertz distribution. We consider three types of time-varying covariates: first, a dichotomous time-varying covariate that can …


Analysis Of Binary Data Via Spatial-Temporal Autologistic Regression Models, Zilong Wang Jan 2012

Analysis Of Binary Data Via Spatial-Temporal Autologistic Regression Models, Zilong Wang

Theses and Dissertations--Statistics

Spatial-temporal autologistic models are useful models for binary data that are measured repeatedly over time on a spatial lattice. They can account for effects of potential covariates and spatial-temporal statistical dependence among the data. However, the traditional parametrization of spatial-temporal autologistic model presents difficulties in interpreting model parameters across varying levels of statistical dependence, where its non-negative autocovariates could bias the realizations toward 1. In order to achieve interpretable parameters, a centered spatial-temporal autologistic regression model has been developed. Two efficient statistical inference approaches, expectation-maximization pseudo-likelihood approach (EMPL) and Monte Carlo expectation-maximization likelihood approach (MCEML), have been proposed. Also, Bayesian …


Alternatives To Mixture Model Analysis Of Correlated Binomial Data, N. Rao Chaganty, Roy Sabo, Yihao Deng Jan 2012

Alternatives To Mixture Model Analysis Of Correlated Binomial Data, N. Rao Chaganty, Roy Sabo, Yihao Deng

Mathematics & Statistics Faculty Publications

While univariate instances of binomial data are readily handled with generalized linear models, cases of multivariate or repeated measure binomial data are complicated by the possibility of correlated responses. Likelihood-based estimation can be applied by using mixture distribution models, though this approach can present computational challenges. The logistic transformation can be used to bypass these concerns and allow for alternative estimating procedures. One popular alternative is the generalized estimating equation (GEE) method, though systematic errors can lead to infeasible correlation estimates or nonconvergence problems. Our approach is the coupling of quasileast squares (QLSs) method with a rarely used matrix factorization, …