Open Access. Powered by Scholars. Published by Universities.®

Biostatistics Commons

Open Access. Powered by Scholars. Published by Universities.®

University of Louisville

Electronic Theses and Dissertations

Discipline
Keyword
Publication Year

Articles 1 - 30 of 40

Full-Text Articles in Biostatistics

Bayesian Strategies For Propensity Score Estimation In Causal Inference., Uthpala I. Wanigasekara Dec 2023

Bayesian Strategies For Propensity Score Estimation In Causal Inference., Uthpala I. Wanigasekara

Electronic Theses and Dissertations

Causal inference is a method used in various fields to draw causal conclusions based on data. It involves using assumptions, study designs, and estimation strategies to minimize the impact of confounding variables. Propensity scores are used to estimate outcome effects, through matching methods, stratification, weighting methods, and the Covariate Balancing Propensity Score method. However, they can be sensitive to estimation techniques and can lead to unstable findings. Researchers have proposed integrating weighing with regression adjustment in parametric models to improve causal inference validity. The first project focuses on Bayesian joint and two-stage methods for propensity score analysis. Propensity score modeling …


Causal Inference For The Effect Of Continuous Treatment On Time-To-Event Outcomes And Mediation Analysis On Health Disparities In Observational Studies., Triparna Poddar Dec 2023

Causal Inference For The Effect Of Continuous Treatment On Time-To-Event Outcomes And Mediation Analysis On Health Disparities In Observational Studies., Triparna Poddar

Electronic Theses and Dissertations

The dissertation comprises two projects related to causal inference based on observational data. In healthcare research, where abundant observational data such as claims data and electronic records are available, researchers often aim to study the treatment effect and the pathway of that effect. However, estimating treatment effects in observational data presents challenges due to confounding factors. The first project focuses on estimating continuous treatment effects for survival outcomes, while the second concentrates on mediation analysis, allowing the exploration of the pathway of the causal effect. Both projects involve addressing confounding variables. In the first project, I investigate estimation of the …


Statistical Inference On Lung Cancer Screening Using The National Lung Screening Trial Data., Farhin Rahman Aug 2023

Statistical Inference On Lung Cancer Screening Using The National Lung Screening Trial Data., Farhin Rahman

Electronic Theses and Dissertations

This dissertation consists of three research projects on cancer screening probability modeling. In these projects, the three key modeling parameters (sensitivity, sojourn time, transition density) for cancer screening were estimated, along with the long-term outcomes (including overdiagnosis as one outcome), the optimal screening time/age, the lead time distribution, and the probability of overdiagnosis at the future screening time were simulated to provide a statistical perspective on the effectiveness of cancer screening programs. In the first part of this dissertation, a statistical inference was conducted for male and female smokers using the National Lung Screening Trial (NLST) chest X-ray data. A …


Bayesian Methods For Graphical Models With Neighborhood Selection., Sagnik Bhadury Dec 2022

Bayesian Methods For Graphical Models With Neighborhood Selection., Sagnik Bhadury

Electronic Theses and Dissertations

Graphical models determine associations between variables through the notion of conditional independence. Gaussian graphical models are a widely used class of such models, where the relationships are formalized by non-null entries of the precision matrix. However, in high-dimensional cases, covariance estimates are typically unstable. Moreover, it is natural to expect only a few significant associations to be present in many realistic applications. This necessitates the injection of sparsity techniques into the estimation method. Classical frequentist methods, like GLASSO, use penalization techniques for this purpose. Fully Bayesian methods, on the contrary, are slow because they require iteratively sampling over a quadratic …


Statistical Methods For Personalized Treatment Selection And Survival Data Analysis Based On Observational Data With High-Dimensional Covariates., Don Ramesh Dinendra Sudaraka Tholkage Aug 2022

Statistical Methods For Personalized Treatment Selection And Survival Data Analysis Based On Observational Data With High-Dimensional Covariates., Don Ramesh Dinendra Sudaraka Tholkage

Electronic Theses and Dissertations

Due to the wide availability of functional data from multiple disciplines, the studies of functional data analysis have become popular in the recent literature. However, the related development in censored survival data has been relatively sparse. In Chapter 2, we consider the problem of analyzing time-to-event data in the presence of functional predictors. We develop a conditional generalized Kaplan Meier (KM) estimator that incorporates functional predictors using kernel weights and rigorously establishes its asymptotic properties. In addition, we propose to select the optimal bandwidth based on a time-dependent Brier score. We then carry out extensive numerical studies to examine the …


Statistical Methods For Assessing Drug Interactions And Identifying Effect Modifiers Using Observational Data., Qian Xu May 2022

Statistical Methods For Assessing Drug Interactions And Identifying Effect Modifiers Using Observational Data., Qian Xu

Electronic Theses and Dissertations

This dissertation consists of three projects related to causal inference based on observational data. In the first project, we propose a double robust to identify the effect modifiers and estimate optimal treatment. Observational studies differ from experimental studies in that assignment of subjects to treatments is not randomized but rather occurs due to natural mechanisms, which are usually hidden from the researchers. Many statistical methods to identify the treatment effect and select the optimal personalized treatment for experimental studies may not be suitable for observational studies any more. In this project, we propose a exible outcome model to select the …


Estimating Treatment Effect On Medical Cost And Examining Medical Cost Trajectory Using Splines And Change Point Techniques., Indranil Ghosh Dec 2021

Estimating Treatment Effect On Medical Cost And Examining Medical Cost Trajectory Using Splines And Change Point Techniques., Indranil Ghosh

Electronic Theses and Dissertations

In the world of growing medical needs, other than the clinical outcomes, the cost of healthcare is one of the important aspects to evaluate. The cost of treatment could act as a decisive factor on which one to choose from two equally likely effective treatment options. In literature, the most used quantity for the cost of treatment is cumulative lifetime cost since the diagnosis of a disease. While it provides a bird' eye view of the treatment cost, it fails to capture the underlying pattern of the treatment cost trajectory. We developed a marginal structural functional model (MSFM) using an …


Predictive Modeling Of Clinical Outcomes For Hospitalized Covid-19 Patients Utilizing Cytof And Clinical Data., Onajia Stubblefield Aug 2021

Predictive Modeling Of Clinical Outcomes For Hospitalized Covid-19 Patients Utilizing Cytof And Clinical Data., Onajia Stubblefield

Electronic Theses and Dissertations

In December 2019, an outbreak of a novel coronavirus initiated a global pandemic. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a virus that causes the disease coronavirus disease 2019 (COVID-19). Symptoms of infection with COVID-19 vary widely between individuals. While some infected individuals are asymptomatic, others need more extensive care and require hospitalization. Indeed, the COVID-19 pandemic was characterized by a shortage of hospital beds which presented additional complications in providing adequate care for patients. In this study, we used a combination of T cell population data collected from mass cytometry analysis and clinical markers to form a predictive …


Bayesian Variable Selection Strategies In Longitudinal Mixture Models And Categorical Regression Problems., Md Nazir Uddin Aug 2021

Bayesian Variable Selection Strategies In Longitudinal Mixture Models And Categorical Regression Problems., Md Nazir Uddin

Electronic Theses and Dissertations

In this work, we seek to develop a variable screening and selection method for Bayesian mixture models with longitudinal data. To develop this method, we consider data from the Health and Retirement Survey (HRS) conducted by University of Michigan. Considering yearly out-of-pocket expenditures as the longitudinal response variable, we consider a Bayesian mixture model with $K$ components. The data consist of a large collection of demographic, financial, and health-related baseline characteristics, and we wish to find a subset of these that impact cluster membership. An initial mixture model without any cluster-level predictors is fit to the data through an MCMC …


Estimating Cumulative Incidence Rate On Interval Censored Data In An Illness-Death Model., Chen Qian May 2021

Estimating Cumulative Incidence Rate On Interval Censored Data In An Illness-Death Model., Chen Qian

Electronic Theses and Dissertations

Phase IV clinical trials are designed to monitor long-term side effects caused overtime by the medical treatment. For instance, in advanced primary cancer treatment, childhood cancer survivors are often at risk of developing undesired events, such as cardiotoxicity, during their adulthood. Such problems could be due to their cancer or the treatment they received for their cancer such as radiation or intensive chemotherapy. Cardiotoxicity can be diagnosed with electrophysiology with measurements of fraction shortening, afterload, etc. Often the primary focus of a study could be on estimating the cumulative incidence of a particular outcome of interest such as cardiotoxicity. However, …


Observational Studies In Group Testing And Potential Applications., Alexander Christopher Noll May 2021

Observational Studies In Group Testing And Potential Applications., Alexander Christopher Noll

Electronic Theses and Dissertations

The use of group testing to identify individuals with targeted outcomes in a population can greatly improve the efficiency, speed, and cost effectiveness of testing a population for an outcome, or at least for identifying the prevalence of an outcome in a population. The implementation of causal inference techniques can provide the basis for an observational study that would allow an investigator to gather estimates for treatment effectiveness if group testing was conducted on the population in a certain way. This thesis examines a simulation of the above outlined principles in order to demonstrate a potential application for determining treatment …


Statistical Approaches Of Gene Set Analysis With Quantitative Trait Loci For High-Throughput Genomic Studies., Samarendra Das Dec 2020

Statistical Approaches Of Gene Set Analysis With Quantitative Trait Loci For High-Throughput Genomic Studies., Samarendra Das

Electronic Theses and Dissertations

Recently, gene set analysis has become the first choice for gaining insights into the underlying complex biology of diseases through high-throughput genomic studies, such as Microarrays, bulk RNA-Sequencing, single cell RNA-Sequencing, etc. It also reduces the complexity of statistical analysis and enhances the explanatory power of the obtained results. Further, the statistical structure and steps common to these approaches have not yet been comprehensively discussed, which limits their utility. Hence, a comprehensive overview of the available gene set analysis approaches used for different high-throughput genomic studies is provided. The analysis of gene sets is usually carried out based on …


Modified-Half-Normal Distribution And Different Methods To Estimate Average Treatment Effect., Jingchao Sun Dec 2020

Modified-Half-Normal Distribution And Different Methods To Estimate Average Treatment Effect., Jingchao Sun

Electronic Theses and Dissertations

This dissertation consists of three projects related to Modified-Half-Normal distribution and causal inference. In my first project, a new distribution called Modified-Half-Normal distribution was introduced. I explored a few of its distributional properties, the procedures for generating random samples based on Bayesian approaches, and the parameter estimation based on the method of moments. The second project deals with the problem of selection bias of average treatment effect (ATE) if we use the observational data. I combined the propensity score based inverse probability of treatment weighting (IPTW) method and the directed acyclic graph (DAG) to solve this problem. The third project …


Aspects Of Causal Inference., John A. Craycroft Dec 2020

Aspects Of Causal Inference., John A. Craycroft

Electronic Theses and Dissertations

Observational studies differ from experimental studies in that assignment of subjects to treatments is not randomized but rather occurs due to natural mechanisms, which are usually hidden from researchers. Yet objectives of the two studies are frequently the same: identify the causal – rather than merely associational – relationship between some treatment or exposure and an outcome. The statistical issues that arise in properly analyzing observational data for this goal are numerous and fascinating, and these issues are encompassed in the domain of causal inference. The research presented in this dissertation explores several distinct aspects of causal inference. This dissertation …


Marginal Methods And Software For Clustered Data With Cluster- And Group-Size Informativeness., Mary Elizabeth Gregg Aug 2020

Marginal Methods And Software For Clustered Data With Cluster- And Group-Size Informativeness., Mary Elizabeth Gregg

Electronic Theses and Dissertations

Clustered data result when observations have some natural organizational association. In such data, cluster size is defined as the number of observations belonging to a cluster. A phenomenon termed informative cluster size (ICS) occurs when observation outcomes vary in a systematic way related to the cluster size. An additional form of informativeness, termed informative within-cluster group size (IWCGS), arises when the distribution of group-defining categorical covariates within clusters similarly carries information related to outcomes. Standard methods for the marginal analysis of clustered data can produce biased estimates and inference when data have informativeness. A reweighting methodology has been developed that …


Linear Methods For Regression With Small Sample Sizes Relative To The Number Of Variables., Rajesh Sikder Aug 2020

Linear Methods For Regression With Small Sample Sizes Relative To The Number Of Variables., Rajesh Sikder

Electronic Theses and Dissertations

In data sets where there are a small number of observations but a large number of variables observed for each observation, ordinary least squares estimation cannot be used for regression models. There are many alternative including stepwise regression, penalized methods such as ridge regression and the LASSO, and methods based on derived inputs such as principal components regression and partial least squares regression. In this thesis, these five methods are described. K-fold cross validation is also discussed as a way for determining regularization parameters for each method. The performance of these methods in estimation and prediction is also examined through …


Novel Bayesian Methodology For The Analysis Of Single-Cell Rna Sequencing Data., Michael Sekula May 2020

Novel Bayesian Methodology For The Analysis Of Single-Cell Rna Sequencing Data., Michael Sekula

Electronic Theses and Dissertations

With single-cell RNA sequencing (scRNA-seq) technology, researchers are able to gain a better understanding of health and disease through the analysis of gene expression data at the cellular-level; however, scRNA-seq data tend to have high proportions of zero values, increased cell-to-cell variability, and overdispersion due to abnormally large expression counts, which create new statistical problems that need to be addressed. This dissertation includes three research projects that propose Bayesian methodology suitable for scRNA-seq analysis. In the first project, a hurdle model for identifying differentially expressed genes across cell types in scRNA-seq data is presented. This model incorporates a correlated random …


Statistical Methods For Estimating And Testing Treatment Effect For Multiple Treatment Groups In Observational Studies., Xiaofang Yan Dec 2019

Statistical Methods For Estimating And Testing Treatment Effect For Multiple Treatment Groups In Observational Studies., Xiaofang Yan

Electronic Theses and Dissertations

Note: Abstract would not save due to an issue with some of the characters.


Novel Bayesian Methodology In Multivariate Problems., Debamita Kundu Aug 2019

Novel Bayesian Methodology In Multivariate Problems., Debamita Kundu

Electronic Theses and Dissertations

This dissertation involves developing novel Bayesian methodology for multivariate problems. In particular, it focuses on two contexts: shrinkage based variable selection in multivariate regression and simultaneous covariance estimation of multiple groups. Both these projects are centered around fully Bayesian inference schemes based on hierarchical modeling to capture context-specific features of the data and the development of computationally efficient estimation algorithm. Variable selection over a potentially large set of covariates in a linear model is quite popular. In the Bayesian context, common prior choices can lead to a posterior expectation of the regression coefficients that is a sparse (or nearly sparse) …


Innate Immunity, The Hepatic Extracellular Matrix, And Liver Injury: Mathematical Modeling Of Metastatic Potential And Tumor Development In Alcoholic Liver Disease., Shanice V. Hudson Dec 2018

Innate Immunity, The Hepatic Extracellular Matrix, And Liver Injury: Mathematical Modeling Of Metastatic Potential And Tumor Development In Alcoholic Liver Disease., Shanice V. Hudson

Electronic Theses and Dissertations

The overarching goals of the current work are to fill key gaps in the current understanding of alcohol consumption and the risk of metastasis to the liver. Considering the evidence this research group has compiled confirming that the hepatic matrisome responds dynamically to injury, an altered extracellular matrix (ECM) profile appears to be a key feature of pre-fibrotic inflammatory injury in the liver. This group has demonstrated that the hepatic ECM responds dynamically to alcohol exposure, in particular, sensitizing the liver to LPS-induced inflammatory damage. Although the study of alcohol in its role as a contributing factor to oncogenesis and …


Bayesian Analytical Approaches For Metabolomics : A Novel Method For Molecular Structure-Informed Metabolite Interaction Modeling, A Novel Diagnostic Model For Differentiating Myocardial Infarction Type, And Approaches For Compound Identification Given Mass Spectrometry Data., Patrick J. Trainor Aug 2018

Bayesian Analytical Approaches For Metabolomics : A Novel Method For Molecular Structure-Informed Metabolite Interaction Modeling, A Novel Diagnostic Model For Differentiating Myocardial Infarction Type, And Approaches For Compound Identification Given Mass Spectrometry Data., Patrick J. Trainor

Electronic Theses and Dissertations

Metabolomics, the study of small molecules in biological systems, has enjoyed great success in enabling researchers to examine disease-associated metabolic dysregulation and has been utilized for the discovery biomarkers of disease and phenotypic states. In spite of recent technological advances in the analytical platforms utilized in metabolomics and the proliferation of tools for the analysis of metabolomics data, significant challenges in metabolomics data analyses remain. In this dissertation, we present three of these challenges and Bayesian methodological solutions for each. In the first part we develop a new methodology to serve a basis for making higher order inferences in metabolomics, …


Generalized Spatiotemporal Modeling And Causal Inference For Assessing Treatment Effects For Multiple Groups For Ordinal Outcome., Soutik Ghosal Aug 2018

Generalized Spatiotemporal Modeling And Causal Inference For Assessing Treatment Effects For Multiple Groups For Ordinal Outcome., Soutik Ghosal

Electronic Theses and Dissertations

This dissertation consists of three projects and can be categorized in two broad research areas: generalized spatiotemporal modeling and causal inference based on observational data. In the first project, I introduce a Bayesian hierarchical mixed effect hurdle model with a nested random effect structure to model the count for primary care providers and understand their spatial and temporal variation. This study further enables us to identify the health professional shortage areas and the possible impacting factors. In the second project, I have unified popular parametric and nonparametric propensity score-based methods to assess the treatment effect of multiple groups for ordinal …


Sample Size Calculations And Normalization Methods For Rna-Seq Data., Xiaohong Li Dec 2017

Sample Size Calculations And Normalization Methods For Rna-Seq Data., Xiaohong Li

Electronic Theses and Dissertations

High-throughput RNA sequencing (RNA-seq) has become the preferred choice for transcriptomics and gene expression studies. With the rapid growth of RNA-seq applications, sample size calculation methods for RNA-seq experiment design and data normalization methods for DEG analysis are important issues to be explored and discussed. The underlying theme of this dissertation is to develop novel sample size calculation methods in RNA-seq experiment design using test statistics. I have also proposed two novel normalization methods for analysis of RNA-seq data. In chapter one, I present the test statistical methods including Wald’s test, log-transformed Wald’s test and likelihood ratio test statistics for …


Functional Data Analysis Methods For Predicting Disease Status., Sarah Kendrick Dec 2017

Functional Data Analysis Methods For Predicting Disease Status., Sarah Kendrick

Electronic Theses and Dissertations

Introduction: Differential scanning calorimetry (DSC) is used to determine thermally-induced conformational changes of biomolecules within a blood plasma sample. Recent research has indicated that DSC curves (or thermograms) may have different characteristics based on disease status and, thus, may be useful as a monitoring and diagnostic tool for some diseases. Since thermograms are curves measured over a range of temperature values, they are often considered as functional data. In this dissertation we propose and apply functional data analysis (FDA) techniques to analyze DSC data from the Lupus Family Registry and Repository (LFRR). The aim is to develop FDA methods to …


Bayesian Approach On Short Time-Course Data Of Protein Phosphorylation, Casual Inference For Ordinal Outcome And Causal Analysis Of Dietary And Physical Activity In T2dm Using Nhanes Data., You Wu Aug 2017

Bayesian Approach On Short Time-Course Data Of Protein Phosphorylation, Casual Inference For Ordinal Outcome And Causal Analysis Of Dietary And Physical Activity In T2dm Using Nhanes Data., You Wu

Electronic Theses and Dissertations

This dissertation contains three different projects in proteomics and causal inferences. In the first project, I apply a Bayesian hierarchical model to assess the stability of phosphorylated proteins under short-time cold ischemia. This study provides inference on the stability of these phosphorylated proteins, which is valuable when using these proteins as biomarkers for a disease. in the second project, I perform a comparative study of different confounding-adjusted to estimate the treatment effect when the outcome variable is ordinal using observational data. The adjusted U-statistics method is compared with other methods such as ordinal logistic regression, propensity score based stratification and …


Estimation Of The Three Key Parameters And The Lead Time Distribution In Lung Cancer Screening., Ruiqi Liu Aug 2017

Estimation Of The Three Key Parameters And The Lead Time Distribution In Lung Cancer Screening., Ruiqi Liu

Electronic Theses and Dissertations

This dissertation contains three research projects on cancer screening probability modeling. Cancer screening is the primary technique for early detection. The goal of screening is to catch the disease early before clinical symptoms appear. In these projects, the three key parameters and lead time distribution were estimated to provide a statistical point of view on the effectiveness of cancer screening programs. In the first project, cancer screening probability model was used to analyze the computed tomography (CT) scan group in the National Lung Screening Trial (NLST) data. Three key parameters were estimated using Bayesian approach and Markov Chain Monte Carlo …


Likelihood-Based Methods For Analysis Of Copy Number Variation Using Next Generation Sequencing Data., Udika Iroshini Bandara Aug 2017

Likelihood-Based Methods For Analysis Of Copy Number Variation Using Next Generation Sequencing Data., Udika Iroshini Bandara

Electronic Theses and Dissertations

A Copy Number Variation (CNV) detection problem is considered using Circular Binary Segmentation (CBS) procedures, including newly developed procedures based on likelihood ratio tests with the parametric bootstrap for models based on discrete distributions for count data (Poisson and negative binomial) and a widely-used DNAcopy package. Results from the literature concerning maximum likelihood estimation for the negative binomial distribution are reviewed. The Newton-Raphson method is used to find the root of the derivative of the profile log likelihood function when applicable, and it is proven that this method converges to the true Maximum Likeihood Estimate (MLE), if the starting point …


Propensity Score Based Methods For Estimating The Treatment Effects Based On Observational Studies., Younathan Abdia Aug 2016

Propensity Score Based Methods For Estimating The Treatment Effects Based On Observational Studies., Younathan Abdia

Electronic Theses and Dissertations

This dissertation consists of two interconnected research projects. The first project was a study of propensity scores based statistical methods for estimating the average treatment effect (ATE) and the average treatment effect among treated (ATT) when there are two treatment groups. The ATE is defined as the mean of the individual causal effects in the whole population, while ATT is defined as the treatment effect for the treated population. Propensity score based statistical methods, such as matching, regression, stratification, inverse probability weighting (IPW), and doubly robust (DR) methods were used to estimate the ATE and ATT. Simulation studies and case …


Semi-Parametric Methods For Personalized Treatment Selection And Multi-State Models., Chathura K. Siriwardhana May 2016

Semi-Parametric Methods For Personalized Treatment Selection And Multi-State Models., Chathura K. Siriwardhana

Electronic Theses and Dissertations

This dissertation contains three research projects on personalized medicine and a project on multi-state modelling. The idea behind personalized medicine is selecting the best treatment that maximizes interested clinical outcomes of an individual based on his or her genetic and genomic information. We propose a method for treatment assignment based on individual covariate information for a patient. Our method covers more than two treatments and it can be applied with a broad set of models and it has very desirable large sample properties. An empirical study using simulations and a real data analysis show the applicability of the proposed procedure. …


Propensity Score Methods : A Simulation And Case Study Involving Breast Cancer Patients., John Craycroft May 2016

Propensity Score Methods : A Simulation And Case Study Involving Breast Cancer Patients., John Craycroft

Electronic Theses and Dissertations

Observational data presents unique challenges for analysis that are not encountered with experimental data resulting from carefully designed randomized controlled trials. Selection bias and unbalanced treatment assignments can obscure estimations of treatment effects, making the process of causal inference from observational data highly problematic. In 1983, Paul Rosenbaum and Donald Rubin formalized an approach for analyzing observational data that adjusts treatment effect estimates for the set of non-treatment variables that are measured at baseline. The propensity score is the conditional probability of assignment to a treatment group given the covariates. Using this score, one may balance the covariates across treatment …