Open Access. Powered by Scholars. Published by Universities.®

Biostatistics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 13 of 13

Full-Text Articles in Biostatistics

Bayesian Statistical Modeling Of Spatially Resolved Transcriptomics Data, Xi Jiang Oct 2023

Bayesian Statistical Modeling Of Spatially Resolved Transcriptomics Data, Xi Jiang

Statistical Science Theses and Dissertations

Spatially resolved transcriptomics (SRT) quantifies expression levels at different spatial locations, providing a new and powerful tool to investigate novel biological insights. As experimental technologies enhance both in capacity and efficiency, there arises a growing demand for the development of analytical methodologies.

One question in SRT data analysis is to identify genes whose expressions exhibit spatially correlated patterns, called spatially variable (SV) genes. Most current methods to identify SV genes are built upon the geostatistical model with Gaussian process, which could limit the models' ability to identify complex spatial patterns. In order to overcome this challenge and capture more types …


Optimizing Tumor Xenograft Experiments Using Bayesian Linear And Nonlinear Mixed Modelling And Reinforcement Learning, Mary Lena Bleile May 2023

Optimizing Tumor Xenograft Experiments Using Bayesian Linear And Nonlinear Mixed Modelling And Reinforcement Learning, Mary Lena Bleile

Statistical Science Theses and Dissertations

Tumor xenograft experiments are a popular tool of cancer biology research. In a typical such experiment, one implants a set of animals with an aliquot of the human tumor of interest, applies various treatments of interest, and observes the subsequent response. Efficient analysis of the data from these experiments is therefore of utmost importance. This dissertation proposes three methods for optimizing cancer treatment and data analysis in the tumor xenograft context. The first of these is applicable to tumor xenograft experiments in general, and the second two seek to optimize the combination of radiotherapy with immunotherapy in the tumor xenograft …


Regression Modeling Of Complex Survival Data Based On Pseudo-Observations, Rong Rong Dec 2022

Regression Modeling Of Complex Survival Data Based On Pseudo-Observations, Rong Rong

Statistical Science Theses and Dissertations

The restricted mean survival time (RMST) is a clinically meaningful summary measure in studies with survival outcomes. Statistical methods have been developed for regression analysis of RMST to investigate impacts of covariates on RMST, which is a useful alternative to the Cox regression analysis. However, existing methods for regression modeling of RMST are not applicable to left-truncated right-censored data that arise frequently in prevalent cohort studies, for which the sampling bias due to left truncation and informative censoring induced by the prevalent sampling scheme must be properly addressed. Meanwhile, statistical methods have been developed for regression modeling of the cumulative …


Sars-Cov-2 Pandemic Analytical Overview With Machine Learning Predictability, Anthony Tanaydin, Jingchen Liang, Daniel W. Engels Jan 2021

Sars-Cov-2 Pandemic Analytical Overview With Machine Learning Predictability, Anthony Tanaydin, Jingchen Liang, Daniel W. Engels

SMU Data Science Review

Understanding diagnostic tests and examining important features of novel coronavirus (COVID-19) infection are essential steps for controlling the current pandemic of 2020. In this paper, we study the relationship between clinical diagnosis and analytical features of patient blood panels from the US, Mexico, and Brazil. Our analysis confirms that among adults, the risk of severe illness from COVID-19 increases with pre-existing conditions such as diabetes and immunosuppression. Although more than eight months into pandemic, more data have become available to indicate that more young adults were getting infected. In addition, we expand on the definition of COVID-19 test and discuss …


Bayesian Semi-Supervised Keyphrase Extraction And Jackknife Empirical Likelihood For Assessing Heterogeneity In Meta-Analysis, Guanshen Wang Dec 2020

Bayesian Semi-Supervised Keyphrase Extraction And Jackknife Empirical Likelihood For Assessing Heterogeneity In Meta-Analysis, Guanshen Wang

Statistical Science Theses and Dissertations

This dissertation investigates: (1) A Bayesian Semi-supervised Approach to Keyphrase Extraction with Only Positive and Unlabeled Data, (2) Jackknife Empirical Likelihood Confidence Intervals for Assessing Heterogeneity in Meta-analysis of Rare Binary Events.

In the big data era, people are blessed with a huge amount of information. However, the availability of information may also pose great challenges. One big challenge is how to extract useful yet succinct information in an automated fashion. As one of the first few efforts, keyphrase extraction methods summarize an article by identifying a list of keyphrases. Many existing keyphrase extraction methods focus on the unsupervised setting, …


Compressed Dna Representation For Efficient Amr Classification, John Partee, Robert Hazell, Anjli Solsi, John Santerre Aug 2020

Compressed Dna Representation For Efficient Amr Classification, John Partee, Robert Hazell, Anjli Solsi, John Santerre

SMU Data Science Review

In this paper, we explore a representation methodology for the compression of DNA isolates. Using lossless string compression via tokenization of frequently repeated segments of DNA, we reduce the length of the isolates to be counted as k-mers for classification. With this new representation, we apply a previously established feature sampling method to dramatically reduce the feature space. In understanding the genetic diversity, we also look at conserving biological function across these spaces. Using a random forest model we were able to predict the resistance or susceptibility of bacteria with 85-90\% accuracy, with a 30-50\% reduction in overall isolate length, …


Causal Inference And Prediction On Observational Data With Survival Outcomes, Xiaofei Chen Jul 2020

Causal Inference And Prediction On Observational Data With Survival Outcomes, Xiaofei Chen

Statistical Science Theses and Dissertations

Infants with hypoplastic left heart syndrome require an initial Norwood operation, followed some months later by a stage 2 palliation (S2P). The timing of S2P is critical for the operation’s success and the infant’s survival, but the optimal timing, if one exists, is unknown. We attempt to estimate the optimal timing of S2P by analyzing data from the Single Ventricle Reconstruction Trial (SVRT), which randomized patients between two different types of Norwood procedure. In the SVRT, the timing of the S2P was chosen by the medical team; thus with respect to this exposure, the trial constitutes an observational study, and …


Sensitivity Analysis For Incomplete Data And Causal Inference, Heng Chen May 2020

Sensitivity Analysis For Incomplete Data And Causal Inference, Heng Chen

Statistical Science Theses and Dissertations

In this dissertation, we explore sensitivity analyses under three different types of incomplete data problems, including missing outcomes, missing outcomes and missing predictors, potential outcomes in \emph{Rubin causal model (RCM)}. The first sensitivity analysis is conducted for the \emph{missing completely at random (MCAR)} assumption in frequentist inference; the second one is conducted for the \emph{missing at random (MAR)} assumption in likelihood inference; the third one is conducted for one novel assumption, the ``sixth assumption'' proposed for the robustness of instrumental variable estimand in causal inference.


Inference Of Heterogeneity In Meta-Analysis Of Rare Binary Events And Rss-Structured Cluster Randomized Studies, Chiyu Zhang Dec 2019

Inference Of Heterogeneity In Meta-Analysis Of Rare Binary Events And Rss-Structured Cluster Randomized Studies, Chiyu Zhang

Statistical Science Theses and Dissertations

This dissertation contains two topics: (1) A Comparative Study of Statistical Methods for Quantifying and Testing Between-study Heterogeneity in Meta-analysis with Focus on Rare Binary Events; (2) Estimation of Variances in Cluster Randomized Designs Using Ranked Set Sampling.

Meta-analysis, the statistical procedure for combining results from multiple studies, has been widely used in medical research to evaluate intervention efficacy and safety. In many practical situations, the variation of treatment effects among the collected studies, often measured by the heterogeneity parameter, may exist and can greatly affect the inference about effect sizes. Comparative studies have been done for only one or …


Sample Size Calculation Of Clinical Trials With Correlated Outcomes, Dateng Li Aug 2019

Sample Size Calculation Of Clinical Trials With Correlated Outcomes, Dateng Li

Statistical Science Theses and Dissertations

In this thesis, we investigate sample size calculation for three kinds of clinical trials: (1). Randomized controlled trials (RCTs) with longitudinal count outcomes; (2). Cluster randomized trials (CRTs) with count outcomes; (3). CRTs with multiple binary co-primary endpoints.


Robust And Adaptive Design Approaches For Stepped Wedge Cluster Randomized Trials, Jijia Wang Jan 2019

Robust And Adaptive Design Approaches For Stepped Wedge Cluster Randomized Trials, Jijia Wang

Statistical Science Theses and Dissertations

The stepped wedge (SW) cluster randomized design has been increasingly employed by pragmatic trials in health services research. In this study, based on the GEE approach, I present a closed-form sample size that is applicable to both closed-cohort and cross-sectional SW trials with outcomes from the exponential family. On the other hand, I proposed a Bayesian adaptive design for cross-sectional SW cluster randomized trials. It is more adaptable than traditional designs because it allows early termination of the trial when interim data indicate that the intervention is sufficient efficacious or inefficacious. A decision to terminate or continue the trial will …


Association Tests For Genetic Effect And Its Interaction With Environmental Factors, Zhengyang Zhou Jul 2018

Association Tests For Genetic Effect And Its Interaction With Environmental Factors, Zhengyang Zhou

Statistical Science Theses and Dissertations

My research is in the area of statistical genetics, and it contains three projects: (1) Differentiating the Cochran-Armitage (CA) trend test and Pearson’s chi-square test: location and dispersion; (2) Decomposing Pearson’s chi-square test: a linear regression and its departure from linearity; (3) Testing nonlinear gene-environment (GxE) interaction through varying coefficient and linear mixed models.

(1) In genetic case-control association studies, a standard practice is to perform the CA trend test with 1 degree-of-freedom (df) under the assumption of an additive model. However, when the true genetic model is recessive or near recessive, it is outperformed by Pearson’s chi-square test with …


Developing Statistical Methods For Data From Platforms Measuring Gene Expression, Gaoxiang Jia Apr 2018

Developing Statistical Methods For Data From Platforms Measuring Gene Expression, Gaoxiang Jia

Statistical Science Theses and Dissertations

This research contains two topics: (1) PBNPA: a permutation-based non-parametric analysis of CRISPR screen data; (2) RCRnorm: an integrated system of random-coefficient hierarchical regression models for normalizing NanoString nCounter data from FFPE samples.

Clustered regularly-interspaced short palindromic repeats (CRISPR) screens are usually implemented in cultured cells to identify genes with critical functions. Although several methods have been developed or adapted to analyze CRISPR screening data, no single spe- cific algorithm has gained popularity. Thus, rigorous procedures are needed to overcome the shortcomings of existing algorithms. We developed a Permutation-Based Non-Parametric Analysis (PBNPA) algorithm, which computes p-values at the gene level …