Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 5 of 5

Full-Text Articles in Physical Sciences and Mathematics

Statistical Methods For Analyzing Multi-Omics Data: Dependence Structure And Missing Values, Wenda Zhang Jul 2022

Statistical Methods For Analyzing Multi-Omics Data: Dependence Structure And Missing Values, Wenda Zhang

Theses and Dissertations

The advancements in high-throughput technologies have made it possible to generate a huge number of "omics'' data, including genomics, proteomics, transcriptomics, epigenomics, metabolomics, and microbiomics. Combining multiple data sources and performing joint analyses with all available information and the phenotypic outcome can reflect various aspects in complex biological systems, such as revealing regulation processes, discovering novel associations between biological entities, and identifying relevant biomarkers for certain diseases or phenotypic outcomes. This dissertation focuses on developing statistical models for analyzing multi-omics data. It is comprised of three topics: (1) integrative analysis for multi-omics data with missing observations in intermediate variables; (2) …


Modified Em Algorithm In Smcure Package Based On Proportional Hazards Mixture Cure Model With Offset Terms, Jiaying Yi Jul 2022

Modified Em Algorithm In Smcure Package Based On Proportional Hazards Mixture Cure Model With Offset Terms, Jiaying Yi

Theses and Dissertations

Mixture cure model is a useful method of survival analysis for population including cured proportion and uncured proportion. The R package SMCURE applies EM algorithm to estimate the coefficients of covariates in the mixture cure model. Although an offset term is specified in the SMCURE statement, the offset term is not appropriately handled in the algorithm. This thesis aims to adjust the EM algorithm for the proportional hazards mixture cure model in the SMCURE package. In addition, the offset term can be specified separately in the incidence part or the latency part. The numerical experiments include simulation study and real …


Statistical Methods For Analyzing Dependence Structures With Applications In Single-Cell Experiments, Zhen Yang Jul 2022

Statistical Methods For Analyzing Dependence Structures With Applications In Single-Cell Experiments, Zhen Yang

Theses and Dissertations

This dissertation focuses on studying methods in dependence structure analysis. In particular, it consists of two topics: (1) modeling dynamic correlation in zero-inflated bivariate count data; and (2) gene co-expression latent factor analysis for cell-type clustering.

In Chapter 2, a zero-inflated negative binomial model for analyzing the dynamic correlation in zero-inflated bivariate count data is proposed. Interactions between biological molecules in a cell are tightly coordinated and often highly dynamic. As a result of these varying signaling activities, changes in gene co-expression patterns could often be observed. The advancements in next-generation sequencing tech-nologies bring new statistical challenges for studying these …


Predicting Lower Body Soft Tissue Injuries In American Football With Gps Data, Nicholas Tice Apr 2022

Predicting Lower Body Soft Tissue Injuries In American Football With Gps Data, Nicholas Tice

Theses and Dissertations

It is of utmost importance to sports organizations that they keep their players as healthy as possible and contributing to the success of the team. Advancements in technology and investments by sports clubs have allowed researchers to better understand the role of load management in high-level athletes to mitigate injury risk. Through GPS tracking data provided by a collaborating Division I American college football team, we seek to predict lower body soft tissue injuries in future training sessions and reduce the number of potentially avoidable injuries within the organization. The difficulty of analyzing the injury data set is that the …


A Comparison Of Inference Methods In High-Dimensional Linear Regression, Imtiaz Ebna Mannan Apr 2022

A Comparison Of Inference Methods In High-Dimensional Linear Regression, Imtiaz Ebna Mannan

Theses and Dissertations

Building confidence/credible intervals for the high-dimensional (p >> n) linear models have been the subject of exploration for many years. In this paper, we explore three specific setups. First, we look at the Bayesian paradigm for the LASSO model. A double-exponential prior has been applied to the regression coefficient and from that, a posterior distribution is derived to get the necessary quantiles to calculate the credible intervals for the regression coefficients. Second, we explore the de-sparsified LASSO estimates, and using its asymptotic normality, we calculate the confidence intervals for the model coefficients. Finally, we incorporate an adaptive LASSO model. To …