Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 31 - 60 of 138

Full-Text Articles in Physical Sciences and Mathematics

Multivariate Joint Models And Dynamic Predictions, Md Akhtar Hossain Apr 2020

Multivariate Joint Models And Dynamic Predictions, Md Akhtar Hossain

Theses and Dissertations

The joint modeling of longitudinal and time-to-event data is an active area of statistical research that has received a lot of attention. The standard joint models, referred to as univariate joint models, allow simultaneous modeling of a single longitudinal outcome and a single time-to-event under an assumption of independent censoring. The majority of the joint modeling research in the last two decades has focused on extending and improving the univariate joint models. While many of the practical applications involve data on multivariate longitudinal outcomes and multiple timeto- events possibly informatively censored by some other terminal time-to-event, the developments of joint …


The Analysis Of Neural Heterogeneity Through Mathematical And Statistical Methods, Kyle Wendling Jan 2020

The Analysis Of Neural Heterogeneity Through Mathematical And Statistical Methods, Kyle Wendling

Theses and Dissertations

Diversity of intrinsic neural attributes and network connections is known to exist in many areas of the brain and is thought to significantly affect neural coding. Recent theoretical and experimental work has argued that in uncoupled networks, coding is most accurate at intermediate levels of heterogeneity. I explore this phenomenon through two distinct approaches: a theoretical mathematical modeling approach and a data-driven statistical modeling approach.

Through the mathematical approach, I examine firing rate heterogeneity in a feedforward network of stochastic neural oscillators utilizing a high-dimensional model. The firing rate heterogeneity stems from two sources: intrinsic (different individual cells) and network …


Zero-Inflated Longitudinal Mixture Model For Stochastic Radiographic Lung Compositional Change Following Radiotherapy Of Lung Cancer, Viviana A. Rodríguez Romero Jan 2020

Zero-Inflated Longitudinal Mixture Model For Stochastic Radiographic Lung Compositional Change Following Radiotherapy Of Lung Cancer, Viviana A. Rodríguez Romero

Theses and Dissertations

Compositional data (CD) is mostly analyzed as relative data, using ratios of components, and log-ratio transformations to be able to use known multivariable statistical methods. Therefore, CD where some components equal zero represent a problem. Furthermore, when the data is measured longitudinally, observations are spatially related and appear to come from a mixture population, the analysis becomes highly complex. For this matter, a two-part model was proposed to deal with structural zeros in longitudinal CD using a mixed-effects model. Furthermore, the model has been extended to the case where the non-zero components of the vector might a two component mixture …


Extension Of Risk-Based Measure Of Time-Varying Prognostic Discrimination For Survival Models, Shujie Chen Jul 2019

Extension Of Risk-Based Measure Of Time-Varying Prognostic Discrimination For Survival Models, Shujie Chen

Theses and Dissertations

The Cox proportional hazards (PH) model and time dependent PH model are the most popular survival models in survival analysis. The hazard discrimination summary HDS(t) proposed by Liang and Heagerty [2017] is used to evaluate the mean hazard difference between cases and controls at time t. Liang and Heagerty [2017] evaluated the discrimination performance under the PH model and time dependent PH model with right censoring.

In this thesis, first, we further investigate their method via comprehensive simulations including 1) We extend the simulation in Liang and Heagerty [2017] under the PH model by adding more scenarios such as different …


Randomization Analysis Driven Software, Steph-Yves Louis Apr 2019

Randomization Analysis Driven Software, Steph-Yves Louis

Theses and Dissertations

The application of a method of randomization for a clinical trial frequently summarizes to using Simple Randomization. Even though the latter method provides favorable characteristics, if the collected sample is not large enough, it still presents the highest chance of imbalance both marginally in the treatment groups and locally in terms of the covariates. Methods of Permuted Block Randomization, Urn Randomization, Stratified Permuted Block Randomization, and Minimization represent popular alternative methods that one should consider depending on the goal of the study. A comparison of the previously mentioned methods is carried to evaluate their performance with samples that are not …


Site- And Location-Adjusted Approaches To Adaptive Allocation Clinical Trial Designs, Brian S. Di Pace Jan 2019

Site- And Location-Adjusted Approaches To Adaptive Allocation Clinical Trial Designs, Brian S. Di Pace

Theses and Dissertations

Response-Adaptive (RA) designs are used to adaptively allocate patients in clinical trials. These methods have been generalized to include Covariate-Adjusted Response-Adaptive (CARA) designs, which adjust treatment assignments for a set of covariates while maintaining features of the RA designs. Challenges may arise in multi-center trials if differential treatment responses and/or effects among sites exist. We propose Site-Adjusted Response-Adaptive (SARA) approaches to account for inter-center variability in treatment response and/or effectiveness, including either a fixed site effect or both random site and treatment-by-site interaction effects to calculate conditional probabilities. These success probabilities are used to update assignment probabilities for allocating patients …


Methods For Evaluating Dropout Attrition In Survey Data, Camille J. Hochheimer Jan 2019

Methods For Evaluating Dropout Attrition In Survey Data, Camille J. Hochheimer

Theses and Dissertations

As researchers increasingly use web-based surveys, the ease of dropping out in the online setting is a growing issue in ensuring data quality. One theory is that dropout or attrition occurs in phases that can be generalized to phases of high dropout and phases of stable use. In order to detect these phases, several methods are explored. First, existing methods and user-specified thresholds are applied to survey data where significant changes in the dropout rate between two questions is interpreted as the start or end of a high dropout phase. Next, survey dropout is considered as a time-to-event outcome and …


Assessing The Impact Of Incorporating Residential Histories Into The Spatial Analysis Of Cancer Risk, Anny-Claude Joseph Jan 2019

Assessing The Impact Of Incorporating Residential Histories Into The Spatial Analysis Of Cancer Risk, Anny-Claude Joseph

Theses and Dissertations

In many spatial epidemiologic studies, investigators use residential location at diagnosis as a surrogate for unknown environmental exposures or as a geographic basis for assigning measured exposures. Inherently, they make assumptions about the timing and location of pertinent exposures which may prove problematic when studying long latency diseases such as cancer.

In this work we explored how the association between environmental exposures and disease risk for long-latency health outcomes like cancer is affected by residential mobility. We used simulation studies conditioned on real data to evaluate the extent to which the commonly held assumption of no residential mobility 1) affected …


Methods For Joint Normalization And Comparison Of Hi-C Data, John C. Stansfield Jan 2019

Methods For Joint Normalization And Comparison Of Hi-C Data, John C. Stansfield

Theses and Dissertations

The development of chromatin conformation capture technology has opened new avenues of study into the 3D structure and function of the genome. Chromatin structure is known to influence gene regulation, and differences in structure are now emerging as a mechanism of regulation between, e.g., cell differentiation and disease vs. normal states. Hi-C sequencing technology now provides a way to study the 3D interactions of the chromatin over the whole genome. However, like all sequencing technologies, Hi-C suffers from several forms of bias stemming from both the technology and the DNA sequence itself. Several normalization methods have been developed for normalizing …


Genome-Wide Systems Genetics Of Alcohol Consumption And Dependence, Kristin Mignogna Jan 2019

Genome-Wide Systems Genetics Of Alcohol Consumption And Dependence, Kristin Mignogna

Theses and Dissertations

Widely effective treatment for alcohol use disorder is not yet available, because the exact biological mechanisms that underlie this disorder are not completely understood. One way to gain a better understanding of these mechanisms is to examine the genetic frameworks that contribute to the risk for developing this disorder. This dissertation examines genetic association data in combination with gene expression networks in the brain to identify functional groups of genes associated with alcohol consumption and dependence.

The first study took advantage of the behavioral complexity of human samples, and experimental capabilities provided by mouse models, by co-analyzing gene expression networks …


Bayesian Nonparametric Analysis Of Longitudinal Data With Non-Ignorable Non-Monotone Missingness, Yu Cao Jan 2019

Bayesian Nonparametric Analysis Of Longitudinal Data With Non-Ignorable Non-Monotone Missingness, Yu Cao

Theses and Dissertations

In longitudinal studies, outcomes are measured repeatedly over time, but in reality clinical studies are full of missing data points of monotone and non-monotone nature. Often this missingness is related to the unobserved data so that it is non-ignorable. In such context, pattern-mixture model (PMM) is one popular tool to analyze the joint distribution of outcome and missingness patterns. Then the unobserved outcomes are imputed using the distribution of observed outcomes, conditioned on missing patterns. However, the existing methods suffer from model identification issues if data is sparse in specific missing patterns, which is very likely to happen with a …


Spectral Methods For The Detection And Characterization Of Topologically Associated Domains, Kellen Garrison Cresswell Jan 2019

Spectral Methods For The Detection And Characterization Of Topologically Associated Domains, Kellen Garrison Cresswell

Theses and Dissertations

The three-dimensional (3D) structure of the genome plays a crucial role in gene expression regulation. Chromatin conformation capture technologies (Hi-C) have revealed that the genome is organized in a hierarchy of topologically associated domains (TADs), sub-TADs, and chromatin loops which is relatively stable across cell-lines and even across species. These TADs dynamically reorganize during development of disease, and exhibit cell- and conditionspecific differences. Identifying such hierarchical structures and how they change between conditions is a critical step in understanding genome regulation and disease development. Despite their importance, there are relatively few tools for identification of TADs and even fewer for …


Clustering Biological Data With Self-Adjusting High-Dimensional Sieve, Josselyn Gonzalez Apr 2018

Clustering Biological Data With Self-Adjusting High-Dimensional Sieve, Josselyn Gonzalez

Theses and Dissertations

Data classification as a preprocessing technique is a crucial step in the analysis and understanding of numerical data. Cluster analysis, in particular, provides insight into the inherent patterns found in data which makes the interpretation of any follow-up analyses more meaningful. A clustering algorithm groups together data points according to a predefined similarity criterion. This allows the data set to be broken up into segments which, in turn, gives way for a more targeted statistical analysis. Cluster analysis has applications in numerous fields of study and, as a result, countless algorithms have been developed. However, the quantity of options makes …


Adjusting For Mis-Reporting In Count Data, Gelareh Rahimighazikalayeh Jan 2018

Adjusting For Mis-Reporting In Count Data, Gelareh Rahimighazikalayeh

Theses and Dissertations

Any counting system is prone to recording errors including underreporting and overreporting. Ignoring the misreporting pattern in count data can give rise to bias in the estimation of model parameters. Accordingly, Poisson, negative binomial and generalized Poisson regression have been expanded in some instances to capture reporting biases. However, to our knowledge, no program has been developed to allow users to apply all of these models when needed. In the first part of the dissertation, we review the available models for underreported counts and develop a Stata command to estimate Poisson, negative binomial and generalized Poisson regression models for underreported …


Comparison Of The Performance Of Simple Linear Regression And Quantile Regression With Non-Normal Data: A Simulation Study, Marjorie Howard Jan 2018

Comparison Of The Performance Of Simple Linear Regression And Quantile Regression With Non-Normal Data: A Simulation Study, Marjorie Howard

Theses and Dissertations

Linear regression is a widely used method for analysis that is well understood across a wide variety of disciplines. In order to use linear regression, a number of assumptions must be met. These assumptions, specifically normality and homoscedasticity of the error distribution can at best be met only approximately with real data. Quantile regression requires fewer assumptions, which offers a potential advantage over linear regression. In this simulation study, we compare the performance of linear (least squares) regression to quantile regression when these assumptions are violated, in order to investigate under what conditions quantile regression becomes the more advantageous method …


Estimation Procedures For Complex Survival Models And Their Applications In Epidemiology Studies, Jie Zhou Jan 2018

Estimation Procedures For Complex Survival Models And Their Applications In Epidemiology Studies, Jie Zhou

Theses and Dissertations

In this dissertation, we aim to address three important questions in practice, which can be solved through complex survival models. The first project focuses on studying the longitudinal fitness effect on cardiovascular disease (CVD) mortality. In the second project, we study the disease-death relation between CVD and all-cause mortality and evaluate important covariate effects on the disease or death transitions. In the third project, we compare antiretroviral treatment (ART) for HIV patients and consider both treatment effect and side effect of the drugs. The first two projects are motivated by the Aerobics Center Longitudinal Study (ACLS) datasets and the third …


Examining The Confirmatory Tetrad Analysis (Cta) As A Solution Of The Inadequacy Of Traditional Structural Equation Modeling (Sem) Fit Indices, Hangcheng Liu Jan 2018

Examining The Confirmatory Tetrad Analysis (Cta) As A Solution Of The Inadequacy Of Traditional Structural Equation Modeling (Sem) Fit Indices, Hangcheng Liu

Theses and Dissertations

Structural Equation Modeling (SEM) is a framework of statistical methods that allows us to represent complex relationships between variables. SEM is widely used in economics, genetics and the behavioral sciences (e.g. psychology, psychobiology, sociology and medicine). Model complexity is defined as a model’s ability to fit different data patterns and it plays an important role in model selection when applying SEM. As in linear regression, the number of free model parameters is typically used in traditional SEM model fit indices as a measure of the model complexity. However, only using number of free model parameters to indicate SEM model complexity …


Penalized Mixed-Effects Ordinal Response Models For High-Dimensional Genomic Data In Twins And Families, Amanda E. Gentry Jan 2018

Penalized Mixed-Effects Ordinal Response Models For High-Dimensional Genomic Data In Twins And Families, Amanda E. Gentry

Theses and Dissertations

The Brisbane Longitudinal Twin Study (BLTS) was being conducted in Australia and was funded by the US National Institute on Drug Abuse (NIDA). Adolescent twins were sampled as a part of this study and surveyed about their substance use as part of the Pathways to Cannabis Use, Abuse and Dependence project. The methods developed in this dissertation were designed for the purpose of analyzing a subset of the Pathways data that includes demographics, cannabis use metrics, personality measures, and imputed genotypes (SNPs) for 493 complete twin pairs (986 subjects.) The primary goal was to determine what combination of SNPs and …


Estimating The Respiratory Lung Motion Model Using Tensor Decomposition On Displacement Vector Field, Kingston Kang Jan 2018

Estimating The Respiratory Lung Motion Model Using Tensor Decomposition On Displacement Vector Field, Kingston Kang

Theses and Dissertations

Modern big data often emerge as tensors. Standard statistical methods are inadequate to deal with datasets of large volume, high dimensionality, and complex structure. Therefore, it is important to develop algorithms such as low-rank tensor decomposition for data compression, dimensionality reduction, and approximation.

With the advancement in technology, high-dimensional images are becoming ubiquitous in the medical field. In lung radiation therapy, the respiratory motion of the lung introduces variabilities during treatment as the tumor inside the lung is moving, which brings challenges to the precise delivery of radiation to the tumor. Several approaches to quantifying this uncertainty propose using a …


Weighted Quantile Sum Regression For Analyzing Correlated Predictors Acting Through A Mediation Pathway On A Biological Outcome, Bhanu M. Evani Jan 2017

Weighted Quantile Sum Regression For Analyzing Correlated Predictors Acting Through A Mediation Pathway On A Biological Outcome, Bhanu M. Evani

Theses and Dissertations

Abstract

Weighted Quantile Sum Regression for Analyzing Correlated Predictors Acting Through a Mediation Pathway on a Biological Outcome

By

Bhanu M. Evani, Ph.D.

A thesis submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at Virginia Commonwealth University.

Virginia Commonwealth University, 2017.

Major Director: Robert A. Perera, Asst. Professor, Department of Biostatistics

This work examines mediated effects of a set of correlated predictors using the recently developed Weighted Quantile Sum (WQS) regression method. Traditionally, mediation analysis has been conducted using the multiple regression method, first proposed by Baron and Kenny (1986), which has since …


Comparing The Structural Components Variance Estimator And U-Statistics Variance Estimator When Assessing The Difference Between Correlated Aucs With Finite Samples, Anna L. Bosse Jan 2017

Comparing The Structural Components Variance Estimator And U-Statistics Variance Estimator When Assessing The Difference Between Correlated Aucs With Finite Samples, Anna L. Bosse

Theses and Dissertations

Introduction: The structural components variance estimator proposed by DeLong et al. (1988) is a popular approach used when comparing two correlated AUCs. However, this variance estimator is biased and could be problematic with small sample sizes.

Methods: A U-statistics based variance estimator approach is presented and compared with the structural components variance estimator through a large-scale simulation study under different finite-sample size configurations.

Results: The U-statistics variance estimator was unbiased for the true variance of the difference between correlated AUCs regardless of the sample size and had lower RMSE than the structural components variance estimator, providing better type 1 error …


Evaluation Of Goodness-Of-Fit Tests For The Cox Proportional Hazards Model With Time-Varying Covariates, Shanshan Hong Jan 2017

Evaluation Of Goodness-Of-Fit Tests For The Cox Proportional Hazards Model With Time-Varying Covariates, Shanshan Hong

Theses and Dissertations

The proportional hazards (PH) model, proposed by Cox (1972), is one of the most popular survival models for analyzing time-to-event data. To use the PH model properly, one must examine whether the data satisfy the PH assumption. An alternative model should be suggested if the PH assumption is invalid. The main purpose of this thesis is to examine the performance of five existing methods for assessing the PH assumption. Through extensive simulations, the powers of five different existing methods are compared; these methods include the likelihood ratio test, the Schoenfeld residuals test, the scaled Schoenfeld residuals test, Lin et al. …


Marginal Structural Cox Model For Survival Data With Treatment-Confounder Feedback, Yanan Zhang Jan 2017

Marginal Structural Cox Model For Survival Data With Treatment-Confounder Feedback, Yanan Zhang

Theses and Dissertations

In an observational longitudinal study, there can be time-varying exposure/treatment and time-varying confounders. When the confounders affect the exposure and prior exposure also has an impact on levels of confounders, there is treatment confounder feedback. To admit estimation of unbiased causal effects, these conditions need to be hold, exchangeability, positivity, consistency. The traditional method of conditioning on potential confounders does not meet these 3 conditions. Therefore, parameter estimates from traditional Cox model are biased casual effect estimates when the treatment confounder feedback exists. The marginal structural Cox model can be used to address this issue. By calculating and including inverse …


Longitudinal And Geographical Modeling Of Circular Data With An Application To Sudden Infant Death Syndrome, Xinyan Cai Jan 2017

Longitudinal And Geographical Modeling Of Circular Data With An Application To Sudden Infant Death Syndrome, Xinyan Cai

Theses and Dissertations

The aim of this thesis is to study seasonality of death in U.S. infants who died from SIDS. We also propose to investigate secular trends and geographical patterns of seasonal patterns of mortality. The application of circular statistics is used to describe the seasonality of the month of death in infants who died from SIDS in 1990, 2000 and 2010. The secular trends of seasonal patterns of SIDS mortality are investigated using a circular linear regression model after adjusting for potential confounders. The geographical variation in seasonal patterns of SIDS mortality is explored from the U.S. map and quantified by …


Statistical Methods For Multivariate And Correlated Data, Xinling Xu Jan 2017

Statistical Methods For Multivariate And Correlated Data, Xinling Xu

Theses and Dissertations

A commonly encountered data type in real life is count data, especially in selfreported behavioral studies. One issue of the self-reported count data is the inaccuracy. In the first part of the dissertation, we are going to address one specific type of inaccuracy in bivariate count data–heaping. Copula functions are used for the formulation of the bivariate distribution. Using copula functions for solving data inaccuracy problems is still a new area, which we are going to explore in this dissertation.

We also discuss the methods for variable selection when the explanatory variables are highly correlated. In particular, our method is …


The Generalized Monotone Incremental Forward Stagewise Method For Modeling Longitudinal, Clustered, And Overdispersed Count Data: Application Predicting Nuclear Bud And Micronuclei Frequencies, Rebecca Lehman Jan 2017

The Generalized Monotone Incremental Forward Stagewise Method For Modeling Longitudinal, Clustered, And Overdispersed Count Data: Application Predicting Nuclear Bud And Micronuclei Frequencies, Rebecca Lehman

Theses and Dissertations

With the influx of high-dimensional data there is an immediate need for statistical methods that are able to handle situations when the number of predictors greatly exceeds the number of samples. One such area of growth is in examining how environmental exposures to toxins impact the body long term. The cytokinesis-block micronucleus assay can measure the genotoxic effect of exposure as a count outcome. To investigate potential biomarkers, high-throughput assays that assess gene expression and methylation have been developed. It is of interest to identify biomarkers or molecular features that are associated with elevated micronuclei (MN) or nuclear bud (Nbud) …


Stage-Specific Predictive Models For Cancer Survivability, Elham Sagheb Hossein Pour Dec 2016

Stage-Specific Predictive Models For Cancer Survivability, Elham Sagheb Hossein Pour

Theses and Dissertations

Survivability of cancer strongly depends on the stage of cancer. In most previous works, machine learning survivability prediction models for a particular cancer, were trained and evaluated together on all stages of the cancer. In this work, we trained and evaluated survivability prediction models for five major cancers, together on all stages and separately for every stage. We named these models joint and stage-specific models respectively. The obtained results for the cancers which we investigated reveal that, the best model to predict the survivability of the cancer for one specific stage is the model which is specifically built for that …


Novel Methods For Analyzing Longitudinal Data With Measurement Error In The Time Variable, Caroline Munindi Mulatya Jun 2016

Novel Methods For Analyzing Longitudinal Data With Measurement Error In The Time Variable, Caroline Munindi Mulatya

Theses and Dissertations

In some longitudinal studies, the observed time points are often confounded with measurement error due to the sampling conditions, resulting into data with measurement error in the time variable. This type of data occurs mainly in observational studies when the onset of a longitudinal process is unknown or in clinical trials when individual visits do not take place as specified by the study protocol, but are often rounded to coincide with the study protocol. Methodological and inferential implications of error in time varying covariates for both linear and nonlinear models have been studied widely. In this dissertation, we shift attention …


Detecting Association Of Gene-Environment Interactions In Common And Rare Variants For Hypertension, Miguelangel Diaz Medina May 2016

Detecting Association Of Gene-Environment Interactions In Common And Rare Variants For Hypertension, Miguelangel Diaz Medina

Theses and Dissertations

Subsequent malignant neoplasms (SMNs) or secondary cancers are one of the most negative effects resulting from cancer treatment such as chemotherapy or radiation. Given the severity and high incidence of mortality faced by cancer survivors, it is critical that we understand the cause of SMNs so that preventive measures or intervention can be done for individuals facing a higher risk of SMN incidence. The purpose of this thesis is to test the efficacy of newly developed statistical methods used to identify gene-environment interactions that are associated with a specific disease, in this case, SMNs, considering both common and rare variants, …


Are We Missing The Forest For The Trees? Quantifying The Maintenance Of Diversity In Temperate Deciduous Forests, Kathryn Barry May 2016

Are We Missing The Forest For The Trees? Quantifying The Maintenance Of Diversity In Temperate Deciduous Forests, Kathryn Barry

Theses and Dissertations

One of the most pressing questions of community ecology is: Why do we have so many species? Over 100 hypotheses have been proposed to answer this question for woody plants over the past 70 years, yet there remains no consensus among community ecologists. In this dissertation, I explore the evidence supporting several different hypotheses (Chapter 1). I provide evidence that negative density dependence, where individuals perform poorly near members of their own species, may only be relevant for canopy tree species (Chapter 2). Understory species do not demonstrate negative density dependence while canopy trees demonstrate negative density dependence that increases …