Open Access. Powered by Scholars. Published by Universities.®
![Digital Commons Network](http://assets.bepress.com/20200205/img/dcn/DCsunburst.png)
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Institution
- Keyword
-
- Biostatistics (7)
- Physical Sciences and Mathematics, Statistics and Probability, Biostatistics (6)
- Bayesian (4)
- Genomics (4)
- Physical Sciences and Mathematics, Statistics and Probability, Biostatistics (4)
-
- Classification (3)
- Cluster analysis (3)
- Environment (3)
- Genetics (3)
- Longitudinal data (3)
- Machine Learning (3)
- Power (3)
- Spatial clustering (3)
- Survival Analysis (3)
- Variable selection (3)
- Algorithm (2)
- Chemicals (2)
- Clinical trial (2)
- Clustering (2)
- Copy number variation (2)
- Correlated data (2)
- Count Data (2)
- Covariance (2)
- Cox proportional hazards model (2)
- DNA Methylation (2)
- Data (2)
- Data analysis (2)
- Gene expression (2)
- Hi-C (2)
- Microarray (2)
Articles 31 - 60 of 138
Full-Text Articles in Physical Sciences and Mathematics
Multivariate Joint Models And Dynamic Predictions, Md Akhtar Hossain
Multivariate Joint Models And Dynamic Predictions, Md Akhtar Hossain
Theses and Dissertations
The joint modeling of longitudinal and time-to-event data is an active area of statistical research that has received a lot of attention. The standard joint models, referred to as univariate joint models, allow simultaneous modeling of a single longitudinal outcome and a single time-to-event under an assumption of independent censoring. The majority of the joint modeling research in the last two decades has focused on extending and improving the univariate joint models. While many of the practical applications involve data on multivariate longitudinal outcomes and multiple timeto- events possibly informatively censored by some other terminal time-to-event, the developments of joint …
The Analysis Of Neural Heterogeneity Through Mathematical And Statistical Methods, Kyle Wendling
The Analysis Of Neural Heterogeneity Through Mathematical And Statistical Methods, Kyle Wendling
Theses and Dissertations
Diversity of intrinsic neural attributes and network connections is known to exist in many areas of the brain and is thought to significantly affect neural coding. Recent theoretical and experimental work has argued that in uncoupled networks, coding is most accurate at intermediate levels of heterogeneity. I explore this phenomenon through two distinct approaches: a theoretical mathematical modeling approach and a data-driven statistical modeling approach.
Through the mathematical approach, I examine firing rate heterogeneity in a feedforward network of stochastic neural oscillators utilizing a high-dimensional model. The firing rate heterogeneity stems from two sources: intrinsic (different individual cells) and network …
Zero-Inflated Longitudinal Mixture Model For Stochastic Radiographic Lung Compositional Change Following Radiotherapy Of Lung Cancer, Viviana A. Rodríguez Romero
Zero-Inflated Longitudinal Mixture Model For Stochastic Radiographic Lung Compositional Change Following Radiotherapy Of Lung Cancer, Viviana A. Rodríguez Romero
Theses and Dissertations
Compositional data (CD) is mostly analyzed as relative data, using ratios of components, and log-ratio transformations to be able to use known multivariable statistical methods. Therefore, CD where some components equal zero represent a problem. Furthermore, when the data is measured longitudinally, observations are spatially related and appear to come from a mixture population, the analysis becomes highly complex. For this matter, a two-part model was proposed to deal with structural zeros in longitudinal CD using a mixed-effects model. Furthermore, the model has been extended to the case where the non-zero components of the vector might a two component mixture …
Extension Of Risk-Based Measure Of Time-Varying Prognostic Discrimination For Survival Models, Shujie Chen
Extension Of Risk-Based Measure Of Time-Varying Prognostic Discrimination For Survival Models, Shujie Chen
Theses and Dissertations
The Cox proportional hazards (PH) model and time dependent PH model are the most popular survival models in survival analysis. The hazard discrimination summary HDS(t) proposed by Liang and Heagerty [2017] is used to evaluate the mean hazard difference between cases and controls at time t. Liang and Heagerty [2017] evaluated the discrimination performance under the PH model and time dependent PH model with right censoring.
In this thesis, first, we further investigate their method via comprehensive simulations including 1) We extend the simulation in Liang and Heagerty [2017] under the PH model by adding more scenarios such as different …
Randomization Analysis Driven Software, Steph-Yves Louis
Randomization Analysis Driven Software, Steph-Yves Louis
Theses and Dissertations
The application of a method of randomization for a clinical trial frequently summarizes to using Simple Randomization. Even though the latter method provides favorable characteristics, if the collected sample is not large enough, it still presents the highest chance of imbalance both marginally in the treatment groups and locally in terms of the covariates. Methods of Permuted Block Randomization, Urn Randomization, Stratified Permuted Block Randomization, and Minimization represent popular alternative methods that one should consider depending on the goal of the study. A comparison of the previously mentioned methods is carried to evaluate their performance with samples that are not …
Site- And Location-Adjusted Approaches To Adaptive Allocation Clinical Trial Designs, Brian S. Di Pace
Site- And Location-Adjusted Approaches To Adaptive Allocation Clinical Trial Designs, Brian S. Di Pace
Theses and Dissertations
Response-Adaptive (RA) designs are used to adaptively allocate patients in clinical trials. These methods have been generalized to include Covariate-Adjusted Response-Adaptive (CARA) designs, which adjust treatment assignments for a set of covariates while maintaining features of the RA designs. Challenges may arise in multi-center trials if differential treatment responses and/or effects among sites exist. We propose Site-Adjusted Response-Adaptive (SARA) approaches to account for inter-center variability in treatment response and/or effectiveness, including either a fixed site effect or both random site and treatment-by-site interaction effects to calculate conditional probabilities. These success probabilities are used to update assignment probabilities for allocating patients …
Methods For Evaluating Dropout Attrition In Survey Data, Camille J. Hochheimer
Methods For Evaluating Dropout Attrition In Survey Data, Camille J. Hochheimer
Theses and Dissertations
As researchers increasingly use web-based surveys, the ease of dropping out in the online setting is a growing issue in ensuring data quality. One theory is that dropout or attrition occurs in phases that can be generalized to phases of high dropout and phases of stable use. In order to detect these phases, several methods are explored. First, existing methods and user-specified thresholds are applied to survey data where significant changes in the dropout rate between two questions is interpreted as the start or end of a high dropout phase. Next, survey dropout is considered as a time-to-event outcome and …
Assessing The Impact Of Incorporating Residential Histories Into The Spatial Analysis Of Cancer Risk, Anny-Claude Joseph
Assessing The Impact Of Incorporating Residential Histories Into The Spatial Analysis Of Cancer Risk, Anny-Claude Joseph
Theses and Dissertations
In many spatial epidemiologic studies, investigators use residential location at diagnosis as a surrogate for unknown environmental exposures or as a geographic basis for assigning measured exposures. Inherently, they make assumptions about the timing and location of pertinent exposures which may prove problematic when studying long latency diseases such as cancer.
In this work we explored how the association between environmental exposures and disease risk for long-latency health outcomes like cancer is affected by residential mobility. We used simulation studies conditioned on real data to evaluate the extent to which the commonly held assumption of no residential mobility 1) affected …
Methods For Joint Normalization And Comparison Of Hi-C Data, John C. Stansfield
Methods For Joint Normalization And Comparison Of Hi-C Data, John C. Stansfield
Theses and Dissertations
The development of chromatin conformation capture technology has opened new avenues of study into the 3D structure and function of the genome. Chromatin structure is known to influence gene regulation, and differences in structure are now emerging as a mechanism of regulation between, e.g., cell differentiation and disease vs. normal states. Hi-C sequencing technology now provides a way to study the 3D interactions of the chromatin over the whole genome. However, like all sequencing technologies, Hi-C suffers from several forms of bias stemming from both the technology and the DNA sequence itself. Several normalization methods have been developed for normalizing …
Genome-Wide Systems Genetics Of Alcohol Consumption And Dependence, Kristin Mignogna
Genome-Wide Systems Genetics Of Alcohol Consumption And Dependence, Kristin Mignogna
Theses and Dissertations
Widely effective treatment for alcohol use disorder is not yet available, because the exact biological mechanisms that underlie this disorder are not completely understood. One way to gain a better understanding of these mechanisms is to examine the genetic frameworks that contribute to the risk for developing this disorder. This dissertation examines genetic association data in combination with gene expression networks in the brain to identify functional groups of genes associated with alcohol consumption and dependence.
The first study took advantage of the behavioral complexity of human samples, and experimental capabilities provided by mouse models, by co-analyzing gene expression networks …
Bayesian Nonparametric Analysis Of Longitudinal Data With Non-Ignorable Non-Monotone Missingness, Yu Cao
Bayesian Nonparametric Analysis Of Longitudinal Data With Non-Ignorable Non-Monotone Missingness, Yu Cao
Theses and Dissertations
In longitudinal studies, outcomes are measured repeatedly over time, but in reality clinical studies are full of missing data points of monotone and non-monotone nature. Often this missingness is related to the unobserved data so that it is non-ignorable. In such context, pattern-mixture model (PMM) is one popular tool to analyze the joint distribution of outcome and missingness patterns. Then the unobserved outcomes are imputed using the distribution of observed outcomes, conditioned on missing patterns. However, the existing methods suffer from model identification issues if data is sparse in specific missing patterns, which is very likely to happen with a …
Spectral Methods For The Detection And Characterization Of Topologically Associated Domains, Kellen Garrison Cresswell
Spectral Methods For The Detection And Characterization Of Topologically Associated Domains, Kellen Garrison Cresswell
Theses and Dissertations
The three-dimensional (3D) structure of the genome plays a crucial role in gene expression regulation. Chromatin conformation capture technologies (Hi-C) have revealed that the genome is organized in a hierarchy of topologically associated domains (TADs), sub-TADs, and chromatin loops which is relatively stable across cell-lines and even across species. These TADs dynamically reorganize during development of disease, and exhibit cell- and conditionspecific differences. Identifying such hierarchical structures and how they change between conditions is a critical step in understanding genome regulation and disease development. Despite their importance, there are relatively few tools for identification of TADs and even fewer for …
Clustering Biological Data With Self-Adjusting High-Dimensional Sieve, Josselyn Gonzalez
Clustering Biological Data With Self-Adjusting High-Dimensional Sieve, Josselyn Gonzalez
Theses and Dissertations
Data classification as a preprocessing technique is a crucial step in the analysis and understanding of numerical data. Cluster analysis, in particular, provides insight into the inherent patterns found in data which makes the interpretation of any follow-up analyses more meaningful. A clustering algorithm groups together data points according to a predefined similarity criterion. This allows the data set to be broken up into segments which, in turn, gives way for a more targeted statistical analysis. Cluster analysis has applications in numerous fields of study and, as a result, countless algorithms have been developed. However, the quantity of options makes …
Adjusting For Mis-Reporting In Count Data, Gelareh Rahimighazikalayeh
Adjusting For Mis-Reporting In Count Data, Gelareh Rahimighazikalayeh
Theses and Dissertations
Any counting system is prone to recording errors including underreporting and overreporting. Ignoring the misreporting pattern in count data can give rise to bias in the estimation of model parameters. Accordingly, Poisson, negative binomial and generalized Poisson regression have been expanded in some instances to capture reporting biases. However, to our knowledge, no program has been developed to allow users to apply all of these models when needed. In the first part of the dissertation, we review the available models for underreported counts and develop a Stata command to estimate Poisson, negative binomial and generalized Poisson regression models for underreported …
Comparison Of The Performance Of Simple Linear Regression And Quantile Regression With Non-Normal Data: A Simulation Study, Marjorie Howard
Comparison Of The Performance Of Simple Linear Regression And Quantile Regression With Non-Normal Data: A Simulation Study, Marjorie Howard
Theses and Dissertations
Linear regression is a widely used method for analysis that is well understood across a wide variety of disciplines. In order to use linear regression, a number of assumptions must be met. These assumptions, specifically normality and homoscedasticity of the error distribution can at best be met only approximately with real data. Quantile regression requires fewer assumptions, which offers a potential advantage over linear regression. In this simulation study, we compare the performance of linear (least squares) regression to quantile regression when these assumptions are violated, in order to investigate under what conditions quantile regression becomes the more advantageous method …
Estimation Procedures For Complex Survival Models And Their Applications In Epidemiology Studies, Jie Zhou
Estimation Procedures For Complex Survival Models And Their Applications In Epidemiology Studies, Jie Zhou
Theses and Dissertations
In this dissertation, we aim to address three important questions in practice, which can be solved through complex survival models. The first project focuses on studying the longitudinal fitness effect on cardiovascular disease (CVD) mortality. In the second project, we study the disease-death relation between CVD and all-cause mortality and evaluate important covariate effects on the disease or death transitions. In the third project, we compare antiretroviral treatment (ART) for HIV patients and consider both treatment effect and side effect of the drugs. The first two projects are motivated by the Aerobics Center Longitudinal Study (ACLS) datasets and the third …
Examining The Confirmatory Tetrad Analysis (Cta) As A Solution Of The Inadequacy Of Traditional Structural Equation Modeling (Sem) Fit Indices, Hangcheng Liu
Theses and Dissertations
Structural Equation Modeling (SEM) is a framework of statistical methods that allows us to represent complex relationships between variables. SEM is widely used in economics, genetics and the behavioral sciences (e.g. psychology, psychobiology, sociology and medicine). Model complexity is defined as a model’s ability to fit different data patterns and it plays an important role in model selection when applying SEM. As in linear regression, the number of free model parameters is typically used in traditional SEM model fit indices as a measure of the model complexity. However, only using number of free model parameters to indicate SEM model complexity …
Penalized Mixed-Effects Ordinal Response Models For High-Dimensional Genomic Data In Twins And Families, Amanda E. Gentry
Penalized Mixed-Effects Ordinal Response Models For High-Dimensional Genomic Data In Twins And Families, Amanda E. Gentry
Theses and Dissertations
The Brisbane Longitudinal Twin Study (BLTS) was being conducted in Australia and was funded by the US National Institute on Drug Abuse (NIDA). Adolescent twins were sampled as a part of this study and surveyed about their substance use as part of the Pathways to Cannabis Use, Abuse and Dependence project. The methods developed in this dissertation were designed for the purpose of analyzing a subset of the Pathways data that includes demographics, cannabis use metrics, personality measures, and imputed genotypes (SNPs) for 493 complete twin pairs (986 subjects.) The primary goal was to determine what combination of SNPs and …
Estimating The Respiratory Lung Motion Model Using Tensor Decomposition On Displacement Vector Field, Kingston Kang
Estimating The Respiratory Lung Motion Model Using Tensor Decomposition On Displacement Vector Field, Kingston Kang
Theses and Dissertations
Modern big data often emerge as tensors. Standard statistical methods are inadequate to deal with datasets of large volume, high dimensionality, and complex structure. Therefore, it is important to develop algorithms such as low-rank tensor decomposition for data compression, dimensionality reduction, and approximation.
With the advancement in technology, high-dimensional images are becoming ubiquitous in the medical field. In lung radiation therapy, the respiratory motion of the lung introduces variabilities during treatment as the tumor inside the lung is moving, which brings challenges to the precise delivery of radiation to the tumor. Several approaches to quantifying this uncertainty propose using a …
Weighted Quantile Sum Regression For Analyzing Correlated Predictors Acting Through A Mediation Pathway On A Biological Outcome, Bhanu M. Evani
Weighted Quantile Sum Regression For Analyzing Correlated Predictors Acting Through A Mediation Pathway On A Biological Outcome, Bhanu M. Evani
Theses and Dissertations
Abstract
Weighted Quantile Sum Regression for Analyzing Correlated Predictors Acting Through a Mediation Pathway on a Biological Outcome
By
Bhanu M. Evani, Ph.D.
A thesis submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at Virginia Commonwealth University.
Virginia Commonwealth University, 2017.
Major Director: Robert A. Perera, Asst. Professor, Department of Biostatistics
This work examines mediated effects of a set of correlated predictors using the recently developed Weighted Quantile Sum (WQS) regression method. Traditionally, mediation analysis has been conducted using the multiple regression method, first proposed by Baron and Kenny (1986), which has since …
Comparing The Structural Components Variance Estimator And U-Statistics Variance Estimator When Assessing The Difference Between Correlated Aucs With Finite Samples, Anna L. Bosse
Theses and Dissertations
Introduction: The structural components variance estimator proposed by DeLong et al. (1988) is a popular approach used when comparing two correlated AUCs. However, this variance estimator is biased and could be problematic with small sample sizes.
Methods: A U-statistics based variance estimator approach is presented and compared with the structural components variance estimator through a large-scale simulation study under different finite-sample size configurations.
Results: The U-statistics variance estimator was unbiased for the true variance of the difference between correlated AUCs regardless of the sample size and had lower RMSE than the structural components variance estimator, providing better type 1 error …
Evaluation Of Goodness-Of-Fit Tests For The Cox Proportional Hazards Model With Time-Varying Covariates, Shanshan Hong
Evaluation Of Goodness-Of-Fit Tests For The Cox Proportional Hazards Model With Time-Varying Covariates, Shanshan Hong
Theses and Dissertations
The proportional hazards (PH) model, proposed by Cox (1972), is one of the most popular survival models for analyzing time-to-event data. To use the PH model properly, one must examine whether the data satisfy the PH assumption. An alternative model should be suggested if the PH assumption is invalid. The main purpose of this thesis is to examine the performance of five existing methods for assessing the PH assumption. Through extensive simulations, the powers of five different existing methods are compared; these methods include the likelihood ratio test, the Schoenfeld residuals test, the scaled Schoenfeld residuals test, Lin et al. …
Marginal Structural Cox Model For Survival Data With Treatment-Confounder Feedback, Yanan Zhang
Marginal Structural Cox Model For Survival Data With Treatment-Confounder Feedback, Yanan Zhang
Theses and Dissertations
In an observational longitudinal study, there can be time-varying exposure/treatment and time-varying confounders. When the confounders affect the exposure and prior exposure also has an impact on levels of confounders, there is treatment confounder feedback. To admit estimation of unbiased causal effects, these conditions need to be hold, exchangeability, positivity, consistency. The traditional method of conditioning on potential confounders does not meet these 3 conditions. Therefore, parameter estimates from traditional Cox model are biased casual effect estimates when the treatment confounder feedback exists. The marginal structural Cox model can be used to address this issue. By calculating and including inverse …
Longitudinal And Geographical Modeling Of Circular Data With An Application To Sudden Infant Death Syndrome, Xinyan Cai
Longitudinal And Geographical Modeling Of Circular Data With An Application To Sudden Infant Death Syndrome, Xinyan Cai
Theses and Dissertations
The aim of this thesis is to study seasonality of death in U.S. infants who died from SIDS. We also propose to investigate secular trends and geographical patterns of seasonal patterns of mortality. The application of circular statistics is used to describe the seasonality of the month of death in infants who died from SIDS in 1990, 2000 and 2010. The secular trends of seasonal patterns of SIDS mortality are investigated using a circular linear regression model after adjusting for potential confounders. The geographical variation in seasonal patterns of SIDS mortality is explored from the U.S. map and quantified by …
Statistical Methods For Multivariate And Correlated Data, Xinling Xu
Statistical Methods For Multivariate And Correlated Data, Xinling Xu
Theses and Dissertations
A commonly encountered data type in real life is count data, especially in selfreported behavioral studies. One issue of the self-reported count data is the inaccuracy. In the first part of the dissertation, we are going to address one specific type of inaccuracy in bivariate count data–heaping. Copula functions are used for the formulation of the bivariate distribution. Using copula functions for solving data inaccuracy problems is still a new area, which we are going to explore in this dissertation.
We also discuss the methods for variable selection when the explanatory variables are highly correlated. In particular, our method is …
The Generalized Monotone Incremental Forward Stagewise Method For Modeling Longitudinal, Clustered, And Overdispersed Count Data: Application Predicting Nuclear Bud And Micronuclei Frequencies, Rebecca Lehman
Theses and Dissertations
With the influx of high-dimensional data there is an immediate need for statistical methods that are able to handle situations when the number of predictors greatly exceeds the number of samples. One such area of growth is in examining how environmental exposures to toxins impact the body long term. The cytokinesis-block micronucleus assay can measure the genotoxic effect of exposure as a count outcome. To investigate potential biomarkers, high-throughput assays that assess gene expression and methylation have been developed. It is of interest to identify biomarkers or molecular features that are associated with elevated micronuclei (MN) or nuclear bud (Nbud) …
Stage-Specific Predictive Models For Cancer Survivability, Elham Sagheb Hossein Pour
Stage-Specific Predictive Models For Cancer Survivability, Elham Sagheb Hossein Pour
Theses and Dissertations
Survivability of cancer strongly depends on the stage of cancer. In most previous works, machine learning survivability prediction models for a particular cancer, were trained and evaluated together on all stages of the cancer. In this work, we trained and evaluated survivability prediction models for five major cancers, together on all stages and separately for every stage. We named these models joint and stage-specific models respectively. The obtained results for the cancers which we investigated reveal that, the best model to predict the survivability of the cancer for one specific stage is the model which is specifically built for that …
Novel Methods For Analyzing Longitudinal Data With Measurement Error In The Time Variable, Caroline Munindi Mulatya
Novel Methods For Analyzing Longitudinal Data With Measurement Error In The Time Variable, Caroline Munindi Mulatya
Theses and Dissertations
In some longitudinal studies, the observed time points are often confounded with measurement error due to the sampling conditions, resulting into data with measurement error in the time variable. This type of data occurs mainly in observational studies when the onset of a longitudinal process is unknown or in clinical trials when individual visits do not take place as specified by the study protocol, but are often rounded to coincide with the study protocol. Methodological and inferential implications of error in time varying covariates for both linear and nonlinear models have been studied widely. In this dissertation, we shift attention …
Detecting Association Of Gene-Environment Interactions In Common And Rare Variants For Hypertension, Miguelangel Diaz Medina
Detecting Association Of Gene-Environment Interactions In Common And Rare Variants For Hypertension, Miguelangel Diaz Medina
Theses and Dissertations
Subsequent malignant neoplasms (SMNs) or secondary cancers are one of the most negative effects resulting from cancer treatment such as chemotherapy or radiation. Given the severity and high incidence of mortality faced by cancer survivors, it is critical that we understand the cause of SMNs so that preventive measures or intervention can be done for individuals facing a higher risk of SMN incidence. The purpose of this thesis is to test the efficacy of newly developed statistical methods used to identify gene-environment interactions that are associated with a specific disease, in this case, SMNs, considering both common and rare variants, …
Are We Missing The Forest For The Trees? Quantifying The Maintenance Of Diversity In Temperate Deciduous Forests, Kathryn Barry
Are We Missing The Forest For The Trees? Quantifying The Maintenance Of Diversity In Temperate Deciduous Forests, Kathryn Barry
Theses and Dissertations
One of the most pressing questions of community ecology is: Why do we have so many species? Over 100 hypotheses have been proposed to answer this question for woody plants over the past 70 years, yet there remains no consensus among community ecologists. In this dissertation, I explore the evidence supporting several different hypotheses (Chapter 1). I provide evidence that negative density dependence, where individuals perform poorly near members of their own species, may only be relevant for canopy tree species (Chapter 2). Understory species do not demonstrate negative density dependence while canopy trees demonstrate negative density dependence that increases …