Open Access. Powered by Scholars. Published by Universities.®
- Institution
-
- University of Kentucky (3)
- Virginia Commonwealth University (3)
- Bucknell University (1)
- COBRA (1)
- James Madison University (1)
-
- Misericordia University (1)
- Purdue University (1)
- Southern Methodist University (1)
- University of Arkansas, Fayetteville (1)
- University of Louisville (1)
- University of Massachusetts Amherst (1)
- University of Nebraska Medical Center (1)
- University of Texas Rio Grande Valley (1)
- Washington University in St. Louis (1)
- Keyword
-
- Statistics (2)
- American ginseng (1)
- Average Causal Effect (1)
- Bayesian Adjustment for Confounding (1)
- Bayesian methods (1)
-
- Bayesian modeling (1)
- Bias correction (1)
- Biomarker (1)
- Biostatistics (1)
- COVID-19 (1)
- Canonical GLM (1)
- Compositional data (1)
- Computational neuroscience (1)
- Coronavirus (1)
- Covariate Adjustment (1)
- Diabetes (1)
- Differential abundance analysis (1)
- Differential expression (1)
- Dirichlet (1)
- Distance Correlation (1)
- Electronic health record (1)
- Evolution (1)
- Fisher scoring algorithm (1)
- Fokker-Planck (1)
- GLMM (1)
- Gamma distribution (1)
- Gated recurrent unit (1)
- Gene Set Test (1)
- Gene expression (1)
- Gene set analysis (1)
- Publication
-
- Theses and Dissertations--Statistics (3)
- Theses and Dissertations (2)
- Biology and Medicine Through Mathematics Conference (1)
- Doctoral Dissertations (1)
- Electronic Theses and Dissertations (1)
-
- Faculty Journal Articles (1)
- Graduate Theses and Dissertations (1)
- Masters Theses, 2020-current (1)
- McKelvey School of Engineering Theses & Dissertations (1)
- School of Mathematical and Statistical Sciences Faculty Publications and Presentations (1)
- Statistical Science Theses and Dissertations (1)
- Student Research Poster Presentations 2020 (1)
- The Journal of Purdue Undergraduate Research (1)
- The University of Michigan Department of Biostatistics Working Paper Series (1)
- Theses & Dissertations (1)
- Publication Type
Articles 1 - 18 of 18
Full-Text Articles in Statistical Models
Multi-Level Small Area Estimation Based On Calibrated Hierarchical Likelihood Approach Through Bias Correction With Applications To Covid-19 Data, Nirosha Rathnayake
Multi-Level Small Area Estimation Based On Calibrated Hierarchical Likelihood Approach Through Bias Correction With Applications To Covid-19 Data, Nirosha Rathnayake
Theses & Dissertations
Small area estimation (SAE) has been widely used in a variety of applications to draw estimates in geographic domains represented as a metropolitan area, district, county, or state. The direct estimation methods provide accurate estimates when the sample size of study participants within each area unit is sufficiently large, but it might not always be realistic to have large sample sizes of study participants when considering small geographical regions. Meanwhile, high dimensional socio-ecological data exist at the community level, providing an opportunity for model-based estimation by incorporating rich auxiliary information at the individual and area levels. Thus, it is critical …
Gene Set Testing By Distance Correlation, Sho-Hsien Su
Gene Set Testing By Distance Correlation, Sho-Hsien Su
Graduate Theses and Dissertations
Pathways are the functional building blocks of complex diseases such as cancers. Pathway-level studies may provide insights on some important biological processes. Gene set test is an important tool to study the differential expression of a gene set between two groups, e.g., cancer vs normal. The differential expression of a gene set could be due to the difference in mean, variability, or both. However, most existing gene set tests only target the mean difference but overlook other types of differential expression. In this thesis, we propose to use the recently developed distance correlation for gene set testing. To assess the …
Statistical Approaches Of Gene Set Analysis With Quantitative Trait Loci For High-Throughput Genomic Studies., Samarendra Das
Statistical Approaches Of Gene Set Analysis With Quantitative Trait Loci For High-Throughput Genomic Studies., Samarendra Das
Electronic Theses and Dissertations
Recently, gene set analysis has become the first choice for gaining insights into the underlying complex biology of diseases through high-throughput genomic studies, such as Microarrays, bulk RNA-Sequencing, single cell RNA-Sequencing, etc. It also reduces the complexity of statistical analysis and enhances the explanatory power of the obtained results. Further, the statistical structure and steps common to these approaches have not yet been comprehensively discussed, which limits their utility. Hence, a comprehensive overview of the available gene set analysis approaches used for different high-throughput genomic studies is provided. The analysis of gene sets is usually carried out based on …
A Differential Geometry-Based Machine Learning Algorithm For The Brain Age Problem, Justin Asher, Khoa Tan Dang, Maxwell Masters
A Differential Geometry-Based Machine Learning Algorithm For The Brain Age Problem, Justin Asher, Khoa Tan Dang, Maxwell Masters
The Journal of Purdue Undergraduate Research
No abstract provided.
Causal Inference And Prediction On Observational Data With Survival Outcomes, Xiaofei Chen
Causal Inference And Prediction On Observational Data With Survival Outcomes, Xiaofei Chen
Statistical Science Theses and Dissertations
Infants with hypoplastic left heart syndrome require an initial Norwood operation, followed some months later by a stage 2 palliation (S2P). The timing of S2P is critical for the operation’s success and the infant’s survival, but the optimal timing, if one exists, is unknown. We attempt to estimate the optimal timing of S2P by analyzing data from the Single Ventricle Reconstruction Trial (SVRT), which randomized patients between two different types of Norwood procedure. In the SVRT, the timing of the S2P was chosen by the medical team; thus with respect to this exposure, the trial constitutes an observational study, and …
Statistical Inference Of Adaptation At Multiple Genomic Scales Using Supervised Classification And A Hidden Markov Model, Lauren A. Sugden
Statistical Inference Of Adaptation At Multiple Genomic Scales Using Supervised Classification And A Hidden Markov Model, Lauren A. Sugden
Biology and Medicine Through Mathematics Conference
No abstract provided.
Modeling Species Distribution And Habitat Suitability Of American Ginseng (Panax Quinquefolius) In Virginia, Jacob D. J. Peters
Modeling Species Distribution And Habitat Suitability Of American Ginseng (Panax Quinquefolius) In Virginia, Jacob D. J. Peters
Masters Theses, 2020-current
American ginseng (Panax quinquefolius) is a well-known and sought-after medicinal plant native to North America that is facing increased threat of extinction due to overharvesting, herbivory, and habitat loss. Species distribution and habitat suitability models may be valuable to landowners interested in sustainable harvest or to institutions interested in the conservation and restoration of the species. With unequal sampling efforts across a region of interest, it is likely that some locations with appropriate habitat may be misrepresented in model predictions. This study refined a state-derived species distribution model for ginseng through increased sampling effort across the Cumberland Plateau …
Predicting Disease Progression Using Deep Recurrent Neural Networks And Longitudinal Electronic Health Record Data, Seunghwan Kim
Predicting Disease Progression Using Deep Recurrent Neural Networks And Longitudinal Electronic Health Record Data, Seunghwan Kim
McKelvey School of Engineering Theses & Dissertations
Electronic Health Records (EHR) are widely adopted and used throughout healthcare systems and are able to collect and store longitudinal information data that can be used to describe patient phenotypes. From the underlying data structures used in the EHR, discrete data can be extracted and analyzed to improve patient care and outcomes via tasks such as risk stratification and prospective disease management. Temporality in EHR is innately present given the nature of these data, however, and traditional classification models are limited in this context by the cross- sectional nature of training and prediction processes. Finding temporal patterns in EHR is …
Bayesian Methods For The Assessment Of Reporting Errors For Data-Sparse Population-Periods With Applications To Estimating Mortality, Emily Peterson
Bayesian Methods For The Assessment Of Reporting Errors For Data-Sparse Population-Periods With Applications To Estimating Mortality, Emily Peterson
Doctoral Dissertations
Population level mortality data is often subject to substantial reporting errors due to misclassification of cause of death, misclassification of death status, or age reporting errors. Accuracy of error-prone data sources can be assessed by comparing such data to gold standard data for the same population-period. We present Bayesian methods for assessing the extent of reporting errors across different population-periods and generalizing those to settings where gold-standard data are lacking. Firstly, we investigate misclassification errors of maternal cause of death reporting in civil registration vital statistics data. We use a Bayesian hierarchical bivariate random-walk model to estimate country-year specific sensitivity …
Predicting Diabetes Diagnoses, Sarah Netchert
Predicting Diabetes Diagnoses, Sarah Netchert
Student Research Poster Presentations 2020
This study explored the traits and health state of African Americans in central Virginia in order to determine what traits put people at a higher probability of being diagnosed with diabetes. We also want to know which traits will generate the highest probability a person will be diagnosed with diabetes. Traits that were included and used in this study were cholesterol, stabilized glucose, high density lipoprotein levels, age(years), gender, height(inches), weight(pounds), systolic blood pressure, diastolic blood pressure, waist size(inches), and hip size(inches). There were 403 individuals included in study since they were only ones screened for diabetes out of 1,046 …
Shrinkage Priors For Isotonic Probability Vectors And Binary Data Modeling, Philip S. Boonstra, Daniel R. Owen, Jian Kang
Shrinkage Priors For Isotonic Probability Vectors And Binary Data Modeling, Philip S. Boonstra, Daniel R. Owen, Jian Kang
The University of Michigan Department of Biostatistics Working Paper Series
This paper outlines a new class of shrinkage priors for Bayesian isotonic regression modeling a binary outcome against a predictor, where the probability of the outcome is assumed to be monotonically non-decreasing with the predictor. The predictor is categorized into a large number of groups, and the set of differences between outcome probabilities in consecutive categories is equipped with a multivariate prior having support over the set of simplexes. The Dirichlet distribution, which can be derived from a normalized cumulative sum of gamma-distributed random variables, is a natural choice of prior, but using mathematical and simulation-based arguments, we show that …
The Analysis Of Neural Heterogeneity Through Mathematical And Statistical Methods, Kyle Wendling
The Analysis Of Neural Heterogeneity Through Mathematical And Statistical Methods, Kyle Wendling
Theses and Dissertations
Diversity of intrinsic neural attributes and network connections is known to exist in many areas of the brain and is thought to significantly affect neural coding. Recent theoretical and experimental work has argued that in uncoupled networks, coding is most accurate at intermediate levels of heterogeneity. I explore this phenomenon through two distinct approaches: a theoretical mathematical modeling approach and a data-driven statistical modeling approach.
Through the mathematical approach, I examine firing rate heterogeneity in a feedforward network of stochastic neural oscillators utilizing a high-dimensional model. The firing rate heterogeneity stems from two sources: intrinsic (different individual cells) and network …
Zero-Inflated Longitudinal Mixture Model For Stochastic Radiographic Lung Compositional Change Following Radiotherapy Of Lung Cancer, Viviana A. Rodríguez Romero
Zero-Inflated Longitudinal Mixture Model For Stochastic Radiographic Lung Compositional Change Following Radiotherapy Of Lung Cancer, Viviana A. Rodríguez Romero
Theses and Dissertations
Compositional data (CD) is mostly analyzed as relative data, using ratios of components, and log-ratio transformations to be able to use known multivariable statistical methods. Therefore, CD where some components equal zero represent a problem. Furthermore, when the data is measured longitudinally, observations are spatially related and appear to come from a mixture population, the analysis becomes highly complex. For this matter, a two-part model was proposed to deal with structural zeros in longitudinal CD using a mixed-effects model. Furthermore, the model has been extended to the case where the non-zero components of the vector might a two component mixture …
Semiparametric And Nonparametric Methods For Comparing Biomarker Levels Between Groups, Yuntong Li
Semiparametric And Nonparametric Methods For Comparing Biomarker Levels Between Groups, Yuntong Li
Theses and Dissertations--Statistics
Comparing the distribution of biomarker measurements between two groups under either an unpaired or paired design is a common goal in many biomarker studies. However, analyzing biomarker data is sometimes challenging because the data may not be normally distributed and contain a large fraction of zero values or missing values. Although several statistical methods have been proposed, they either require data normality assumption, or are inefficient. We proposed a novel two-part semiparametric method for data under an unpaired setting and a nonparametric method for data under a paired setting. The semiparametric method considers a two-part model, a logistic regression for …
Estimation Of The Treatment Effect With Bayesian Adjustment For Covariates, Li Xu
Estimation Of The Treatment Effect With Bayesian Adjustment For Covariates, Li Xu
Theses and Dissertations--Statistics
The Bayesian adjustment for confounding (BAC) is a Bayesian model averaging method to select and adjust for confounding factors when evaluating the average causal effect of an exposure on a certain outcome. We extend the BAC method to time-to-event outcomes. Specifically, the posterior distribution of the exposure effect on a time-to-event outcome is calculated as a weighted average of posterior distributions from a number of candidate proportional hazards models, weighing each model by its ability to adjust for confounding factors. The Bayesian Information Criterion based on the partial likelihood is used to compare different models and approximate the Bayes factor. …
Bayesian Kinetic Modeling For Tracer-Based Metabolomic Data, Xu Zhang
Bayesian Kinetic Modeling For Tracer-Based Metabolomic Data, Xu Zhang
Theses and Dissertations--Statistics
Kinetic modeling of the time dependence of metabolite concentrations including the unstable isotope labeled species is an important approach to simulate metabolic pathway dynamics. It is also essential for quantitative metabolic flux analysis using tracer data. However, as the metabolic networks are complex including extensive compartmentation and interconnections, the parameter estimation for enzymes that catalyze individual reactions needed for kinetic modeling is challenging. As the pa- rameter space is large and multi-dimensional while kinetic data are comparatively sparse, the estimation procedure (especially the point estimation methods) often en- counters multiple local maximum such that standard maximum likelihood methods may yield …
Sex And Age Differences In Prevalence And Risk Factors For Prediabetes In Mexican-Americans, Kristina Vatcheva, Belinda M. Reininger, Susan P. Fisher-Hoch, Joseph B. Mccormick
Sex And Age Differences In Prevalence And Risk Factors For Prediabetes In Mexican-Americans, Kristina Vatcheva, Belinda M. Reininger, Susan P. Fisher-Hoch, Joseph B. Mccormick
School of Mathematical and Statistical Sciences Faculty Publications and Presentations
AIMS:
Over 1/3 of Americans have prediabetes, while 9.4% have type 2 diabetes. The aim of our study was to estimate the prevalence of prediabetes in Mexican Americans, with known 28.2% prevalence of type 2 diabetes, by age and sex and to identify critical socio-demographic and clinical factors associated with prediabetes.
METHODS:
Data were collected between 2004 and 2017 from the Cameron County Hispanic Cohort in Texas. Weighted crude and sex- and age- stratified prevalences were calculated. Survey weighted logistic regression analyses were conducted to identify risk factors for prediabetes.
RESULTS:
The prevalence of prediabetes (32%) was slightly higher than …
Enhancing Models And Measurements Of Traffic-Related Air Pollutants For Health Studies Using Dispersion Modeling And Bayesian Data Fusion, Stuart A. Batterman, Veronica J. Berrocal, Chad Milando, Owais Gilani, Saravanan Arunachalam, K. Max Zhang
Enhancing Models And Measurements Of Traffic-Related Air Pollutants For Health Studies Using Dispersion Modeling And Bayesian Data Fusion, Stuart A. Batterman, Veronica J. Berrocal, Chad Milando, Owais Gilani, Saravanan Arunachalam, K. Max Zhang
Faculty Journal Articles
Research Report 202 describes a study led by Dr. Stuart Batterman at the University of Michigan, Ann Arbor and colleagues. The investigators evaluated the ability to predict traffic-related air pollution using a variety of methods and models, including a line source air pollution dispersion model and sophisticated spatiotemporal Bayesian data fusion methods. Exposure assessment for traffic-related air pollution is challenging because the pollutants are a complex mixture and vary greatly over space and time. Because extensive direct monitoring is difficult and expensive, a number of modeling approaches have been developed, but each model has its own limitations and errors.
Dr. …