Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability

University of Kentucky

Theses/Dissertations

Keyword
Publication Year
Publication

Articles 1 - 30 of 123

Full-Text Articles in Physical Sciences and Mathematics

The Performance Of Marginal Modeling Methods For Rare Events With Application To Opioid Overdose Mortality And Morbidity, Shawn Nigam Jan 2024

The Performance Of Marginal Modeling Methods For Rare Events With Application To Opioid Overdose Mortality And Morbidity, Shawn Nigam

Theses and Dissertations--Epidemiology and Biostatistics

Opioid misuse is a nationwide epidemic, with Kentucky having one of the highest opioid overdose-related fatality rates across all US states. These rates have increased significantly over the past decade, with particularly large increases during the COVID-19 pandemic. This dissertation aims to study the behavior of these increases and the methods for the marginal modeling of count outcomes related to opioid overdose.

Opioid overdose-related fatality rates in Kentucky increased significantly during the COVID-19 pandemic. In this chapter, we characterize the changes in opioid overdose fatality rates in Kentucky and identify associations between potential factors and fatality rates. County-level opioid overdose …


On Generative Models And Joint Architectures For Document-Level Relation Extraction, Aviv Brokman Jan 2024

On Generative Models And Joint Architectures For Document-Level Relation Extraction, Aviv Brokman

Theses and Dissertations--Statistics

Biomedical text is being generated at a high rate in scientific literature publications and electronic health records. Within these documents lies a wealth of potentially useful information in biomedicine. Relation extraction (RE), the process of automating the identification of structured relationships between entities within text, represents a highly sought-after goal in biomedical informatics, offering the potential to unlock deeper insights and connections from this vast corpus of data. In this dissertation, we tackle this problem with a variety of approaches.

We review the recent history of the field of document-level RE. Several themes emerge. First, graph neural networks dominate the …


Differential Impacts Of Weather Anomalies On Household Energy Expenditure Shares: A Comparison Of Clustered Panel Analysis Methods, Jordan Champion Jan 2024

Differential Impacts Of Weather Anomalies On Household Energy Expenditure Shares: A Comparison Of Clustered Panel Analysis Methods, Jordan Champion

Theses and Dissertations--Agricultural Economics

Recent emphasis on environmental justice has highlighted deficiencies in our energy system that produce disparities in accessibility and affordability for the most vulnerable. Meanwhile, the realities of a gradually warming climate and the onset of a global energy crisis (IEA 2022) have coincidently contributed to spikes in both energy prices and demand. These implications threaten to further exacerbate existing disparities for income-constrained and vulnerable populations, enhancing their risk of falling into prolonged insecurity. To ensure our transition to a just, sustainable future, we must first ensure equitable access to affordable and reliable energy for everyone. Combining household-level panel and state-level …


Striving For Appropriate Antibiotic Use: A Biomarker Initiative, And Outcomes Associated With Azithromycin Exposure, Amanda Gusovsky Jan 2023

Striving For Appropriate Antibiotic Use: A Biomarker Initiative, And Outcomes Associated With Azithromycin Exposure, Amanda Gusovsky

Theses and Dissertations--Pharmacy

The introduction of antibiotics into clinical practice is considered the greatest medical breakthrough of the 20thcentury. However, the use of antibiotics can contribute to the development of resistance. In the United States (U.S.), approximately 2.8 million people are infected with antibiotic-resistant bacteria each year, and more than 35,000 people die as a result. Moreover, some antibiotics are known to cause cardiac side effects including QT prolongation, hypotension, and ventricular arrythmias. The U.S. Centers for Disease Control and Prevention (CDC) defines appropriate antibiotic use as the effort to use “the right antibiotic, at the right dose, for the right …


Potential Alzheimer's Disease Plasma Biomarkers, Taylor Estepp Jan 2023

Potential Alzheimer's Disease Plasma Biomarkers, Taylor Estepp

Theses and Dissertations--Epidemiology and Biostatistics

In this series of studies, we examined the potential of a variety of blood-based plasma biomarkers for the identification of Alzheimer's disease (AD) progression and cognitive decline. With the end goal of studying these biomarkers via mixture modeling, we began with a literature review of the methodology. An examination of the biomarkers with demographics and other health factors found evidence of minimal risk of confounding along the causal pathway from biomarkers to cognitive performance. Further study examined the usefulness of linear combinations of biomarkers, achieved via partial least squares (PLS) analysis, as predictors of various cognitive assessment scores and clinical …


High Dimensional Data Analysis: Variable Screening And Inference, Lei Fang Jan 2023

High Dimensional Data Analysis: Variable Screening And Inference, Lei Fang

Theses and Dissertations--Statistics

This dissertation focuses on the problem of high dimensional data analysis, which arises in many fields including genomics, finance, and social sciences. In such settings, the number of features or variables is much larger than the number of observations, posing significant challenges to traditional statistical methods.

To address these challenges, this dissertation proposes novel methods for variable screening and inference. The first part of the dissertation focuses on variable screening, which aims to identify a subset of important variables that are strongly associated with the response variable. Specifically, we propose a robust nonparametric screening method to effectively select the predictors …


Statistical Intervals For Neural Network And Its Relationship With Generalized Linear Model, Sheng Yuan Jan 2023

Statistical Intervals For Neural Network And Its Relationship With Generalized Linear Model, Sheng Yuan

Theses and Dissertations--Statistics

Neural networks have experienced widespread adoption and have become integral in cutting-edge domains like computer vision, natural language processing, and various contemporary fields. However, addressing the statistical aspects of neural networks has been a persistent challenge, with limited satisfactory results. In my research, I focused on exploring statistical intervals applied to neural networks, specifically confidence intervals and tolerance intervals. I employed variance estimation methods, such as direct estimation and resampling, to assess neural networks and their performance under outlier scenarios. Remarkably, when outliers were present, the resampling method with infinitesimal jackknife estimation yielded confidence intervals that closely aligned with nominal …


Finite Mixtures Of Mean-Parameterized Conway-Maxwell-Poisson Models, Dongying Zhan Jan 2023

Finite Mixtures Of Mean-Parameterized Conway-Maxwell-Poisson Models, Dongying Zhan

Theses and Dissertations--Statistics

For modeling count data, the Conway-Maxwell-Poisson (CMP) distribution is a popular generalization of the Poisson distribution due to its ability to characterize data over- or under-dispersion. While the classic parameterization of the CMP has been well-studied, its main drawback is that it is does not directly model the mean of the counts. This is mitigated by using a mean-parameterized version of the CMP distribution. In this work, we are concerned with the setting where count data may be comprised of subpopulations, each possibly having varying degrees of data dispersion. Thus, we propose a finite mixture of mean-parameterized CMP distributions. An …


A Novel Nonparametric Test For Heterogeneity Detection And Assessment Of Fluid Removal Among Crrt Patients In Icu, Shaowli Kabir Apr 2022

A Novel Nonparametric Test For Heterogeneity Detection And Assessment Of Fluid Removal Among Crrt Patients In Icu, Shaowli Kabir

Theses and Dissertations--Epidemiology and Biostatistics

Over the past decade acute kidney injury (AKI) has been occurring among 20%-50% of patients admitted to the intensive care unit (ICU) in United States. Continuous renal replacement therapy (CRRT) has become a popular treatment method among these critically ill patients. But there are multiple complications in implementing this treatment, including discrepancies in practiced and prescribed fluid removal, possibly related to the heterogeneity among these patients. With mixture modeling there have been several techniques in detecting heterogeneity with their specific limitations. In this dissertation a novel nonparametric ‘d test’ will be used to detect heterogeneity among CRRT patients in ICU. …


Energy Integrated Ratio Analysis Of The Anomalous Precession Frequency In The Fermilab Muon G-2 Experiment, Ritwika Chakraborty Jan 2022

Energy Integrated Ratio Analysis Of The Anomalous Precession Frequency In The Fermilab Muon G-2 Experiment, Ritwika Chakraborty

Theses and Dissertations--Physics and Astronomy

The muon’s anomalous magnetic moment, aμ, provides a unique way for probing physics beyond the standard model experimentally as it gathers contributions from all the known and unknown forces and particles in nature. The theoretical prediction of aμ has been in greater than 3 σ tension with the experimental measurement since the results of the Muon g-2 Experiment at the Brookhaven National Laboratory (E-821) were published in the early 2000s with a precision of 540 ppb. To settle this tension, the new Fermilab Muon g - 2 Experiment (E-989) is currently taking data with the aim of …


Deriving The Distributions And Developing Methods Of Inference For R2-Type Measures, With Applications To Big Data Analysis, Gregory S. Hawk Jan 2022

Deriving The Distributions And Developing Methods Of Inference For R2-Type Measures, With Applications To Big Data Analysis, Gregory S. Hawk

Theses and Dissertations--Statistics

As computing capabilities and cloud-enhanced data sharing has accelerated exponentially in the 21st century, our access to Big Data has revolutionized the way we see data around the world, from healthcare to investments to manufacturing to retail and supply-chain. In many areas of research, however, the cost of obtaining each data point makes more than just a few observations impossible. While machine learning and artificial intelligence (AI) are improving our ability to make predictions from datasets, we need better statistical methods to improve our ability to understand and translate models into meaningful and actionable insights.

A central goal in the …


Opioid Use Disorder Treatment With Buprenorphine: Analysis Of Treatment Utilization And Associated Outcomes In Kentucky, Feitong Lei Jan 2022

Opioid Use Disorder Treatment With Buprenorphine: Analysis Of Treatment Utilization And Associated Outcomes In Kentucky, Feitong Lei

Theses and Dissertations--Epidemiology and Biostatistics

Opioid use disorder (OUD) is chronic opioid use that results in clinically significant suffering, impairment, or even death. The opioid epidemic in the United States has become a public health and economic crisis, affecting patients' well-being and the nation's overall health and welfare. Eastern Kentucky was among the first regions affected by the opioid crisis, and Kentucky has historically ranked among the top five states for age-adjusted drug overdose mortality rate.

There are three medications (buprenorphine, methadone, naltrexone) approved by the U.S. Food and Drug Administration to treat OUD. As a partial opioid agonist, buprenorphine is a safe medication for …


Multivariate Statistical Modeling For Radio-Genomics Study, Tiantian Zeng Jan 2022

Multivariate Statistical Modeling For Radio-Genomics Study, Tiantian Zeng

Theses and Dissertations--Statistics

Radiogenomics is a new direction in cancer research that focuses on the associations among radiomics, genomics and clinical outcome. Currently, the major challenge for Radiogenomics lies in the effective integration of genomics and imaging data for promising clinical outcome prediction. Herein, we propose a multivariate joint model that can integrate imaging and genomic data for better predicting the clinical outcome. Specifically, we jointly consider two multivariate group lasso models, one regresses imaging features on genomic features, and the other regresses patient’s clinical outcome on genomic features. An L1 penalty term is introduced for each variable, and weight in the penalty …


Addressing Ascertainment Bias In The Study Of Cardiovascular Disease Burden In Opioid Use Disorders - Application Of Natural Language Processing Of Electronic Health Records, Jade Huang Singleton Jan 2022

Addressing Ascertainment Bias In The Study Of Cardiovascular Disease Burden In Opioid Use Disorders - Application Of Natural Language Processing Of Electronic Health Records, Jade Huang Singleton

Theses and Dissertations--Epidemiology and Biostatistics

In the United States, the prevalence of long-term exposure to opioid drugs, for both medically and nonmedically indicated purposes, has increased considerably since the mid-1990’s. Concerns have emerged about the potential health effects of opioid use. There is also growing interest in other possible connections with opioid use including cardiovascular disease. Electronic health records (EHR) contain information about patient care in the form of structured codes and unstructured notes. Natural language processing (NLP) provides a tool for processing unstructured textual data in EHR clinical notes and extracts useful information for research with structured formats. The purpose of this dissertation was …


Statistical Theory For Specialized Linear Regression Adjustment Methods Compared To Multiple Linear Regression In The Presence And Absence Of Interaction Effects, Leon Su Jan 2022

Statistical Theory For Specialized Linear Regression Adjustment Methods Compared To Multiple Linear Regression In The Presence And Absence Of Interaction Effects, Leon Su

Theses and Dissertations--Statistics

When building models to investigate outcomes and variables of interest, researchers often want to adjust for other variables. There is a variety of ways that these adjustments are performed. In this work, we will consider four approaches to adjustment utilized by researchers in various fields. We will compare the efficacy of these methods to what we call the ”true model method”, fitting a multiple linear regression model in which adjustment variables are model covariates. Our goal is to show that these adjustment methods have inferior performance to the true model method by comparing model parameter estimates, power, type I error, …


Beta Mixture And Contaminated Model With Constraints And Application With Micro-Array Data, Ya Qi Jan 2022

Beta Mixture And Contaminated Model With Constraints And Application With Micro-Array Data, Ya Qi

Theses and Dissertations--Statistics

This dissertation research is concentrated on the Contaminated Beta(CB) model and its application in micro-array data analysis. Modified Likelihood Ratio Test (MLRT) introduced by [Chen et al., 2001] is used for testing the omnibus null hypothesis of no contamination of Beta(1,1)([Dai and Charnigo, 2008]). We design constraints for two-component CB model, which put the mode toward the left end of the distribution to reflect the abundance of small p-values of micro-array data, to increase the test power. A three-component CB model might be useful when distinguishing high differentially expressed genes and moderate differentially expressed genes. If the null hypothesis above …


Investigations Into The Genetics Of Mixed Pathologies In Dementia, Adam Dugan Jan 2021

Investigations Into The Genetics Of Mixed Pathologies In Dementia, Adam Dugan

Theses and Dissertations--Epidemiology and Biostatistics

Alzheimer’s disease (AD) is an irreversible, progressive brain disorder that leads to a loss of memory and thinking skills. While tremendous progress has been made in our understanding of the genetics underlying AD, currently known genetic variants explain only approximately 30% of the heritable risk of developing AD. One hurdle to AD research is that it can only be definitively diagnosed at autopsy, making cruder, clinic-based diagnoses more common. In recent years, several brain pathologies that mimic AD’s clinical presentation have been identified including brain arteriolosclerosis, hippocampal sclerosis (HS), and, most recently, limbic-predominant age-related TDP-43 encephalopathy (LATE). It has become …


Evaluating The Incidence Of Melanoma And Lung Cancer Of Current And Former Active-Duty U.S. Military Who Were Deployed In Support Of Operation Enduring Freedom And Operation Iraqi Freedom, Brian Kovacic Jan 2021

Evaluating The Incidence Of Melanoma And Lung Cancer Of Current And Former Active-Duty U.S. Military Who Were Deployed In Support Of Operation Enduring Freedom And Operation Iraqi Freedom, Brian Kovacic

Theses and Dissertations--Epidemiology and Biostatistics

The incidence of melanoma and lung cancer has been gradually increasing in the United States over the past three decades with the reputed causes due to etiological and environmental exposures, and tobacco usage. There has been concern that melanoma and lung cancer incidence among military personnel may be associated with deployment to environments with intense sun exposure and increased smoking rates due to post-traumatic stress disorder. The aim of this study was to examine associations between deployment in support of Operation Enduring Freedom (OEF) or Operation Iraqi Freedom (OIF), from 2001 through 2015, with subsequent melanoma and lung cancer incidence. …


Design And Analyses Of School-Based Violence Prevention Cluster Randomized Trials, Md. Tofial Azam Jan 2021

Design And Analyses Of School-Based Violence Prevention Cluster Randomized Trials, Md. Tofial Azam

Theses and Dissertations--Epidemiology and Biostatistics

Interpersonal violence such as teen dating violence is a severe public health problem. Teen dating violence, including sexual violence (unwanted sexual contacts or activities), physical and psychological dating violence, sexual harassment, and stalking, affects high school students' physical and mental health and academic achievement in the United States. Dating violence is linked to psychological abuse perpetration in the future, depression, anxiety, and hostility. The teen dating violence victimization experience was related to antisocial behavior, drug abuse, increased heavy drinking, depression, suicidal ideation, smoking, and adult interpersonal violence victimization during adolescence. The detrimental effects of interpersonal violence demonstrate the critical importance …


Estimating And Testing Treatment Effects With Misclassified Multivariate Data, Zi Ye Jan 2021

Estimating And Testing Treatment Effects With Misclassified Multivariate Data, Zi Ye

Theses and Dissertations--Statistics

Clinical trials are often used to assess drug efficacy and safety. Participants are sometimes pre-stratified into different groups by diagnostic tools. However, these diagnostic tools are fallible. The traditional method ignores this problem and assumes the diagnostic devices are perfect. This assumption will lead to inefficient and biased estimators. In this era of personalized medicine and measurement-based care, the issues of bias and efficiency are of paramount importance. Despite the prominence, only few researches evaluated the treatment effect in the presence of misclassifications in some special cases and most others focus on assessing the accuracy of the diagnostic devices. In …


Dimension Reduction Techniques In Regression, Pei Wang Jan 2021

Dimension Reduction Techniques In Regression, Pei Wang

Theses and Dissertations--Statistics

Because of the advances of modern technology, the size of the collected data nowadays is larger and the structure is more complex. To deal with such kinds of data, sufficient dimension reduction (SDR) and reduced rank (RR) regression are two powerful tools. This dissertation focuses on these two tools and it is composed of three projects. In the first project, we introduce a new SDR method through a novel approach of feature filter to recover the central mean subspace exhaustively along with a method to determine the dimension, two variable selection methods, and extensions to multivariate response and large p …


Maternal Proximity To Mountaintop Removal Mining And Birth Defects In Appalachian Kentucky, 1997-2003, Daniel B. Cooper Jan 2021

Maternal Proximity To Mountaintop Removal Mining And Birth Defects In Appalachian Kentucky, 1997-2003, Daniel B. Cooper

Theses and Dissertations--Public Health (M.P.H. & Dr.P.H.)

Background: Extraction of coal through mountaintop removal mining (MTR) alters many dimensions of the landscape, and explosive blasts, exposed rock, and coal washing have the potential to pollute air and water with substances known to increase risk of developmental and birth anomalies. Previous research suggests that infants born to mothers living in MTR coal mining counties have higher prevalence of most types of birth defects.

Objectives: This study seeks to examine further the relationship between MTR activity and birth defects by employing individual level exposure estimation through precise satellite data of MTR activity in the Appalachian region and maternal residence …


Sexual Behaviors Associated With Online Partner-Seeking Among Men Who Have Sex With Men From Small/Midsized Towns Or Rural Areas In Kentucky, Vira Pravosud Jan 2021

Sexual Behaviors Associated With Online Partner-Seeking Among Men Who Have Sex With Men From Small/Midsized Towns Or Rural Areas In Kentucky, Vira Pravosud

Theses and Dissertations--Epidemiology and Biostatistics

The HIV epidemic remains one of the most significant public health issues in the United States, particularly among men who have sex with men (MSM). New avenues for partner-seeking have emerged over the past three decades, including through the Internet, social media, and geosocial networking applications. Consisting of three cross-sectional studies, this dissertation research aimed to determine associations between the use of various online tools for partner-seeking (hereafter collectively referred to as “apps”) and HIV-related sexual behaviors among 252 young adult MSM residing in small/midsized towns or rural areas in Central Kentucky, a group that has been under-represented in the …


Novel Nonparametric Testing Approaches For Multivariate Growth Curve Data: Finite-Sample, Resampling And Rank-Based Methods, Ting Zeng Jan 2021

Novel Nonparametric Testing Approaches For Multivariate Growth Curve Data: Finite-Sample, Resampling And Rank-Based Methods, Ting Zeng

Theses and Dissertations--Statistics

Multivariate growth curve data naturally arise in various fields, for example, biomedical science, public health, agriculture, social science and so on. For data of this type, the classical approach is to conduct multivariate analysis of variance (MANOVA) based on Wilks' Lambda and other multivariate statistics, which require the assumptions of multivariate normality and homogeneity of within-cell covariance matrices. However, data being analyzed nowadays show marked departure from multivariate normal distribution and homoscedasticity. In this dissertation, we investigate nonparametric testing approaches for multivariate growth curve data from three aspects, i.e., finite-sample, resampling and rank-based methods.

The first project proposes an approximate …


Novel Methods For Characterizing Conditional Quantiles In Zero-Inflated Count Regression Models, Xuan Shi Jan 2021

Novel Methods For Characterizing Conditional Quantiles In Zero-Inflated Count Regression Models, Xuan Shi

Theses and Dissertations--Statistics

Despite its popularity in diverse disciplines, quantile regression methods are primarily designed for the continuous response setting and cannot be directly applied to the discrete (or count) response setting. There can also be challenges when modeling count responses, such as the presence of excess zero counts, formally known as zero-inflation. To address the aforementioned challenges, we propose a comprehensive model-aware strategy that synthesizes quantile regression methods with estimation of zero-inflated count regression models. Various competing computational routines are examined, while residual analysis and model selection procedures are included to validate our method. The performance of these methods is characterized through …


Innovative Statistical Models In Cancer Immunotherapy Trial Design, Jing Wei Jan 2021

Innovative Statistical Models In Cancer Immunotherapy Trial Design, Jing Wei

Theses and Dissertations--Statistics

A challenge arising in cancer immunotherapy trial design is the presence of non-proportional hazards (NPH) patterns in survival curves. We considered three different NPH patterns caused by delayed treatment effect, cure rate and responder rate of treatment group in this dissertation. These three NPH patterns would violate the proportional hazard model assumption and ignoring any of them in an immunotherapy trial design will result in substantial loss of statistical power.

In this dissertation, four models to deal with NPH patterns are discussed. First, a piecewise proportional hazards model is proposed to incorporate delayed treatment effect into the trial design consideration. …


Nonparametric Analysis Of Clustered And Multivariate Data, Yue Cui Jan 2020

Nonparametric Analysis Of Clustered And Multivariate Data, Yue Cui

Theses and Dissertations--Statistics

In this dissertation, we investigate three distinct but interrelated problems for nonparametric analysis of clustered data and multivariate data in pre-post factorial design.

In the first project, we propose a nonparametric approach for one-sample clustered data in pre-post intervention design. In particular, we consider the situation where for some clusters all members are only observed at either pre or post intervention but not both. This type of clustered data is referred to us as partially complete clustered data. Unlike most of its parametric counterparts, we do not assume specific models for data distributions, intra-cluster dependence structure or variability, in effect …


Estimation Of The Treatment Effect With Bayesian Adjustment For Covariates, Li Xu Jan 2020

Estimation Of The Treatment Effect With Bayesian Adjustment For Covariates, Li Xu

Theses and Dissertations--Statistics

The Bayesian adjustment for confounding (BAC) is a Bayesian model averaging method to select and adjust for confounding factors when evaluating the average causal effect of an exposure on a certain outcome. We extend the BAC method to time-to-event outcomes. Specifically, the posterior distribution of the exposure effect on a time-to-event outcome is calculated as a weighted average of posterior distributions from a number of candidate proportional hazards models, weighing each model by its ability to adjust for confounding factors. The Bayesian Information Criterion based on the partial likelihood is used to compare different models and approximate the Bayes factor. …


Measuring Variability In Model Performance Measures, Matthew Rutledge Jan 2020

Measuring Variability In Model Performance Measures, Matthew Rutledge

Theses and Dissertations--Statistics

As data become increasingly available, statisticians are confronted with both larger sample sizes and larger numbers of predictors. While both of these factors are beneficial in building better predictive models and allowing for better inference, models can become difficult to interpret and often include variables of little practical significance. This dissertation provides methods that assist model builders to better understand and select from a collection of candidate models. We study the asymptotic distribution of AIC and propose a graphical tool to assist practitioners in comparing and contrasting candidate models. Real-world examples show how this graphic might be used and a …


Nonparametric Tests Of Lack Of Fit For Multivariate Data, Yan Xu Jan 2020

Nonparametric Tests Of Lack Of Fit For Multivariate Data, Yan Xu

Theses and Dissertations--Statistics

A common problem in regression analysis (linear or nonlinear) is assessing the lack-of-fit. Existing methods make parametric or semi-parametric assumptions to model the conditional mean or covariance matrices. In this dissertation, we propose fully nonparametric methods that make only additive error assumptions. Our nonparametric approach relies on ideas from nonparametric smoothing to reduce the test of association (lack-of-fit) problem into a nonparametric multivariate analysis of variance. A major problem that arises in this approach is that the key assumptions of independence and constant covariance matrix among the groups will be violated. As a result, the standard asymptotic theory is not …