Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability

2020

PDF

Institution
Keyword
Publication
Publication Type

Articles 1 - 30 of 599

Full-Text Articles in Physical Sciences and Mathematics

Statistical Modeling Of Quarrying Activities And Their Impact On Residents’ Satisfaction, Jefferson M. Domingues, Vania F. L. Miranda, Denise C. Rezende, Yara S. Lares, Saulo R. Ferreira, Izabela R. C. De Oliveira Dec 2020

Statistical Modeling Of Quarrying Activities And Their Impact On Residents’ Satisfaction, Jefferson M. Domingues, Vania F. L. Miranda, Denise C. Rezende, Yara S. Lares, Saulo R. Ferreira, Izabela R. C. De Oliveira

Journal of Environmental Science and Sustainable Development

This research aims to analyse the impact of quarrying on the health and perception of neighbouring communities. A standard questionnaire survey was conducted to collect data from quarry neighbours in a residential neighbourhood located in the city of Lavras, Minas Gerais, Brazil. Residences were distributed based on proximity to a quarrying company, resulting in three distances divided by three equally distant radii, named as Area I (closest to the quarrying company at 630 m), Area II (730 m), and Area III (farthest from the quarrying company at 830 m). Data gathered from 177 residents were analysed with logistic regression models. …


Dach1 Mutation Frequency In Endometrial Cancer Is Associated With High Tumor Mutation Burden, Mckayla J. Riggs, Nan Lin, Chi Wang, Dava W. Piecoro, Rachel W. Miller, Oliver A. Hampton, Mahadev Rao, Frederick R. Ueland, Jill M. Kolesar Dec 2020

Dach1 Mutation Frequency In Endometrial Cancer Is Associated With High Tumor Mutation Burden, Mckayla J. Riggs, Nan Lin, Chi Wang, Dava W. Piecoro, Rachel W. Miller, Oliver A. Hampton, Mahadev Rao, Frederick R. Ueland, Jill M. Kolesar

Obstetrics and Gynecology Faculty Publications

OBJECTIVE: DACH1 is a transcriptional repressor and tumor suppressor gene frequently mutated in melanoma, bladder, and prostate cancer. Loss of DACH1 expression is associated with poor prognostic features and reduced overall survival in uterine cancer. In this study, we utilized the Oncology Research Information Exchange Network (ORIEN) Avatar database to determine the frequency of DACH1 mutations in patients with endometrial cancer in our Kentucky population.

METHODS: We obtained clinical and genomic data for 65 patients with endometrial cancer from the Markey Cancer Center (MCC). We examined the clinical attributes of the cancers by DACH1 status by comparing whole-exome sequencing (WES), …


Medical Marijuana And Opioids (Memo) Study: Protocol Of A Longitudinal Cohort Study To Examine If Medical Cannabis Reduces Opioid Use Among Adults With Chronic Pain, Chinazo O. Cunningham, Joanna L. Starrels, Chenshu Zhang, Marcus A. Bachhuber, Nancy L. Sohler, Frances R. Levin, Haruka Minami, Deepika E. Slawek, Julia H. Arnsten Dec 2020

Medical Marijuana And Opioids (Memo) Study: Protocol Of A Longitudinal Cohort Study To Examine If Medical Cannabis Reduces Opioid Use Among Adults With Chronic Pain, Chinazo O. Cunningham, Joanna L. Starrels, Chenshu Zhang, Marcus A. Bachhuber, Nancy L. Sohler, Frances R. Levin, Haruka Minami, Deepika E. Slawek, Julia H. Arnsten

School of Medicine Faculty Publications

Introduction In the USA, opioid analgesic use and overdoses have increased dramatically. One rapidly expanding strategy to manage chronic pain in the context of this epidemic is medical cannabis. Cannabis has analgesic effects, but it also has potential adverse effects. Further, its impact on opioid analgesic use is not well studied. Managing pain in people living with HIV is particularly challenging, given the high prevalence of opioid analgesic and cannabis use. This study's overarching goal is to understand how medical cannabis use affects opioid analgesic use, with attention to Δ9-tetrahydrocannabinol and cannabidiol content, HIV outcomes and adverse events. Methods and …


The Family Of Bicircular Matroids Closed Under Duality, Vaidy Sivaraman, Daniel Slilaty Dec 2020

The Family Of Bicircular Matroids Closed Under Duality, Vaidy Sivaraman, Daniel Slilaty

Mathematics and Statistics Faculty Publications

We characterize the 3-connected members of the intersection of the class of bicircular and cobi- circular matroids. Aside from some exceptional matroids with rank and corank at most 5, this class consists of just the free swirls and their minors.


Analysis And Implementation Of The Maximum Likelihood Expectation Maximization Algorithm For Find, Angus Boyd Jameson Dec 2020

Analysis And Implementation Of The Maximum Likelihood Expectation Maximization Algorithm For Find, Angus Boyd Jameson

Student Research Projects

This thesis presents an organized explanation and breakdown of the Maximum Likelihood Expectation Maximization image reconstruction algorithm. This background research was used to develop a means of implementing the algorithm into the imaging code for UNH's Field Deployable Imaging Neutron Detector to improve its ability to resolve complex neutron sources. This thesis provides an overview for this implementation scheme, and include the results of a couple of reconstruction tests for the algorithm. A discussion is given on the current state of the algorithm and its integration with the neutron detector system, and suggestions are given for how the work and …


A Spectral Adjustment For Spatial Confounding, Yawen Guan, Garritt L. Page, Brian J. Reich, Massimo Ventrucci, Shu Yang Dec 2020

A Spectral Adjustment For Spatial Confounding, Yawen Guan, Garritt L. Page, Brian J. Reich, Massimo Ventrucci, Shu Yang

Department of Statistics: Faculty Publications

Adjusting for an unmeasured confounder is generally an intractable problem, but in the spatial setting it may be possible under certain conditions. In this paper, we derive necessary conditions on the coherence between the treatment variable of interest and the unmeasured confounder that ensure the causal effect of the treatment is estimable. We specify our model and assumptions in the spectral domain to allow for different degrees of confounding at different spatial resolutions. The key assumption that ensures identifiability is that confounding present at global scales dissipates at local scales. We show that this assumption in the spectral domain is …


Variation In Personality Among Semi-Wild Myanmar Timber Elephants, Sateesh Venkatesh Dec 2020

Variation In Personality Among Semi-Wild Myanmar Timber Elephants, Sateesh Venkatesh

Theses and Dissertations

This study examines two personality traits: exploration and neophobia, which could influence human-elephant conflicts. Thirty-one semi-wild elephants were tested over two trials using a custom novel puzzle tube containing three tasks and three rewards. Our studies show that elephants do vary significantly between individuals in both exploration and neophobia.


Bayesian Semi-Supervised Keyphrase Extraction And Jackknife Empirical Likelihood For Assessing Heterogeneity In Meta-Analysis, Guanshen Wang Dec 2020

Bayesian Semi-Supervised Keyphrase Extraction And Jackknife Empirical Likelihood For Assessing Heterogeneity In Meta-Analysis, Guanshen Wang

Statistical Science Theses and Dissertations

This dissertation investigates: (1) A Bayesian Semi-supervised Approach to Keyphrase Extraction with Only Positive and Unlabeled Data, (2) Jackknife Empirical Likelihood Confidence Intervals for Assessing Heterogeneity in Meta-analysis of Rare Binary Events.

In the big data era, people are blessed with a huge amount of information. However, the availability of information may also pose great challenges. One big challenge is how to extract useful yet succinct information in an automated fashion. As one of the first few efforts, keyphrase extraction methods summarize an article by identifying a list of keyphrases. Many existing keyphrase extraction methods focus on the unsupervised setting, …


Examining Multiple Imputation For Measurement Error Correction In Count Data With Excess Zeros, Shalima Zalsha Dec 2020

Examining Multiple Imputation For Measurement Error Correction In Count Data With Excess Zeros, Shalima Zalsha

Statistical Science Theses and Dissertations

Measurement error and missing data are two common problems in wildlife population surveys. These data are collected from the environment and may be missing or measured with error when the observer’s ability to see the animal is obscured. Methods such as video transects for estimating red snapper abundance and aerial surveys for estimating moose population sizes are highly affected by these problems since total abundance will be underestimated if missing/mismeasured counts are ignored. We shall refer to this problem as visibility bias; it occurs when the true counts are observed when visibility is high, partially observed when visibility is low …


Improved Statistical Methods For Time-Series And Lifetime Data, Xiaojie Zhu Dec 2020

Improved Statistical Methods For Time-Series And Lifetime Data, Xiaojie Zhu

Statistical Science Theses and Dissertations

In this dissertation, improved statistical methods for time-series and lifetime data are developed. First, an improved trend test for time series data is presented. Then, robust parametric estimation methods based on system lifetime data with known system signatures are developed.

In the first part of this dissertation, we consider a test for the monotonic trend in time series data proposed by Brillinger (1989). It has been shown that when there are highly correlated residuals or short record lengths, Brillinger’s test procedure tends to have significance level much higher than the nominal level. This could be related to the discrepancy between …


Bayesian Modeling For Longitudinal Count Data: Applications In Biomedical Research, Morshed Alam Dec 2020

Bayesian Modeling For Longitudinal Count Data: Applications In Biomedical Research, Morshed Alam

Theses & Dissertations

Biomedical count data such as the number of seizures for epilepsy patients, number of new tumors at each visit or the number vomiting after each chemo-radiation for the cancer patients are common. Often these counts are measured longitudinally from patients or within clusters in multi-site trials. The Poisson and negative binomial models may not be adequate when data exhibit over or under-dispersion, respectively. On the contrary, a variety of dispersion conditions in count data can be captured by Conway-Maxwell Poisson (CMP) model.

This doctoral dissertation relegates to developing a statistical methodology to model longitudinal count data distributed as CMP via …


Multi-Level Small Area Estimation Based On Calibrated Hierarchical Likelihood Approach Through Bias Correction With Applications To Covid-19 Data, Nirosha Rathnayake Dec 2020

Multi-Level Small Area Estimation Based On Calibrated Hierarchical Likelihood Approach Through Bias Correction With Applications To Covid-19 Data, Nirosha Rathnayake

Theses & Dissertations

Small area estimation (SAE) has been widely used in a variety of applications to draw estimates in geographic domains represented as a metropolitan area, district, county, or state. The direct estimation methods provide accurate estimates when the sample size of study participants within each area unit is sufficiently large, but it might not always be realistic to have large sample sizes of study participants when considering small geographical regions. Meanwhile, high dimensional socio-ecological data exist at the community level, providing an opportunity for model-based estimation by incorporating rich auxiliary information at the individual and area levels. Thus, it is critical …


Influence Of Some Climatic Elements On Radon Concentration In Saeva Dupka Cave, Bulgaria, Peter Nojarov, Petar Stefanov, Karel Turek Dec 2020

Influence Of Some Climatic Elements On Radon Concentration In Saeva Dupka Cave, Bulgaria, Peter Nojarov, Petar Stefanov, Karel Turek

International Journal of Speleology

This study reveals the influence of some climatic elements on radon concentration in Saeva Dupka Cave, Bulgaria. The research is based mainly on statistical methods. Radon concentration in the cave is determined by two main mechanisms. The first one is through penetration of radon from soil and rocks around the cave (present all year round, but has leading role during the warm half of the year). The second one is through thermodynamic exchange of air between inside of the cave and outside atmosphere (cold half of the year). Climatic factors that affect radon concentration in the cave are temperatures (air, …


Adaptive Ensemble Of Classifiers With Regularization For Imbalanced Data Classification, Chen Wang, Chengyuan Deng, Zhoulu Yu, Dafeng Hui, Xiaofeng Gong, Ruisen Luo Dec 2020

Adaptive Ensemble Of Classifiers With Regularization For Imbalanced Data Classification, Chen Wang, Chengyuan Deng, Zhoulu Yu, Dafeng Hui, Xiaofeng Gong, Ruisen Luo

Biology Faculty Research

The dynamic ensemble selection of classifiers is an effective approach for processing label-imbalanced data classifications. However, such a technique is prone to overfitting, owing to the lack of regularization methods and the dependence on local geometry of data. In this study, focusing on binary imbalanced data classification, a novel dynamic ensemble method, namely adaptive ensemble of classifiers with regularization (AER), is proposed, to overcome the stated limitations. The method solves the overfitting problem through a new perspective of implicit regularization. Specifically, it leverages the properties of stochastic gradient descent to obtain the solution with the minimum norm, thereby achieving regularization; …


Exponential And Hypoexponential Distributions: Some Characterizations, George Yanev Dec 2020

Exponential And Hypoexponential Distributions: Some Characterizations, George Yanev

School of Mathematical and Statistical Sciences Faculty Publications and Presentations

The (general) hypoexponential distribution is the distribution of a sum of independent exponential random variables. We consider the particular case when the involved exponential variables have distinct rate parameters. We prove that the following converse result is true. If for some n ≥ 2, X1, X2, . . . , Xn are independent copies of a random variable X with unknown distribution F and a specific linear combination of Xj ’s has hypoexponential distribution, then F is exponential. Thus, we obtain new characterizations of the exponential distribution. As corollaries of the main results, we extend some previous characterizations established recently …


The Local Stability Of A Modified Multi-Strain Sir Model For Emerging Viral Strains, Miguel Fudolig, Reka Howard Dec 2020

The Local Stability Of A Modified Multi-Strain Sir Model For Emerging Viral Strains, Miguel Fudolig, Reka Howard

Department of Statistics: Faculty Publications

We study a novel multi-strain SIR epidemic model with selective immunity by vaccination. A newer strain is made to emerge in the population when a preexisting strain has reached equilbrium. We assume that this newer strain does not exhibit cross-immunity with the original strain, hence those who are vaccinated and recovered from the original strain become susceptible to the newer strain. Recent events involving the COVID-19 virus shows that it is possible for a viral strain to emerge from a population at a time when the influenza virus, a well-known virus with a vaccine readily available, is active in a …


Confirmative Evaluation: New Cipp Evaluation Model, Tia L. Finney Dec 2020

Confirmative Evaluation: New Cipp Evaluation Model, Tia L. Finney

Journal of Modern Applied Statistical Methods

Struggling trainees often require a substantial investment of time, effort, and resources from medical educators. An emergent challenge involves developing effective ways to accurately identify struggling students and better understand the primary causal factors underlying their poor performance. Identifying the potential reasons for poor performance in medical school is a key first step in developing suitable remediation plans. The SOM Modified Program is a remediation program that aims to ensure academic success for medical students. The purpose of this study is to determine the impact of modifying the CIPP evaluation model by adding a confirmative evaluation step to the model. …


Can Statcast Variables Explain The Variation In Weighted Runs Created Plus?, Ryan Kupiec Dec 2020

Can Statcast Variables Explain The Variation In Weighted Runs Created Plus?, Ryan Kupiec

Student Research

The release of Statcast data in 2015 was revolutionary for data analysis in the game of baseball. Many analysts have begun using this data regularly, but none have used it exclusively. Often older, less reliable statistics (on-base percentage) are still used in favor of the newer statistics (weighted runs created plus). In this paper, we attempt to explain the variation in weighted runs created plus (wRC+) using Statcast variables such as exit velocity and launch angle. We find that exit velocity along with other Statcast variables, can explain as much as 70% of the variation in wRC+. Launch angle can …


The Effect Of Area Deprivation On Covid-19 Risk In Louisiana, K. C. Madhav, Evrim Oral, Susanne Straif-Bourgeois, Ariane L. Rung, Edward S. Peters Dec 2020

The Effect Of Area Deprivation On Covid-19 Risk In Louisiana, K. C. Madhav, Evrim Oral, Susanne Straif-Bourgeois, Ariane L. Rung, Edward S. Peters

School of Public Health Faculty Publications

Background Louisiana in the summer of 2020 had the highest per capita case count for COVID-19 in the United States and COVID-19 deaths disproportionately affects the African American population. Neighborhood deprivation has been observed to be associated with poorer health outcomes. The purpose of this study was to examine the relationship between neighborhood deprivation and COVID-19 in Louisiana. Methods The Area Deprivation Index (ADI) was calculated and used to classify neighborhood deprivation at the census tract level. A total of 17 US census variables were used to calculate the ADI for each of the 1148 census tracts in Louisiana. The …


Cellphone Laws And Teens' Calling While Driving: Analysis Of Repeated Cross-Sectional Surveys In 2013, 2015, 2017, And 2019, Li Li, Caitlin N. Pope, Rebecca R. Andridge, Julie K. Bower, Guoqing Hu, Motao Zhu Dec 2020

Cellphone Laws And Teens' Calling While Driving: Analysis Of Repeated Cross-Sectional Surveys In 2013, 2015, 2017, And 2019, Li Li, Caitlin N. Pope, Rebecca R. Andridge, Julie K. Bower, Guoqing Hu, Motao Zhu

Graduate Center for Gerontology Faculty Publications

BACKGROUND: Distracted driving among teens is a public health and safety concern. Most states in the U.S. have sought to restrict cellphone use while driving by enacting laws. This study examines the difference in prevalence of self-reported calling while driving (CWD) between states with different cellphone bans.

METHODS: Demographics and CWD data were extracted from state Youth Risk Behavior Surveys (YRBS) from 14 states in 2013, 2015, 2017, and 2019. The state YRBS is conducted every 2 years with a representative sample of 9th through 12th grade students attending public school. States were grouped by type of cellphone law(s): no …


Principal Component Analysis For Predicting The Party Of The Legislators, Afsana Mimi Dec 2020

Principal Component Analysis For Predicting The Party Of The Legislators, Afsana Mimi

Publications and Research

In Spring 2020, I did a project, "Decision Tree Predicting the Party of Legislators," and construct a decision tree model to predict legislators' parties' based on their votes. We also use this model to identify legislators who frequently voted against their parties. We used the legislators' roll call votes, Office of Clerk U.S. House of Representatives Data Sets (Categorical values) collected in 2018 and 2019. In this new project, We study the 2018 and 2019 vote data using Principal Component Analysis (PCA). The goal is to find a (compressed) model using unsupervised learning to distinguish the legislators' parties, and PCA …


A Brief On Optimal Transport, Austin G. Vandegriffe Dec 2020

A Brief On Optimal Transport, Austin G. Vandegriffe

Graduate Student Research & Creative Works

Optimal transport is an interesting and exciting application of measure theory to optimization and analysis. In the following, I will bring you through a detailed treatment of random variable couplings, transport plans, basic properties of transport plans, and finishing with the Wasserstein distance on spaces of probability measures with compact support. No detail is left out in this presentation, but some results have further generality and more intricate consequences when tools like measure disintegration are used. But this is left for future work.


A Brief On Characteristic Functions, Austin G. Vandegriffe Dec 2020

A Brief On Characteristic Functions, Austin G. Vandegriffe

Graduate Student Research & Creative Works

Characteristic functions (CFs) are often used in problems involving convergence in distribution, independence of random variables, infinitely divisible distributions, and stochastics. The most famous use of characteristic functions is in the proof of the Central Limit Theorem, also known as the Fundamental Theorem of Statistics. Though less frequent, CFs have also been used in problems of nonparametric time series analysis and in machine learning. Moreover, CFs uniquely determine their distribution, much like the moment generating functions (MGFs), but the major difference is that CFs always exists, whereas MGFs can fail, e.g. the Cauchy distribution. This makes CFs more robust in …


Dynamic Neuromechanical Sets For Locomotion, Aravind Sundararajan Dec 2020

Dynamic Neuromechanical Sets For Locomotion, Aravind Sundararajan

Doctoral Dissertations

Most biological systems employ multiple redundant actuators, which is a complicated problem of controls and analysis. Unless assumptions about how the brain and body work together, and assumptions about how the body prioritizes tasks are applied, it is not possible to find the actuator controls. The purpose of this research is to develop computational tools for the analysis of arbitrary musculoskeletal models that employ redundant actuators. Instead of relying primarily on optimization frameworks and numerical methods or task prioritization schemes used typically in biomechanics to find a singular solution for actuator controls, tools for feasible sets analysis are instead developed …


Application Of Crowdsourced Data In Transportation Operations And Safety, Nima Hoseinzadeh Dec 2020

Application Of Crowdsourced Data In Transportation Operations And Safety, Nima Hoseinzadeh

Doctoral Dissertations

Crowdsourcing refers to the acquisition of data from users who contribute their information via smartphone, social media, or the internet. In transportation systems, crowdsourcing turns users into real-time sensors, providing data on traffic speed, travel time, mile traveled, incidents, roadway conditions, weather severity, irregularities in traffic patterns, and hazards. These data can be collected actively or passively in quantitative or qualitative forms. With the emergence of smartphones and navigation apps, crowdsourced data are gaining increased attention in transportation. Crowdsourced data have advantages over traditional fixed-location sensors and camera monitoring: low implementation costs, extended geographic coverage, high resolution, real-time application, increased …


Nonparametric Bayesian Deep Learning For Scientific Data Analysis, Devanshu Agrawal Dec 2020

Nonparametric Bayesian Deep Learning For Scientific Data Analysis, Devanshu Agrawal

Doctoral Dissertations

Deep learning (DL) has emerged as the leading paradigm for predictive modeling in a variety of domains, especially those involving large volumes of high-dimensional spatio-temporal data such as images and text. With the rise of big data in scientific and engineering problems, there is now considerable interest in the research and development of DL for scientific applications. The scientific domain, however, poses unique challenges for DL, including special emphasis on interpretability and robustness. In particular, a priority of the Department of Energy (DOE) is the research and development of probabilistic ML methods that are robust to overfitting and offer reliable …


Root Stage Distributions And Their Importance In Plant-Soil Feedback Models, Tyler Poppenwimer Dec 2020

Root Stage Distributions And Their Importance In Plant-Soil Feedback Models, Tyler Poppenwimer

Doctoral Dissertations

Roots are fundamental to PSFs, being a key mediator of these feedbacks by interacting with and affecting the soil environment and soil microbial communities. However, most PSF models aggregate roots into a homogeneous component or only implicitly simulate roots via functions. Roots are not homogeneous and root traits (nutrient and water uptake, turnover rate, respiration rate, mycorrhizal colonization, etc.) vary with age, branch order, and diameter. Trait differences among a plant’s roots lead to variation in root function and roots can be disaggregated according to their function. The impact on plant growth and resource cycling of changes in the distribution …


A Management Strategy Evaluation Of The Impacts Of Interspecific Competition And Recreational Fishery Dynamics On Vermilion Snapper (Rhomboplites Aurorubens) In The Gulf Of Mexico, Megumi C. Oshima Dec 2020

A Management Strategy Evaluation Of The Impacts Of Interspecific Competition And Recreational Fishery Dynamics On Vermilion Snapper (Rhomboplites Aurorubens) In The Gulf Of Mexico, Megumi C. Oshima

Dissertations

In the Gulf of Mexico (GOM), Vermilion Snapper (Rhomboplites auroruben), are believed to compete with Red Snapper directly for prey and habitat. The two species share similar diets and have significant spatial overlap in the Gulf. Red Snapper are thought to be the dominate competitor, forcing Vermilion Snapper to feed on less nutritious prey when local resources are depleted. In addition to ecological pressures, GOM Vermilion Snapper support substantial commercial and recreational fisheries. Over the past decade, recreational landings have steadily increased, reaching a historical high in 2018. One cause may be stricter regulations for similar target species such as …


Survival Analysis: An Exact Method For Rare Events, Kristina Reutzel Dec 2020

Survival Analysis: An Exact Method For Rare Events, Kristina Reutzel

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

Conventional asymptotic methods for survival analysis work well when sample sizes are at least moderately sufficient. When dealing with small sample sizes or rare events, the results from these methods have the potential to be inaccurate or misleading. To handle such data, an exact method is proposed and compared against two other methods: 1) the Cox proportional hazards model and 2) stratified logistic regression for discrete survival analysis data.


Nonparametric Estimation Of Trend Function For Stochastic Differential Equations Driven By A Weighted Fractional Brownian Motion, Abdelmalik Keddi, Fethi Madani, Amina A. Bouchentouf Dec 2020

Nonparametric Estimation Of Trend Function For Stochastic Differential Equations Driven By A Weighted Fractional Brownian Motion, Abdelmalik Keddi, Fethi Madani, Amina A. Bouchentouf

Applications and Applied Mathematics: An International Journal (AAM)

In this paper, we consider the problem of nonparametric estimation of trend function for stochastic differential equations driven by a weighted fractional Brownian motion (weighted-fBm). Under some general conditions, the consistent uniform, the rate of convergence as well as the asymptotic normality of our estimator are established. In addition, a numerical example is provided to illustrate the validity of the considered estimator.