Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Theses/Dissertations

Statistics and Probability

2021

Institution
Keyword
Publication

Articles 1 - 30 of 261

Full-Text Articles in Physical Sciences and Mathematics

Parameter Estimation And Inference Of Spatial Autoregressive Model By Stochastic Gradient Descent, Gan Luan Dec 2021

Parameter Estimation And Inference Of Spatial Autoregressive Model By Stochastic Gradient Descent, Gan Luan

Dissertations

Stochastic gradient descent (SGD) is a popular iterative method for model parameter estimation in large-scale data and online learning settings since it goes through the data in only one pass. While SGD has been well studied for independent data, its application to spatially-correlated data largely remains unexplored. This dissertation develops SGD-based parameter estimation and statistical inference algorithms for the spatial autoregressive (SAR) model, a common model for spatial lattice data.

This research contains three parts. (I) The first part concerns SGD estimation and inference for the SAR mean regression model. A new SGD algorithm based on maximum likelihood estimator (MLE) …


Dependent Censoring In Survival Analysis, Zhongcheng Lin Dec 2021

Dependent Censoring In Survival Analysis, Zhongcheng Lin

Dissertations

This dissertation mainly consists of two parts. In the first part, some properties of bivariate Archimedean Copulas formed by two time-to-event random variables are discussed under the setting of left censoring, where these two variables are subject to one left-censored independent variable respectively. Some distributional results for their joint cdf under different censoring patterns are presented. Those results are expected to be useful in both model fitting and checking procedures for Archimedean copula models with bivariate left-censored data. As an application of the theoretical results that are obtained, a moment estimator of the dependence parameter in Archimedean copula models is …


Approximate Likelihood Based Estimations For Joint Models With Intractable Likelihoods, Karl Stessy M. Bisselou Dec 2021

Approximate Likelihood Based Estimations For Joint Models With Intractable Likelihoods, Karl Stessy M. Bisselou

Theses & Dissertations

This dissertation focuses on the development of approximation approaches for the joint modeling (JM) of repeated measures data and time-to-event data in the presence of analytically or numerically intractable likelihoods. Current likelihood-based inferences for JMs show several limitations including (i) intractability of integrals during marginal likelihood derivations due to the complexity in computations, and (ii) the large number of nuisance parameters (unobserved) posing a problem with convergence. The h-likelihood (HL) and synthetic likelihood (SL) are two computationally efficient estimation approaches that overcome these challenges.

In the presence of extremely high censoring rates, the HL can produce bias parameter estimates. We …


Smoking, Alcohol Consumption, And Depression In Association With Incidence Of Type 2 Diabetes Among Mexican Americans In Starr County, Texas, Gabriela Rubannelsonkumar Dec 2021

Smoking, Alcohol Consumption, And Depression In Association With Incidence Of Type 2 Diabetes Among Mexican Americans In Starr County, Texas, Gabriela Rubannelsonkumar

Honors Program Theses and Research Projects

Previous studies on conditions like obesity, hypertension, and type 2 diabetes mellitus (T2DM) have explored the correlations between them and various other human conditions, including aortic stiffness, left ventricular hypertrophy and sleep apnea, as they predict possibilities of developing certain diseases in Mexican Americans. This study aims to observe the correlation between lifestyle decisions that could relate to the onset of the depression in normal, prediabetic, and diabetic individuals. These include smoking habits and alcohol consumption. Many papers have previously conducted research on these lifestyle habits as they relate to obesity, hypertension, diabetes, however, have done so in a singular …


The Development Of Authentic Virtual Reality Scenarios To Measure Individuals’ Level Of Systems Thinking Skills And Learning Abilities, Vidanelage L. Dayarathna Dec 2021

The Development Of Authentic Virtual Reality Scenarios To Measure Individuals’ Level Of Systems Thinking Skills And Learning Abilities, Vidanelage L. Dayarathna

Theses and Dissertations

This dissertation develops virtual reality modules to capture individuals’ learning abilities and systems thinking skills in dynamic environments. In the first chapter, an immersive queuing theory teaching module is developed using virtual reality technology. The objective of the study is to present systems engineering concepts in a more sophisticated environment and measure students learning abilities. Furthermore, the study explores the performance gaps between male and female students in manufacturing systems concepts. To investigate the gender biases toward the performance of developed VR module, three efficacy measures (simulation sickness questionnaire, systems usability scale, and presence questionnaire) and two effectiveness measures (NASA …


Teacher Education Programs Of Top Pisa Scoring Countries, Stephanie Kafer Dec 2021

Teacher Education Programs Of Top Pisa Scoring Countries, Stephanie Kafer

Honors Projects

This research paper aims to investigate the teacher education programs of four different countries that have consistently scored high on the international Programme for International Student Assessment (PISA) test. This project intends to answer two questions: What locations consistently perform high on the Programme for International Student Assessment (PISA) test? What do the teacher training programs look like for these locations and are there commonalities between programs of different locations? The first question is answered using statistics of PISA scores from the past twenty years and from those statistics, the top four countries that this paper focuses on are Finland, …


Functional Mixed Data Clustering With Fourier Basis Smoothing, Ishmael Amartey Dec 2021

Functional Mixed Data Clustering With Fourier Basis Smoothing, Ishmael Amartey

Electronic Theses and Dissertations

Clustering is an important analytical technique that has proven to affect human life positively through its application in cancer research, market segmentation, city planning etc. In this time of growing technological systems, mixed data has seen another face of longitudinal, directional and functional attributes which is worth paying attention to and analyzing. Previous research works on clustering relied largely on the inverse weight technique and B-spline in smoothing data and assessing the performance of various clustering algorithms. In 1971, Gower proposed a method of clustering for mixed variable types which has been extended to include functional and directional variables by …


The Physiological Factors Of Diabetes And Their Effect On The Cognitive And Emotional Functioning In Older Populations: A Secondary Data Analysis, Celeste Anahi Alvidrez Dec 2021

The Physiological Factors Of Diabetes And Their Effect On The Cognitive And Emotional Functioning In Older Populations: A Secondary Data Analysis, Celeste Anahi Alvidrez

Open Access Theses & Dissertations

Background: The rates of Type 2 Diabetes (T2D) have increased over the past 20 years in all age groups. The physiological factors that underlie T2D could have impact on specific brain pathways that support cognitive and emotional functioning. Aims and Objective: The goal of this study was to examine whether older Mexican American individuals with a history of T2D were more likely to develop later cognitive impairment and/or depression. Hypotheses: It was predicted that elderly participants (mean age at time of interview = 87.87 years) with a history of T2D onset prior to age 65, are more likely to have …


Factors Influencing Intent To Take A Covid-19 Test In The United States, Sheila Rutto Dec 2021

Factors Influencing Intent To Take A Covid-19 Test In The United States, Sheila Rutto

Theses and Dissertations

In 2020, COVID-19 became the first pandemic in the world’s history that brought the entire world to an abrupt and unexpected halt. Since the first reported case of the disease to date, the novel coronavirus has been able to wreak havoc in literary every corner of the globe and left an ever-growing number of unprecedented fatalities. The normal way of life has been disrupted, and the level of uncertainty about the end of this pandemic continues to manifest to many. Due to the urgency to bring this pandemic under control, medical officers have been able to recommend actions that people …


Comparison Of Statistical Methods For Modeling Count Data With An Application To Length Of Hospital Stay, Gustavo A. Fernandez Dec 2021

Comparison Of Statistical Methods For Modeling Count Data With An Application To Length Of Hospital Stay, Gustavo A. Fernandez

Theses and Dissertations

Hospital length of stay (LOS) is a key indicator of hospital care management efficiency, cost of care, and hospital planning. Therefore, understanding hospital LOS variability is always an important healthcare focus. Hospital LOS data are count data, with discrete and nonnegative values, typically right-skewed, and often exhibiting excessive zeros. Numerous studies have been conducted to model hospital LOS to identify significant predictors contributing to its variability. Many researchers have used linear regression with or without logarithmic transformation of the outcome variable LOS, or logistic regression on a dichotomized LOS. These regression methods usually violate models’ assumptions and are subject …


A New Algorithm For Robust Affine-Invariant Clustering, Andrews Tawiah Anum Dec 2021

A New Algorithm For Robust Affine-Invariant Clustering, Andrews Tawiah Anum

Open Access Theses & Dissertations

Cluster analysis is an unsupervised machine learning technique commonly employed to partition a dataset into distinct categories referred to as clusters. The k-means algorithm is a prominent distance-based clustering method. Despite its overwhelming popularity, the algorithm is not invariant under non-singular linear transformations and is not robust, i.e., can be unduly influenced by outliers. To address these deficiencies, we propose an alternative clustering procedure based on minimizing a “trimmed” variant of the negative log-likelihood function. We develop a “concentration step”, vaguely reminiscent of the classical Lloyd’s algorithm, that can iteratively reduce the objective function. Multiple real and synthetic datasets are …


Confidence Interval For The Mean Of A Beta Distribution, Sean Rangel Dec 2021

Confidence Interval For The Mean Of A Beta Distribution, Sean Rangel

Electronic Theses and Dissertations

Statistical inference for the mean of a beta distribution has become increasingly popular in various fields of academic research. In this study, we developed a novel statistical model from likelihood-based techniques to evaluate various confidence interval techniques for the mean of a beta distribution. Simulation studies will be implemented to compare the performance of the confidence intervals. In addition to the development and study involving confidence intervals, we will also apply the confidence intervals to real biological data that was gathered by the Department of Biology at Stephen F. Austin State University and provide recommendations on the best practice.


Statistical Modeling, Learning And Computing For Stochastic Dynamics Of Complex Systems, Mohammadmahdi Hajiha Dec 2021

Statistical Modeling, Learning And Computing For Stochastic Dynamics Of Complex Systems, Mohammadmahdi Hajiha

Graduate Theses and Dissertations

With the recent advances in sensor technology, it is much easier to collect and store streams of system operational and environmental (SOE) data. These data can be used as input to model the underlying behavior of complex engineered systems and phenomenons if appropriate algorithms with well-defined assumptions are developed. This dissertation is comprised of the research work to show the applicability of SOE data when fed into proposed tailored algorithms. The first purposes of these algorithms are to estimate and analyze the reliability of a system as elaborated in Chapter 2. This chapter provides the derivation of closed-form expressions that …


Integration Of Blockchain Technology Into Automobiles To Prevent And Study The Causes Of Accidents, John Kim Dec 2021

Integration Of Blockchain Technology Into Automobiles To Prevent And Study The Causes Of Accidents, John Kim

Electronic Theses, Projects, and Dissertations

Automobile collisions occur daily. We now live in an information-driven world, one where technology is quickly evolving. Blockchain technology can change the automotive industry, the safety of the motoring public and its surrounding environment by incorporating this vast array of information. It can place safety and efficiency at the forefront to pedestrians, public establishments, and provide public agencies with pertinent information securely and efficiently. Other industries where Blockchain technology has been effective in are as follows: supply chain management, logistics, and banking. This paper reviews some statistical information regarding automobile collisions, Blockchain technology, Smart Contracts, Smart Cities; assesses the feasibility …


A Copula Model Approach To Identify The Differential Gene Expression, Prasansha Liyanaarachchi Dec 2021

A Copula Model Approach To Identify The Differential Gene Expression, Prasansha Liyanaarachchi

Mathematics & Statistics Theses & Dissertations

Deoxyribonucleic acid, more commonly known as DNA, is a complex double helix-shaped molecule present in all living organisms and hosts thousands of genes. However, only a few genes exhibit differential expression and play a vital role in a particular disease such as breast cancer. Microarray technology is one of the modern technologies developed to study these gene expressions. There are two major microarray technologies available for expression analysis: Spotted cDNA array and oligonucleotide array. The focus of our research is the statistical analysis of data that arises from the spotted cDNA microarray. Numerous models have been proposed in the literature …


Gps-Denied Navigation Using Synthetic Aperture Radar Images And Neural Networks, Teresa White Dec 2021

Gps-Denied Navigation Using Synthetic Aperture Radar Images And Neural Networks, Teresa White

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Unmanned aerial vehicles (UAV) often rely on GPS for navigation. GPS signals, however, are very low in power and easily jammed or otherwise disrupted. This paper presents a method for determining the navigation errors present at the beginning of a GPS-denied period utilizing data from a synthetic aperture radar (SAR) system. This is accomplished by comparing an online-generated SAR image with a reference image obtained a priori. The distortions relative to the reference image are learned and exploited with a convolutional neural network to recover the initial navigational errors, which can be used to recover the true flight trajectory throughout …


Data-Driven Statin Initiation Evaluation And Optimization For Prediabetes Population, Muhenned A. Abdulsahib Dec 2021

Data-Driven Statin Initiation Evaluation And Optimization For Prediabetes Population, Muhenned A. Abdulsahib

Graduate Theses and Dissertations

This dissertation develops quantitative models to support medical decision making of statininitiation considering the uncertainty in disease progression for prediabetes patients. A mathematical model is built to help medical decision-makers take action of statin initiation under uncertainty in future prediabetes progressions. The association between cholesterol drug use, such as statin, and elevating glucose level attracted considerable amounts of attention in the literature. Statin effects on glucose vary with respect to different levels of glucose. The first chapter of this dissertation introduces the problem and an overview of the tools that will be used to solve it. In the second chapter …


Predictors Of Poor Glycemic Control In Diabetic Clients With Mental Health Illness, Community Alliance, Omaha, Nebraska, Rachelle Flick Dec 2021

Predictors Of Poor Glycemic Control In Diabetic Clients With Mental Health Illness, Community Alliance, Omaha, Nebraska, Rachelle Flick

Capstone Experience

People with severe mental illness tend to die 10-25 years earlier than the general population (WHO). Main contributors to these premature deaths include comorbidities such as hypertension, cardiovascular disease, and diabetes. Diabetes prevalence in mentally ill people is 2 times higher than the general population (WHO). The World Health Organization is taking action to improve the health of people with severe mental illness. These efforts include creating protocols of prevention, identification, assessment, and treatment for mentally ill people, as well as improving access to general health services through the integration of physical and mental health services. Community Alliance, located in …


Estimating Treatment Effect On Medical Cost And Examining Medical Cost Trajectory Using Splines And Change Point Techniques., Indranil Ghosh Dec 2021

Estimating Treatment Effect On Medical Cost And Examining Medical Cost Trajectory Using Splines And Change Point Techniques., Indranil Ghosh

Electronic Theses and Dissertations

In the world of growing medical needs, other than the clinical outcomes, the cost of healthcare is one of the important aspects to evaluate. The cost of treatment could act as a decisive factor on which one to choose from two equally likely effective treatment options. In literature, the most used quantity for the cost of treatment is cumulative lifetime cost since the diagnosis of a disease. While it provides a bird' eye view of the treatment cost, it fails to capture the underlying pattern of the treatment cost trajectory. We developed a marginal structural functional model (MSFM) using an …


Uncertainty Quantification In Deep And Statistical Learning With Applications In Bio-Medical Image Analysis, K. Ruwani M. Fernando Nov 2021

Uncertainty Quantification In Deep And Statistical Learning With Applications In Bio-Medical Image Analysis, K. Ruwani M. Fernando

USF Tampa Graduate Theses and Dissertations

Deep Learning (DL) has achieved the state-of-the-art performance across a broad spectrum oftasks. From a statistical standpoint, deep neural networks can be construed as universal function approximators. Although statistical modeling and deep learning methods are well-established as independent areas of research, hybridization of the two paradigms via probabilistic deep networks is an emerging trend. Through development of novel analytical methods under the statistical and deep-learning framework, we address some of the major challenges encountered in the design of intelligent systems which include class imbalance learning, probability calibration, uncertainty quantification and high dimensionality. When modeling rare events, existing methodologies require re-sampling …


Regional Expansion And Evaluation Of Potential Chemical Control For Invasive Apple Snails (Pomacea Maculata) In Southwest Louisiana, Julian M. Lucero Nov 2021

Regional Expansion And Evaluation Of Potential Chemical Control For Invasive Apple Snails (Pomacea Maculata) In Southwest Louisiana, Julian M. Lucero

LSU Master's Theses

The integration of monitoring and chemical control is an efficient strategy for managing invasive apple snails, Pomacea maculata, in the rice (Oryza sativa L.) and crawfish systems of southwest Louisiana. However, their current distribution, expansion rates, and susceptibility to chemical control methods in this area are not well known. This study evaluated the expansion of P. maculata in southwest Louisiana and assessed potential chemical control for P. maculata among toxicity assays using various application rates. The effects of potential chemical control were also assessed on a non-target species, the red swamp crawfish (Procambarus clarkii). P. maculata …


Differential Privacy For Regression Modeling In Health: An Evaluation Of Algorithms, Joseph Ficek Nov 2021

Differential Privacy For Regression Modeling In Health: An Evaluation Of Algorithms, Joseph Ficek

USF Tampa Graduate Theses and Dissertations

Background: There is a need for rigorous and standardized methods of privacy protection for shared data in the health sciences. Differential privacy is one such method that has gained much popularity due to its versatility and robustness. This study evaluates differential privacy for explanatory regression modeling in the context of health research.

Methods: Surveyed and newly proposed algorithms were evaluated with respect to the accuracy (bias and RMSE) of coefficient estimates, the empirical coverage probability of confidence intervals, and the power and type I error rates of hypothesis tests. Evaluations took place in both simulated and real data from a …


Port Throughput Forecasting Using Arima And Ols Regression: Case Study : Gwangyang Port In Korea, Shin Park Oct 2021

Port Throughput Forecasting Using Arima And Ols Regression: Case Study : Gwangyang Port In Korea, Shin Park

World Maritime University Dissertations

No abstract provided.


Online And Adjusted Human Activities Recognition With Statistical Learning, Yanjia Zhang Oct 2021

Online And Adjusted Human Activities Recognition With Statistical Learning, Yanjia Zhang

USF Tampa Graduate Theses and Dissertations

Wearable human activity recognition (HAR) is a widely application system for our daily life. It hasbeen built in many devices, such as smartphone, smartwatch, activity tracker, and health monitor. Many researchers try to develop a system which requires less memory space and power, but has fast and accurate classification results. Moreover, the objective of adjusting the classifier by the system self is also a study direction. In the present study, we introduced the machine learning methods to both smartphone data and smartwatch data and an adjusted model with the continuous generating data. Further, we also proposed a new HAR system …


Statistical Improvements For Ecological Learning About Spatial Processes, Gaetan L. Dupont Oct 2021

Statistical Improvements For Ecological Learning About Spatial Processes, Gaetan L. Dupont

Masters Theses

Ecological inquiry is rooted fundamentally in understanding population abundance, both to develop theory and improve conservation outcomes. Despite this importance, estimating abundance is difficult due to the imperfect detection of individuals in a sample population. Further, accounting for space can provide more biologically realistic inference, shifting the focus from abundance to density and encouraging the exploration of spatial processes. To address these challenges, Spatial Capture-Recapture (“SCR”) has emerged as the most prominent method for estimating density reliably. The SCR model is conceptually straightforward: it combines a spatial model of detection with a point process model of the spatial distribution of …


High-Dimensional Feature Selection And Multi-Level Causal Mediation Analysis With Applications To Human Aging And Cluster-Based Intervention Studies, Hachem Saddiki Oct 2021

High-Dimensional Feature Selection And Multi-Level Causal Mediation Analysis With Applications To Human Aging And Cluster-Based Intervention Studies, Hachem Saddiki

Doctoral Dissertations

Many questions in public health and medicine are fundamentally causal in that our objective is to learn the effect of some exposure, randomized or not, on an outcome of interest. As a result, causal inference frameworks and methodologies have gained interest as a promising tool to reliably answer scientific questions. However, the tasks of identifying and efficiently estimating causal effects from observed data still pose significant challenges under complex data generating scenarios. We focus on (1) high-dimensional settings where the number of variables is orders of magnitude higher than the number of observations; and (2) multi-level settings, where study participants …


Monitoring Mammals At Multiple Scales: Case Studies From Carnivore Communities, Kadambari Devarajan Oct 2021

Monitoring Mammals At Multiple Scales: Case Studies From Carnivore Communities, Kadambari Devarajan

Doctoral Dissertations

Carnivores are distributed widely and threatened by habitat loss, poaching, climate change, and disease. They are considered integral to ecosystem function through their direct and indirect interactions with species at different trophic levels. Given the importance of carnivores, it is of high conservation priority to understand the processes driving carnivore assemblages in different systems. It is thus essential to determine the abiotic and biotic drivers of carnivore community composition at different spatial scales and address the following questions: (i) What factors influence carnivore community composition and diversity? (ii) How do the factors influencing carnivore communities vary across spatial and temporal …


Measurement Invariance Across Immigrant And Non-Immigrant Populations On Pisa Cognitive And Non-Cognitive Scales, Maritza Casas Oct 2021

Measurement Invariance Across Immigrant And Non-Immigrant Populations On Pisa Cognitive And Non-Cognitive Scales, Maritza Casas

Doctoral Dissertations

International large-scale educational assessments (ILSAs) have played a relevant role in educational policies targeting immigrant students across countries as their results are used by governments as input for decision-making purposes. Given the potential impact that ILSAs can have, the psychometric features of these assessments must be carefully assessed and empirical evidence about the extent to which the inferences made based on test results are valid must be collected. To do so, the first step is to determine if the test results have the same meaning across countries and groups of examinees that is, if the measures are invariant so that …


Using Generalizability And Rasch Measurement Theory To Ensure Rigorous Measurement In An International Development Education Evaluation, Louise Bahry Oct 2021

Using Generalizability And Rasch Measurement Theory To Ensure Rigorous Measurement In An International Development Education Evaluation, Louise Bahry

Doctoral Dissertations

Between the United States and Great Britain, over 30 billion USD was spent in 2018 on international aid, over a billion of which is dedicated to education programs alone. Recently, there has been increased attention on the rigorous evaluation of aid-funded programs, moving beyond counting outputs to the measurement of educational impact. The current study uses two methodological approaches (Generalizability (Brennan, 1992, 2001) and Rasch Measurement Theory (Andrich, 1978; Rasch, 1980; Wright & Masters, 1982) to analyze data from math and literacy assessments, and self-report surveys used in an international evaluation of an educational initiative in the Democratic Republic of …


Compound Sums, Their Distributions, And Actuarial Pricing, Ang Li Oct 2021

Compound Sums, Their Distributions, And Actuarial Pricing, Ang Li

Electronic Thesis and Dissertation Repository

Compound risk models are widely used in insurance companies to mathematically describe their aggregate amount of losses during certain time period. However, evaluation of the distribution of compound random variables and the computation of the relevant risk measures are non-trivial. Therefore, the main purpose of this thesis is to study the bounds and simulation methods for both univariate and multivariate compound distributions. The premium setting principles related to dependent multivariate compound distributions are studied. .

In the first part of this thesis, we consider the upper and lower bounds of the tail of bivariate compound distributions. Our results extend those …