Open Access. Powered by Scholars. Published by Universities.®

Statistical Models Commons

Open Access. Powered by Scholars. Published by Universities.®

2021

Discipline
Institution
Keyword
Publication
Publication Type

Articles 1 - 30 of 71

Full-Text Articles in Statistical Models

Approximate Likelihood Based Estimations For Joint Models With Intractable Likelihoods, Karl Stessy M. Bisselou Dec 2021

Approximate Likelihood Based Estimations For Joint Models With Intractable Likelihoods, Karl Stessy M. Bisselou

Theses & Dissertations

This dissertation focuses on the development of approximation approaches for the joint modeling (JM) of repeated measures data and time-to-event data in the presence of analytically or numerically intractable likelihoods. Current likelihood-based inferences for JMs show several limitations including (i) intractability of integrals during marginal likelihood derivations due to the complexity in computations, and (ii) the large number of nuisance parameters (unobserved) posing a problem with convergence. The h-likelihood (HL) and synthetic likelihood (SL) are two computationally efficient estimation approaches that overcome these challenges.

In the presence of extremely high censoring rates, the HL can produce bias parameter estimates. We …


Identification And Characterization Of Forest Fire Risk Zones Leveraging Machine Learning Methods, Joshua Balson, Matt Chinchilla, Cam Lu, Jeff Washburn, Nibhrat Lohia Dec 2021

Identification And Characterization Of Forest Fire Risk Zones Leveraging Machine Learning Methods, Joshua Balson, Matt Chinchilla, Cam Lu, Jeff Washburn, Nibhrat Lohia

SMU Data Science Review

Across the United States, record numbers of wildfires are observed costing billions of dollars in property damage, polluting the environment, and putting lives at risk. The ability of emergency management professionals, city planners, and private entities such as insurance companies to determine if an area is at higher risk of a fire breaking out has never been greater. This paper proposes a novel methodology for identifying and characterizing zones with increased risks of forest fires. Methods involving machine learning techniques use the widely available and recorded data, thus making it possible to implement the tool quickly.


Comparing Machine Learning Techniques With State-Of-The-Art Parametric Prediction Models For Predicting Soybean Traits, Susweta Ray Dec 2021

Comparing Machine Learning Techniques With State-Of-The-Art Parametric Prediction Models For Predicting Soybean Traits, Susweta Ray

Department of Statistics: Dissertations, Theses, and Student Work

Soybean is a significant source of protein and oil, and also widely used as animal feed. Thus, developing lines that are superior in terms of yield, protein and oil content is important to feed the ever-growing population. As opposed to the high-cost phenotyping, genotyping is both cost and time efficient for breeders while evaluating new lines in different environments (location-year combinations) can be costly. Several Genomic prediction (GP) methods have been developed to use the marker and environment data effectively to predict the yield or other relevant phenotypic traits of crops. Our study compares a conventional GP method (GBLUP), a …


Confidence Interval For The Mean Of A Beta Distribution, Sean Rangel Dec 2021

Confidence Interval For The Mean Of A Beta Distribution, Sean Rangel

Electronic Theses and Dissertations

Statistical inference for the mean of a beta distribution has become increasingly popular in various fields of academic research. In this study, we developed a novel statistical model from likelihood-based techniques to evaluate various confidence interval techniques for the mean of a beta distribution. Simulation studies will be implemented to compare the performance of the confidence intervals. In addition to the development and study involving confidence intervals, we will also apply the confidence intervals to real biological data that was gathered by the Department of Biology at Stephen F. Austin State University and provide recommendations on the best practice.


Interpolating Missing Data And Comparing Performance Of Common Interpolation Techniques From A 30-Year Water Quality Dataset, Wako Bungula, Danelle M. Larson Dr., Killian Davis, Richard Erickson Dr., Amber Lee, Casey Mckean, Frederick Miller, Alaina Stockdill, Enrika Hlavacek Nov 2021

Interpolating Missing Data And Comparing Performance Of Common Interpolation Techniques From A 30-Year Water Quality Dataset, Wako Bungula, Danelle M. Larson Dr., Killian Davis, Richard Erickson Dr., Amber Lee, Casey Mckean, Frederick Miller, Alaina Stockdill, Enrika Hlavacek

Annual Symposium on Biomathematics and Ecology Education and Research

No abstract provided.


Estimation Analysis For The Seir Model With Stochastic Perturbation For The Covid-19 Outbreak In Bogotá, Viswanathan Arunachalam, Andres Rios-Gutierrez Nov 2021

Estimation Analysis For The Seir Model With Stochastic Perturbation For The Covid-19 Outbreak In Bogotá, Viswanathan Arunachalam, Andres Rios-Gutierrez

Annual Symposium on Biomathematics and Ecology Education and Research

No abstract provided.


Statistical Modeling Of Sars-Cov-2 Mutation In The U.S., Yuru Jing, Angela Antonou Nov 2021

Statistical Modeling Of Sars-Cov-2 Mutation In The U.S., Yuru Jing, Angela Antonou

Annual Symposium on Biomathematics and Ecology Education and Research

No abstract provided.


Species Abundance Distributions And The Canon Of Classical Music, Noelle Atkin Nov 2021

Species Abundance Distributions And The Canon Of Classical Music, Noelle Atkin

Annual Symposium on Biomathematics and Ecology Education and Research

No abstract provided.


Statistical Improvements For Ecological Learning About Spatial Processes, Gaetan L. Dupont Oct 2021

Statistical Improvements For Ecological Learning About Spatial Processes, Gaetan L. Dupont

Masters Theses

Ecological inquiry is rooted fundamentally in understanding population abundance, both to develop theory and improve conservation outcomes. Despite this importance, estimating abundance is difficult due to the imperfect detection of individuals in a sample population. Further, accounting for space can provide more biologically realistic inference, shifting the focus from abundance to density and encouraging the exploration of spatial processes. To address these challenges, Spatial Capture-Recapture (“SCR”) has emerged as the most prominent method for estimating density reliably. The SCR model is conceptually straightforward: it combines a spatial model of detection with a point process model of the spatial distribution of …


Monitoring Mammals At Multiple Scales: Case Studies From Carnivore Communities, Kadambari Devarajan Oct 2021

Monitoring Mammals At Multiple Scales: Case Studies From Carnivore Communities, Kadambari Devarajan

Doctoral Dissertations

Carnivores are distributed widely and threatened by habitat loss, poaching, climate change, and disease. They are considered integral to ecosystem function through their direct and indirect interactions with species at different trophic levels. Given the importance of carnivores, it is of high conservation priority to understand the processes driving carnivore assemblages in different systems. It is thus essential to determine the abiotic and biotic drivers of carnivore community composition at different spatial scales and address the following questions: (i) What factors influence carnivore community composition and diversity? (ii) How do the factors influencing carnivore communities vary across spatial and temporal …


Measurement Invariance Across Immigrant And Non-Immigrant Populations On Pisa Cognitive And Non-Cognitive Scales, Maritza Casas Oct 2021

Measurement Invariance Across Immigrant And Non-Immigrant Populations On Pisa Cognitive And Non-Cognitive Scales, Maritza Casas

Doctoral Dissertations

International large-scale educational assessments (ILSAs) have played a relevant role in educational policies targeting immigrant students across countries as their results are used by governments as input for decision-making purposes. Given the potential impact that ILSAs can have, the psychometric features of these assessments must be carefully assessed and empirical evidence about the extent to which the inferences made based on test results are valid must be collected. To do so, the first step is to determine if the test results have the same meaning across countries and groups of examinees that is, if the measures are invariant so that …


Compound Sums, Their Distributions, And Actuarial Pricing, Ang Li Oct 2021

Compound Sums, Their Distributions, And Actuarial Pricing, Ang Li

Electronic Thesis and Dissertation Repository

Compound risk models are widely used in insurance companies to mathematically describe their aggregate amount of losses during certain time period. However, evaluation of the distribution of compound random variables and the computation of the relevant risk measures are non-trivial. Therefore, the main purpose of this thesis is to study the bounds and simulation methods for both univariate and multivariate compound distributions. The premium setting principles related to dependent multivariate compound distributions are studied. .

In the first part of this thesis, we consider the upper and lower bounds of the tail of bivariate compound distributions. Our results extend those …


Science Is For Everybody: A Resource For Understanding Glaciers, Climate, And Modeling, Emma Watson Oct 2021

Science Is For Everybody: A Resource For Understanding Glaciers, Climate, And Modeling, Emma Watson

Independent Study Project (ISP) Collection

Climate change threatens the existence of glaciers worldwide. In order to properly interact with these changing systems, we must first understand them. Glacial models provide an excellent way to do this; however, the language and mathematical concepts used in their creation is generally inaccessible to a common audience. This project presents an online resource for a general audience to interact with climate science, glaciology, and glacial modeling. Long term goals for the project include the incorporation of a glacial model of Drangajökull, Vestfirðir, NW Iceland. As such, focus for the project includes a literature review of glaciers, Drangajökull in particular, …


2021 Assessment Of The Status Of The West Coast Demersal Scalefifish Resource, David Fairclough, E. A. Fisher, Sybrand Alex Hesp, Ainslie Denham, Rachel Marks Oct 2021

2021 Assessment Of The Status Of The West Coast Demersal Scalefifish Resource, David Fairclough, E. A. Fisher, Sybrand Alex Hesp, Ainslie Denham, Rachel Marks

Fisheries research reports

No abstract provided.


Exploring The Relationship Between Mandatory Helmet Use Regulations And Adult Cyclists’ Behavior In California Using Hybrid Machine Learning Models, Fatemeh Davoudi Kakhki, Maria Chierichetti Oct 2021

Exploring The Relationship Between Mandatory Helmet Use Regulations And Adult Cyclists’ Behavior In California Using Hybrid Machine Learning Models, Fatemeh Davoudi Kakhki, Maria Chierichetti

Mineta Transportation Institute Publications

In California, bike fatalities increased by 8.1% from 2015 to 2016. Even though the benefits of wearing helmets in protecting cyclists against trauma in cycling crash has been determined, the use of helmets is still limited, and there is opposition against mandatory helmet use, particularly for adults. Therefore, exploring perceptions of adult cyclists regarding mandatory helmet use is a key element in understanding cyclists’ behavior, and determining the impact of mandatory helmet use on their cycling rate. The goal of this research is to identify sociodemographic characteristics and cycling behaviors that are associated with the use and non-use of bicycle …


Ecological Risk Assessment For The Temperate Demersal Elasmobranch Resource, Department Of Primary Industries And Regional Development, Western Australia Oct 2021

Ecological Risk Assessment For The Temperate Demersal Elasmobranch Resource, Department Of Primary Industries And Regional Development, Western Australia

Fisheries research reports

No abstract provided.


Squid And Cuttlefish Resources Of Western Australia, Daniel Yeoh, Danielle J. Johnston Phd, David C. Harris Sep 2021

Squid And Cuttlefish Resources Of Western Australia, Daniel Yeoh, Danielle J. Johnston Phd, David C. Harris

Fisheries research reports

No abstract provided.


Otoliths Of South-Western Australian Fish: A Photographic Catalogue, Chris Dowling, Kim Smith, Elain Lek, Joshua Brown Sep 2021

Otoliths Of South-Western Australian Fish: A Photographic Catalogue, Chris Dowling, Kim Smith, Elain Lek, Joshua Brown

Fisheries research reports

No abstract provided.


Determining Malignancy: Can Mammogram Results Help Predict The Diagnosis Of Breast Tumors?, Taylor Behrens Aug 2021

Determining Malignancy: Can Mammogram Results Help Predict The Diagnosis Of Breast Tumors?, Taylor Behrens

Symposium of Student Scholars

Even with advancements in treatment and preventative care, breast cancer remains an epidemic claiming more than 40,000 American male and female lives each year. The mammogram dataset that I am analyzing was initially complied in the early 1990s by a team from the University of Wisconsin - Madison. Past research diagnoses breast cancer from fine-needle aspirates. My research focuses on predicting whether we can determine breast cancer diagnoses without the use of invasive procedures and, in particular, whether we can predict breast cancer based on mammogram data. Do measures of gray-scale texture, radius, concavity, perimeter, compactness, area, and smoothness of …


Spatial Analysis Of Landscape Characteristics, Anthropogenic Factors, And Seasonality Effects On Water Quality In Portland, Oregon, Katherine Gelsey, Daniel Ramirez Aug 2021

Spatial Analysis Of Landscape Characteristics, Anthropogenic Factors, And Seasonality Effects On Water Quality In Portland, Oregon, Katherine Gelsey, Daniel Ramirez

REU Final Reports

Urban areas often struggle with deteriorated water quality as a result of complex interactions between landscape factors such as land cover, use, and management as well as climatic variables such as weather, precipitation, and atmospheric conditions. Green stormwater infrastructure (GSI) has been introduced as a strategy to reintroduce pre-development hydrological conditions in cities, but questions remain as to how GSI interacts with other landscape factors to affect water quality. We conducted a statistical analysis of six relevant water quality indicators in 131 water quality stations in four watersheds around Portland, Oregon using data from 2015 to 2021. Indiscriminate of station …


Modeling Covid-19 Spread In Small Colleges, Riti Bahl, Nicole Eikmeier, Alexandra Fraser, Matthew Junge, Felicia Keesing, Kukai Nakahata, Lily Reeves Aug 2021

Modeling Covid-19 Spread In Small Colleges, Riti Bahl, Nicole Eikmeier, Alexandra Fraser, Matthew Junge, Felicia Keesing, Kukai Nakahata, Lily Reeves

Publications and Research

We develop an agent-based model on a network meant to capture features unique to COVID-19 spread through a small residential college. We find that a safe reopening requires strong policy from administrators combined with cautious behavior from students. Strong policy includes weekly screening tests with quick turnaround and halving the campus population. Cautious behavior from students means wearing facemasks, socializing less, and showing up for COVID-19 testing. We also find that comprehensive testing and facemasks are the most effective single interventions, building closures can lead to infection spikes in other areas depending on student behavior, and faster return of test …


Empirical Fitting Of Periodically Repeating Environmental Data, Pavel Bělík, Andrew Hotchkiss, Brandon Perez, John Zobitz Aug 2021

Empirical Fitting Of Periodically Repeating Environmental Data, Pavel Bělík, Andrew Hotchkiss, Brandon Perez, John Zobitz

Spora: A Journal of Biomathematics

We extend and generalize an approach to conduct fitting models of periodically repeating data. Our method first detrends the data from a baseline function and then fits the data to a periodic (trigonometric, polynomial, or piecewise linear) function. The polynomial and piecewise linear functions are developed from assumptions of continuity and differentiability across each time period. We apply this approach to different datasets in the environmental sciences in addition to a synthetic dataset. Overall the polynomial and piecewise linear approaches developed here performed as good (or better) compared to the trigonometric approach when evaluated using statistical measures (R2 …


Modeling Reproduction Influencers Of An Endangered Oak, Camila Cortez Aug 2021

Modeling Reproduction Influencers Of An Endangered Oak, Camila Cortez

DePaul Discoveries

The endemic oak, Quercus brandegeei has been labeled as endangered by the IUCN Red List of Endangered Species due to its limited genetic diversity and lack of regeneration. The oak (Quercus) species is a keystone species in many parts of the world and has been facing various challenges to their survival (Westwood 2017) making efforts to support and protect endemic oaks all the more ecologically and socially imperative. There are challenges to identifying threats as there are many unknown characteristics of Q. brandegeei’s biology that are essential to carrying out conservation efforts. To develop a greater understanding of …


Identification And Characterization Of De Novo Germline Tp53 Mutation Carriers In Families With Li-Fraumeni Syndrome, Carlos C. Vera Recio Aug 2021

Identification And Characterization Of De Novo Germline Tp53 Mutation Carriers In Families With Li-Fraumeni Syndrome, Carlos C. Vera Recio

Dissertations & Theses (Open Access)

Li-Fraumeni syndrome (LFS) is an inherited cancer syndrome caused by a deleterious mutation in TP53. An estimated 48% of LFS patients present due to a de novo mutation (DNM) in TP53. The knowledge of DNM status, DNM or familial mutation (FM), of an LFS patient requires genetic testing of both parents which is often inaccessible, making de novo LFS patients difficult to study. Famdenovo.TP53 is a Mendelian Risk prediction model used to predict DNM status of TP53 mutation carriers based on the cancer-family history and several input genetic parameters, including disease-gene penetrance. The good predictive performance of Famdenovo.TP53 was demonstrated …


Ensemble Data Fitting For Bathymetric Models Informed By Nominal Data, Samantha Zambo Aug 2021

Ensemble Data Fitting For Bathymetric Models Informed By Nominal Data, Samantha Zambo

Dissertations

Due to the difficulty and expense of collecting bathymetric data, modeling is the primary tool to produce detailed maps of the ocean floor. Current modeling practices typically utilize only one interpolator; the industry standard is splines-in-tension.

In this dissertation we introduce a new nominal-informed ensemble interpolator designed to improve modeling accuracy in regions of sparse data. The method is guided by a priori domain knowledge provided by artificially intelligent classifiers. We recast such geomorphological classifications, such as ‘seamount’ or ‘ridge’, as nominal data which we utilize as foundational shapes in an expanded ordinary least squares regression-based algorithm. To our knowledge …


Predictive Modeling Of Clinical Outcomes For Hospitalized Covid-19 Patients Utilizing Cytof And Clinical Data., Onajia Stubblefield Aug 2021

Predictive Modeling Of Clinical Outcomes For Hospitalized Covid-19 Patients Utilizing Cytof And Clinical Data., Onajia Stubblefield

Electronic Theses and Dissertations

In December 2019, an outbreak of a novel coronavirus initiated a global pandemic. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a virus that causes the disease coronavirus disease 2019 (COVID-19). Symptoms of infection with COVID-19 vary widely between individuals. While some infected individuals are asymptomatic, others need more extensive care and require hospitalization. Indeed, the COVID-19 pandemic was characterized by a shortage of hospital beds which presented additional complications in providing adequate care for patients. In this study, we used a combination of T cell population data collected from mass cytometry analysis and clinical markers to form a predictive …


Bayesian Variable Selection Strategies In Longitudinal Mixture Models And Categorical Regression Problems., Md Nazir Uddin Aug 2021

Bayesian Variable Selection Strategies In Longitudinal Mixture Models And Categorical Regression Problems., Md Nazir Uddin

Electronic Theses and Dissertations

In this work, we seek to develop a variable screening and selection method for Bayesian mixture models with longitudinal data. To develop this method, we consider data from the Health and Retirement Survey (HRS) conducted by University of Michigan. Considering yearly out-of-pocket expenditures as the longitudinal response variable, we consider a Bayesian mixture model with $K$ components. The data consist of a large collection of demographic, financial, and health-related baseline characteristics, and we wish to find a subset of these that impact cluster membership. An initial mixture model without any cluster-level predictors is fit to the data through an MCMC …


On The Estimation Of Heston-Nandi Garch Using Returns And/Or Options: A Simulation-Based Approach, Xize Ye Jul 2021

On The Estimation Of Heston-Nandi Garch Using Returns And/Or Options: A Simulation-Based Approach, Xize Ye

Electronic Thesis and Dissertation Repository

In this thesis, the Heston-Nandi GARCH(1,1) (henceforth, HN-GARCH) option pricing model is fitted via 4 maximum likelihood-based estimation and calibration approaches using simulated returns and/or options. The purpose is to examine the benefits of the joint estimation using both returns and options over the fundamental returns-only estimation on GARCH models. From our empirical studies, with the additional option sample, we can improve the efficiency of the estimates for HN-GARCH parameters. Nonetheless, the improvements for the risk premium factor, both from empirical standard errors, and sample RMSEs, are insignificant. In addition, option prices are simulated with a pre-defined noise structure and …


Model-Free Descriptive Modeling For Multivariate Categorical Data With An Ordinal Dependent Variable, Li Wang Jul 2021

Model-Free Descriptive Modeling For Multivariate Categorical Data With An Ordinal Dependent Variable, Li Wang

Doctoral Dissertations

In the process of statistical modeling, the descriptive modeling plays an essential role in accelerating the formulation of plausible hypotheses in the subsequent explanatory modeling and facilitating the selection of potential variables in the subsequent predictive modeling. Especially, for multivariate categorical data analysis, it is desirable to use the descriptive modeling methods for uncovering and summarizing the potential association structure among multiple categorical variables in a compact manner. However, many classical methods in this case either rely on strong assumptions for parametric models or become infeasible when the data dimension is higher. To this end, we propose a model-free method …


Evaluating The Efficiency Of Markov Chain Monte Carlo Algorithms, Thuy Scanlon Jul 2021

Evaluating The Efficiency Of Markov Chain Monte Carlo Algorithms, Thuy Scanlon

Graduate Theses and Dissertations

Markov chain Monte Carlo (MCMC) is a simulation technique that produces a Markov chain designed to converge to a stationary distribution. In Bayesian statistics, MCMC is used to obtain samples from a posterior distribution for inference. To ensure the accuracy of estimates using MCMC samples, the convergence to the stationary distribution of an MCMC algorithm has to be checked. As computation time is a resource, optimizing the efficiency of an MCMC algorithm in terms of effective sample size (ESS) per time unit is an important goal for statisticians. In this paper, we use simulation studies to demonstrate how the Gibbs …