Open Access. Powered by Scholars. Published by Universities.®

Statistical Models Commons

Open Access. Powered by Scholars. Published by Universities.®

Discipline
Institution
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 31 - 60 of 1348

Full-Text Articles in Statistical Models

Bayesian Statistical Modeling Of Spatially Resolved Transcriptomics Data, Xi Jiang Oct 2023

Bayesian Statistical Modeling Of Spatially Resolved Transcriptomics Data, Xi Jiang

Statistical Science Theses and Dissertations

Spatially resolved transcriptomics (SRT) quantifies expression levels at different spatial locations, providing a new and powerful tool to investigate novel biological insights. As experimental technologies enhance both in capacity and efficiency, there arises a growing demand for the development of analytical methodologies.

One question in SRT data analysis is to identify genes whose expressions exhibit spatially correlated patterns, called spatially variable (SV) genes. Most current methods to identify SV genes are built upon the geostatistical model with Gaussian process, which could limit the models' ability to identify complex spatial patterns. In order to overcome this challenge and capture more types …


Parameter Estimation For Normally Distributed Grouped Data And Clustering Single-Cell Rna Sequencing Data Via The Expectation-Maximization Algorithm, Zahra Aghahosseinalishirazi Sep 2023

Parameter Estimation For Normally Distributed Grouped Data And Clustering Single-Cell Rna Sequencing Data Via The Expectation-Maximization Algorithm, Zahra Aghahosseinalishirazi

Electronic Thesis and Dissertation Repository

The Expectation-Maximization (EM) algorithm is an iterative algorithm for finding the maximum likelihood estimates in problems involving missing data or latent variables. The EM algorithm can be applied to problems consisting of evidently incomplete data or missingness situations, such as truncated distributions, censored or grouped observations, and also to problems in which the missingness of the data is not natural or evident, such as mixed-effects models, mixture models, log-linear models, and latent variables. In Chapter 2 of this thesis, we apply the EM algorithm to grouped data, a problem in which incomplete data are evident. Nowadays, data confidentiality is of …


Dynamic Influence Diagram-Based Deep Reinforcement Learning Framework And Application For Decision Support For Operators In Control Rooms, Joseph Mietkiewicz, Ammar N. Abbas, Chidera Winifred Amazu, Anders L. Madsen, Gabriele Baldissone Sep 2023

Dynamic Influence Diagram-Based Deep Reinforcement Learning Framework And Application For Decision Support For Operators In Control Rooms, Joseph Mietkiewicz, Ammar N. Abbas, Chidera Winifred Amazu, Anders L. Madsen, Gabriele Baldissone

Articles

In today’s complex industrial environment, operators are often faced with challenging situations that require quick and accurate decision-making. The human-machine interface (HMI) can display too much information, leading to information overload and potentially compromising the operator’s ability to respond effectively. To address this challenge, decision support models are needed to assist operators in identifying and responding to potential safety incidents. In this paper, we present an experiment to evaluate the effectiveness of a recommendation system in addressing the challenge of information overload. The case study focuses on a formaldehyde production simulator and examines the performance of an improved Human-Machine Interface …


Modelling Long-Term Security Returns, Xinghan Zhu Aug 2023

Modelling Long-Term Security Returns, Xinghan Zhu

Electronic Thesis and Dissertation Repository

This research focuses on the concerns of Canadian investors regarding portfolio diversification and preparedness for unexpected risks in retirement planning. It models market crashes and two main financial instruments as independent components to simulate clients’ portfolios. Initially exploring single distributions on mutual funds such as Laplace and t distributions, the research finds limited success. Instead, a normal-Weibull spliced distribution is introduced to model log returns. The Geometric Brownian Motion (GBM) model is employed to predict and evaluate returns on common stocks using the Maximum Likelihood Estimator (MLE), assuming that daily log returns follow a normal distribution. Additionally, the Merton Jump …


Using Geographic Information To Explore Player-Specific Movement And Its Effects On Play Success In The Nfl, Hayley Horn, Eric Laigaie, Alexander Lopez, Shravan Reddy Aug 2023

Using Geographic Information To Explore Player-Specific Movement And Its Effects On Play Success In The Nfl, Hayley Horn, Eric Laigaie, Alexander Lopez, Shravan Reddy

SMU Data Science Review

American Football is a billion-dollar industry in the United States. The analytical aspect of the sport is an ever-growing domain, with open-source competitions like the NFL Big Data Bowl accelerating this growth. With the amount of player movement during each play, tracking data can prove valuable in many areas of football analytics. While concussion detection, catch recognition, and completion percentage prediction are all existing use cases for this data, player-specific movement attributes, such as speed and agility, may be helpful in predicting play success. This research calculates player-specific speed and agility attributes from tracking data and supplements them with descriptive …


Forecasting Covid-19 With Temporal Hierarchies And Ensemble Methods, Li Shandross Aug 2023

Forecasting Covid-19 With Temporal Hierarchies And Ensemble Methods, Li Shandross

Masters Theses

Infectious disease forecasting efforts underwent rapid growth during the COVID-19 pandemic, providing guidance for pandemic response and about potential future trends. Yet despite their importance, short-term forecasting models often struggled to produce accurate real-time predictions of this complex and rapidly changing system. This gap in accuracy persisted into the pandemic and warrants the exploration and testing of new methods to glean fresh insights.

In this work, we examined the application of the temporal hierarchical forecasting (THieF) methodology to probabilistic forecasts of COVID-19 incident hospital admissions in the United States. THieF is an innovative forecasting technique that aggregates time-series data into …


Indirect Aggression And Victimization: Investigating Instrument Psychometrics, Gender Differences, And Its Relationship To Social Information Processing, Taylor Steeves Aug 2023

Indirect Aggression And Victimization: Investigating Instrument Psychometrics, Gender Differences, And Its Relationship To Social Information Processing, Taylor Steeves

Electronic Theses and Dissertations

The study of indirect bullying behaviors, relational aggression and social aggression, has been of theoretical importance and interest to researchers and psychologists within the last few decades. In this investigation, using a convenience sample of 451 late adolescents attending a private university in the mid-Atlantic U.S., I examined the factor structure of two measures of indirect bullying, the Young Adult Social Behavior Scale – Victim (YASB-V) and the Young Adult Social Behavior Scale – Perpetrator (YASB-P). Using confirmatory factor analysis (CFA), I found that the YASB-V comprised a four-factor model, differing from the model that had been identified in the …


Statistical Inference On Lung Cancer Screening Using The National Lung Screening Trial Data., Farhin Rahman Aug 2023

Statistical Inference On Lung Cancer Screening Using The National Lung Screening Trial Data., Farhin Rahman

Electronic Theses and Dissertations

This dissertation consists of three research projects on cancer screening probability modeling. In these projects, the three key modeling parameters (sensitivity, sojourn time, transition density) for cancer screening were estimated, along with the long-term outcomes (including overdiagnosis as one outcome), the optimal screening time/age, the lead time distribution, and the probability of overdiagnosis at the future screening time were simulated to provide a statistical perspective on the effectiveness of cancer screening programs. In the first part of this dissertation, a statistical inference was conducted for male and female smokers using the National Lung Screening Trial (NLST) chest X-ray data. A …


A Framework For Statistical Modeling Of Wind Speed And Wind Direction, Eva Murphy Aug 2023

A Framework For Statistical Modeling Of Wind Speed And Wind Direction, Eva Murphy

All Dissertations

Atmospheric near surface wind speed and wind direction play an important role in many applications, ranging from air quality modeling, building design, wind turbine placement to climate change research. It is therefore crucial to accurately estimate the joint probability distribution of wind speed and direction. This dissertation aims to provide a modeling framework for studying the variation of wind speed and wind direction. To this end, three projects are conducted to address some of the key issues for modeling wind vectors.\\

First, a conditional decomposition approach is developed to model the joint distribution of wind speed and direction. Specifically, the …


Statistical Graph Quality Analysis Of Utah State University Master Of Science Thesis Reports, Ragan Astle Aug 2023

Statistical Graph Quality Analysis Of Utah State University Master Of Science Thesis Reports, Ragan Astle

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Graphical software packages have become increasingly popular in our modern world, but there are concerns within the statistical visualization field about the default settings provided by these packages, which can make it challenging to create good quality graphs that align with standard graph principles. In this thesis, we investigate whether the quality of graphs from Utah State University (USU) Plan A Master of Science (MS) thesis reports from the years 1930 to 2019 was affected by the rise of graphical software packages. We collected all data stored on the USU Digital Commons website since November 2021 to determine the specific …


Stressor: An R Package For Benchmarking Machine Learning Models, Samuel A. Haycock Aug 2023

Stressor: An R Package For Benchmarking Machine Learning Models, Samuel A. Haycock

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Many discipline specific researchers need a way to quickly compare the accuracy of their predictive models to other alternatives. However, many of these researchers are not experienced with multiple programming languages. Python has recently been the leader in machine learning functionality, which includes the PyCaret library that allows users to develop high-performing machine learning models with only a few lines of code. The goal of the stressor package is to help users of the R programming language access the advantages of PyCaret without having to learn Python. This allows the user to leverage R’s powerful data analysis workflows, while simultaneously …


Modeling Biphasic, Non-Sigmoidal Dose-Response Relationships: Comparison Of Brain- Cousens And Cedergreen Models For A Biochemical Dataset, Venkat D. Abbaraju, Tamaraty L. Robinson, Brian P. Weiser Aug 2023

Modeling Biphasic, Non-Sigmoidal Dose-Response Relationships: Comparison Of Brain- Cousens And Cedergreen Models For A Biochemical Dataset, Venkat D. Abbaraju, Tamaraty L. Robinson, Brian P. Weiser

Rowan-Virtua School of Osteopathic Medicine Faculty Scholarship

Biphasic, non-sigmoidal dose-response relationships are frequently observed in biochemistry and pharmacology, but they are not always analyzed with appropriate statistical methods. Here, we examine curve fitting methods for “hormetic” dose-response relationships where low and high doses of an effector produce opposite responses. We provide the full dataset used for modeling, and we provide the code for analyzing the dataset in SAS using two established mathematical models of hormesis, the Brain-Cousens model and the Cedergreen model. We show how to obtain and interpret curve parameters such as the ED50 that arise from modeling, and we discuss how curve parameters might change …


A Comparison Of Confidence Intervals In State Space Models, Jinyu Du Jul 2023

A Comparison Of Confidence Intervals In State Space Models, Jinyu Du

Statistical Science Theses and Dissertations

This thesis develops general procedures for constructing confidence intervals (CIs) of the error disturbance parameters (standard deviations) and transformations of the error disturbance parameters in time-invariant state space models (ssm). With only a set of observations, estimating individual error disturbance parameters accurately in the presence of other unknown parameters in ssm is a very challenging problem. We attempted to construct four different types of confidence intervals, Wald, likelihood ratio, score, and higher-order asymptotic intervals for both the simple local level model and the general time-invariant state space models (ssm). We show that for a simple local level model, both the …


On Image Response Regression With High-Dimensional Data, Noah Fuerth Jun 2023

On Image Response Regression With High-Dimensional Data, Noah Fuerth

Major Papers

A recent issue in statistical analysis is modelling data when the effect variable

changes at different locations. This can be difficult to accomplish when the dimensions

of the covariates are very high, and when the domain of the varying coefficient

functions of predictors are not necessarily regular. This research paper will investigate

a method to overcome these challenges by approximating the varying coefficient

functions using bivariate splines. We do this by splitting the domain of the varying

coefficient functions into a number of triangles, and build the bivariate spline functions

based on this triangulation. This major paper will outline detailed …


Addressing The Impact Of Time-Dependent Social Groupings On Animal Survival And Recapture Rates In Mark-Recapture Studies, Alexandru M. Draghici Jun 2023

Addressing The Impact Of Time-Dependent Social Groupings On Animal Survival And Recapture Rates In Mark-Recapture Studies, Alexandru M. Draghici

Electronic Thesis and Dissertation Repository

Mark-recapture (MR) models typically assume that individuals under study have independent survival and recapture outcomes. One such model of interest is known as the Cormack-Jolly-Seber (CJS) model. In this dissertation, we conduct three major research projects focused on studying the impact of violating the independence assumption in MR models along with presenting extensions which relax the independence assumption. In the first project, we conduct a simulation study to address the impact of failing to account for pair-bonded animals having correlated recapture and survival fates on the CJS model. We examined the impact of correlation on the likelihood ratio test (LRT), …


Analytical Approach For Monitoring The Behavior Of Patients With Pancreatic Adenocarcinoma At Different Stages As A Function Of Time, Aditya Chakaborty Dr, Chris P. Tsokos Dr May 2023

Analytical Approach For Monitoring The Behavior Of Patients With Pancreatic Adenocarcinoma At Different Stages As A Function Of Time, Aditya Chakaborty Dr, Chris P. Tsokos Dr

Biology and Medicine Through Mathematics Conference

No abstract provided.


Predicting Dengue Incidence In Central Argentina Using Google Trends Data, Sahil Chindal, Elizabet Estallo, Yanjun Qian, Michael Robert May 2023

Predicting Dengue Incidence In Central Argentina Using Google Trends Data, Sahil Chindal, Elizabet Estallo, Yanjun Qian, Michael Robert

Biology and Medicine Through Mathematics Conference

No abstract provided.


Public Acceptance Of Guidance And Regulations For Space Flight Participation, Cory Trunkhill, Robert Joslin, Joseph Keebler May 2023

Public Acceptance Of Guidance And Regulations For Space Flight Participation, Cory Trunkhill, Robert Joslin, Joseph Keebler

Journal of Aviation Technology and Engineering

Space flight participants are not professional astronauts and not subject to the rules and guidance covering space flight crewmembers. Ordinal logistic regression of survey data was utilized to explore public acceptance of current medical screening recommendations and regulations for safety risk and implied liability for civil space flight participation. Independent variables constituted participant demographic representations while dependent variables represented current Federal Aviation Administration guidance and regulations. Odds ratios were derived based on the demographic categories to interpret likelihood of acceptance for the criteria. Significant likely acceptance of guidance and regulations was found for five of twelve demographic variables influencing public …


Evaluating Models Of Scanpath Prediction, Matthias Kümmerer, Matthias Bethge May 2023

Evaluating Models Of Scanpath Prediction, Matthias Kümmerer, Matthias Bethge

MODVIS Workshop

No abstract provided.


Optimizing Tumor Xenograft Experiments Using Bayesian Linear And Nonlinear Mixed Modelling And Reinforcement Learning, Mary Lena Bleile May 2023

Optimizing Tumor Xenograft Experiments Using Bayesian Linear And Nonlinear Mixed Modelling And Reinforcement Learning, Mary Lena Bleile

Statistical Science Theses and Dissertations

Tumor xenograft experiments are a popular tool of cancer biology research. In a typical such experiment, one implants a set of animals with an aliquot of the human tumor of interest, applies various treatments of interest, and observes the subsequent response. Efficient analysis of the data from these experiments is therefore of utmost importance. This dissertation proposes three methods for optimizing cancer treatment and data analysis in the tumor xenograft context. The first of these is applicable to tumor xenograft experiments in general, and the second two seek to optimize the combination of radiotherapy with immunotherapy in the tumor xenograft …


Movie Recommender System Using Matrix Factorization, Roland Fiagbe May 2023

Movie Recommender System Using Matrix Factorization, Roland Fiagbe

Data Science and Data Mining

Recommendation systems are a popular and beneficial field that can help people make informed decisions automatically. This technique assists users in selecting relevant information from an overwhelming amount of available data. When it comes to movie recommendations, two common methods are collaborative filtering, which compares similarities between users, and content-based filtering, which takes a user’s specific preferences into account. However, our study focuses on the collaborative filtering approach, specifically matrix factorization. Various similarity metrics are used to identify user similarities for recommendation purposes. Our project aims to predict movie ratings for unwatched movies using the MovieLens rating dataset. We developed …


Comparing Hierarchical Data Structures And Hierarchical Data Analysis, Halley Jeanne Dante, Robert Rovetti May 2023

Comparing Hierarchical Data Structures And Hierarchical Data Analysis, Halley Jeanne Dante, Robert Rovetti

Honors Thesis

Real world data is inherently noisy and data analysis can be especially complex when noise is compounded in hierarchical and multilevel data structures. Since such data structures can be described using multiple approaches, the way data is collapsed and grouped within these structures can influence its resulting interpretation and analyses. To avoid discrepancies in data collapsing and grouping, multiple statistical approaches have been developed specifically to analyze multilevel data structures. Examples of multilevel statistical models are the two-factor ANOVA and the general linear model with repeated-measures (GLM-RR) which is typically used in the context of looking at change over time. …


Factors Affecting Apothecia Production And Primary Infection By Monilinia Vaccinii-Corymbosi On Vaccinium Angustifolium, Ian Leonard May 2023

Factors Affecting Apothecia Production And Primary Infection By Monilinia Vaccinii-Corymbosi On Vaccinium Angustifolium, Ian Leonard

Electronic Theses and Dissertations

Mummy berry, caused by Monilinia vaccinii-corymbosi (MVC), is a prolific disease of Vaccinium angustifolium (wild blueberry) leading to decreased yield in wild blueberry fields throughout the Downeast (DE) and Midcoast (MC) regions of Maine (ME). This study aimed to identify factors affecting primary inoculum production and infection by MVC on wild blueberry, and what bud stages of wild blueberry are most susceptible to infection. Through common garden (CGE), field and incubation experiments conducted in 2021 and 2022, factors affecting carpogenic germination of MVC pseudosclerotia and relationships between susceptible wild blueberry buds and environmental factors were analyzed. The CGE conducted in …


A Probabilistic Exploration Of Food Supplementation And Assistance, Logan Mattingly May 2023

A Probabilistic Exploration Of Food Supplementation And Assistance, Logan Mattingly

Honors College Theses

Food insecurity is a stark threat that grips our country and affects households throughout our country. Dietary insufficiency manifests itself in ways that affect health and public safety. According to researchers, individuals who suffer from food insecurity have a higher risk of aggression, anxiety, suicide ideation and depression. These problems tend to occur unequally distributed among those households with lower income. In this work, an exploratory analysis within these data sets will be performed to examine the socio-economic, biographical, nutritional, and geographical principal components of food insecurity among survey participants and how the US Supplemental Nutrition Assistance Program (SNAP) effects …


Hispanic Human Capital And Financial Aid Application In The West Census Region, Benjamin Lundy-Paine May 2023

Hispanic Human Capital And Financial Aid Application In The West Census Region, Benjamin Lundy-Paine

Capstone Projects and Master's Theses

As of 2021, very few Hispanic residents in the United States held a college degree in comparison to non-Hispanic residents. Research has shown that, particularly for Hispanic students, financial aid increases college persistence. Hispanic Free Application for Federal Student Aid (FAFSA) submission rates rank among the lowest, preventing many Hispanic students from receiving financial assistance. This issue is most prevalent West Census Region (WCR), where there is the highest concentration of Hispanic residents. To understand what barriers may be preventing Hispanic submission in the WCR this Capstone used logistic regression models to analyze student-level data from the National Center for …


Multidimensional Investigation Of Tennessee’S Urban Forest, Jillian L. Gorrell May 2023

Multidimensional Investigation Of Tennessee’S Urban Forest, Jillian L. Gorrell

Doctoral Dissertations

Preserving existing trees in urban areas and properly cultivating urban forest conservation and management opportunities is valuable to the ever-growing urban environment and necessary for creating optimal experiences and educational tools to meet the needs of increasing urban populations. This dissertation contains studies investigating several facets of the urban forest, including environmental effects of deforestation and urbanization, tree equity, and urban forest facility management and accessibility. Community education and outreach at arboreta about the importance of the tree canopy can help promote environmental stewardship. A digital questionnaire was electronically distributed to representatives of arboreta certified through the Tennessee Division of …


An Analysis Of Changes In Seasonal Dynamics And Generational Differences In The Maine Lobster Fishery, Emily Fitting May 2023

An Analysis Of Changes In Seasonal Dynamics And Generational Differences In The Maine Lobster Fishery, Emily Fitting

Electronic Theses and Dissertations

The American lobster (Homarus americanus) supports the most valuable single species fishery in the US. Lobster landings have been increasing steadily for the last three decades, but before that landings were more variable. The high value of the lobster fishery combined with the decline of other commercially important species in this region has created increasing dependence on the resource, and previous research questions the resilience of the fishery in the face of social and environmental changes.

Important lobster life history processes, including migration patterns, growth rates, and reproduction, are driven by ocean bottom temperature, which creates a strong seasonal cycle …


Baseball’S Evolution In The 21st Century, And How It Exemplifies Human Response To Change, Jonathan Sharpe May 2023

Baseball’S Evolution In The 21st Century, And How It Exemplifies Human Response To Change, Jonathan Sharpe

Honors Projects

The game of baseball has changed a lot in the past twenty years. It can be primarily attributed to the explosion in data analytics and how they are used to evaluate baseball players. This led to different player profiles being preferred and eventually led to the development of players changing. As a result, the strategies employed have also evolved and turned into a different game than seen only a couple of decades ago. This paper will explore the changes that the game has seen. On the other hand, Major League Baseball has also implemented its own changes to try and …


Effects Of Functional Network Model Definition On Biomarker Outcome Prediction, Xinyang Feng May 2023

Effects Of Functional Network Model Definition On Biomarker Outcome Prediction, Xinyang Feng

Arts & Sciences Electronic Theses and Dissertations

Machine learning (ML) models are widely used to investigate the human connectome and to predict and understand behavior, emotion, and cognition. Prior research has organized pediatric connectome data using adult functional network models. However, this assumes that adult functional network models are appropriate and useful for prediction developmental outcomes from pediatric connectome data. We hypothesize that the application of adult brain network models could result in poor model fit, limiting the generalizability of results. Here, we test whether prediction of biological age is improved by concordant brain network models matching underlying functional connectome data. To quantify the difference in age …


Uconn Baseball Batting Order Optimization, Gavin Rublewski, Gavin Rublewski May 2023

Uconn Baseball Batting Order Optimization, Gavin Rublewski, Gavin Rublewski

Honors Scholar Theses

Challenging conventional wisdom is at the very core of baseball analytics. Using data and statistical analysis, the sets of rules by which coaches make decisions can be justified, or possibly refuted. One of those sets of rules relates to the construction of a batting order. Through data collection, data adjustment, the construction of a baseball simulator, and the use of a Monte Carlo Simulation, I have assessed thousands of possible batting orders to determine the roster-specific strategies that lead to optimal run production for the 2023 UConn baseball team. This paper details a repeatable process in which basic player statistics …