Open Access. Powered by Scholars. Published by Universities.®

Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistical Models

Theses/Dissertations

Institution
Keyword
Publication Year
Publication

Articles 1 - 30 of 49

Full-Text Articles in Probability

Multiscale Modelling Of Brain Networks And The Analysis Of Dynamic Processes In Neurodegenerative Disorders, Hina Shaheen Jan 2024

Multiscale Modelling Of Brain Networks And The Analysis Of Dynamic Processes In Neurodegenerative Disorders, Hina Shaheen

Theses and Dissertations (Comprehensive)

The complex nature of the human brain, with its intricate organic structure and multiscale spatio-temporal characteristics ranging from synapses to the entire brain, presents a major obstacle in brain modelling. Capturing this complexity poses a significant challenge for researchers. The complex interplay of coupled multiphysics and biochemical activities within this intricate system shapes the brain's capacity, functioning within a structure-function relationship that necessitates a specific mathematical framework. Advanced mathematical modelling approaches that incorporate the coupling of brain networks and the analysis of dynamic processes are essential for advancing therapeutic strategies aimed at treating neurodegenerative diseases (NDDs), which afflict millions of …


Exploration And Statistical Modeling Of Profit, Caleb Gibson Dec 2023

Exploration And Statistical Modeling Of Profit, Caleb Gibson

Undergraduate Honors Theses

For any company involved in sales, maximization of profit is the driving force that guides all decision-making. Many factors can influence how profitable a company can be, including external factors like changes in inflation or consumer demand or internal factors like pricing and product cost. Understanding specific trends in one's own internal data, a company can readily identify problem areas or potential growth opportunities to help increase profitability.

In this discussion, we use an extensive data set to examine how a company might analyze their own data to identify potential changes the company might investigate to drive better performance. Based …


New Developments On The Estimability And The Estimation Of Phase-Type Actuarial Models, Cong Nie Jul 2022

New Developments On The Estimability And The Estimation Of Phase-Type Actuarial Models, Cong Nie

Electronic Thesis and Dissertation Repository

This thesis studies the estimability and the estimation methods for two models based on Markov processes: the phase-type aging model (PTAM), which models the human aging process, and the discrete multivariate phase-type model (DMPTM), which can be used to model multivariate insurance claim processes.

The principal contributions of this thesis can be categorized into two areas. First, an objective measure of estimability is proposed to quantify estimability in the context of statistical models. Existing methods for assessing estimability require the subjective specification of thresholds, which potentially limits their usefulness. Unlike these methods, the proposed measure of estimability is objective. In …


Advancements In Gaussian Process Learning For Uncertainty Quantification, John C. Nicholson May 2022

Advancements In Gaussian Process Learning For Uncertainty Quantification, John C. Nicholson

All Dissertations

Gaussian processes are among the most useful tools in modeling continuous processes in machine learning and statistics. The research presented provides advancements in uncertainty quantification using Gaussian processes from two distinct perspectives. The first provides a more fundamental means of constructing Gaussian processes which take on arbitrary linear operator constraints in much more general framework than its predecessors, and the other from the perspective of calibration of state-aware parameters in computer models. If the value of a process is known at a finite collection of points, one may use Gaussian processes to construct a surface which interpolates these values to …


Early-Warning Alert Systems For Financial-Instability Detection: An Hmm-Driven Approach, Xing Gu Apr 2022

Early-Warning Alert Systems For Financial-Instability Detection: An Hmm-Driven Approach, Xing Gu

Electronic Thesis and Dissertation Repository

Regulators’ early intervention is crucial when the financial system is experiencing difficulties. Financial stability must be preserved to avert banks’ bailouts, which hugely drain government's financial resources. Detecting in advance periods of financial crisis entails the development and customisation of accurate and robust quantitative techniques. The goal of this thesis is to construct automated systems via the interplay of various mathematical and statistical methodologies to signal financial instability episodes in the near-term horizon. These signal alerts could provide regulatory bodies with the capacity to initiate appropriate response that will thwart or at least minimise the occurrence of a financial crisis. …


Intra-Hour Solar Forecasting Using Cloud Dynamics Features Extracted From Ground-Based Infrared Sky Images, Guillermo Terrén-Serrano Apr 2022

Intra-Hour Solar Forecasting Using Cloud Dynamics Features Extracted From Ground-Based Infrared Sky Images, Guillermo Terrén-Serrano

Electrical and Computer Engineering ETDs

Due to the increasing use of photovoltaic systems, power grids are vulnerable to the projection of shadows from moving clouds. An intra-hour solar forecast provides power grids with the capability of automatically controlling the dispatch of energy, reducing the additional cost for a guaranteed, reliable supply of energy (i.e., energy storage). This dissertation introduces a novel sky imager consisting of a long-wave radiometric infrared camera and a visible light camera with a fisheye lens. The imager is mounted on a solar tracker to maintain the Sun in the center of the images throughout the day, reducing the scattering effect produced …


Identification And Characterization Of De Novo Germline Tp53 Mutation Carriers In Families With Li-Fraumeni Syndrome, Carlos C. Vera Recio Aug 2021

Identification And Characterization Of De Novo Germline Tp53 Mutation Carriers In Families With Li-Fraumeni Syndrome, Carlos C. Vera Recio

Dissertations & Theses (Open Access)

Li-Fraumeni syndrome (LFS) is an inherited cancer syndrome caused by a deleterious mutation in TP53. An estimated 48% of LFS patients present due to a de novo mutation (DNM) in TP53. The knowledge of DNM status, DNM or familial mutation (FM), of an LFS patient requires genetic testing of both parents which is often inaccessible, making de novo LFS patients difficult to study. Famdenovo.TP53 is a Mendelian Risk prediction model used to predict DNM status of TP53 mutation carriers based on the cancer-family history and several input genetic parameters, including disease-gene penetrance. The good predictive performance of Famdenovo.TP53 was demonstrated …


Bayesian Variable Selection Strategies In Longitudinal Mixture Models And Categorical Regression Problems., Md Nazir Uddin Aug 2021

Bayesian Variable Selection Strategies In Longitudinal Mixture Models And Categorical Regression Problems., Md Nazir Uddin

Electronic Theses and Dissertations

In this work, we seek to develop a variable screening and selection method for Bayesian mixture models with longitudinal data. To develop this method, we consider data from the Health and Retirement Survey (HRS) conducted by University of Michigan. Considering yearly out-of-pocket expenditures as the longitudinal response variable, we consider a Bayesian mixture model with $K$ components. The data consist of a large collection of demographic, financial, and health-related baseline characteristics, and we wish to find a subset of these that impact cluster membership. An initial mixture model without any cluster-level predictors is fit to the data through an MCMC …


Evaluating The Efficiency Of Markov Chain Monte Carlo Algorithms, Thuy Scanlon Jul 2021

Evaluating The Efficiency Of Markov Chain Monte Carlo Algorithms, Thuy Scanlon

Graduate Theses and Dissertations

Markov chain Monte Carlo (MCMC) is a simulation technique that produces a Markov chain designed to converge to a stationary distribution. In Bayesian statistics, MCMC is used to obtain samples from a posterior distribution for inference. To ensure the accuracy of estimates using MCMC samples, the convergence to the stationary distribution of an MCMC algorithm has to be checked. As computation time is a resource, optimizing the efficiency of an MCMC algorithm in terms of effective sample size (ESS) per time unit is an important goal for statisticians. In this paper, we use simulation studies to demonstrate how the Gibbs …


Statistical Analysis Of 2017-18 Premier League Match Statistics Using A Regression Analysis In R, Bergen Campbell May 2021

Statistical Analysis Of 2017-18 Premier League Match Statistics Using A Regression Analysis In R, Bergen Campbell

Undergraduate Theses and Capstone Projects

This thesis analyzes the correlation between a team’s statistics and the success of their performances, and develops a predictive model that can be used to forecast final season results for that team. Data from the 2017-2018 Premier League season is to be gathered and broken down within R to highlight what factors and variables are largely contributing to the success or downfall of a team. A multiple linear regression model and stepwise selection process is then used to include any factors that are significant in predicting in match results.

The predictions about the 17-18 season results based on the model …


Markov Chains And Their Applications, Fariha Mahfuz Apr 2021

Markov Chains And Their Applications, Fariha Mahfuz

Math Theses

Markov chain is a stochastic model that is used to predict future events. Markov chain is relatively simple since it only requires the information of the present state to predict the future states. In this paper we will go over the basic concepts of Markov Chain and several of its applications including Google PageRank algorithm, weather prediction and gamblers ruin.

We examine on how the Google PageRank algorithm works efficiently to provide PageRank for a Google search result. We also show how can we use Markov chain to predict weather by creating a model from real life data.


The Mean-Reverting 4/2 Stochastic Volatility Model: Properties And Financial Applications, Zhenxian Gong Feb 2021

The Mean-Reverting 4/2 Stochastic Volatility Model: Properties And Financial Applications, Zhenxian Gong

Electronic Thesis and Dissertation Repository

Financial markets and instruments are continuously evolving, displaying new and more refined stylized facts. This requires regular reviews and empirical evaluations of advanced models. There is evidence in literature that supports stochastic volatility models over constant volatility models in capturing stylized facts such as "smile" and "skew" presented in implied volatility surfaces. In this thesis, we target commodity and volatility index markets, and develop a novel stochastic volatility model that incorporates mean-reverting property and 4/2 stochastic volatility process. Commodities and volatility indexes have been proved to be mean-reverting, which means their prices tend to revert to their long term mean …


Modified Firearm Discharge Residue Analysis Utilizing Advanced Analytical Techniques, Complexing Agents, And Quantum Chemical Calculations, William J. Feeney Jan 2021

Modified Firearm Discharge Residue Analysis Utilizing Advanced Analytical Techniques, Complexing Agents, And Quantum Chemical Calculations, William J. Feeney

Graduate Theses, Dissertations, and Problem Reports

The use of gunshot residue (GSR) or firearm discharge residue (FDR) evidence faces some challenges because of instrumental and analytical limitations and the difficulties in evaluating and communicating evidentiary value. For instance, the categorization of GSR based only on elemental analysis of single, spherical particles is becoming insufficient because newer ammunition formulations produce residues with varying particle morphology and composition. Also, one common criticism about GSR practitioners is that their reports focus on the presence or absence of GSR in an item without providing an assessment of the weight of the evidence. Such reports leave the end-used with unanswered questions, …


Bayesian Techniques For Relating Genetic Polymorphisms To Diffusion Tensor Images Of Cocaine Users, Tmader Alballa Jan 2021

Bayesian Techniques For Relating Genetic Polymorphisms To Diffusion Tensor Images Of Cocaine Users, Tmader Alballa

Theses and Dissertations

Past investigations utilizing Diffusion Tensor Imaging (DTI) have demonstrated that cocaine use disorder (CUD) yields white matter changes. We proposed three Bayesian techniques in order to explore the relationship between Fractional Anisotropy (FA), genetic data, and years of cocaine use (YCU). CUD participants exhibit abnormality in different areas of the brain versus non-drug using controls, which is measured by DTI. This dissertation is motivated by a neuroimaging genetic study in cocaine dependence, which found that there were relationships between several genes such as GAD and 5-HT2R and CUD subjects.

In the first chapter, there is background on the …


Evaluating An Ordinal Output Using Data Modeling, Algorithmic Modeling, And Numerical Analysis, Martin Keagan Wynne Brown Jan 2020

Evaluating An Ordinal Output Using Data Modeling, Algorithmic Modeling, And Numerical Analysis, Martin Keagan Wynne Brown

Murray State Theses and Dissertations

Data and algorithmic modeling are two different approaches used in predictive analytics. The models discussed from these two approaches include the proportional odds logit model (POLR), the vector generalized linear model (VGLM), the classification and regression tree model (CART), and the random forests model (RF). Patterns in the data were analyzed using trigonometric polynomial approximations and Fast Fourier Transforms. Predictive modeling is used frequently in statistics and data science to find the relationship between the explanatory (input) variables and a response (output) variable. Both approaches prove advantageous in different cases depending on the data set. In our case, the data …


How Machine Learning And Probability Concepts Can Improve Nba Player Evaluation, Harrison Miller Jan 2020

How Machine Learning And Probability Concepts Can Improve Nba Player Evaluation, Harrison Miller

CMC Senior Theses

In this paper I will be breaking down a scholarly article, written by Sameer K. Deshpande and Shane T. Jensen, that proposed a new method to evaluate NBA players. The NBA is the highest level professional basketball league in America and stands for the National Basketball Association. They proposed to build a model that would result in how NBA players impact their teams chances of winning a game, using machine learning and probability concepts. I preface that by diving into these concepts and their mathematical backgrounds. These concepts include building a linear model using ordinary least squares method, the bias …


Allocative Poisson Factorization For Computational Social Science, Aaron Schein Jul 2019

Allocative Poisson Factorization For Computational Social Science, Aaron Schein

Doctoral Dissertations

Social science data often comes in the form of high-dimensional discrete data such as categorical survey responses, social interaction records, or text. These data sets exhibit high degrees of sparsity, missingness, overdispersion, and burstiness, all of which present challenges to traditional statistical modeling techniques. The framework of Poisson factorization (PF) has emerged in recent years as a natural way to model high-dimensional discrete data sets. This framework assumes that each observed count in a data set is a Poisson random variable $y ~ Pois(\mu)$ whose rate parameter $\mu$ is a function of shared model parameters. This thesis examines a specific …


Paper Structure Formation Simulation, Tyler R. Seekins May 2019

Paper Structure Formation Simulation, Tyler R. Seekins

Electronic Theses and Dissertations

On the surface, paper appears simple, but closer inspection yields a rich collection of chaotic dynamics and random variables. Predictive simulation of paper product properties is desirable for screening candidate experiments and optimizing recipes but existing models are inadequate for practical use. We present a novel structure simulation and generation system designed to narrow the gap between mathematical model and practical prediction. Realistic inputs to the system are preserved as randomly distributed variables. Rapid fiber placement (~1 second/fiber) is achieved with probabilistic approximation of chaotic fluid dynamics and minimization of potential energy to determine flexible fiber conformations. Resulting digital packed …


Predictive Distributions Via Filtered Historical Simulation For Financial Risk Management, Tyson Clark May 2019

Predictive Distributions Via Filtered Historical Simulation For Financial Risk Management, Tyson Clark

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

Filtered historical simulation with an underlying GARCH process can be used as a valuable tool in VaR analysis, as it derives risk estimates that are sensitive to the distributional properties of the historical data of the produced predictive density. I examine the applications to risk analysis that filtered historical simulation can provide, as well as an interpretation of the predictive density as a poor man’s Bayesian posterior distribution. The predictive density allows us to make associated probabilistic statements regarding the results for VaR analysis, giving greater measurement of risk and the ability to maintain the optimal level of risk per …


Modeling Stochastically Intransitive Relationships In Paired Comparison Data, Ryan Patrick Alexander Mcshane Jan 2019

Modeling Stochastically Intransitive Relationships In Paired Comparison Data, Ryan Patrick Alexander Mcshane

Statistical Science Theses and Dissertations

If the Warriors beat the Rockets and the Rockets beat the Spurs, does that mean that the Warriors are better than the Spurs? Sophisticated fans would argue that the Warriors are better by the transitive property, but could Spurs fans make a legitimate argument that their team is better despite this chain of evidence?

We first explore the nature of intransitive (rock-scissors-paper) relationships with a graph theoretic approach to the method of paired comparisons framework popularized by Kendall and Smith (1940). Then, we focus on the setting where all pairs of items, teams, players, or objects have been compared to …


Counting And Coloring Sudoku Graphs, Kyle Oddson Jan 2019

Counting And Coloring Sudoku Graphs, Kyle Oddson

Mathematics and Statistics Dissertations, Theses, and Final Project Papers

A sudoku puzzle is most commonly a 9 × 9 grid of 3 × 3 boxes wherein the puzzle player writes the numbers 1 - 9 with no repetition in any row, column, or box. We generalize the notion of the n2 × n2 sudoku grid for all n ϵ Z ≥2 and codify the empty sudoku board as a graph. In the main section of this paper we prove that sudoku boards and sudoku graphs exist for all such n we prove the equivalence of [3]'s construction using unions and products of graphs to the definition of …


An Overview And Evaluation Of Synthetc: A Statistical Model For Extra-Tropical Cyclones, Rafael Uryayev Jan 2019

An Overview And Evaluation Of Synthetc: A Statistical Model For Extra-Tropical Cyclones, Rafael Uryayev

Dissertations and Theses

Extratropical cyclones (ETCs) are the most common weather phenomena affecting the United States, Canada, and Europe. They can pose serious hazards over large swaths of area. In this thesis, a statistical model of ETCs, called SynthETC, is discussed. The model accounts for the for genesis, track path, termination, and intensity of statistically generated ETCs. Genesis is modeled as a Poisson process, whose mean is determined by climate and historical information. Tracks are modeled as a regression-mean determined by climate and historical information plus a stochastic component. Lysis is modeled using logistic regression, with climate states as covariates. Intensity is modeled …


Biodiversity And Distribution Of Benthic Foraminifera In Harrington Sound, Bermuda: The Effects Of Physical And Geochemical Factors On Dominant Taxa, Nam Le Jan 2019

Biodiversity And Distribution Of Benthic Foraminifera In Harrington Sound, Bermuda: The Effects Of Physical And Geochemical Factors On Dominant Taxa, Nam Le

Honors Theses

Harrington Sound, Bermuda, is a nearly enclosed lagoon acting as a subtropical/tropical, carbonate-rich basin in which carbonate sediments, reef patches, and carbonate-producing organisms accumulate. Here, one of the most important calcareous groups is the Foraminifera. Analyses of common benthic orders, including miliolids (Quinqueloculina and Triloculina spp.) and rotaliids (Homotrema rubrum, Elphidium spp., and Ammonia beccarii), are essential in understanding past and present environmental conditions affecting the island's coastal environment. These taxa have been studied previously; however, factors explaining their individual patterns of abundance in the Sound are not well detailed. The goal of this study is …


Generalizing Multistage Partition Procedures For Two-Parameter Exponential Populations, Rui Wang Aug 2018

Generalizing Multistage Partition Procedures For Two-Parameter Exponential Populations, Rui Wang

University of New Orleans Theses and Dissertations

ANOVA analysis is a classic tool for multiple comparisons and has been widely used in numerous disciplines due to its simplicity and convenience. The ANOVA procedure is designed to test if a number of different populations are all different. This is followed by usual multiple comparison tests to rank the populations. However, the probability of selecting the best population via ANOVA procedure does not guarantee the probability to be larger than some desired prespecified level. This lack of desirability of the ANOVA procedure was overcome by researchers in early 1950's by designing experiments with the goal of selecting the best …


Distribution Of A Sum Of Random Variables When The Sample Size Is A Poisson Distribution, Mark Pfister Aug 2018

Distribution Of A Sum Of Random Variables When The Sample Size Is A Poisson Distribution, Mark Pfister

Electronic Theses and Dissertations

A probability distribution is a statistical function that describes the probability of possible outcomes in an experiment or occurrence. There are many different probability distributions that give the probability of an event happening, given some sample size n. An important question in statistics is to determine the distribution of the sum of independent random variables when the sample size n is fixed. For example, it is known that the sum of n independent Bernoulli random variables with success probability p is a Binomial distribution with parameters n and p: However, this is not true when the sample size …


Deep Learning Analysis Of Limit Order Book, Xin Xu May 2018

Deep Learning Analysis Of Limit Order Book, Xin Xu

Arts & Sciences Electronic Theses and Dissertations

In this paper, we build a deep neural network for modeling spatial structure in limit order book and make prediction for future best ask or best bid price based on ideas of (Sirignano 2016). We propose an intuitive data processing method to approximate the data is non-available for us based only on level I data that is more widely available. The model is based on the idea that there is local dependence for best ask or best bid price and sizes of related orders. First we use logistic regression to prove that this approach is reasonable. To show the advantages …


Application Of Remote Sensing And Machine Learning Modeling To Post-Wildfire Debris Flow Risks, Priscilla Addison Jan 2018

Application Of Remote Sensing And Machine Learning Modeling To Post-Wildfire Debris Flow Risks, Priscilla Addison

Dissertations, Master's Theses and Master's Reports

Historically, post-fire debris flows (DFs) have been mostly more deadly than the fires that preceded them. Fires can transform a location that had no history of DFs to one that is primed for it. Studies have found that the higher the severity of the fire, the higher the probability of DF occurrence. Due to high fatalities associated with these events, several statistical models have been developed for use as emergency decision support tools. These previous models used linear modeling approaches that produced subpar results. Our study therefore investigated the application of nonlinear machine learning modeling as an alternative. Existing models …


Comparing Various Machine Learning Statistical Methods Using Variable Differentials To Predict College Basketball, Nicholas Bennett Jan 2018

Comparing Various Machine Learning Statistical Methods Using Variable Differentials To Predict College Basketball, Nicholas Bennett

Williams Honors College, Honors Research Projects

The purpose of this Senior Honors Project is to research, study, and demonstrate newfound knowledge of various machine learning statistical techniques that are not covered in the University of Akron’s statistics major curriculum. This report will be an overview of three machine-learning methods that were used to predict NCAA Basketball results, specifically, the March Madness tournament. The variables used for these methods, models, and tests will include numerous variables kept throughout the season for each team, along with a couple variables that are used by the selection committee when tournament teams are being picked. The end goal is to find …


Some New And Generalized Distributions Via Exponentiation, Gamma And Marshall-Olkin Generators With Applications, Hameed Abiodun Jimoh Jan 2018

Some New And Generalized Distributions Via Exponentiation, Gamma And Marshall-Olkin Generators With Applications, Hameed Abiodun Jimoh

Electronic Theses and Dissertations

Three new generalized distributions developed via completing risk, gamma generator, Marshall-Olkin generator and exponentiation techniques are proposed and studied. Structural properties including quantile functions, hazard rate functions, moment, conditional moments, mean deviations, R\'enyi entropy, distribution of order statistics and maximum likelihood estimates are presented. Monte Carlo simulation is employed to examine the performance of the proposed distributions. Applications of the generalized distributions to real lifetime data are presented to illustrate the usefulness of the models.


Statistical Analysis Of Momentum In Basketball, Mackenzi Stump Dec 2017

Statistical Analysis Of Momentum In Basketball, Mackenzi Stump

Honors Projects

The “hot hand” in sports has been debated for as long as sports have been around. The debate involves whether streaks and slumps in sports are true phenomena or just simply perceptions in the mind of the human viewer. This statistical analysis of momentum in basketball analyzes the distribution of time between scoring events for the BGSU Women’s Basketball team from 2011-2017. We discuss how the distribution of time between scoring events changes with normal game factors such as location of the game, game outcome, and several other factors. If scoring events during a game were always randomly distributed, or …