Open Access. Powered by Scholars. Published by Universities.®

Statistical Models Commons

Open Access. Powered by Scholars. Published by Universities.®

Western University

Discipline
Keyword
Publication Year
Publication
Publication Type

Articles 1 - 30 of 50

Full-Text Articles in Statistical Models

Parameter Estimation For Normally Distributed Grouped Data And Clustering Single-Cell Rna Sequencing Data Via The Expectation-Maximization Algorithm, Zahra Aghahosseinalishirazi Sep 2023

Parameter Estimation For Normally Distributed Grouped Data And Clustering Single-Cell Rna Sequencing Data Via The Expectation-Maximization Algorithm, Zahra Aghahosseinalishirazi

Electronic Thesis and Dissertation Repository

The Expectation-Maximization (EM) algorithm is an iterative algorithm for finding the maximum likelihood estimates in problems involving missing data or latent variables. The EM algorithm can be applied to problems consisting of evidently incomplete data or missingness situations, such as truncated distributions, censored or grouped observations, and also to problems in which the missingness of the data is not natural or evident, such as mixed-effects models, mixture models, log-linear models, and latent variables. In Chapter 2 of this thesis, we apply the EM algorithm to grouped data, a problem in which incomplete data are evident. Nowadays, data confidentiality is of …


Modelling Long-Term Security Returns, Xinghan Zhu Aug 2023

Modelling Long-Term Security Returns, Xinghan Zhu

Electronic Thesis and Dissertation Repository

This research focuses on the concerns of Canadian investors regarding portfolio diversification and preparedness for unexpected risks in retirement planning. It models market crashes and two main financial instruments as independent components to simulate clients’ portfolios. Initially exploring single distributions on mutual funds such as Laplace and t distributions, the research finds limited success. Instead, a normal-Weibull spliced distribution is introduced to model log returns. The Geometric Brownian Motion (GBM) model is employed to predict and evaluate returns on common stocks using the Maximum Likelihood Estimator (MLE), assuming that daily log returns follow a normal distribution. Additionally, the Merton Jump …


Addressing The Impact Of Time-Dependent Social Groupings On Animal Survival And Recapture Rates In Mark-Recapture Studies, Alexandru M. Draghici Jun 2023

Addressing The Impact Of Time-Dependent Social Groupings On Animal Survival And Recapture Rates In Mark-Recapture Studies, Alexandru M. Draghici

Electronic Thesis and Dissertation Repository

Mark-recapture (MR) models typically assume that individuals under study have independent survival and recapture outcomes. One such model of interest is known as the Cormack-Jolly-Seber (CJS) model. In this dissertation, we conduct three major research projects focused on studying the impact of violating the independence assumption in MR models along with presenting extensions which relax the independence assumption. In the first project, we conduct a simulation study to address the impact of failing to account for pair-bonded animals having correlated recapture and survival fates on the CJS model. We examined the impact of correlation on the likelihood ratio test (LRT), …


Regression-Based Methods For Dynamic Treatment Regimes With Mismeasured Covariates Or Misclassified Response, Dan Liu Sep 2022

Regression-Based Methods For Dynamic Treatment Regimes With Mismeasured Covariates Or Misclassified Response, Dan Liu

Electronic Thesis and Dissertation Repository

The statistical study of dynamic treatment regimes (DTRs) focuses on estimating sequential treatment decision rules tailored to patient-level information across multiple stages of intervention. Regression-based methods in DTR have been studied in the literature with a critical assumption that all the observed variables are precisely measured. However, this assumption is often violated in many applications. One example is the STAR*D study, in which the patient's depressive score is subject to measurement error. In this thesis, we explore problems in the context of DTR with measurement error or misclassification considered in the observed data.

The first project deals with covariate measurement …


Exploring Human-Caused Fire Occurrence Prediction, Ruyi Jin Aug 2022

Exploring Human-Caused Fire Occurrence Prediction, Ruyi Jin

Undergraduate Student Research Internships Conference

Wildland Fire Science has become an increasingly hot topic in recent years. The goal of this report is to investigate human-caused wildland fire occurrence prediction. The two main predictors of interest are the mean value of the Fine Fuel Moisture Code (FFMC) and the month when a fire ignites. An Exploratory Data Analysis is presented first, after which we fit models to predict daily fire counts. We first consider Poisson models to fit the count data, but also attempt to fit Negative Binomial models to deal with overdispersion. We compare these models in the following ways: plotting the difference in …


An Analysis Of Weighted Least Squares Monte Carlo, Xiaotian Zhu Aug 2022

An Analysis Of Weighted Least Squares Monte Carlo, Xiaotian Zhu

Electronic Thesis and Dissertation Repository

Since Longstaff and Schwartz [2001] brought the amazing Regression-based Monte Carlo (LSMC) method in pricing American options, it has received heated discussion. Based on the research done by Fabozzi et al. [2017] that applies the heteroscedasticity correction method to LSMC, we further extend the study by introducing the methods from Park [1966] and Harvey [1976]. Our work shows that for a single stock American Call option modelled by GBM with two exercise opportunities, WLSMC or IRLSMC provides better estimates in continuation value than LSMC. However, they do not lead to better exercise decisions and hence have little to no effect …


A Transformer-Based Classification System For Volcanic Seismic Signals, Anthony P. Rinaldi, Cindy Mora Stock, Cristián Bravo Roman, Alexander Hemming Aug 2022

A Transformer-Based Classification System For Volcanic Seismic Signals, Anthony P. Rinaldi, Cindy Mora Stock, Cristián Bravo Roman, Alexander Hemming

Undergraduate Student Research Internships Conference

Monitoring volcanic events as they occur is a task that, to this day, requires significant human capital. The current process requires geologists to monitor seismographs around the clock, making it extremely labour-intensive and inefficient. The ability to automatically classify volcanic events as they happen in real-time would allow for quicker responses to these events by the surrounding communities. Timely knowledge of the type of event that is occurring can allow these surrounding communities to prepare or evacuate sooner depending on the magnitude of the event. Up until recently, not much research has been conducted regarding the potential for machine learning …


Bias-Corrected Bagging In Active Learning With An Actuarial Application, Yangxuan Xu Aug 2022

Bias-Corrected Bagging In Active Learning With An Actuarial Application, Yangxuan Xu

Undergraduate Student Research Internships Conference

The variable annuity (VA) is a modern insurance product that offers certain guaranteed protection and tax-deferred treatment. Because of the inherent complexity of guarantees’ payoff, the closed-form solution of fair market values (FMVs) is often not available. Most insurance companies depend on Monte Carlo (MC) simulation to price the FMVs of these products, which is an extremely computational intensive and time-consuming approach. The metamodeling approach can be used to circumvent the heavy computation.

In the modeling stage, the bagged tree method has proved to outperform other parametric approaches. Also, a bias-corrected (BC) bagging model was tried and showed significant improvement …


Investigating Distributions Of Epochs In Wildland Fire Lifetimes, Xinlei Wang Aug 2022

Investigating Distributions Of Epochs In Wildland Fire Lifetimes, Xinlei Wang

Undergraduate Student Research Internships Conference

The objective of my research project is to explore the relationship between variables related to wildland fire and to model distributions of epochs in wildland fire lifetimes. Several distributional families are considered for modeling these epochs, including the exponential distribution, gamma distribution, Weibull distribution and continuous phase-type distribution. I explain each of these distributions in short terms and illustrate how they are fit. Visual results of my exploratory data analysis are illustrated in two parts, data visualization and data modeling, along with my interpretation of each. Since this work is preliminary, I conclude the report with a discussion on what …


The Q-Analogue Of The Extended Generalized Gamma Distribution, Wenhao Chen Aug 2022

The Q-Analogue Of The Extended Generalized Gamma Distribution, Wenhao Chen

Undergraduate Student Research Internships Conference

This project introduces a flexible univariate probability model referred to as the q-analogue of the Extended Generalized Gamma (or q-EGG) distribution, which encompasses the majority of the most frequently used continuous distributions, including the gamma, Weibull, logistic, type-1 and type-2 beta, Gaussian, Cauchy, Student-t and F. Closed form representations of its moments and cumulative distribution function are provided. Additionally, computational techniques are proposed for determining estimates of its parameters. Both the method of moments and the maximum likelihood approach are utilized. The effect of each parameter is also graphically illustrated. Certain data sets are modeled with q-EGG distributions; goodness of …


Investigation Of Key Factors To Earthquake Insurance Take-Up Rates In Quebec And British Columbia Households And Prediction Model Building, Yongcheng Jiang Aug 2022

Investigation Of Key Factors To Earthquake Insurance Take-Up Rates In Quebec And British Columbia Households And Prediction Model Building, Yongcheng Jiang

Undergraduate Student Research Internships Conference

Maintaining an adequate level of earthquake take-up rate could protect the insurance industry from systemic failure. Past research has shown that British Columbia and Quebec have significant differences in earthquake insurance take-up rate. This report investigates key factors from the structure (default options and various types) of the insurance plan and personal characteristics along with socioeconomic/demographic profiles that affect the demand for earthquake protection in the form of insurance. The report also provides a prediction model for earthquake insurance take-up rate. The results show an importance ranking of key factors of earthquake insurance take up, the most important three are …


Functional Structure Of Excess Return And Volatility, Chenxi Zhao Aug 2022

Functional Structure Of Excess Return And Volatility, Chenxi Zhao

Undergraduate Student Research Internships Conference

Capturing the relation between excess returns and volatility can help making better decisions in the stock market in terms of portfolio allocation and assets risk management. This paper takes the data of a minute-by-minute series of S&P500 from January 2009 to January 2021 as the research object and explores the best structural representation for the excess return as a function of the volatility, for a well-known index. This is implemented via regression models for volatility and excess returns. The results reveal that there’s a structural break in the relationship between the excess return and volatility based on the sign of …


New Developments On The Estimability And The Estimation Of Phase-Type Actuarial Models, Cong Nie Jul 2022

New Developments On The Estimability And The Estimation Of Phase-Type Actuarial Models, Cong Nie

Electronic Thesis and Dissertation Repository

This thesis studies the estimability and the estimation methods for two models based on Markov processes: the phase-type aging model (PTAM), which models the human aging process, and the discrete multivariate phase-type model (DMPTM), which can be used to model multivariate insurance claim processes.

The principal contributions of this thesis can be categorized into two areas. First, an objective measure of estimability is proposed to quantify estimability in the context of statistical models. Existing methods for assessing estimability require the subjective specification of thresholds, which potentially limits their usefulness. Unlike these methods, the proposed measure of estimability is objective. In …


Early-Warning Alert Systems For Financial-Instability Detection: An Hmm-Driven Approach, Xing Gu Apr 2022

Early-Warning Alert Systems For Financial-Instability Detection: An Hmm-Driven Approach, Xing Gu

Electronic Thesis and Dissertation Repository

Regulators’ early intervention is crucial when the financial system is experiencing difficulties. Financial stability must be preserved to avert banks’ bailouts, which hugely drain government's financial resources. Detecting in advance periods of financial crisis entails the development and customisation of accurate and robust quantitative techniques. The goal of this thesis is to construct automated systems via the interplay of various mathematical and statistical methodologies to signal financial instability episodes in the near-term horizon. These signal alerts could provide regulatory bodies with the capacity to initiate appropriate response that will thwart or at least minimise the occurrence of a financial crisis. …


Statistical Applications To The Management Of Intensive Care And Step-Down Units, Yawo Mamoua Kobara Apr 2022

Statistical Applications To The Management Of Intensive Care And Step-Down Units, Yawo Mamoua Kobara

Electronic Thesis and Dissertation Repository

This thesis proposes three contributing manuscripts related to patient flow management, server decision-making, and ventilation time in the intensive care and step-down units system.

First, a Markov decision process (MDP) model with a Monte Carlo simulation was performed to compare two patient flow policies: prioritizing premature step-down and prioritizing rejection of patients when the intensive care unit is congested. The optimal decisions were obtained under the two strategies. The simulation results based on these optimal decisions show that a premature step-down strategy contributes to higher congestion downstream. Counter-intuitively, premature step-down should be discouraged, and patient rejection or divergence actions should …


Compound Sums, Their Distributions, And Actuarial Pricing, Ang Li Oct 2021

Compound Sums, Their Distributions, And Actuarial Pricing, Ang Li

Electronic Thesis and Dissertation Repository

Compound risk models are widely used in insurance companies to mathematically describe their aggregate amount of losses during certain time period. However, evaluation of the distribution of compound random variables and the computation of the relevant risk measures are non-trivial. Therefore, the main purpose of this thesis is to study the bounds and simulation methods for both univariate and multivariate compound distributions. The premium setting principles related to dependent multivariate compound distributions are studied. .

In the first part of this thesis, we consider the upper and lower bounds of the tail of bivariate compound distributions. Our results extend those …


On The Estimation Of Heston-Nandi Garch Using Returns And/Or Options: A Simulation-Based Approach, Xize Ye Jul 2021

On The Estimation Of Heston-Nandi Garch Using Returns And/Or Options: A Simulation-Based Approach, Xize Ye

Electronic Thesis and Dissertation Repository

In this thesis, the Heston-Nandi GARCH(1,1) (henceforth, HN-GARCH) option pricing model is fitted via 4 maximum likelihood-based estimation and calibration approaches using simulated returns and/or options. The purpose is to examine the benefits of the joint estimation using both returns and options over the fundamental returns-only estimation on GARCH models. From our empirical studies, with the additional option sample, we can improve the efficiency of the estimates for HN-GARCH parameters. Nonetheless, the improvements for the risk premium factor, both from empirical standard errors, and sample RMSEs, are insignificant. In addition, option prices are simulated with a pre-defined noise structure and …


A Class Of Phase-Type Ageing Models And Their Lifetime Distributions, Boquan Cheng Apr 2021

A Class Of Phase-Type Ageing Models And Their Lifetime Distributions, Boquan Cheng

Electronic Thesis and Dissertation Repository

Ageing is a universal and ever-present biological phenomenon. Yet, describing the ageing mechanism in formal mathematical terms — in particular, capturing the ageing pattern and quantifying the ageing rate — has remained a challenging actuarial modelling endeavour. In this thesis, we propose a class of Coxian-type Markovian models. This class enables a quantitative description of the well-known characteristics of ageing, which is a genetically determined, progressive, and essentially irreversible process. The unique structure of our model features the transition rate for the ageing process and a functional form for the relationship between ageing and death with a shape parameter that …


The Mean-Reverting 4/2 Stochastic Volatility Model: Properties And Financial Applications, Zhenxian Gong Feb 2021

The Mean-Reverting 4/2 Stochastic Volatility Model: Properties And Financial Applications, Zhenxian Gong

Electronic Thesis and Dissertation Repository

Financial markets and instruments are continuously evolving, displaying new and more refined stylized facts. This requires regular reviews and empirical evaluations of advanced models. There is evidence in literature that supports stochastic volatility models over constant volatility models in capturing stylized facts such as "smile" and "skew" presented in implied volatility surfaces. In this thesis, we target commodity and volatility index markets, and develop a novel stochastic volatility model that incorporates mean-reverting property and 4/2 stochastic volatility process. Commodities and volatility indexes have been proved to be mean-reverting, which means their prices tend to revert to their long term mean …


Statistical Methods With A Focus On Joint Outcome Modeling And On Methods For Fire Science, Da Zhong Xi Nov 2020

Statistical Methods With A Focus On Joint Outcome Modeling And On Methods For Fire Science, Da Zhong Xi

Electronic Thesis and Dissertation Repository

Understanding the dynamics of wildfires contributes significantly to the development of fire science. Challenges in the analysis of historical fire data include defining fire dynamics within existing statistical frameworks, modeling the duration and size of fires as joint outcomes, identifying the how fires are grouped into clusters of subpopulations, and assessing the effect of environmental variables in different modeling frameworks. We develop novel statistical methods to consider outcomes related to fire science jointly. These methods address these challenges by linking univariate models for separate outcomes through shared random effects, an approach referred to as joint modeling. Comparisons with existing …


Renewable-Energy Resources, Economic Growth And Their Causal Link, Yiyang Chen Aug 2020

Renewable-Energy Resources, Economic Growth And Their Causal Link, Yiyang Chen

Electronic Thesis and Dissertation Repository

This thesis examines the presence and strength of predictive causal relationship between re-newable energy prices and economic growth. We look for evidence by investigating the cases of Norway, New Zealand, and Canada’s two provinces of Alberta and Ontario. The usual vectorautoregressive model (VAR) and its various improved versions still assume constant parametersover time. We devise a Markov-switching VAR (MS-VAR) model in order to accommodate the observed time-dependent causal relation changes. Our proposed modelling approach is induced by the hidden Markov model methodologies in terms of an online parameter estimationthrough recursive filtering. The parameters of the MS-VAR model are governed by …


Extensions Of Classification Method Based On Quantiles, Yuanhao Lai Jun 2020

Extensions Of Classification Method Based On Quantiles, Yuanhao Lai

Electronic Thesis and Dissertation Repository

This thesis deals with the problem of classification in general, with a particular focus on heavy-tailed or skewed data. The classification problem is first formalized by statistical learning theory and several important classification methods are reviewed, where the distance-based classifiers, including the median-based classifier and the quantile-based classifier (QC), are especially useful for the heavy-tailed or skewed inputs. However, QC is limited by its model capacity and the issue of high-dimensional accumulated errors. Our objective of this study is to investigate more general methods while retaining the merits of QC.

We present four extensions of QC, which appear in chronological …


Generalized 4/2 Factor Model, Yuyang Cheng Jun 2020

Generalized 4/2 Factor Model, Yuyang Cheng

Electronic Thesis and Dissertation Repository

We investigate portfolio optimization, risk management, and derivative pricing for a factor stochastic model that considers the 4/2 stochastic volatility on the common/systematic factor as well as on the intrinsic factor. This setting allows us to capture stochastic volatility and stochastic covariation among assets. The model is also a generalization of existing models in the literature as it includes the mean reverting property and spillover effect to capture wider types of financial assets. At a theoretical level we identify conditions for well-defined changes of measure. A quasi-closed form solution within a 4/2 structured model is obtained for a portfolio optimization …


Edge-Cloud Iot Data Analytics: Intelligence At The Edge With Deep Learning, Ananda Mohon M. Ghosh May 2020

Edge-Cloud Iot Data Analytics: Intelligence At The Edge With Deep Learning, Ananda Mohon M. Ghosh

Electronic Thesis and Dissertation Repository

Rapid growth in numbers of connected devices, including sensors, mobile, wearable, and other Internet of Things (IoT) devices, is creating an explosion of data that are moving across the network. To carry out machine learning (ML), IoT data are typically transferred to the cloud or another centralized system for storage and processing; however, this causes latencies and increases network traffic. Edge computing has the potential to remedy those issues by moving computation closer to the network edge and data sources. On the other hand, edge computing is limited in terms of computational power and thus is not well suited for …


Point Process Modelling Of Objects In The Star Formation Complexes Of The M33 Galaxy, Dayi Li Apr 2020

Point Process Modelling Of Objects In The Star Formation Complexes Of The M33 Galaxy, Dayi Li

Electronic Thesis and Dissertation Repository

In this thesis, Gibbs point process (GPP) models are constructed to study the spatial distribution of objects in the star formation complexes of the M33 galaxy. The GPP models circumvent the limitations of the two-point correlation function employed in the current astronomy literature by naturally accounting for the inhomogeneous distribution of these objects. The spatial distribution of these objects serves as a sensitive probe in understanding the star formation process, which is crucial in understanding the formation of galaxies and the Universe. The objects under study include the CO filament structure, giant molecular clouds (GMCs) and young stellar cluster candidates …


Statistical Modeling And Characterization Of Induced Seismicity Within The Western Canada Sedimentary Basin, Sid Kothari Oct 2019

Statistical Modeling And Characterization Of Induced Seismicity Within The Western Canada Sedimentary Basin, Sid Kothari

Electronic Thesis and Dissertation Repository

In western Canada, there has been an increase in seismic activity linked to anthropogenic energy-related operations including conventional hydrocarbon production, wastewater fluid injection and more recently hydraulic fracturing (HF). Statistical modeling and characterization of the space, time and magnitude distributions of the seismicity clusters is vital for a better understanding of induced earthquake processes and development of predictive models. In this work, a statistical analysis of the seismicity in the Western Canada Sedimentary Basin was performed across past and present time periods by utilizing a compiled earthquake catalogue for Alberta and eastern British Columbia. Specifically, the frequency-magnitude statistics were analyzed …


Exploring The Estimability Of Mark-Recapture Models With Individual, Time-Varying Covariates Using The Scaled Logit Link Function, Jiaqi Mu Aug 2019

Exploring The Estimability Of Mark-Recapture Models With Individual, Time-Varying Covariates Using The Scaled Logit Link Function, Jiaqi Mu

Electronic Thesis and Dissertation Repository

Mark-recapture studies are often used to estimate the survival of individuals in a population and identify factors that affect survival in order to understand how the population might be affected by changing conditions. Factors that vary between individuals and over time, like body mass, present a challenge because they can only be observed when an individual is captured. Several models have been proposed to deal with the missing-covariate problem and commonly impose a logit link function which implies that the survival probability varies between 0 and 1. In this thesis I explore the estimability of four possible models when survival …


Split Credibility: A Two-Dimensional Semi-Linear Credibility Model, Jingbing Qiu Aug 2019

Split Credibility: A Two-Dimensional Semi-Linear Credibility Model, Jingbing Qiu

Electronic Thesis and Dissertation Repository

In the thesis, we introduce a two-dimensional semi-linear credibility model, which is an extension of the classical credibility or split credibility models used by practicing actuaries. Our model predicts the future expected losses of a policyholder by considering its historical primary and excess losses. The optimal split point is derived based on the mean squared error criterion. We show when and why splitting a policyholder’s historical losses into primary and excess parts work analytically. In addition, we derived formulas for estimating our model parameters nonparametrically. Finally, we show the application of our model through three examples.


Bias Assessment And Reduction In Kernel Smoothing, Wenkai Ma Nov 2018

Bias Assessment And Reduction In Kernel Smoothing, Wenkai Ma

Electronic Thesis and Dissertation Repository

When performing local polynomial regression (LPR) with kernel smoothing, the choice of the smoothing parameter, or bandwidth, is critical. The performance of the method is often evaluated using the Mean Square Error (MSE). Bias and variance are two components of MSE. Kernel methods are known to exhibit varying degrees of bias. Boundary effects and data sparsity issues are two potential problems to watch for. There is a need for a tool to visually assess the potential bias when applying kernel smooths to a given scatterplot of data. In this dissertation, we propose pointwise confidence intervals for bias and demonstrate a …


Statistical Applications In Healthcare Systems, Maryam Mojalal Apr 2018

Statistical Applications In Healthcare Systems, Maryam Mojalal

Electronic Thesis and Dissertation Repository

This thesis consists of three contributing manuscripts related to waiting times with possible applications in health care. The first manuscript is inspired by a practical problem related to decision making in an emergency department (ED). As short-run predictions of ED censuses are particularly important for efficient allocation and management of ED resources we model ED changes and present estimations for short term (hourly) ED censuses at each time point. We present a Markov-chain based algorithm to make census predictions in near future.

Considering the variation in arrival pattern and service requirements, we apply and compare three models which best describe …