Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 134

Full-Text Articles in Physical Sciences and Mathematics

Multi-Arm Randomized Control Trials In Inflammatory Bowel Disease: A Literature Review And An Illustration Of Methods For Analysis, Sahiba Saini Nov 2023

Multi-Arm Randomized Control Trials In Inflammatory Bowel Disease: A Literature Review And An Illustration Of Methods For Analysis, Sahiba Saini

Electronic Thesis and Dissertation Repository

This thesis aimed to review the literature on multiple-arm randomized control trials in inflammatory bowel disease (IBD) and to illustrate how to analyze these trials, focusing on appropriately controlling the type 1 error rates. The literature review found 247 trials published from the inception of each database to April 2014, of which 122 (49%) trials were multiple-arm trials and of those, 59 (48%) trials were on ulcerative colitis and 63 (52%) on Crohn’s disease. A published assessment tool was adopted to assess whether controlling of Type I error rates was needed. Despite the common use of this trial design and …


Parameter Estimation For Normally Distributed Grouped Data And Clustering Single-Cell Rna Sequencing Data Via The Expectation-Maximization Algorithm, Zahra Aghahosseinalishirazi Sep 2023

Parameter Estimation For Normally Distributed Grouped Data And Clustering Single-Cell Rna Sequencing Data Via The Expectation-Maximization Algorithm, Zahra Aghahosseinalishirazi

Electronic Thesis and Dissertation Repository

The Expectation-Maximization (EM) algorithm is an iterative algorithm for finding the maximum likelihood estimates in problems involving missing data or latent variables. The EM algorithm can be applied to problems consisting of evidently incomplete data or missingness situations, such as truncated distributions, censored or grouped observations, and also to problems in which the missingness of the data is not natural or evident, such as mixed-effects models, mixture models, log-linear models, and latent variables. In Chapter 2 of this thesis, we apply the EM algorithm to grouped data, a problem in which incomplete data are evident. Nowadays, data confidentiality is of …


Nonparametric Methods For Analysis And Sizing Of Cluster Randomization Trials With Baseline Measurements, Chengchun Yu Sep 2023

Nonparametric Methods For Analysis And Sizing Of Cluster Randomization Trials With Baseline Measurements, Chengchun Yu

Electronic Thesis and Dissertation Repository

Cluster randomization trials are popular in situations where the intervention needs to be implemented at the cluster level, or logistical, financial and/or ethical reason dictates the choice for randomization at the cluster level, or minimization of contamination is needed. It is very common for cluster trials to take measurements before randomization and again at follow-up, resulting in a clustered pretest-posttest design. For continuous outcomes, the cluster-adjusted analysis of covariance approach can be used to adjust for accidental bias and improve efficiency. However, a direct application of this method is nonsensical if the measures are incompatible with an interval scale, yet …


Modelling Long-Term Security Returns, Xinghan Zhu Aug 2023

Modelling Long-Term Security Returns, Xinghan Zhu

Electronic Thesis and Dissertation Repository

This research focuses on the concerns of Canadian investors regarding portfolio diversification and preparedness for unexpected risks in retirement planning. It models market crashes and two main financial instruments as independent components to simulate clients’ portfolios. Initially exploring single distributions on mutual funds such as Laplace and t distributions, the research finds limited success. Instead, a normal-Weibull spliced distribution is introduced to model log returns. The Geometric Brownian Motion (GBM) model is employed to predict and evaluate returns on common stocks using the Maximum Likelihood Estimator (MLE), assuming that daily log returns follow a normal distribution. Additionally, the Merton Jump …


Addressing The Impact Of Time-Dependent Social Groupings On Animal Survival And Recapture Rates In Mark-Recapture Studies, Alexandru M. Draghici Jun 2023

Addressing The Impact Of Time-Dependent Social Groupings On Animal Survival And Recapture Rates In Mark-Recapture Studies, Alexandru M. Draghici

Electronic Thesis and Dissertation Repository

Mark-recapture (MR) models typically assume that individuals under study have independent survival and recapture outcomes. One such model of interest is known as the Cormack-Jolly-Seber (CJS) model. In this dissertation, we conduct three major research projects focused on studying the impact of violating the independence assumption in MR models along with presenting extensions which relax the independence assumption. In the first project, we conduct a simulation study to address the impact of failing to account for pair-bonded animals having correlated recapture and survival fates on the CJS model. We examined the impact of correlation on the likelihood ratio test (LRT), …


Multiple Endpoints In Randomized Controlled Trials: A Review And An Illustration Of The Global Test, Lindsay Cameron Apr 2023

Multiple Endpoints In Randomized Controlled Trials: A Review And An Illustration Of The Global Test, Lindsay Cameron

Electronic Thesis and Dissertation Repository

A randomized controlled trial is often used to provide high quality evidence regarding treatment interventions. Due to the complex nature of many diseases, trials usually select multiple primary outcomes to capture the efficacy of the interventions. In this thesis, we conducted a literature search to determine the prevalence of the different types of multiple outcomes that have been used in randomized controlled trials. We also reviewed the corresponding statistical methods used to deal with such outcomes. In addition, we described the benefits of using global tests as a statistical method when there are multiple primary outcomes in order to answer …


Nearby Galaxies: Modelling Star Formation Histories And Contamination By Unresolved Background Galaxies, Hadi Papei Jan 2023

Nearby Galaxies: Modelling Star Formation Histories And Contamination By Unresolved Background Galaxies, Hadi Papei

Electronic Thesis and Dissertation Repository

Galaxies are complex systems of stars, gas, dust, and dark matter which evolve over billions of years, and one of the main goals of astrophysics is to understand how these complex systems form and change. Measuring the star formation history of nearby galaxies, in which thousands of stars can be resolved individually, has provided us with a clear picture of their evolutionary history and the evolution of galaxies in general.

In this work, we have developed the first public Python package, SFHPy, to measure star formation histories of nearby galaxies using their colour-magnitude diagrams. In this algorithm, an observed colour-magnitude …


Portfolio Optimization Analysis In The Family Of 4/2 Stochastic Volatility Models, Yuyang Cheng Nov 2022

Portfolio Optimization Analysis In The Family Of 4/2 Stochastic Volatility Models, Yuyang Cheng

Electronic Thesis and Dissertation Repository

Over the last two decades, trading of financial derivatives has increased significantly along with richer and more complex behaviour/traits on the underlying assets. The need for more advanced models to capture traits and behaviour of risky assets is crucial. In this spirit, the state-of-the-art 4/2 stochastic volatility model was recently proposed by Grasselli in 2017 and has gained great attention ever since. The 4/2 model is a superposition of a Heston (1/2) component and a 3/2 component, which is shown to be able to eliminate the limitations of these two individual models, bringing the best out of each other. Based …


Statistical Roles Of The G-Expectation Framework In Model Uncertainty: The Semi-G-Structure As A Stepping Stone, Yifan Li Oct 2022

Statistical Roles Of The G-Expectation Framework In Model Uncertainty: The Semi-G-Structure As A Stepping Stone, Yifan Li

Electronic Thesis and Dissertation Repository

The G-expectation framework is a generalization of the classical probability system based on the sublinear expectation to deal with phenomena that cannot be described by a single probabilistic model. These phenomena are closely related to the long-existing concern about model uncertainty in statistics. However, the distributions and independence in the G-framework are quite different from the classical setup. These distinctions bring difficulty when applying the idea of this framework to general statistical practice. Therefore, a fundamental and unavoidable problem is how to better understand G-version concepts from a statistical perspective.

To explore this problem, this thesis establishes a new substructure …


Regression-Based Methods For Dynamic Treatment Regimes With Mismeasured Covariates Or Misclassified Response, Dan Liu Sep 2022

Regression-Based Methods For Dynamic Treatment Regimes With Mismeasured Covariates Or Misclassified Response, Dan Liu

Electronic Thesis and Dissertation Repository

The statistical study of dynamic treatment regimes (DTRs) focuses on estimating sequential treatment decision rules tailored to patient-level information across multiple stages of intervention. Regression-based methods in DTR have been studied in the literature with a critical assumption that all the observed variables are precisely measured. However, this assumption is often violated in many applications. One example is the STAR*D study, in which the patient's depressive score is subject to measurement error. In this thesis, we explore problems in the context of DTR with measurement error or misclassification considered in the observed data.

The first project deals with covariate measurement …


Copulas, Maximal Dependence, And Anomaly Detection In Bi-Variate Time Series, Ning Sun Aug 2022

Copulas, Maximal Dependence, And Anomaly Detection In Bi-Variate Time Series, Ning Sun

Electronic Thesis and Dissertation Repository

This thesis focuses on discussing non-parametric estimators and their asymptotic behaviors for indices developed to characterize bi-variate time series. There are typically two types of indices depending on whether the distributional information is involved. For the indices containing the distributional information of the bivariate stationary time series, we particularly focus on the index called the tail order of maximal dependence (TOMD), which is an improvement of the tail order. For the indices without distributional information of the bivariate time series, we focus on an anomaly detection index for univariate input-output systems.

This thesis integrates three articles. The first article (Chapter …


An Analysis Of Weighted Least Squares Monte Carlo, Xiaotian Zhu Aug 2022

An Analysis Of Weighted Least Squares Monte Carlo, Xiaotian Zhu

Electronic Thesis and Dissertation Repository

Since Longstaff and Schwartz [2001] brought the amazing Regression-based Monte Carlo (LSMC) method in pricing American options, it has received heated discussion. Based on the research done by Fabozzi et al. [2017] that applies the heteroscedasticity correction method to LSMC, we further extend the study by introducing the methods from Park [1966] and Harvey [1976]. Our work shows that for a single stock American Call option modelled by GBM with two exercise opportunities, WLSMC or IRLSMC provides better estimates in continuation value than LSMC. However, they do not lead to better exercise decisions and hence have little to no effect …


New Developments On The Estimability And The Estimation Of Phase-Type Actuarial Models, Cong Nie Jul 2022

New Developments On The Estimability And The Estimation Of Phase-Type Actuarial Models, Cong Nie

Electronic Thesis and Dissertation Repository

This thesis studies the estimability and the estimation methods for two models based on Markov processes: the phase-type aging model (PTAM), which models the human aging process, and the discrete multivariate phase-type model (DMPTM), which can be used to model multivariate insurance claim processes.

The principal contributions of this thesis can be categorized into two areas. First, an objective measure of estimability is proposed to quantify estimability in the context of statistical models. Existing methods for assessing estimability require the subjective specification of thresholds, which potentially limits their usefulness. Unlike these methods, the proposed measure of estimability is objective. In …


Early-Warning Alert Systems For Financial-Instability Detection: An Hmm-Driven Approach, Xing Gu Apr 2022

Early-Warning Alert Systems For Financial-Instability Detection: An Hmm-Driven Approach, Xing Gu

Electronic Thesis and Dissertation Repository

Regulators’ early intervention is crucial when the financial system is experiencing difficulties. Financial stability must be preserved to avert banks’ bailouts, which hugely drain government's financial resources. Detecting in advance periods of financial crisis entails the development and customisation of accurate and robust quantitative techniques. The goal of this thesis is to construct automated systems via the interplay of various mathematical and statistical methodologies to signal financial instability episodes in the near-term horizon. These signal alerts could provide regulatory bodies with the capacity to initiate appropriate response that will thwart or at least minimise the occurrence of a financial crisis. …


Flexible Modelling Of Time-Dependent Covariate Effects With Correlated Competing Risks: Application To Hereditary Breast And Ovarian Cancer Families, Seungwoo Lee Apr 2022

Flexible Modelling Of Time-Dependent Covariate Effects With Correlated Competing Risks: Application To Hereditary Breast And Ovarian Cancer Families, Seungwoo Lee

Electronic Thesis and Dissertation Repository

This thesis aims to develop a flexible approach for modelling time-dependent covariate effects on event risk using B-splines in the presence of correlated competing risks. The performance of the proposed model was evaluated via simulation in terms of the bias and precision of the estimation of the parameters and penetrance functions. In addition, we extended the concordance index to account for time-dependent effects and competing events simultaneously and demonstrated its inference procedures. We applied our proposed methods to data rising from the BRCA1 mutation families from the breast cancer family registry to evaluate the time-dependent effects of mammographic screening and …


Statistical Applications To The Management Of Intensive Care And Step-Down Units, Yawo Mamoua Kobara Apr 2022

Statistical Applications To The Management Of Intensive Care And Step-Down Units, Yawo Mamoua Kobara

Electronic Thesis and Dissertation Repository

This thesis proposes three contributing manuscripts related to patient flow management, server decision-making, and ventilation time in the intensive care and step-down units system.

First, a Markov decision process (MDP) model with a Monte Carlo simulation was performed to compare two patient flow policies: prioritizing premature step-down and prioritizing rejection of patients when the intensive care unit is congested. The optimal decisions were obtained under the two strategies. The simulation results based on these optimal decisions show that a premature step-down strategy contributes to higher congestion downstream. Counter-intuitively, premature step-down should be discouraged, and patient rejection or divergence actions should …


Testing Aftershock Forecasts Using Bayesian Methods, Elisa Dong Mar 2022

Testing Aftershock Forecasts Using Bayesian Methods, Elisa Dong

Electronic Thesis and Dissertation Repository

The presence of strong aftershocks can increase the seismic hazard following a large earthquake and should be considered for operational earthquake forecasting and risk management. Aftershock forecasts are generated from seismicity models during the evolution of the aftershock sequence. This work compares quantitative test results of the forecasting abilities for three competing aftershock rate models - the modified Omori law, the Epidemic Type Aftershock Sequence model, and the compound Omori law - to identify the best performing model for forecasting the largest aftershock during the early aftershock sequence. Forecasts of large aftershock probabilities are generated by either the Extreme Value …


Physical Investigation Of Downburst Winds And Applicability To Full Scale Events, Federico Canepa Feb 2022

Physical Investigation Of Downburst Winds And Applicability To Full Scale Events, Federico Canepa

Electronic Thesis and Dissertation Repository

Thunderstorm winds, i.e. downbursts, are cold descending currents originating from cumulonimbus clouds which, upon the impingement on the ground, spread radially with high intensities. The downdraft phase of the storm and the subsequent radial outflow that is formed can cause major issues for aviation and immense damages to ground-mounted structures. Thunderstorm winds present characteristics completely different from the stationary Gaussian synoptic winds, which largely affect the mid-latitude areas of the globe in the form of extra-tropical cyclones. Downbursts are very localized winds in both space and time. It follows that their statistical investigation, by means of classical full scale anemometric …


Compound Sums, Their Distributions, And Actuarial Pricing, Ang Li Oct 2021

Compound Sums, Their Distributions, And Actuarial Pricing, Ang Li

Electronic Thesis and Dissertation Repository

Compound risk models are widely used in insurance companies to mathematically describe their aggregate amount of losses during certain time period. However, evaluation of the distribution of compound random variables and the computation of the relevant risk measures are non-trivial. Therefore, the main purpose of this thesis is to study the bounds and simulation methods for both univariate and multivariate compound distributions. The premium setting principles related to dependent multivariate compound distributions are studied. .

In the first part of this thesis, we consider the upper and lower bounds of the tail of bivariate compound distributions. Our results extend those …


Nature, Nurture, Or Both? Study Of Sex And Gender And Their Effects On Pain, Maryam Ghodrati Jul 2021

Nature, Nurture, Or Both? Study Of Sex And Gender And Their Effects On Pain, Maryam Ghodrati

Electronic Thesis and Dissertation Repository

As a pain researcher, in order to have a better understanding of pain, we should adopt a multidimensional view, such as the biopsychosocial (BPS) model and consider physical, psychological, and social elements altogether. The studies in this dissertation are part of the bigger project of SYMBIOME in which the aim is to help to create and develop a prognostic clinical phenotype in people post musculoskeletal (MSK) trauma. Chapter 2 presents a Confirmatory Factor Analysis (CFA) in order to assess the structural validity of the first section of the new Gender Pain and Expectation Scale (GPES). Our analysis indicated a 3-factor …


On The Estimation Of Heston-Nandi Garch Using Returns And/Or Options: A Simulation-Based Approach, Xize Ye Jul 2021

On The Estimation Of Heston-Nandi Garch Using Returns And/Or Options: A Simulation-Based Approach, Xize Ye

Electronic Thesis and Dissertation Repository

In this thesis, the Heston-Nandi GARCH(1,1) (henceforth, HN-GARCH) option pricing model is fitted via 4 maximum likelihood-based estimation and calibration approaches using simulated returns and/or options. The purpose is to examine the benefits of the joint estimation using both returns and options over the fundamental returns-only estimation on GARCH models. From our empirical studies, with the additional option sample, we can improve the efficiency of the estimates for HN-GARCH parameters. Nonetheless, the improvements for the risk premium factor, both from empirical standard errors, and sample RMSEs, are insignificant. In addition, option prices are simulated with a pre-defined noise structure and …


Addressing Bias In Non-Experimental Studies Assessing Treatment Outcomes In Prostate Cancer, David E. Guy Jun 2021

Addressing Bias In Non-Experimental Studies Assessing Treatment Outcomes In Prostate Cancer, David E. Guy

Electronic Thesis and Dissertation Repository

We evaluated the ability of matching techniques to balance baseline characteristics between treatment groups using non-experimental data. We identified a set of balance diagnostics that assessed key differences in baseline covariates with potential for confounding. These diagnostics were used in a novel systematic approach to developing and evaluating models for use in propensity score matching that optimized balance and data retention. We then compared the performance of propensity score and coarsened exact matching strategies in optimizing balance and data retention, using non-experimental data from a pan-Canadian prostate cancer database. Both matching techniques balanced baseline covariates adequately and retained approximately 70% …


Making Sense Of Noisy Data: Theory And Applications, Lingzhi Chen Jun 2021

Making Sense Of Noisy Data: Theory And Applications, Lingzhi Chen

Electronic Thesis and Dissertation Repository

This thesis introduces a novel and interpretable index of increase which is mathematically defined based on the distance between a given function and a set of non-increasing functions. Unlike the widely used traditional statistical methods for analyzing relationships between variables, the index does not rely on assumptions such as linearity, normality, and monotonicity, which may not be satisfied. Hence, it has the flexibility to be applied directly on pairs of data points to measure and compare non-linear, asymmetric, and non-monotonic relationships between two variables.

We begin with a review of the literature and background knowledge in Chapter 2.

In Chapter …


Sample Size Formulas For Estimating Areas Under The Receiver Operating Characteristic Curves With Precision And Assurance, Grace Lu Jun 2021

Sample Size Formulas For Estimating Areas Under The Receiver Operating Characteristic Curves With Precision And Assurance, Grace Lu

Electronic Thesis and Dissertation Repository

The area under the receiver operating characteristic curve (AUC) is commonly used to quantify the discriminative ability of tests with ordinal or continuous test data. When planning a study to evaluate a new test, it is important to determine a minimum sample size required to achieve a prespecified precision of estimating AUC. However, conventional sample size formulas do not consider the probability of achieving a prespecified precision, resulting in underestimation of sample sizes. To incorporate the assurance probability, asymptotic sample size formulas were derived using different variance estimators for AUC in this thesis. The precision of AUC estimations was quantified …


A Class Of Phase-Type Ageing Models And Their Lifetime Distributions, Boquan Cheng Apr 2021

A Class Of Phase-Type Ageing Models And Their Lifetime Distributions, Boquan Cheng

Electronic Thesis and Dissertation Repository

Ageing is a universal and ever-present biological phenomenon. Yet, describing the ageing mechanism in formal mathematical terms — in particular, capturing the ageing pattern and quantifying the ageing rate — has remained a challenging actuarial modelling endeavour. In this thesis, we propose a class of Coxian-type Markovian models. This class enables a quantitative description of the well-known characteristics of ageing, which is a genetically determined, progressive, and essentially irreversible process. The unique structure of our model features the transition rate for the ageing process and a functional form for the relationship between ageing and death with a shape parameter that …


Sample Size Formulas For Estimating Risk Ratios With The Modified Poisson Model For Binary Outcomes, Zhenni Xue Feb 2021

Sample Size Formulas For Estimating Risk Ratios With The Modified Poisson Model For Binary Outcomes, Zhenni Xue

Electronic Thesis and Dissertation Repository

Sample size estimation is usually the first step in planning a research study. Too small a study cannot adequately address the objectives, while too large a study may waste resources or unethical. For binary outcomes, several sample size estimation methods are available based on logistic regression models, which focusing on odds ratios. In prospective studies, risk ratios are preferable for ease of interpretation and communication. In this thesis, we compared the power difference between the logistic regression model and the modified Poisson regression model via simulation studies. We then proposed sample size estimation formulas based on the modified Poisson regression …


The Mean-Reverting 4/2 Stochastic Volatility Model: Properties And Financial Applications, Zhenxian Gong Feb 2021

The Mean-Reverting 4/2 Stochastic Volatility Model: Properties And Financial Applications, Zhenxian Gong

Electronic Thesis and Dissertation Repository

Financial markets and instruments are continuously evolving, displaying new and more refined stylized facts. This requires regular reviews and empirical evaluations of advanced models. There is evidence in literature that supports stochastic volatility models over constant volatility models in capturing stylized facts such as "smile" and "skew" presented in implied volatility surfaces. In this thesis, we target commodity and volatility index markets, and develop a novel stochastic volatility model that incorporates mean-reverting property and 4/2 stochastic volatility process. Commodities and volatility indexes have been proved to be mean-reverting, which means their prices tend to revert to their long term mean …


Statistical Methods With A Focus On Joint Outcome Modeling And On Methods For Fire Science, Da Zhong Xi Nov 2020

Statistical Methods With A Focus On Joint Outcome Modeling And On Methods For Fire Science, Da Zhong Xi

Electronic Thesis and Dissertation Repository

Understanding the dynamics of wildfires contributes significantly to the development of fire science. Challenges in the analysis of historical fire data include defining fire dynamics within existing statistical frameworks, modeling the duration and size of fires as joint outcomes, identifying the how fires are grouped into clusters of subpopulations, and assessing the effect of environmental variables in different modeling frameworks. We develop novel statistical methods to consider outcomes related to fire science jointly. These methods address these challenges by linking univariate models for separate outcomes through shared random effects, an approach referred to as joint modeling. Comparisons with existing …


A Treatise Of Pd-Lgd Correlation Modelling, Wisdom S. Avusuglo Wsa Aug 2020

A Treatise Of Pd-Lgd Correlation Modelling, Wisdom S. Avusuglo Wsa

Electronic Thesis and Dissertation Repository

The provision in Paragraph 468 of Basel II Framework Document for calculating loss given default (LGD) requires that parameters used in Pillar I of Basel II capital estimations must be reflective of economic downturn conditions so that relevant risks are accounted for. This provision is based on the fact that the probability of default (PD) and LGD correlations are not captured in the proposed formula for estimating economic capital. To help quantify economic downturn LGD, the Basel Committee proposed establishing a functional relationship between long-run and downturn LGD.

To the best of our knowledge, the current proposed models that map …


Ranking Comments: An Entropy-Based Method With Word Embedding Clustering, Yuyang Zhang Aug 2020

Ranking Comments: An Entropy-Based Method With Word Embedding Clustering, Yuyang Zhang

Electronic Thesis and Dissertation Repository

Automatically ranking comments by their relevance plays an important role in text mining and text summarization area. In this thesis, firstly, we introduce a new text digitalization method: the bag of word clusters model. Unlike the traditional bag of words model that treats each word as an independent item, we group semantic-related words as clusters using pre-trained word2vec word embeddings and represent each comment as a distribution of word clusters. This method can extract both semantic and statistical information from texts. Next, we propose an unsupervised ranking algorithm that identifies relevant comments by their distance to the “ideal” comment. The …