Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 39

Full-Text Articles in Physical Sciences and Mathematics

Parameter Estimation For Normally Distributed Grouped Data And Clustering Single-Cell Rna Sequencing Data Via The Expectation-Maximization Algorithm, Zahra Aghahosseinalishirazi Sep 2023

Parameter Estimation For Normally Distributed Grouped Data And Clustering Single-Cell Rna Sequencing Data Via The Expectation-Maximization Algorithm, Zahra Aghahosseinalishirazi

Electronic Thesis and Dissertation Repository

The Expectation-Maximization (EM) algorithm is an iterative algorithm for finding the maximum likelihood estimates in problems involving missing data or latent variables. The EM algorithm can be applied to problems consisting of evidently incomplete data or missingness situations, such as truncated distributions, censored or grouped observations, and also to problems in which the missingness of the data is not natural or evident, such as mixed-effects models, mixture models, log-linear models, and latent variables. In Chapter 2 of this thesis, we apply the EM algorithm to grouped data, a problem in which incomplete data are evident. Nowadays, data confidentiality is of …


Addressing The Impact Of Time-Dependent Social Groupings On Animal Survival And Recapture Rates In Mark-Recapture Studies, Alexandru M. Draghici Jun 2023

Addressing The Impact Of Time-Dependent Social Groupings On Animal Survival And Recapture Rates In Mark-Recapture Studies, Alexandru M. Draghici

Electronic Thesis and Dissertation Repository

Mark-recapture (MR) models typically assume that individuals under study have independent survival and recapture outcomes. One such model of interest is known as the Cormack-Jolly-Seber (CJS) model. In this dissertation, we conduct three major research projects focused on studying the impact of violating the independence assumption in MR models along with presenting extensions which relax the independence assumption. In the first project, we conduct a simulation study to address the impact of failing to account for pair-bonded animals having correlated recapture and survival fates on the CJS model. We examined the impact of correlation on the likelihood ratio test (LRT), …


Nearby Galaxies: Modelling Star Formation Histories And Contamination By Unresolved Background Galaxies, Hadi Papei Jan 2023

Nearby Galaxies: Modelling Star Formation Histories And Contamination By Unresolved Background Galaxies, Hadi Papei

Electronic Thesis and Dissertation Repository

Galaxies are complex systems of stars, gas, dust, and dark matter which evolve over billions of years, and one of the main goals of astrophysics is to understand how these complex systems form and change. Measuring the star formation history of nearby galaxies, in which thousands of stars can be resolved individually, has provided us with a clear picture of their evolutionary history and the evolution of galaxies in general.

In this work, we have developed the first public Python package, SFHPy, to measure star formation histories of nearby galaxies using their colour-magnitude diagrams. In this algorithm, an observed colour-magnitude …


Early-Warning Alert Systems For Financial-Instability Detection: An Hmm-Driven Approach, Xing Gu Apr 2022

Early-Warning Alert Systems For Financial-Instability Detection: An Hmm-Driven Approach, Xing Gu

Electronic Thesis and Dissertation Repository

Regulators’ early intervention is crucial when the financial system is experiencing difficulties. Financial stability must be preserved to avert banks’ bailouts, which hugely drain government's financial resources. Detecting in advance periods of financial crisis entails the development and customisation of accurate and robust quantitative techniques. The goal of this thesis is to construct automated systems via the interplay of various mathematical and statistical methodologies to signal financial instability episodes in the near-term horizon. These signal alerts could provide regulatory bodies with the capacity to initiate appropriate response that will thwart or at least minimise the occurrence of a financial crisis. …


Statistical Applications To The Management Of Intensive Care And Step-Down Units, Yawo Mamoua Kobara Apr 2022

Statistical Applications To The Management Of Intensive Care And Step-Down Units, Yawo Mamoua Kobara

Electronic Thesis and Dissertation Repository

This thesis proposes three contributing manuscripts related to patient flow management, server decision-making, and ventilation time in the intensive care and step-down units system.

First, a Markov decision process (MDP) model with a Monte Carlo simulation was performed to compare two patient flow policies: prioritizing premature step-down and prioritizing rejection of patients when the intensive care unit is congested. The optimal decisions were obtained under the two strategies. The simulation results based on these optimal decisions show that a premature step-down strategy contributes to higher congestion downstream. Counter-intuitively, premature step-down should be discouraged, and patient rejection or divergence actions should …


On The Estimation Of Heston-Nandi Garch Using Returns And/Or Options: A Simulation-Based Approach, Xize Ye Jul 2021

On The Estimation Of Heston-Nandi Garch Using Returns And/Or Options: A Simulation-Based Approach, Xize Ye

Electronic Thesis and Dissertation Repository

In this thesis, the Heston-Nandi GARCH(1,1) (henceforth, HN-GARCH) option pricing model is fitted via 4 maximum likelihood-based estimation and calibration approaches using simulated returns and/or options. The purpose is to examine the benefits of the joint estimation using both returns and options over the fundamental returns-only estimation on GARCH models. From our empirical studies, with the additional option sample, we can improve the efficiency of the estimates for HN-GARCH parameters. Nonetheless, the improvements for the risk premium factor, both from empirical standard errors, and sample RMSEs, are insignificant. In addition, option prices are simulated with a pre-defined noise structure and …


The Mean-Reverting 4/2 Stochastic Volatility Model: Properties And Financial Applications, Zhenxian Gong Feb 2021

The Mean-Reverting 4/2 Stochastic Volatility Model: Properties And Financial Applications, Zhenxian Gong

Electronic Thesis and Dissertation Repository

Financial markets and instruments are continuously evolving, displaying new and more refined stylized facts. This requires regular reviews and empirical evaluations of advanced models. There is evidence in literature that supports stochastic volatility models over constant volatility models in capturing stylized facts such as "smile" and "skew" presented in implied volatility surfaces. In this thesis, we target commodity and volatility index markets, and develop a novel stochastic volatility model that incorporates mean-reverting property and 4/2 stochastic volatility process. Commodities and volatility indexes have been proved to be mean-reverting, which means their prices tend to revert to their long term mean …


Statistical Methods With A Focus On Joint Outcome Modeling And On Methods For Fire Science, Da Zhong Xi Nov 2020

Statistical Methods With A Focus On Joint Outcome Modeling And On Methods For Fire Science, Da Zhong Xi

Electronic Thesis and Dissertation Repository

Understanding the dynamics of wildfires contributes significantly to the development of fire science. Challenges in the analysis of historical fire data include defining fire dynamics within existing statistical frameworks, modeling the duration and size of fires as joint outcomes, identifying the how fires are grouped into clusters of subpopulations, and assessing the effect of environmental variables in different modeling frameworks. We develop novel statistical methods to consider outcomes related to fire science jointly. These methods address these challenges by linking univariate models for separate outcomes through shared random effects, an approach referred to as joint modeling. Comparisons with existing …


Ranking Comments: An Entropy-Based Method With Word Embedding Clustering, Yuyang Zhang Aug 2020

Ranking Comments: An Entropy-Based Method With Word Embedding Clustering, Yuyang Zhang

Electronic Thesis and Dissertation Repository

Automatically ranking comments by their relevance plays an important role in text mining and text summarization area. In this thesis, firstly, we introduce a new text digitalization method: the bag of word clusters model. Unlike the traditional bag of words model that treats each word as an independent item, we group semantic-related words as clusters using pre-trained word2vec word embeddings and represent each comment as a distribution of word clusters. This method can extract both semantic and statistical information from texts. Next, we propose an unsupervised ranking algorithm that identifies relevant comments by their distance to the “ideal” comment. The …


Point Process Modelling Of Objects In The Star Formation Complexes Of The M33 Galaxy, Dayi Li Apr 2020

Point Process Modelling Of Objects In The Star Formation Complexes Of The M33 Galaxy, Dayi Li

Electronic Thesis and Dissertation Repository

In this thesis, Gibbs point process (GPP) models are constructed to study the spatial distribution of objects in the star formation complexes of the M33 galaxy. The GPP models circumvent the limitations of the two-point correlation function employed in the current astronomy literature by naturally accounting for the inhomogeneous distribution of these objects. The spatial distribution of these objects serves as a sensitive probe in understanding the star formation process, which is crucial in understanding the formation of galaxies and the Universe. The objects under study include the CO filament structure, giant molecular clouds (GMCs) and young stellar cluster candidates …


A Visual Analytics System For Investigating Multimorbidity Using Supervised Machine Learning, Maede Sadat Nouri Apr 2020

A Visual Analytics System For Investigating Multimorbidity Using Supervised Machine Learning, Maede Sadat Nouri

Electronic Thesis and Dissertation Repository

Patterns of multimorbidity are complex and difficult to summarise using static visualization techniques like tables and charts. We present a visual analytics system with the goal of facilitating the process of making sense of data collected from patients with multimorbidity. The system reveals underlying patterns in the data visually and interactively, which enables users to easily assess both prevalence and correlation estimates of different chronic diseases among multimorbid patients with varying characteristics. To do so, the system uses count-based conditional probability, binary logistic regression, softmax regression and decision tree models to dynamically compute and visualize prevalence and correlation estimates for …


Statistical Modeling And Characterization Of Induced Seismicity Within The Western Canada Sedimentary Basin, Sid Kothari Oct 2019

Statistical Modeling And Characterization Of Induced Seismicity Within The Western Canada Sedimentary Basin, Sid Kothari

Electronic Thesis and Dissertation Repository

In western Canada, there has been an increase in seismic activity linked to anthropogenic energy-related operations including conventional hydrocarbon production, wastewater fluid injection and more recently hydraulic fracturing (HF). Statistical modeling and characterization of the space, time and magnitude distributions of the seismicity clusters is vital for a better understanding of induced earthquake processes and development of predictive models. In this work, a statistical analysis of the seismicity in the Western Canada Sedimentary Basin was performed across past and present time periods by utilizing a compiled earthquake catalogue for Alberta and eastern British Columbia. Specifically, the frequency-magnitude statistics were analyzed …


Bias Assessment And Reduction In Kernel Smoothing, Wenkai Ma Nov 2018

Bias Assessment And Reduction In Kernel Smoothing, Wenkai Ma

Electronic Thesis and Dissertation Repository

When performing local polynomial regression (LPR) with kernel smoothing, the choice of the smoothing parameter, or bandwidth, is critical. The performance of the method is often evaluated using the Mean Square Error (MSE). Bias and variance are two components of MSE. Kernel methods are known to exhibit varying degrees of bias. Boundary effects and data sparsity issues are two potential problems to watch for. There is a need for a tool to visually assess the potential bias when applying kernel smooths to a given scatterplot of data. In this dissertation, we propose pointwise confidence intervals for bias and demonstrate a …


Statistical Modeling Of Co2 Flux Data, Fang He Sep 2018

Statistical Modeling Of Co2 Flux Data, Fang He

Electronic Thesis and Dissertation Repository

Carbon dioxide (CO2) flux is important for agriculture and carbon cycle studies. Only a small proportion of the land is currently covered by proper equipment to directly collect CO2 flux data. The CO2 flux data has an obvious annual cycle with the phase changing from year to year. How to build a model to estimate the annual effect and seasonal dynamics is a challenging task. With the help of the Moderate Resolution Imaging Spectroradiometer (MODIS) which is carried by NASA satellites, corresponding data, such as normalized difference vegetation index (NDVI), is freely available from NASA. Our goals are modeling the …


The Periglacial Landscape Of Mars: Insight Into The 'Decameter-Scale Rimmed Depressions' In Utopia Planitia, Arya Bina Aug 2018

The Periglacial Landscape Of Mars: Insight Into The 'Decameter-Scale Rimmed Depressions' In Utopia Planitia, Arya Bina

Electronic Thesis and Dissertation Repository

Currently, Mars appears to be in a ‘frozen’ and ‘dry’ state, with the clear majority of the planet’s surface maintaining year-round sub-zero temperatures. However, the discovery of features consistent with landforms found in periglacial environments on Earth, suggests a climate history for Mars that may have involved freeze and thaw cycles. Such landforms include hummocky, polygonised, scalloped, and pitted terrains, as well as ice-rich deposits and gullies, along the mid- to high-latitude bands, typically with no lower than 20o N/S. The detection of near-surface and surface ice via the Phoenix lander, excavation of ice via recent impact cratering activity as …


Stochastic Modelling Of Implied Correlation Index And Herd Behavior Index. Evidence, Properties And Pricing., Lin Fang Jul 2018

Stochastic Modelling Of Implied Correlation Index And Herd Behavior Index. Evidence, Properties And Pricing., Lin Fang

Electronic Thesis and Dissertation Repository

In this work, we provide the definition, study properties, and craft new stochastic models for two dependence indices: the implied correlation index and the herd behavior index (HIX). In particular, we model and price financial derivatives on the basic implied correlation index (CIX) as reported by CBOE. Our analysis is the first revealing the presence of heteroscedasticity in the time series of CIX leading to two Correlation Stochastic Volatility (CSV) models. We describe properties of CSV models and use discretization methods for their simulation. A partial estimation methodology is implemented on CBOE S& P 500 CIX historical data treating the …


Analysis Challenges For High Dimensional Data, Bangxin Zhao Apr 2018

Analysis Challenges For High Dimensional Data, Bangxin Zhao

Electronic Thesis and Dissertation Repository

In this thesis, we propose new methodologies targeting the areas of high-dimensional variable screening, influence measure and post-selection inference. We propose a new estimator for the correlation between the response and high-dimensional predictor variables, and based on the estimator we develop a new screening technique termed Dynamic Tilted Current Correlation Screening (DTCCS) for high dimensional variables screening. DTCCS is capable of picking up the relevant predictor variables within a finite number of steps. The DTCCS method takes the popular used sure independent screening (SIS) method and the high-dimensional ordinary least squares projection (HOLP) approach as its special cases.

Two methods …


Statistical Applications In Healthcare Systems, Maryam Mojalal Apr 2018

Statistical Applications In Healthcare Systems, Maryam Mojalal

Electronic Thesis and Dissertation Repository

This thesis consists of three contributing manuscripts related to waiting times with possible applications in health care. The first manuscript is inspired by a practical problem related to decision making in an emergency department (ED). As short-run predictions of ED censuses are particularly important for efficient allocation and management of ED resources we model ED changes and present estimations for short term (hourly) ED censuses at each time point. We present a Markov-chain based algorithm to make census predictions in near future.

Considering the variation in arrival pattern and service requirements, we apply and compare three models which best describe …


Advances In Semi-Nonparametric Density Estimation And Shrinkage Regression, Hossein Zareamoghaddam Mar 2018

Advances In Semi-Nonparametric Density Estimation And Shrinkage Regression, Hossein Zareamoghaddam

Electronic Thesis and Dissertation Repository

This thesis advocates the use of shrinkage and penalty techniques for estimating the parameters of a regression model that comprises both parametric and nonparametric components and develops semi-nonparametric density estimation methodologies that are applicable in a regression context.

First, a moment-based approach whereby a univariate or bivariate density function is approximated by means of a suitable initial density function that is adjusted by a linear combination of orthogonal polynomials is introduced. Such adjustments are shown to be mathematically equivalent to making use of standard polynomials in one or two variables. Once extended to apply to density estimation, in which case …


Some Applications Of Higher-Order Hidden Markov Models In The Exotic Commodity Markets, Heng Xiong Feb 2018

Some Applications Of Higher-Order Hidden Markov Models In The Exotic Commodity Markets, Heng Xiong

Electronic Thesis and Dissertation Repository

The liberalisation of regional and global commodity markets over the last several decades resulted in certain commodity price behaviours that require new modelling and estimation approaches. Such new approaches have important implications to the valuation and utilisation of commodity derivatives. Derivatives are becoming increasingly crucial for market participants in hedging their exposure to volatile price swings and in managing risks associated with derivative trading. The modelling of commodity-based variables is an integral part of risk management and optimal-investment strategies for commodity-linked portfolios. The characteristics of commodity price evolution cannot be captured sufficiently by one-state driven models even with the inclusion …


Advances In The Modeling Of Heavy-Tailed Distributions, Sang Jin Kang Jan 2018

Advances In The Modeling Of Heavy-Tailed Distributions, Sang Jin Kang

Electronic Thesis and Dissertation Repository

Several advances are proposed in connection with the approximation and estimation of heavy-tailed distributions, some of which also apply to other types of distributions. It is first explained that on initially applying the Esscher transform to heavy-tailed density functions such as the Pareto, Student-t and Cauchy densities, one can utilize a moment-based technique whereby the tilted density functions are expressed as the product of a base density function and a polynomial adjustment. Alternatively, density approximants can be secured by appropriately truncating the distributions or mapping them onto compact supports. The validity of these approaches is corroborated by simulation studies. …


Statistical Modelling, Optimal Strategies And Decisions In Two-Period Economies, Jiang Wu Nov 2017

Statistical Modelling, Optimal Strategies And Decisions In Two-Period Economies, Jiang Wu

Electronic Thesis and Dissertation Repository

Motivated by some real problems, our thesis puts forward two general two-period pricing models and explore optimal buying and selling strategies in two states of the two-period decision, when buyer/seller's decisions in the two periods are uncertain: commodity valuations may or may not be independent, may or may not follow the same distribution, be heavily or just lightly influenced by exogenous economic conditions, and so on. For both the example of buying laptops and the example of selling houses, the connections between each example and the two-envelope paradox encourage us to explore optimal strategies based on the works of McDonnell …


Data-Adaptive Kernel Support Vector Machine, Xin Liu Nov 2017

Data-Adaptive Kernel Support Vector Machine, Xin Liu

Electronic Thesis and Dissertation Repository

In this thesis, we propose the data-adaptive kernel Support Vector Machine (SVM), a new method with a data-driven scaling kernel function based on real data sets. This two-stage approach of kernel function scaling can enhance the accuracy of a support vector machine, especially when the data are imbalanced. Followed by the standard SVM procedure in the first stage, the proposed method locally adapts the kernel function to data locations based on the skewness of the class outcomes. In the second stage, the decision rule is constructed with the data-adaptive kernel function and is used as the classifier. This process enlarges …


Mining Of Primary Healthcare Patient Data With Selective Multimorbid Diseases, Annette Megerdichian Azad May 2017

Mining Of Primary Healthcare Patient Data With Selective Multimorbid Diseases, Annette Megerdichian Azad

Electronic Thesis and Dissertation Repository

Despite a large volume of research on the prognosis, diagnosis and overall burden of multimorbidity, very little is known about socio-demographic characteristics of multimorbid patients. This thesis aims to analyze the socio-demographic characteristics of patients with multiple chronic conditions (multimorbidity), focusing on patient groups sharing the same combination of diseases. Several methods were explored to analyze the co-occurrence of multiple chronic diseases as well as the associations between socio-demographics and chronic conditions. These methods include disease pair distributions over gender, age groups and income level quintiles, Multimorbidity Coefficients for measuring the concurrence of disease pairs and triples, and k-modes clustering …


Multiscale Wind Modelling For Sustainability And Resilience, Djordje Romanic Oct 2016

Multiscale Wind Modelling For Sustainability And Resilience, Djordje Romanic

Electronic Thesis and Dissertation Repository

The research presented herein is a mix of meteorological and wind engineering disciplines. In many cases, there is a gap between these two fields and this thesis is an attempt to bridge that gap through multiscale wind modelling approaches. Data and methods used in this study cover a multitude of spatial and temporal scales. Applications are in the fields of sustainability and resilience. This relationship between multiscale wind modelling and sustainability and resilience is investigated examining several case studies of three different developments: urban, rural and coastal.

An urban wind modelling methodology is proposed and applied for a specific development …


Advances In Portmanteau Diagnostic Tests, Jinkun Xiao Sep 2016

Advances In Portmanteau Diagnostic Tests, Jinkun Xiao

Electronic Thesis and Dissertation Repository

Portmanteau test serves an important role in model diagnostics for Box-Jenkins Modelling procedures. A large number of Portmanteau test based on the autocorrelation function are proposed for a general purpose goodness-of-fit test. Since the asymptotic distributions for the statistics has a complicated form which makes it hard to obtain the p-value directly, the gamma approximation is introduced to obtain the p-value. But the approximation will inevitably introduce approximation errors and needs a large number of observations to yield a good approximation. To avoid some pitfalls in the approximation, the Lin-Mcleod Test is further proposed to obtain a numeric solution to …


Joint Analysis Of Zero-Heavy Longitudinal Outcomes: Models And Comparison Of Study Designs, Erin R. Lundy Jul 2016

Joint Analysis Of Zero-Heavy Longitudinal Outcomes: Models And Comparison Of Study Designs, Erin R. Lundy

Electronic Thesis and Dissertation Repository

Understanding the patterns and mechanisms of the process of desistance from criminal activity is imperative for the development of effective sanctions and legal policy. Methodological challenges in the analysis of longitudinal criminal behaviour data include the need to develop methods for multivariate longitudinal discrete data, incorporating modulating exposure variables and several possible sources of zero-inflation. We develop new tools for zero-heavy joint outcome analysis which address these challenges and provide novel insights on processes related to offending patterns. Comparisons with existing approaches demonstrate the benefits of utilizing modeling frameworks which incorporate distinct sources of zeros. An additional concern in this …


Completely Monotone And Bernstein Functions With Convexity Properties On Their Measures, Shen Shan Aug 2015

Completely Monotone And Bernstein Functions With Convexity Properties On Their Measures, Shen Shan

Electronic Thesis and Dissertation Repository

The concepts of completely monotone and Bernstein functions have been introduced near one hundred years ago. They find wide applications in areas ranging from stochastic L\'{e}vy processes and complex analysis to monotone operator theory. They have well-known Bernstein and L\'{e}vy-Khintchine integral representations through which there are one-to-one correspondences between them and Radon measures on $[0,\infty)$ or $(0,\infty)$, respectively. In this thesis, we investigate subclasses of completely monotone and Bernstein functions with various convexity properties on their measures. These subclasses have intriguing applications in probability theories and convex analysis.

The convexity properties we investigate include convexity, harmonic convexity and $\beta$-convexity of …


A Spatial Analysis Of Forest Fire Survival And A Marked Cluster Process For Simulating Fire Load, Amy A. Morin Jul 2014

A Spatial Analysis Of Forest Fire Survival And A Marked Cluster Process For Simulating Fire Load, Amy A. Morin

Electronic Thesis and Dissertation Repository

The duration of a forest fire depends on many factors, such as weather, fuel type and fuel moisture, as well as fire management strategies. Understanding how these impact the duration of a fire can lead to more effective suppression efforts as this information can be incorporated into decision support systems used by fire management agencies to help allocate suppression resources. This thesis presents a thorough survival analysis of lightning and people-caused fires in the Intensive fire management zone of Ontario, Canada from 1989 through 2004. The analysis is then extended to investigate spatial patterns across this region using proportional hazards …


Statistical Applications In Wildfire Management And Prediction, Lengyi Han May 2014

Statistical Applications In Wildfire Management And Prediction, Lengyi Han

Electronic Thesis and Dissertation Repository

This thesis develops statistical methods and models and applies them
to problems related to forest fires. The unifying goal of the work is to provide a data analytic basis for quantifying the uncertainty surrounding fire ignition and fire growth which builds on existing theory where possible.

The main body of the thesis is comprised of three research papers. The Fire Weather Index (FWI) plays an important role in fire management and is central to the first two papers. In the first instance, the block bootstrap confidence interval method is used to deal nonparametrically with the dependence in the FWI data. …