Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 24 of 24

Full-Text Articles in Physical Sciences and Mathematics

On Limiting Distributions For Eigenvalue Spectra Of Sample Correlation Matrices From Heavy-Tailed Populations: Literature Review, Sachini Sandhareka Wijesundara Jun 2024

On Limiting Distributions For Eigenvalue Spectra Of Sample Correlation Matrices From Heavy-Tailed Populations: Literature Review, Sachini Sandhareka Wijesundara

Major Papers

This major paper offers an extensive review of literature concerning the limiting distributions for the eigenvalue spectrum of sample correlation matrices from a p-dimensional population, where both the dimension p and the sample size n grow to infinity. The study systematically categorizes the reviewed literature based on underlying assumptions regarding the data characteristics. Specifically, it examines several distinct cases: the independent and identically distributed (i.i.d) case with finite fourth moments, the i.i.d case with infinite fourth moments, the i.i.d case with infinite second moments, and scenarios where rows and columns of the data are linearly dependent. Additionally, the major paper …


Statistical Consulting In Academia: A Review, Ke Xiao Jan 2024

Statistical Consulting In Academia: A Review, Ke Xiao

Major Papers

This paper reviews the state of statistical consulting in academia by performing a literature review on this topic in chapters 1 and 2. Chapter 1 overviews general aspects of statistical consulting and types of centers that conduct such services in academia. In Chapter 2 we summarise the literature about the common logistics and processes for conducting statistical consulting in academia. In Chapters 3 and 4, we analyze data on statistical consulting centers for the largest 100 universities in the USA. We also review the literature on the future of statistical consulting in academia in the era of big data and …


Comparing Elevator Strategies For A Parking Lot, Naveed Arafat Aug 2023

Comparing Elevator Strategies For A Parking Lot, Naveed Arafat

Major Papers

In this paper, we compare elevator strategies for a parking garage. It is assumed that the parking garage has several floors and there is an elevator which can stop on each floor. We begin by considering 4 strategies detailed in page 23. For each strategy, we loop the program 100 times, and get 100 mean values for wait times. Welch's test confirms highly significant differences among the 4 strategies. Repeating the analysis multiple times we see that the best of the 4 strategies is strategy 2, which places the elevator on floor 2 (the median floor) after use.


Excess Zeros Under Gam: Tweedie Or Two-Part?, Xianming Zeng Aug 2023

Excess Zeros Under Gam: Tweedie Or Two-Part?, Xianming Zeng

Major Papers

Positive, right-skewed data with excess zeros are encountered in many real-life situations. Two possible techniques to analyze this type of data are: Two-part models and Tweedie models. The two-part models assume existence of a separate zero generating process, while the Tweedie models are based on distributions that allow mass at zero. The paper aims to present a simulation study to investigate the performance of Generalized Additive Models (GAM) under the distribution of Tweedie and two-part models for such data with excess zero by using MSE (Mean Square Error) and relative bias to compare the performance of both methods. We found …


On Image Response Regression With High-Dimensional Data, Noah Fuerth Jun 2023

On Image Response Regression With High-Dimensional Data, Noah Fuerth

Major Papers

A recent issue in statistical analysis is modelling data when the effect variable

changes at different locations. This can be difficult to accomplish when the dimensions

of the covariates are very high, and when the domain of the varying coefficient

functions of predictors are not necessarily regular. This research paper will investigate

a method to overcome these challenges by approximating the varying coefficient

functions using bivariate splines. We do this by splitting the domain of the varying

coefficient functions into a number of triangles, and build the bivariate spline functions

based on this triangulation. This major paper will outline detailed …


On Maximum Likelihood Estimators For A Jump-Type Affine Diffusion Two-Factor Model, Jiaming Yin Mr. Jun 2023

On Maximum Likelihood Estimators For A Jump-Type Affine Diffusion Two-Factor Model, Jiaming Yin Mr.

Major Papers

We consider a jump-type two-factor affine diffusion model driven by a subordinator in the context of continuous time observations. We study the asymptotic properties of the maximum likelihood estimator (MLE) for the drift parameters. In particular, we prove the strong consistency and the asymptotic normality of MLE in the subcritical case. We also present some numerical illustrations to confirm the theoretical results. The main difficulty of this major paper consists in proving the ergodicity of the model in the subcritical case and deriving the limiting behavior of the process.


On Partially Observed Tensor Regression, Dinara Miftyakhetdinova Jan 2023

On Partially Observed Tensor Regression, Dinara Miftyakhetdinova

Major Papers

Tensor data is widely used in modern data science. The interest lies in identifying and characterizing the relationship between tensor datasets and external covariates. These datasets, though, are often incomplete. An efficient nonconvex alternating updating algorithm proposed by J. Zhou et al. in the paper "Partially Observed Dynamic Tensor Response Regression" provides a novel approach. The algorithm handles the problem of unobserved entries by solving an optimization problem of a loss function under the low-rankness, sparsity, and fusion constraints. This analysis aims to understand in detail the proposed algorithms and their theoretical proofs with, potentially, dropping some of the assumptions …


Uniformity Test Based On The Empirical Bernstein Distribution, Ran Sun Jan 2023

Uniformity Test Based On The Empirical Bernstein Distribution, Ran Sun

Major Papers

In this paper, we firstly review the origin of Bernstein polynomial and the various application of it. Then we review the importance of goodness-of-fit test, especially the uniformity test, and we examine lots of different test statistics proposed by far. After that we suggest two new statistics for testing the uniformity. These two statistics are based on Komogorov-Smirnov test type and Cramér-Von Mises test type, respectively. Also we embed Bernstein polynomial into those test type and take advantage of great approximation performance of this polynomial. Finally, we run a Monte-Carlo simulation to compare the performance of our statistics to those …


Optimal Speed Of A Machine In An Assembly Line Using The Continuous Time Markov Chain Rate Matrix, Chandi Darshani Rupasinghe Jan 2023

Optimal Speed Of A Machine In An Assembly Line Using The Continuous Time Markov Chain Rate Matrix, Chandi Darshani Rupasinghe

Major Papers

The optimal speed of a machine in an assembly line is determined using a Markov decision process type model. We develop the rate matrix that represents the inter-event time of a machine, either repair time or time to breakdown, as a function of speed. We consider the rate of time to breakdown with a variety of functions of speed. We find limiting probabilities and express profit in terms of these probabilities. We then find the optimal speed to maximize profit. Further, we assume an underlying function of speed and simulate data using R. From the simulated data, we estimate the …


On Bayesian Methods And Functional Registration Of Fmri, Xiaoxuan Wang Jan 2023

On Bayesian Methods And Functional Registration Of Fmri, Xiaoxuan Wang

Major Papers

The application of functional magnetic resonance imaging (fMRI) has greatly improved our comprehension of the human brain and behaviour. However, after anatomical alignment, there remains large inter-individual variability in brain anatomy and functional localization, which is one of the obstacles to conducting group studies and performing group-level inference. This major paper addresses this problem by applying a new method (Bayesian Functional Registration) to decrease misalignment in functional brain systems between people by spatially transforming each subject’s functional data into a common reference map. The proposed approach allows us to assess differences in brain function across subjects. It also creates a …


A Review Of Statistical Learning Methods With Applications, Natalie R. Masse Aug 2022

A Review Of Statistical Learning Methods With Applications, Natalie R. Masse

Major Papers

Statistical learning refers to a set of tools for modelling and understanding complex datasets. It is a recently developed area in statistics and blends with parallel developments in computer science and, in particular, machine learning. This paper aims to outline some of the key statistical learning methods in the areas of prediction and classification of data. The goal is to discuss the theory and methodology of Ordinary Least Squares Regression, Ridge Regression, Lasso Regression, Logistic Regression, K-Nearest Neighbours method of classification, Linear and Quadratic Discriminant analysis, and Classification Trees. We then discuss the idea of Cross Validation, and demonstrate these …


To Logit Or Not To Logit Data In The Unit Interval: A Simulation Study, Kayode Idris Hamzat Aug 2022

To Logit Or Not To Logit Data In The Unit Interval: A Simulation Study, Kayode Idris Hamzat

Major Papers

In this paper, we recommend a mechanism for determining whether to logit or not to logit data in the unit interval which is based on quantile estimation of data between 0 and 1. By using a simulated dataset generated from a Beta regression model, the estimated quantile for this model perform better than those based on the linear quantile regression with logit transformation.

Further, we investigate the performance of the quantile regression estimators based on the LQR and we conclude that it is better than those based on the Beta regression when the distribution is contaminated with 10% uniform numbers …


Task Interrupted By A Poisson Process, Jarrett Christopher Nantais Oct 2020

Task Interrupted By A Poisson Process, Jarrett Christopher Nantais

Major Papers

We consider a task which has a completion time T (if not interrupted), which is a random variable with probability density function (pdf) f(t), t>0. Before it is complete, the task may be interrupted by a Poisson process with rate lambda. If that happens, then the task must begin again, with the same completion time random variable T, but with a potentially different realization. These interruptions can reoccur, until eventually the task is finished, with a total time of W. In this paper, we will find the Laplace Transform of W in several special cases.


On Variable Selections In High-Dimensional Incomplete Data, Tao Sun Jun 2020

On Variable Selections In High-Dimensional Incomplete Data, Tao Sun

Major Papers

Modern Statistics has entered the era of Big Data, wherein data sets are too large, high-dimensional, incomplete and complex for most classical statistical methods. This analysis of Big data firstly focuses on missing data. We compare different multiple imputation methods. Combining the characteristics of medical high-throughput experiments, we compared multivariate imputation by chained equations (MICE), missing forest (missForest), as well as self-training selection (STS) methods. A phenotypic data set of common lung disease was assessed. Moreover, in terms of improving the interpretability and predictability of the model, variable selection plays a pivotal role in the following analysis. Taking the Lasso-Poisson …


Extension Of First Passage Probability, Yiping Zhang Jan 2020

Extension Of First Passage Probability, Yiping Zhang

Major Papers

In this paper, we consider the extension of first passage probability. First, we present the first, second, third, and generally k-th passage probability of a Markov Chain moving from one state to another state through step-by-step calculation and two other matrix-version methods. Similarly, we compute the first passage probability of a Markov Chain moving from one state to multiple states. In all discussions, we take into account the situations that one state moves to a different state and returns to itself. Also, we find the mean number of steps needed from one state to another state in a Markov Chain …


A Review Of Statistical Analysis Of Genetic Case-Control Data, Jin Zhang May 2019

A Review Of Statistical Analysis Of Genetic Case-Control Data, Jin Zhang

Major Papers

This paper considers the analysis of genetic case-control data. We consider the allele frequency in cases and controls. Because each individual has two alleles at any autosomal locus, there will be twice as many alleles as people in the allele distribution. Simultaneously, the serological distribution is bulit by ignoring the difference between homozygous and herterozygous. We also consider the marker loci with multiple alleles. Traditional case-control studies provide a powerful and efficient method for evaluation of association between candidate gene and disease. There has been debate on how the power of tests for association changes with different allelic effect. To …


Level Crossing Simulation Of A Queueing Model, Zhanxuan Ding Jan 2019

Level Crossing Simulation Of A Queueing Model, Zhanxuan Ding

Major Papers

Simulation of the level crossing method will be used to find approximations of the distribution of the workload for several queueing models. In particular, three different type of queueing models, with different methods of handling workload bound thresholds, will be considered. Simulation applied to workload bound thresholds is new work.


Infinite Sums, Products, And Urn Models, Yiyan Ni Jan 2019

Infinite Sums, Products, And Urn Models, Yiyan Ni

Major Papers

This paper considers an urn and its evolution in discrete time steps. The

urn initially has two different colored balls(blue and red). We discuss different

cases where k blue balls (k = 1, 2, 3, ... ) will be added (or removed) at every

step if a blue ball is withdrawn, based on the goal of eventually withdrawing a

red ball P(R eventually). We compute the probability of eventually withdrawing

a red ball with two different methods–one using infinite sums and other using

infinite products. One advantage of this is that we can obtain P(R eventually) in

a complex but …


Group-Lasso Estimation In High-Dimensional Factor Models With Structural Breaks, Yujie Song Oct 2018

Group-Lasso Estimation In High-Dimensional Factor Models With Structural Breaks, Yujie Song

Major Papers

In this major paper, we study the influence of structural breaks in the financial market model with high-dimensional data. We present a model which is capable of detecting changes in factor loadings, determining the number of factors and detecting the break date. We consider the case where the break date is both known and unknown and identify the type of instability. For the unknown break date case, we propose a group-LASSO estimator to determine the number of pre- and post-break factors, the break date and the existence of instability of factor loadings when the number of factor is constant. We …


Estimation In High-Dimensional Factor Models With Structural Instabilities, Wen Gao Oct 2018

Estimation In High-Dimensional Factor Models With Structural Instabilities, Wen Gao

Major Papers

In this major paper, we use high-dimensional models to analyze macroeconomic data which is in influenced by the break point. In particular, we consider to detect the break point and study the changes of the number of factors and the factor loadings with the structural instability.

Concretely, we propose two factor models which explain the processes of pre- and post- break periods. Then, we consider the break point as known or unknown. In both situations, we derive the shrinkage estimators by minimizing the penalized least square function and calculate the estimators of the numbers of pre- and post- break factors …


Multi-State Modeling Of Hospital Frequent Users, Yu Liang Jan 2018

Multi-State Modeling Of Hospital Frequent Users, Yu Liang

Major Papers

The top 1% of frequent users account for 34% of public health system expenditures in Ontario, while the top 5% account for 66%. In this paper, we explore the efficacy of an intervention aimed at reducing hospital utilization for a group of patients defined as frequent users, by using Multi-state modeling. We employ time-homogeneous, time-inhomogeneous, parametric and semi-parametric Markov processes to study the transitions of the patients between hospital, ER and outside during a follow up period of one year. The results do not indicate any strong evidence that the intervention was beneficial.


Queues With Server Utilization Of One, Robert Aidoo Jan 2018

Queues With Server Utilization Of One, Robert Aidoo

Major Papers

In most queueing systems of type GI/G/1, the stability condition requires that the server utilization be strictly less than 1. The standard exception is a D/D/1 system in which stability still holds for server utilization equal to 1. This paper presents other cases when server utilization can equal 1, and discusses their characteristics.


Exploring Quantitative Timed Up And Go Sensor Data With Statistical Learning Techniques, Anthony Wright Jan 2018

Exploring Quantitative Timed Up And Go Sensor Data With Statistical Learning Techniques, Anthony Wright

Major Papers

Injuries and hospitalizations due to accidental falls among seniors represent a major expense for the Canadian public health system. It is highly desirable to be able to predict risk of falls for senior individuals in order to place them in prevention programs. Recently, sensor technologies have been used to predict risk of falls and levels of frailty of individuals. A commonly used test for assessing risk of falls is known as QTUG (Quantitative `Timed Up and Go'). The QTUG data often consist of a small set of survey answers about the individuals' historic variables (e.g., number of falls in the …


Geometric Model Of Roots Of Stochastic Matrices, Yelyzaveta Chetina Jan 2018

Geometric Model Of Roots Of Stochastic Matrices, Yelyzaveta Chetina

Major Papers

In this paper we examine the conditions under which discrete-time homogenous Markov transition matrices have probability roots. A method based on geometric interpretation of 2x2 Markov matrices is used to find regions within the unit square corresponding to probability matrices with zero, single or multiple probability roots.