Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 13 of 13

Full-Text Articles in Physical Sciences and Mathematics

Statistical Consulting In Academia: A Review, Ke Xiao Jan 2024

Statistical Consulting In Academia: A Review, Ke Xiao

Major Papers

This paper reviews the state of statistical consulting in academia by performing a literature review on this topic in chapters 1 and 2. Chapter 1 overviews general aspects of statistical consulting and types of centers that conduct such services in academia. In Chapter 2 we summarise the literature about the common logistics and processes for conducting statistical consulting in academia. In Chapters 3 and 4, we analyze data on statistical consulting centers for the largest 100 universities in the USA. We also review the literature on the future of statistical consulting in academia in the era of big data and …


Comparing Elevator Strategies For A Parking Lot, Naveed Arafat Aug 2023

Comparing Elevator Strategies For A Parking Lot, Naveed Arafat

Major Papers

In this paper, we compare elevator strategies for a parking garage. It is assumed that the parking garage has several floors and there is an elevator which can stop on each floor. We begin by considering 4 strategies detailed in page 23. For each strategy, we loop the program 100 times, and get 100 mean values for wait times. Welch's test confirms highly significant differences among the 4 strategies. Repeating the analysis multiple times we see that the best of the 4 strategies is strategy 2, which places the elevator on floor 2 (the median floor) after use.


Excess Zeros Under Gam: Tweedie Or Two-Part?, Xianming Zeng Aug 2023

Excess Zeros Under Gam: Tweedie Or Two-Part?, Xianming Zeng

Major Papers

Positive, right-skewed data with excess zeros are encountered in many real-life situations. Two possible techniques to analyze this type of data are: Two-part models and Tweedie models. The two-part models assume existence of a separate zero generating process, while the Tweedie models are based on distributions that allow mass at zero. The paper aims to present a simulation study to investigate the performance of Generalized Additive Models (GAM) under the distribution of Tweedie and two-part models for such data with excess zero by using MSE (Mean Square Error) and relative bias to compare the performance of both methods. We found …


On Partially Observed Tensor Regression, Dinara Miftyakhetdinova Jan 2023

On Partially Observed Tensor Regression, Dinara Miftyakhetdinova

Major Papers

Tensor data is widely used in modern data science. The interest lies in identifying and characterizing the relationship between tensor datasets and external covariates. These datasets, though, are often incomplete. An efficient nonconvex alternating updating algorithm proposed by J. Zhou et al. in the paper "Partially Observed Dynamic Tensor Response Regression" provides a novel approach. The algorithm handles the problem of unobserved entries by solving an optimization problem of a loss function under the low-rankness, sparsity, and fusion constraints. This analysis aims to understand in detail the proposed algorithms and their theoretical proofs with, potentially, dropping some of the assumptions …


Optimal Speed Of A Machine In An Assembly Line Using The Continuous Time Markov Chain Rate Matrix, Chandi Darshani Rupasinghe Jan 2023

Optimal Speed Of A Machine In An Assembly Line Using The Continuous Time Markov Chain Rate Matrix, Chandi Darshani Rupasinghe

Major Papers

The optimal speed of a machine in an assembly line is determined using a Markov decision process type model. We develop the rate matrix that represents the inter-event time of a machine, either repair time or time to breakdown, as a function of speed. We consider the rate of time to breakdown with a variety of functions of speed. We find limiting probabilities and express profit in terms of these probabilities. We then find the optimal speed to maximize profit. Further, we assume an underlying function of speed and simulate data using R. From the simulated data, we estimate the …


To Logit Or Not To Logit Data In The Unit Interval: A Simulation Study, Kayode Idris Hamzat Aug 2022

To Logit Or Not To Logit Data In The Unit Interval: A Simulation Study, Kayode Idris Hamzat

Major Papers

In this paper, we recommend a mechanism for determining whether to logit or not to logit data in the unit interval which is based on quantile estimation of data between 0 and 1. By using a simulated dataset generated from a Beta regression model, the estimated quantile for this model perform better than those based on the linear quantile regression with logit transformation.

Further, we investigate the performance of the quantile regression estimators based on the LQR and we conclude that it is better than those based on the Beta regression when the distribution is contaminated with 10% uniform numbers …


Task Interrupted By A Poisson Process, Jarrett Christopher Nantais Oct 2020

Task Interrupted By A Poisson Process, Jarrett Christopher Nantais

Major Papers

We consider a task which has a completion time T (if not interrupted), which is a random variable with probability density function (pdf) f(t), t>0. Before it is complete, the task may be interrupted by a Poisson process with rate lambda. If that happens, then the task must begin again, with the same completion time random variable T, but with a potentially different realization. These interruptions can reoccur, until eventually the task is finished, with a total time of W. In this paper, we will find the Laplace Transform of W in several special cases.


On Variable Selections In High-Dimensional Incomplete Data, Tao Sun Jun 2020

On Variable Selections In High-Dimensional Incomplete Data, Tao Sun

Major Papers

Modern Statistics has entered the era of Big Data, wherein data sets are too large, high-dimensional, incomplete and complex for most classical statistical methods. This analysis of Big data firstly focuses on missing data. We compare different multiple imputation methods. Combining the characteristics of medical high-throughput experiments, we compared multivariate imputation by chained equations (MICE), missing forest (missForest), as well as self-training selection (STS) methods. A phenotypic data set of common lung disease was assessed. Moreover, in terms of improving the interpretability and predictability of the model, variable selection plays a pivotal role in the following analysis. Taking the Lasso-Poisson …


Level Crossing Simulation Of A Queueing Model, Zhanxuan Ding Jan 2019

Level Crossing Simulation Of A Queueing Model, Zhanxuan Ding

Major Papers

Simulation of the level crossing method will be used to find approximations of the distribution of the workload for several queueing models. In particular, three different type of queueing models, with different methods of handling workload bound thresholds, will be considered. Simulation applied to workload bound thresholds is new work.


Group-Lasso Estimation In High-Dimensional Factor Models With Structural Breaks, Yujie Song Oct 2018

Group-Lasso Estimation In High-Dimensional Factor Models With Structural Breaks, Yujie Song

Major Papers

In this major paper, we study the influence of structural breaks in the financial market model with high-dimensional data. We present a model which is capable of detecting changes in factor loadings, determining the number of factors and detecting the break date. We consider the case where the break date is both known and unknown and identify the type of instability. For the unknown break date case, we propose a group-LASSO estimator to determine the number of pre- and post-break factors, the break date and the existence of instability of factor loadings when the number of factor is constant. We …


Estimation In High-Dimensional Factor Models With Structural Instabilities, Wen Gao Oct 2018

Estimation In High-Dimensional Factor Models With Structural Instabilities, Wen Gao

Major Papers

In this major paper, we use high-dimensional models to analyze macroeconomic data which is in influenced by the break point. In particular, we consider to detect the break point and study the changes of the number of factors and the factor loadings with the structural instability.

Concretely, we propose two factor models which explain the processes of pre- and post- break periods. Then, we consider the break point as known or unknown. In both situations, we derive the shrinkage estimators by minimizing the penalized least square function and calculate the estimators of the numbers of pre- and post- break factors …


Exploring Quantitative Timed Up And Go Sensor Data With Statistical Learning Techniques, Anthony Wright Jan 2018

Exploring Quantitative Timed Up And Go Sensor Data With Statistical Learning Techniques, Anthony Wright

Major Papers

Injuries and hospitalizations due to accidental falls among seniors represent a major expense for the Canadian public health system. It is highly desirable to be able to predict risk of falls for senior individuals in order to place them in prevention programs. Recently, sensor technologies have been used to predict risk of falls and levels of frailty of individuals. A commonly used test for assessing risk of falls is known as QTUG (Quantitative `Timed Up and Go'). The QTUG data often consist of a small set of survey answers about the individuals' historic variables (e.g., number of falls in the …


Geometric Model Of Roots Of Stochastic Matrices, Yelyzaveta Chetina Jan 2018

Geometric Model Of Roots Of Stochastic Matrices, Yelyzaveta Chetina

Major Papers

In this paper we examine the conditions under which discrete-time homogenous Markov transition matrices have probability roots. A method based on geometric interpretation of 2x2 Markov matrices is used to find regions within the unit square corresponding to probability matrices with zero, single or multiple probability roots.