Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Theses/Dissertations

Statistics and Probability

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Regression

Articles 1 - 5 of 5

Full-Text Articles in Physical Sciences and Mathematics

Examining Model Complexity's Effects When Predicting Continuous Measures From Ordinal Labels, Mckade S. Thomas May 2023

Examining Model Complexity's Effects When Predicting Continuous Measures From Ordinal Labels, Mckade S. Thomas

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Many real world problems require the prediction of ordinal variables where the values are a set of categories with an ordering to them. However, in many of these cases the categorical nature of the ordinal data is not a desirable outcome. As such, regression models treat ordinal variables as continuous and do not bind their predictions to discrete categories. Prior research has found that these models are capable of learning useful information between the discrete levels of the ordinal labels they are trained on, but complex models may learn ordinal labels too closely, missing the information between levels. In this …


Contributions To Random Forest Variable Importance With Applications In R, Kelvyn K. Bladen Aug 2022

Contributions To Random Forest Variable Importance With Applications In R, Kelvyn K. Bladen

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

A major focus in statistics is building and improving computational algorithms that can use data to predict a response. Two fundamental camps of research arise from such a goal. The first camp is researching ways to get more accurate predictions. Many sophisticated methods, collectively known as machine learning methods, have been developed for this very purpose. One such method that is widely used across industry and many other areas of investigation is called Random Forests.

The second camp of research is that of improving the interpretability of machine learning methods. This is worthy of attention when analysts desire to optimize …


Implementation And Application Of The Curds And Whey Algorithm To Regression Problems, John Kidd May 2014

Implementation And Application Of The Curds And Whey Algorithm To Regression Problems, John Kidd

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

A common statistical problem is trying to predict two or more variables using a set of predictor variables. The simplest model for this situation is called multivariate linear regression. This method uses each set of predictor variables to predict each of the response variables separately. This approach seems counter-intuitive as any possible relationship between the variables being predicted is ignored.

Breiman and Friedman found a way to take advantage of relationships among the response variables to increase the accuracy of the predictions for each of the predicted variables with an algorithm they called Curds and
Whey. It uses other statistical …


Linear Regression Of The Poisson Mean, Duane Steven Brown May 1982

Linear Regression Of The Poisson Mean, Duane Steven Brown

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

The purpose of this thesis was to compare two estimation procedures, the method of least squares and the method of maximum likelihood, on sample data obtained from a Poisson distribution. Point estimates of the slope and intercept of the regression line and point estimates of the mean squared error for both the slope and intercept were obtained. It is shown that least squares, the preferred method due to its simplicity, does yield results as good as maximum likelihood.

Also, confidence intervals were computed by Monte Carlo techniques and then were tested for accuracy. For the method of least squares, confidence …


Multicollinearity And The Estimation Of Regression Coefficients, John Charles Teed May 1978

Multicollinearity And The Estimation Of Regression Coefficients, John Charles Teed

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

The precision of the estimates of the regression coefficients in a regression analysis is affected by multicollinearity. The effect of certain factors on multicollinearity and the estimates was studied. The response variables were the standard error of the regression coefficients and a standarized statistic that measures the deviation of the regression coefficient from the population parameter.

The estimates are not influenced by any one factor in particular, but rather some combination of factors. The larger the sample size, the better the precision of the estimates no matter how "bad" the other factors may be.

The standard error of the regression …