Open Access. Powered by Scholars. Published by Universities.®

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

Multivariate Analysis

Articles 1 - 1 of 1

Full-Text Articles in Other Statistics and Probability

Tutorial For Using The Center For High Performance Computing At The University Of Utah And An Example Using Random Forest, Stephen Barton Dec 2016

Tutorial For Using The Center For High Performance Computing At The University Of Utah And An Example Using Random Forest, Stephen Barton

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

Random Forests are very memory intensive machine learning algorithms and most computers would fail at building models from datasets with millions of observations. Using the Center for High Performance Computing (CHPC) at the University of Utah and an airline on-time arrival dataset with 7 million observations from the U.S. Department of Transportation Bureau of Transportation Statistics we built 316 models by adjusting the depth of the trees and randomness of each forest and compared the accuracy and time each took. Using this dataset we discovered that substantial restrictions to the size of trees, observations allowed for each tree, and variables …