Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistical Methodology

Utah State University

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

2018

Articles 1 - 1 of 1

Full-Text Articles in Physical Sciences and Mathematics

A Comparison Of R, Sas, And Python Implementations Of Random Forests, Breckell Soifua Aug 2018

A Comparison Of R, Sas, And Python Implementations Of Random Forests, Breckell Soifua

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

The Random Forest method is a useful machine learning tool developed by Leo Breiman. There are many existing implementations across different programming languages; the most popular of which exist in R, SAS, and Python. In this paper, we conduct a comprehensive comparison of these implementations with regards to the accuracy, variable importance measurements, and timing. This comparison was done on a variety of real and simulated data with different classification difficulty levels, number of predictors, and sample sizes. The comparison shows unexpectedly different results between the three implementations.