Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Applied Statistics

PDF

Williams Honors College, Honors Research Projects

Statistics

Publication Year

Articles 1 - 5 of 5

Full-Text Articles in Physical Sciences and Mathematics

Ensemble Classification: An Analysis Of The Random Forest Model, Jarod Korn Jan 2024

Ensemble Classification: An Analysis Of The Random Forest Model, Jarod Korn

Williams Honors College, Honors Research Projects

The random forest model proposed by Dr. Leo Breiman in 2001 is an ensemble machine learning method for classification prediction and regression. In the following paper, we will conduct an analysis on the random forest model with a focus on how the model works, how it is applied in software, and how it performs on a set of data. To fully understand the model, we will introduce the concept of decision trees, give a summary of the CART model, explain in detail how the random forest model operates, discuss how the model is implemented in software, demonstrate the model by …


Decision Trees: Predicting Future Losses For Insurance Data, Amanda Lahrmann Jan 2018

Decision Trees: Predicting Future Losses For Insurance Data, Amanda Lahrmann

Williams Honors College, Honors Research Projects

Big data is a term that has come to the spotlight for companies within recent years. Data analysis and business intelligence have become prominent sectors of companies and agencies. But what is big data? How has it impacted large companies and agencies? Why must it be embraced?

The best way to approach utilizing a big data set is to establish a question to answer. For this data set, the question that must be answered is “What variables cause a loss to occur?” To answer this question, first, we must understand what is meant by a “loss”, and take a look …


Statistics-Bierce Library Study, Tyler J. Hushour Jan 2017

Statistics-Bierce Library Study, Tyler J. Hushour

Williams Honors College, Honors Research Projects

This is a report from two surveys that I created and administered to students and faculty at Bierce library who came to the Circulation Desk or the Tech Desk, as well as some of my other findings when periodically looking around the library to see where students like to study or hang-out. There was a written survey given at the Circulation Desk, and a different survey given at the Tech Check-Out Desk. The project is for Melanie Smith-Farrell, the head of Access Services, and is based on a similar study Ian McCullough did in the science library. While this is …


Diversification And Market Neutral Portfolios In S&P500, Alan S. Agnew Jan 2016

Diversification And Market Neutral Portfolios In S&P500, Alan S. Agnew

Williams Honors College, Honors Research Projects

Our goal is to investigate strategies to deal with the risks associated with holding asset in the stock market. We first deal with risk of holding a specific stock, by the use of diversification. Later, we’ll attempt to deal with the market risk, which is the risk of entire market going up and down. Data used in this project comes from daily adjusted closing price of stocks listed in the S&P500 index ranging from January 3rd, 2000 to December 31st, 2015 and the data is processed using statistical software R.

Sections 2 through 4 of this …


Black Cloud Randomization Test, Nicholas S. Vanni Jan 2016

Black Cloud Randomization Test, Nicholas S. Vanni

Williams Honors College, Honors Research Projects

The Black Cloud Randomization Test looks at a nontraditional question and attempts to answer the question using unique statistics. The purpose of this paper is to apply what has been learned throughout the years and apply this knowledge to a final project. Data for this project follows an emergency room’s on call schedule, as well as the number of traumas that came in during each day shift. The project builds on what has been already learned and helps to open a different way of working with statistics. The project was coded in the R software. With different restrictions, there are …