Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 4 of 4

Full-Text Articles in Physical Sciences and Mathematics

Feature Investigation For Stock Returns Prediction Using Xgboost And Deep Learning Sentiment Classification, Seungho (Samuel) Lee Jan 2021

Feature Investigation For Stock Returns Prediction Using Xgboost And Deep Learning Sentiment Classification, Seungho (Samuel) Lee

CMC Senior Theses

This paper attempts to quantify predictive power of social media sentiment and financial data in stock prediction by utilizing a comprehensive set of stock-related fundamental and technical variables and social media sentiments. For conducting sentiment analysis, this study employs a pretrained finBERT model that provides three different sentiment classifications and respective softmax scores. Hence, the significance of these variables is evaluated with XGBoost regression and Shapley Additive exPlanations (SHAP) frameworks. Through investigating feature importance, this study finds that statistical properties of sentiment variables provide a stronger predictive power than a weighted sentiment score and that it is possible to quantify …


An Evaluation Of Knot Placement Strategies For Spline Regression, William Klein Jan 2021

An Evaluation Of Knot Placement Strategies For Spline Regression, William Klein

CMC Senior Theses

Regression splines have an established value for producing quality fit at a relatively low-degree polynomial. This paper explores the implications of adopting new methods for knot selection in tandem with established methodology from the current literature. Structural features of generated datasets, as well as residuals collected from sequential iterative models are used to augment the equidistant knot selection process. From analyzing a simulated dataset and an application onto the Racial Animus dataset, I find that a B-spline basis paired with equally-spaced knots remains the best choice when data are evenly distributed, even when structural features of a dataset are known …


Using Twitter Api To Solve The Goat Debate: Michael Jordan Vs. Lebron James, Jordan Trey Leonard Jan 2021

Using Twitter Api To Solve The Goat Debate: Michael Jordan Vs. Lebron James, Jordan Trey Leonard

CMC Senior Theses

Using a Twitter API, I gather and analyze tweets by performing sentiment analysis to solve the GOAT debate among professional athletes with the primary focus on comparing Michael Jordan and LeBron James. Athletes from the National Football League (NFL), the National Basketball Association (NBA), Major League Baseball (MLB), and the National Collegiate Athletic Association (NCAA) Division 1 Men's and Women's Basketball were selected to compare how sentiment polarity varies across sports. Sentiment polarity is measured by labeling text as "positive", "neutral", or "negative" which allows us to determine which athlete/sport is highly favored among the Twitter community when it comes …


Information Prioritization: A Comparison Between Utility Maximizers And Probability Matchers, Yusuf Ismaeel Jan 2021

Information Prioritization: A Comparison Between Utility Maximizers And Probability Matchers, Yusuf Ismaeel

CMC Senior Theses

This thesis examines the differences between probability matchers and utility maximizers in their preferences for information sources in a lab environment. In this paper, we consider the best source of information to be the most connected one. We conducted several linear probability model type regressions along with logit regressions. Furthermore, we also attempted to control and fix any potential misclassifications in classifying the cognitive strategy by using instrumental variables. The results show that utility maximizers will almost always choose the most informed node. Probability matchers, on the other hand, do not exhibit such a behavior as the probability matching strategy …