Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 13 of 13

Full-Text Articles in Physical Sciences and Mathematics

Towards Algorithmic Justice: Human Centered Approaches To Artificial Intelligence Design To Support Fairness And Mitigate Bias In The Financial Services Sector, Jihyun Kim Jan 2024

Towards Algorithmic Justice: Human Centered Approaches To Artificial Intelligence Design To Support Fairness And Mitigate Bias In The Financial Services Sector, Jihyun Kim

CMC Senior Theses

Artificial Intelligence (AI) has positively transformed the Financial services sector but also introduced AI biases against protected groups, amplifying existing prejudices against marginalized communities. The financial decisions made by biased algorithms could cause life-changing ramifications in applications such as lending and credit scoring. Human Centered AI (HCAI) is an emerging concept where AI systems seek to augment, not replace human abilities while preserving human control to ensure transparency, equity and privacy. The evolving field of HCAI shares a common ground with and can be enhanced by the Human Centered Design principles in that they both put humans, the user, at …


Application Of Sentiment Analysis And Machine Learning Techniques To Predict Daily Cryptocurrency Price Returns, Edward Wu Jan 2023

Application Of Sentiment Analysis And Machine Learning Techniques To Predict Daily Cryptocurrency Price Returns, Edward Wu

CMC Senior Theses

This paper examines the effects of social media sentiment relating to Bitcoin on the daily price returns of Bitcoin and other popular cryptocurrencies by utilizing sentiment analysis and machine learning techniques to predict daily price returns. Many investors think that social media sentiment affects cryptocurrency prices. However, the results of this paper find that social media sentiment relating to Bitcoin does not add significant predictive value to forecasting daily price returns for each of the six cryptocurrencies used for analysis and that machine learning models that do not assume linearity between the current day price return and previous daily price …


Utilizing Machine Learning In Healthcare In An Ethical Fashion, Nishka Ayyar Jan 2023

Utilizing Machine Learning In Healthcare In An Ethical Fashion, Nishka Ayyar

CMC Senior Theses

This thesis paper explores the ethical considerations surrounding the use of machine learning (ML) solutions in healthcare. The background section discusses the basics of machine learning techniques and algorithms, and the increasing interest in their utilization in the healthcare sector. The paper then reviews and critically analyzes four studies that highlight concerns related to using ML in healthcare, including issues of bias, privacy, accountability, and transparency. Based on the analysis of these studies, the paper presents several recommendations for addressing these concerns. The paper concludes with a discussion on the potential benefits of using machine learning technology in healthcare. Ultimately, …


Predicting Outcomes Of El Clásico Using Random Forests And Extreme Gradient Boosting, Emanuel Jarquin Jan 2022

Predicting Outcomes Of El Clásico Using Random Forests And Extreme Gradient Boosting, Emanuel Jarquin

CMC Senior Theses

In the modern era, sports betting is becoming increasingly popular. This is especially true in the realm of soccer (or ‘football’ as it is known outside the United States). As a result, the concept of attempting to predict the outcomes of soccer matches using machine learning has garnered much attention in recent years. In this thesis, I utilize well-known machine learning techniques to predict the outcomes of El Clásico matchups and compare the predictive performance of these techniques. The predictive methods employed for this thesis are random forests using the party package in R and extreme gradient boosting using the …


Feature Investigation For Stock Returns Prediction Using Xgboost And Deep Learning Sentiment Classification, Seungho (Samuel) Lee Jan 2021

Feature Investigation For Stock Returns Prediction Using Xgboost And Deep Learning Sentiment Classification, Seungho (Samuel) Lee

CMC Senior Theses

This paper attempts to quantify predictive power of social media sentiment and financial data in stock prediction by utilizing a comprehensive set of stock-related fundamental and technical variables and social media sentiments. For conducting sentiment analysis, this study employs a pretrained finBERT model that provides three different sentiment classifications and respective softmax scores. Hence, the significance of these variables is evaluated with XGBoost regression and Shapley Additive exPlanations (SHAP) frameworks. Through investigating feature importance, this study finds that statistical properties of sentiment variables provide a stronger predictive power than a weighted sentiment score and that it is possible to quantify …


Measuring Machine Learning Model Uncertainty With Applications To Aerial Segmentation, Kevin James Cotton Jan 2021

Measuring Machine Learning Model Uncertainty With Applications To Aerial Segmentation, Kevin James Cotton

CGU Theses & Dissertations

Machine learning model performance on both validation data and new data can be better measured and understood by leveraging uncertainty metrics at the time of prediction. These metrics can improve the model training process by indicating which training data need to be corrected and what part of the domain needs further annotation. The methods described have yet to reach mainstream adoption, and show great potential. Here, we survey the field of uncertainty metrics and provide a robust framework for its application to aerial segmentation. Uncertainty is divided into two types: aleatoric and epistemic. Aleatoric uncertainty arises from variations in training …


How Machine Learning And Probability Concepts Can Improve Nba Player Evaluation, Harrison Miller Jan 2020

How Machine Learning And Probability Concepts Can Improve Nba Player Evaluation, Harrison Miller

CMC Senior Theses

In this paper I will be breaking down a scholarly article, written by Sameer K. Deshpande and Shane T. Jensen, that proposed a new method to evaluate NBA players. The NBA is the highest level professional basketball league in America and stands for the National Basketball Association. They proposed to build a model that would result in how NBA players impact their teams chances of winning a game, using machine learning and probability concepts. I preface that by diving into these concepts and their mathematical backgrounds. These concepts include building a linear model using ordinary least squares method, the bias …


Randomized Algorithms For Preconditioner Selection With Applications To Kernel Regression, Conner Dipaolo Jan 2019

Randomized Algorithms For Preconditioner Selection With Applications To Kernel Regression, Conner Dipaolo

HMC Senior Theses

The task of choosing a preconditioner M to use when solving a linear system Ax=b with iterative methods is often tedious and most methods remain ad-hoc. This thesis presents a randomized algorithm to make this chore less painful through use of randomized algorithms for estimating traces. In particular, we show that the preconditioner stability || I - M-1A ||F, known to forecast preconditioner quality, can be computed in the time it takes to run a constant number of iterations of conjugate gradients through use of sketching methods. This is in spite of folklore which …


A Tacticians Guide To Conflict, Vol. 1: Advancing Explanations & Predictions Of Intrastate Conflict, Khaled Eid Jan 2019

A Tacticians Guide To Conflict, Vol. 1: Advancing Explanations & Predictions Of Intrastate Conflict, Khaled Eid

CGU Theses & Dissertations

Intrastate conflict is an ever-evolving problem – causes, explanation, and predictions are increasingly murky as traditional methods of analysis focus on structural issues as precursors of conflict. Often times these theories do not consider the underlying meso and micro dynamics that can provide vital insights into the phenomena. Tactical decision-makers are left using models that rely on highly aggregated, country level data to create proper courses of actions (COAs) to address or predict conflict. The shortcoming is that conflicts morph quite rapidly and structural variables can struggle capture such dynamic changes. To address this some tacticians are using big data …


Iterative Matrix Factorization Method For Social Media Data Location Prediction, Natchanon Suaysom Jan 2018

Iterative Matrix Factorization Method For Social Media Data Location Prediction, Natchanon Suaysom

HMC Senior Theses

Since some of the location of where the users posted their tweets collected by social media company have varied accuracy, and some are missing. We want to use those tweets with highest accuracy to help fill in the data of those tweets with incomplete information. To test our algorithm, we used the sets of social media data from a city, we separated them into training sets, where we know all the information, and the testing sets, where we intentionally pretend to not know the location. One prediction method that was used in (Dukler, Han and Wang, 2016) requires appending one-hot …


Daily Traffic Flow Pattern Recognition By Spectral Clustering, Matthew Aven Jan 2017

Daily Traffic Flow Pattern Recognition By Spectral Clustering, Matthew Aven

CMC Senior Theses

This paper explores the potential applications of existing spectral clustering algorithms to real life problems through experiments on existing road traffic data. The analysis begins with an overview of previous unsupervised machine learning techniques and constructs an effective spectral clustering algorithm that demonstrates the analytical power of the method. The paper focuses on the spectral embedding method’s ability to project non-linearly separable, high dimensional data into a more manageable space that allows for accurate clustering. The key step in this method involves solving a normalized eigenvector problem in order to construct an optimal representation of the original data.

While this …


Triple Non-Negative Matrix Factorization Technique For Sentiment Analysis And Topic Modeling, Alexander A. Waggoner Jan 2017

Triple Non-Negative Matrix Factorization Technique For Sentiment Analysis And Topic Modeling, Alexander A. Waggoner

CMC Senior Theses

Topic modeling refers to the process of algorithmically sorting documents into categories based on some common relationship between the documents. This common relationship between the documents is considered the “topic” of the documents. Sentiment analysis refers to the process of algorithmically sorting a document into a positive or negative category depending whether this document expresses a positive or negative opinion on its respective topic. In this paper, I consider the open problem of document classification into a topic category, as well as a sentiment category. This has a direct application to the retail industry where companies may want to scour …


Geographic Relevance For Travel Search: The 2014-2015 Harvey Mudd College Clinic Project For Expedia, Inc., Hannah Long Jan 2015

Geographic Relevance For Travel Search: The 2014-2015 Harvey Mudd College Clinic Project For Expedia, Inc., Hannah Long

Scripps Senior Theses

The purpose of this Clinic project is to help Expedia, Inc. expand the search capabilities it offers to its users. In particular, the goal is to help the company respond to unconstrained search queries by generating a method to associate hotels and regions around the world with the higher-level attributes that describe them, such as “family- friendly” or “culturally-rich.” Our team utilized machine-learning algorithms to extract metadata from textual data about hotels and cities. We focused on two machine-learning models: decision trees and Latent Dirichlet Allocation (LDA). The first appeared to be a promising approach, but would require more resources …