Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 17 of 17

Full-Text Articles in Physical Sciences and Mathematics

Towards Algorithmic Justice: Human Centered Approaches To Artificial Intelligence Design To Support Fairness And Mitigate Bias In The Financial Services Sector, Jihyun Kim Jan 2024

Towards Algorithmic Justice: Human Centered Approaches To Artificial Intelligence Design To Support Fairness And Mitigate Bias In The Financial Services Sector, Jihyun Kim

CMC Senior Theses

Artificial Intelligence (AI) has positively transformed the Financial services sector but also introduced AI biases against protected groups, amplifying existing prejudices against marginalized communities. The financial decisions made by biased algorithms could cause life-changing ramifications in applications such as lending and credit scoring. Human Centered AI (HCAI) is an emerging concept where AI systems seek to augment, not replace human abilities while preserving human control to ensure transparency, equity and privacy. The evolving field of HCAI shares a common ground with and can be enhanced by the Human Centered Design principles in that they both put humans, the user, at …


Utilizing Machine Learning In Healthcare In An Ethical Fashion, Nishka Ayyar Jan 2023

Utilizing Machine Learning In Healthcare In An Ethical Fashion, Nishka Ayyar

CMC Senior Theses

This thesis paper explores the ethical considerations surrounding the use of machine learning (ML) solutions in healthcare. The background section discusses the basics of machine learning techniques and algorithms, and the increasing interest in their utilization in the healthcare sector. The paper then reviews and critically analyzes four studies that highlight concerns related to using ML in healthcare, including issues of bias, privacy, accountability, and transparency. Based on the analysis of these studies, the paper presents several recommendations for addressing these concerns. The paper concludes with a discussion on the potential benefits of using machine learning technology in healthcare. Ultimately, …


Advanced Full-Text Search Based On Synonyms In Postgres, Joey Bodoia Jan 2022

Advanced Full-Text Search Based On Synonyms In Postgres, Joey Bodoia

CMC Senior Theses

This paper discusses the advanced full-text search queries based on synonyms that are supported in Chajda, which is a postgres extension and corresponding python library for highly multi-lingual full-text search in postgres. This discussion will include the motivations for using advanced queries based on synonyms, examples of how to use these advanced queries in Chajda, current limitiations of the advanced queries, and performance testing of the advanced queries.


Dynamic Nonlinear Gaussian Model For Inferring A Graph Structure On Time Series, Abhinuv Uppal Jan 2022

Dynamic Nonlinear Gaussian Model For Inferring A Graph Structure On Time Series, Abhinuv Uppal

CMC Senior Theses

In many applications of graph analytics, the optimal graph construction is not always straightforward. I propose a novel algorithm to dynamically infer a graph structure on multiple time series by first imposing a state evolution equation on the graph and deriving the necessary equations to convert it into a maximum likelihood optimization problem. The state evolution equation guarantees that edge weights contain predictive power by construction. After running experiments on simulated data, it appears the required optimization is likely non-convex and does not generally produce results significantly better than randomly tweaking parameters, so it is not feasible to use in …


Studying Geometric Optical Illusions Through The Lens Of A Convolutional Neural Network, Nick Laberge Jan 2019

Studying Geometric Optical Illusions Through The Lens Of A Convolutional Neural Network, Nick Laberge

CMC Senior Theses

Geometrical optical illusions such as the Muller Lyer illusion and the Ponzo illusion have been widely researched over the past 100+ years, yet researchers have not reached a consensus on why human perception is deceived by these illusions or which illusions are the results of the same effects. In this paper, I study these illusions through the lens of a convolutional neural network. First, I successfully train the network to correctly classify how a human would perceive a particular class of illusion (such as the Muller Lyer illusion), then I test the network’s ability to generalize to illusions that it …


@Yourlocation: A Spatial Analysis Of Geotagged Tweets In The Us, Ocean Mckinney Jan 2019

@Yourlocation: A Spatial Analysis Of Geotagged Tweets In The Us, Ocean Mckinney

CMC Senior Theses

This project examines the spatial network properties observable from geo-located tweet data. Conventional exploration examines characteristics of a variety of network attributes, but few employ spatial edge correlations in their analysis. Recent studies have demonstrated the improvements that these correlations contribute to drawing conclusions about network structure. This thesis expands upon social network research utilizing spatial edge correlations and presents processing and formatting techniques for JSON (JavaScript Object Notation) data.


Snap Scholar: The User Experience Of Engaging With Academic Research Through A Tappable Stories Medium, Ieva Burk Jan 2019

Snap Scholar: The User Experience Of Engaging With Academic Research Through A Tappable Stories Medium, Ieva Burk

CMC Senior Theses

With the shift to learn and consume information through our mobile devices, most academic research is still only presented in long-form text. The Stanford Scholar Initiative has explored the segment of content creation and consumption of academic research through video. However, there has been another popular shift in presenting information from various social media platforms and media outlets in the past few years. Snapchat and Instagram have introduced the concept of tappable “Stories” that have gained popularity in the realm of content consumption.

To accelerate the growth of the creation of these research talks, I propose an alternative to video: …


An Introduction To The Theory And Applications Of Bayesian Networks, Anant Jaitha Jan 2017

An Introduction To The Theory And Applications Of Bayesian Networks, Anant Jaitha

CMC Senior Theses

Bayesian networks are a means to study data. A Bayesian network gives structure to data by creating a graphical system to model the data. It then develops probability distributions over these variables. It explores variables in the problem space and examines the probability distributions related to those variables. It conducts statistical inference over those probability distributions to draw meaning from them. They are good means to explore a large set of data efficiently to make inferences. There are a number of real world applications that already exist and are being actively researched. This paper discusses the theory and applications of …


A New Frontier: But For Whom? An Analysis Of The Micro-Computer And Women’S Declining Participation In Computer Science, Eliana Keinan Jan 2017

A New Frontier: But For Whom? An Analysis Of The Micro-Computer And Women’S Declining Participation In Computer Science, Eliana Keinan

CMC Senior Theses

Though women’s participation in science, technology, engineering, and mathematics (STEM) fields has greatly increased over the past 60 years, women’s participation in computer science peaked in the 1980s. The paper searches for key motivators for women entering computer science at the peak in order to isolate factors for the subsequent steep decline. A major finding of the paper is that having a computer at home is (weakly) statistically significant as a determinant for female students choosing to pursue computer science. This relationship is insignificant for students in other STEM and non-STEM fields. A final section of the paper examines employment …


Triple Non-Negative Matrix Factorization Technique For Sentiment Analysis And Topic Modeling, Alexander A. Waggoner Jan 2017

Triple Non-Negative Matrix Factorization Technique For Sentiment Analysis And Topic Modeling, Alexander A. Waggoner

CMC Senior Theses

Topic modeling refers to the process of algorithmically sorting documents into categories based on some common relationship between the documents. This common relationship between the documents is considered the “topic” of the documents. Sentiment analysis refers to the process of algorithmically sorting a document into a positive or negative category depending whether this document expresses a positive or negative opinion on its respective topic. In this paper, I consider the open problem of document classification into a topic category, as well as a sentiment category. This has a direct application to the retail industry where companies may want to scour …


Using A Data Warehouse As Part Of A General Business Process Data Analysis System, Amit Maor Jan 2016

Using A Data Warehouse As Part Of A General Business Process Data Analysis System, Amit Maor

CMC Senior Theses

Data analytics queries often involve aggregating over massive amounts of data, in order to detect trends in the data, make predictions about future data, and make business decisions as a result. As such, it is important that a database management system (DBMS) handling data analytics queries perform well when those queries involve massive amounts of data. A data warehouse is a DBMS which is designed specifically to handle data analytics queries.

This thesis describes the data warehouse Amazon Redshift, and how it was used to design a data analysis system for Laserfiche. Laserfiche is a software company that provides each …


The Future Of Ios Development: Evaluating The Swift Programming Language, Garrett Wells Jan 2015

The Future Of Ios Development: Evaluating The Swift Programming Language, Garrett Wells

CMC Senior Theses

Swift is a new programming language developed by Apple for creating iOS and Mac OS X applications. Intended to eventually replace Objective-C as Apple’s language of choice, Swift needs to convince developers to switch over to the new language. Apple has promised that Swift will be faster than Objective-C, as well as offer more modern language features, be very safe, and be easy to learn and use. In this thesis I test these claims by creating an iOS application entirely in Swift as well as benchmarking two different algorithms. I find that while Swift is faster than Objective-C, it does …


Exploring Algorithmic Musical Key Recognition, Nathan J. Levine Jan 2015

Exploring Algorithmic Musical Key Recognition, Nathan J. Levine

CMC Senior Theses

The following thesis outlines the goal and process of algorithmic musical key detection as well as the underlying music theory. This includes a discussion of signal-processing techniques intended to most accurately detect musical pitch, as well as a detailed description of the Krumhansl-Shmuckler (KS) key-finding algorithm. It also describes the Java based implementation and testing process of a musical key-finding program based on the KS algorithm. This thesis provides an analysis of the results and a comparison with the original algorithm, ending with a discussion of the recommended direction of further development.


A Look Into The Industry Of Video Games Past, Present, And Yet To Come, Chad Hadzinsky Jan 2014

A Look Into The Industry Of Video Games Past, Present, And Yet To Come, Chad Hadzinsky

CMC Senior Theses

Since its inception, the video game industry has been both a new medium for art and innovation as well as a major driving force in the advancements of many technologies. The often overlooked video game industry has turned from a hobby to a multi-billion dollar industry in its short, forty year life. People of all ages and genders across the world are playing video games at a higher clip than ever before. With so many new gamers and emerging technologies, it is an exciting time for the industry. The landscape is constantly changing and successful business models of the past …


Scalable Collaborative Filtering Recommendation Algorithms On Apache Spark, Walker Evan Casey Jan 2014

Scalable Collaborative Filtering Recommendation Algorithms On Apache Spark, Walker Evan Casey

CMC Senior Theses

Collaborative filtering based recommender systems use information about a user's preferences to make personalized predictions about content, such as topics, people, or products, that they might find relevant. As the volume of accessible information and active users on the Internet continues to grow, it becomes increasingly difficult to compute recommendations quickly and accurately over a large dataset. In this study, we will introduce an algorithmic framework built on top of Apache Spark for parallel computation of the neighborhood-based collaborative filtering problem, which allows the algorithm to scale linearly with a growing number of users. We also investigate several different variants …


Colormoo: An Algorithmic Approach To Generating Color Palettes, Joshua Rael Jan 2014

Colormoo: An Algorithmic Approach To Generating Color Palettes, Joshua Rael

CMC Senior Theses

Selecting one color can be done with relative ease, but this task becomes more difficult with each subsequent color. Colormoo is an online tool aimed at solving this problem. We implement three algorithms for generating color palettes based off of a starting color. Data is collected for each palette that is generated. Our analysis reveals two of the algorithms are preferred, but under different circumstances. Furthermore, we find that users prefer palettes containing colors that are compatible, but not too similar. With refined heuristics, we believe these techniques can be extended and applied beyond the field of graphic design alone.


A Machine Learning Approach To Diagnosis Of Parkinson’S Disease, Sumaiya F. Hashmi Jan 2013

A Machine Learning Approach To Diagnosis Of Parkinson’S Disease, Sumaiya F. Hashmi

CMC Senior Theses

I will investigate applications of machine learning algorithms to medical data, adaptations of differences in data collection, and the use of ensemble techniques.

Focusing on the binary classification problem of Parkinson’s Disease (PD) diagnosis, I will apply machine learning algorithms to a primary dataset consisting of voice recordings from healthy and PD subjects. Specifically, I will use Artificial Neural Networks, Support Vector Machines, and an Ensemble Learning algorithm to reproduce results from [MS12] and [GM09].

Next, I will adapt a secondary regression dataset of PD recordings and combine it with the primary binary classification dataset, testing various techniques to consolidate …