Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 6 of 6

Full-Text Articles in Physical Sciences and Mathematics

The Document Similarity Network: A Novel Technique For Visualizing Relationships In Text Corpora, Dylan Baker Jan 2017

The Document Similarity Network: A Novel Technique For Visualizing Relationships In Text Corpora, Dylan Baker

HMC Senior Theses

With the abundance of written information available online, it is useful to be able to automatically synthesize and extract meaningful information from text corpora. We present a unique method for visualizing relationships between documents in a text corpus. By using Latent Dirichlet Allocation to extract topics from the corpus, we create a graph whose nodes represent individual documents and whose edge weights indicate the distance between topic distributions in documents. These edge lengths are then scaled using multidimensional scaling techniques, such that more similar documents are clustered together. Applying this method to several datasets, we demonstrate that these graphs are …


Combinatorial Polynomial Hirsch Conjecture, Sam Miller Jan 2017

Combinatorial Polynomial Hirsch Conjecture, Sam Miller

HMC Senior Theses

The Hirsch Conjecture states that for a d-dimensional polytope with n facets, the diameter of the graph of the polytope is at most n-d. This conjecture was disproven in 2010 by Francisco Santos Leal. However, a polynomial bound in n and d on the diameter of a polytope may still exist. Finding a polynomial bound would provide a worst-case scenario runtime for the Simplex Method of Linear Programming. However working only with polytopes in higher dimensions can prove challenging, so other approaches are welcome. There are many equivalent formulations of the Hirsch Conjecture, one of which is the …


Evolving Art: Modifying Context Free Art With A Genetic Algorithm, Marina Kent Jan 2017

Evolving Art: Modifying Context Free Art With A Genetic Algorithm, Marina Kent

Scripps Senior Theses

Context Free Design Grammar (CFDG) is a programming language for defining recursive structures that can be used to create art. I use CFDG as a design space for genetic programming, experimenting with various options for crossover, mutation, and fitness. In this exploratory work, multiple generations are manually assessed to determine the usefulness of the mutation strategies and fitness functions. I find that simple value mutation and fitness that alters general program structure is not enough to produce an increase of interesting images in CFDG. I discuss these findings as well as future avenues of inquiry for genetic programming in artistic …


An Introduction To The Theory And Applications Of Bayesian Networks, Anant Jaitha Jan 2017

An Introduction To The Theory And Applications Of Bayesian Networks, Anant Jaitha

CMC Senior Theses

Bayesian networks are a means to study data. A Bayesian network gives structure to data by creating a graphical system to model the data. It then develops probability distributions over these variables. It explores variables in the problem space and examines the probability distributions related to those variables. It conducts statistical inference over those probability distributions to draw meaning from them. They are good means to explore a large set of data efficiently to make inferences. There are a number of real world applications that already exist and are being actively researched. This paper discusses the theory and applications of …


A New Frontier: But For Whom? An Analysis Of The Micro-Computer And Women’S Declining Participation In Computer Science, Eliana Keinan Jan 2017

A New Frontier: But For Whom? An Analysis Of The Micro-Computer And Women’S Declining Participation In Computer Science, Eliana Keinan

CMC Senior Theses

Though women’s participation in science, technology, engineering, and mathematics (STEM) fields has greatly increased over the past 60 years, women’s participation in computer science peaked in the 1980s. The paper searches for key motivators for women entering computer science at the peak in order to isolate factors for the subsequent steep decline. A major finding of the paper is that having a computer at home is (weakly) statistically significant as a determinant for female students choosing to pursue computer science. This relationship is insignificant for students in other STEM and non-STEM fields. A final section of the paper examines employment …


Triple Non-Negative Matrix Factorization Technique For Sentiment Analysis And Topic Modeling, Alexander A. Waggoner Jan 2017

Triple Non-Negative Matrix Factorization Technique For Sentiment Analysis And Topic Modeling, Alexander A. Waggoner

CMC Senior Theses

Topic modeling refers to the process of algorithmically sorting documents into categories based on some common relationship between the documents. This common relationship between the documents is considered the “topic” of the documents. Sentiment analysis refers to the process of algorithmically sorting a document into a positive or negative category depending whether this document expresses a positive or negative opinion on its respective topic. In this paper, I consider the open problem of document classification into a topic category, as well as a sentiment category. This has a direct application to the retail industry where companies may want to scour …