Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 8 of 8

Full-Text Articles in Physical Sciences and Mathematics

Machine Learning And Natural Language Processing For Crossword Puzzles, Finn Brennan Jan 2024

Machine Learning And Natural Language Processing For Crossword Puzzles, Finn Brennan

Senior Projects Spring 2024

Senior Project submitted to The Division of Science, Mathematics and Computing of Bard College.


Legislative Language For Success, Sanjana Gundala Jun 2022

Legislative Language For Success, Sanjana Gundala

Master's Theses

Legislative committee meetings are an integral part of the lawmaking process for local and state bills. The testimony presented during these meetings is a large factor in the outcome of the proposed bill. This research uses Natural Language Processing and Machine Learning techniques to analyze testimonies from California Legislative committee meetings from 2015-2016 in order to identify what aspects of a testimony makes it successful. A testimony is considered successful if the alignment of the testimony matches the bill outcome (alignment is "For" and the bill passes or alignment is "Against" and the bill fails). The process of finding what …


Towards A Computational Model Of Narrative On Social Media, Anne Bailey Jun 2022

Towards A Computational Model Of Narrative On Social Media, Anne Bailey

Dartmouth College Undergraduate Theses

This thesis describes a variety of approaches to developing a computational model of narrative on social media. Our goal is to use such a narrative model to identify efforts to manipulate public opinion on social media platforms like Twitter. We present a model in which narratives in a collection of tweets are represented as a graph. Elements from each tweet that are relevant to potential narratives are made into nodes in the graph; for this thesis, we populate graph nodes with tweets’ authors, hashtags, named entities (people, locations, organizations, etc.,), and moral foundations (central moral values framing the discussion). Two …


Cyberbullying Classification Based On Social Network Analysis, Anqi Wang May 2021

Cyberbullying Classification Based On Social Network Analysis, Anqi Wang

Master's Projects

With the popularity of social media platforms such as Facebook, Twitter, and Instagram, people widely share their opinions and comments over the Internet. Exten- sive use of social media has also caused a lot of problems. A representative problem is Cyberbullying, which is a serious social problem, mostly among teenagers. Cyber- bullying occurs when a social media user posts aggressive words or phrases to harass other users, and that leads to negatively affects on their mental and social well-being. Additionally, it may ruin the reputation of that media. We are considering the problem of detecting posts that are aggressive. Moreover, …


Analyses And Creation Of Author Stylized Text, Keith Carlson May 2021

Analyses And Creation Of Author Stylized Text, Keith Carlson

Dartmouth College Ph.D Dissertations

Written text is one of the major ways that humans communicate their thoughts. A single thought can be expressed through many different combinations of words, and the writer must choose which they will use. We call the idea which is communicated the content of the message, and the particular words chosen to express the content, the style. The same content expressed in a different style may tell something useful about the author of the text (e.g., the author's identity), may be easier to understand for different audiences, or may evoke different emotions in the reader.

In this work we explore …


Identifying External Cross-References Using Natural Language Processing (Nlp), Elham Rahmani Apr 2020

Identifying External Cross-References Using Natural Language Processing (Nlp), Elham Rahmani

Electronic Thesis and Dissertation Repository

[Context and motivation] Software engineers build systems that need to be compliant with relevant regulations. These regulations are stated in authoritative documents from which regulatory requirements need to be elicited. Project contract contains cross-references to these regulatory requirements in external documents. [Problem] Exploring and identifying the regulatory requirements in voluminous textual data is enormously time consuming, and hence costly, and error-prone in sizable software projects. [Principal idea and novelty] We use Natural Language Processing (NLP), Pattern Recognition and Web Scrapping techniques for automatically extracting external cross-references from contractual requirements and prepare a map for representing related external cross-references …


Indirect Relatedness, Evaluation, And Visualization For Literature Based Discovery, Sam Henry Jan 2019

Indirect Relatedness, Evaluation, And Visualization For Literature Based Discovery, Sam Henry

Theses and Dissertations

The exponential growth of scientific literature is creating an increased need for systems to process and assimilate knowledge contained within text. Literature Based Discovery (LBD) is a well established field that seeks to synthesize new knowledge from existing literature, but it has remained primarily in the theoretical realm rather than in real-world application. This lack of real-world adoption is due in part to the difficulty of LBD, but also due to several solvable problems present in LBD today. Of these problems, the ones in most critical need of improvement are: (1) the over-generation of knowledge by LBD systems, (2) a …


Csc Senior Project: Nlpstats, Michael Mease Mar 2013

Csc Senior Project: Nlpstats, Michael Mease

Computer Science and Software Engineering

Natural Language Processing has recently increased in popularity. The field of authorship analysis, specifically, uses various characteristics of text quantified by markers. NLPStats serves as a tool designed to streamline marker extraction based on user needs. A flexible query system allows for custom marker requests, adjustment of result formatting, and preprocessing options. Furthermore, an efficiently designed structure ensures that users retrieve information quickly. As a whole, NLPStats enables anyone, regardless of NLP experience, to extract important information about the text of a document.