Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 7 of 7

Full-Text Articles in Physical Sciences and Mathematics

Stereotypes And Language Models: Understanding How Language Models Encode Stereotypes, Debiasing Language Models, And Examining How Stereotypes Affect Conversations, Brian C. Wang Jun 2023

Stereotypes And Language Models: Understanding How Language Models Encode Stereotypes, Debiasing Language Models, And Examining How Stereotypes Affect Conversations, Brian C. Wang

Computer Science Senior Theses

This thesis describes a variety of approaches in examining how language models encode stereotypes (understanding stereotypes from a model point-of-view), debiasing language models, and using language models to understand how stereotypes affect conversations (understanding stereotypes from a conversational point-of-view). We present a novel approach for textual clues analysis that makes language models more interpretable, combining the understanding of what stereotypes the internal structures of language models have encoded during their initial training (via attention-based analysis) and understanding what textual clues are most relevant to identifying stereotypes for models trained to detect stereotypes (via SHAP-based analysis). We find that different pre-trained …


Sarcasm Detection In English And Arabic Tweets Using Transformer Models, Rishik Lad Jun 2023

Sarcasm Detection In English And Arabic Tweets Using Transformer Models, Rishik Lad

Computer Science Senior Theses

This thesis describes our approach toward the detection of sarcasm and its various types in English and Arabic Tweets through methods in deep learning. There are five problems we attempted: (1) detection of sarcasm in English Tweets, (2) detection of sarcasm in Arabic Tweets, (3) determining the type of sarcastic speech subcategory for English Tweets, (4) determining which of two semantically equivalent English Tweets is sarcastic, and (5) determining which of two semantically equivalent Arabic Tweets is sarcastic. All tasks were framed as classification problems, and our contributions are threefold: (a) we developed an English binary classifier system with RoBERTa, …


Data-Optimized Spatial Field Predictions For Robotic Adaptive Sampling: A Gaussian Process Approach, Zachary Nathan May 2023

Data-Optimized Spatial Field Predictions For Robotic Adaptive Sampling: A Gaussian Process Approach, Zachary Nathan

Computer Science Senior Theses

We introduce a framework that combines Gaussian Process models, robotic sensor measurements, and sampling data to predict spatial fields. In this context, a spatial field refers to the distribution of a variable throughout a specific area, such as temperature or pH variations over the surface of a lake. Whereas existing methods tend to analyze only the particular field(s) of interest, our approach optimizes predictions through the effective use of all available data. We validated our framework on several datasets, showing that errors can decline by up to two-thirds through the inclusion of additional colocated measurements. In support of adaptive sampling, …


Leveraging Context Patterns For Medical Entity Classification, Garrett Johnston Jun 2022

Leveraging Context Patterns For Medical Entity Classification, Garrett Johnston

Computer Science Senior Theses

The ability of patients to understand health-related text is important for optimal health outcomes. A system that can automatically annotate medical entities could help patients better understand health-related text. Such a system would also accelerate manual data annotation for this low-resource domain as well as assist in down- stream medical NLP tasks such as finding textual similarity, identifying conflicting medical advice, and aspect-based sentiment analysis. In this work, we investigate a state-of-the-art entity set expansion model, BootstrapNet, for the task of medical entity classification on a new dataset of medical advice text. We also propose EP SBERT, a simple model …


Symplectically Integrated Symbolic Regression Of Hamiltonian Dynamical Systems, Daniel Dipietro Jun 2022

Symplectically Integrated Symbolic Regression Of Hamiltonian Dynamical Systems, Daniel Dipietro

Computer Science Senior Theses

Here we present Symplectically Integrated Symbolic Regression (SISR), a novel technique for learning physical governing equations from data. SISR employs a deep symbolic regression approach, using a multi-layer LSTMRNN with mutation to probabilistically sample Hamiltonian symbolic expressions. Using symplectic neural networks, we develop a model-agnostic approach for extracting meaningful physical priors from the data that can be imposed on-the-fly into the RNN output, limiting its search space. Hamiltonians generated by the RNN are optimized and assessed using a fourth-order symplectic integration scheme; prediction performance is used to train the LSTM-RNN to generate increasingly better functions via a risk-seeking policy gradients …


Entity Based Sentiment Analysis For Textual Health Advice, Dae Lim Chung Apr 2022

Entity Based Sentiment Analysis For Textual Health Advice, Dae Lim Chung

Computer Science Senior Theses

This work explores entity based sentiment analysis for textual health advice through deep learning. We fine tuned a pretrained BERT model to analyze sentiments across five different predetermined categories which consist of food, medicine, disease, exercise, and vitality for three different sentiments: positive, negative, and neutral. Original set of annotated medical dataset from Dartmouth College’s Persist Lab was used to conduct the experiments. For the aim of tailoring the data for the purpose of entity based sentiment analysis, we explored data transformation techniques to generate optimum training examples. During the experiments, we were able to discover that the wide variety …


Exploring The Use Of Social Media To Infer Relationships Between Demographics, Psychographics And Vaccine Hesitancy, Abhimanyu Kapur Jun 2021

Exploring The Use Of Social Media To Infer Relationships Between Demographics, Psychographics And Vaccine Hesitancy, Abhimanyu Kapur

Computer Science Senior Theses

The growing popularity of social media as a platform to obtain information and share one's opinions on various topics makes it a rich source of information for research. In this study, we aimed to develop a framework to infer relationships between demographic and psychographic characteristics of a user and their opinion on a specific narrative - in this case, their stance on taking the COVID-19 vaccine. Twitter was the chosen platform due to the large USA user base and easily available data. Demographic traits included Race, Age, Gender, and Human-vs-Organization Status. Psychographic traits included the Big Five personality traits (Conscientiousness, …