Open Access. Powered by Scholars. Published by Universities.®

Social and Behavioral Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Linguistics

Computer Science Senior Theses

2023

English

Articles 1 - 1 of 1

Full-Text Articles in Social and Behavioral Sciences

Investigating English-Language Dialect-Adjusted Models, Samiha Datta May 2023

Investigating English-Language Dialect-Adjusted Models, Samiha Datta

Computer Science Senior Theses

This thesis describes several approaches to better understand how large language models interpret different dialects of the English language. Our goal is to consider multiple contexts of textual data and to analyze how English-language dialects are realized in them, as well as how a variety of machine learning techniques handle these differences. We focus on two genres of text data: news and social media. In the news context, we establish a dataset covering news articles from five countries and four US states and consider language modeling analysis, topic and sentiment distributions, and manual analysis before performing nine experiments and evaluating …