Open Access. Powered by Scholars. Published by Universities.®

Social and Behavioral Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 8 of 8

Full-Text Articles in Social and Behavioral Sciences

Mitigating Gender Bias In Neural Machine Translation Using Counterfactual Data, Alan Wong Sep 2020

Mitigating Gender Bias In Neural Machine Translation Using Counterfactual Data, Alan Wong

Dissertations, Theses, and Capstone Projects

Recent advances in deep learning have greatly improved the ability of researchers to develop effective machine translation systems. In particular, the application of modern neural architectures, such as the Transformer, has achieved state-of-the-art BLEU scores in many translation tasks. However, it has been found that even state-of-the-art neural machine translation models can suffer from certain implicit biases, such as gender bias (Lu et al., 2019). In response to this issue, researchers have proposed various potential solutions: some have proposed approaches that inject missing gender information into models, while others have attempted modifying the training data itself. We focus on mitigating …


Does The Word "Chien" Bark? Representation Learning In Neural Machine Translation Encoders, Emily Campbell Sep 2020

Does The Word "Chien" Bark? Representation Learning In Neural Machine Translation Encoders, Emily Campbell

Dissertations, Theses, and Capstone Projects

This thesis presents experiments with using representation learning to explore how neural networks learn. Neural networks which take text as input create internal representations of the text during their training. Recent work has found that these representations can be used to perform other downstream linguistic tasks, such as part-of-speech (POS) tagging. This demonstrates that the neural networks are learning linguistic information and storing this information in the representations. We focus on the representations created by neural machine translation (NMT) models and whether they can be used in POS tagging. We train 5 NMT models including an auto-encoder. We extract the …


Inferring Research Fields In Administrative Records Using Text Data, Ekaterina Levitskaya Jun 2020

Inferring Research Fields In Administrative Records Using Text Data, Ekaterina Levitskaya

Dissertations, Theses, and Capstone Projects

The UMETRICS database (Universities: Measuring the Effects of Research on Innovation, Competitiveness, and Science) contains rich information on grants from sponsored federal and non-federal research for 32 universities over a 15-year period. It is hosted at IRIS (Institute for Research on Innovation and Science, University of Michigan) and serves as a rich source of university administrative data; however, it does not contain information on research fields. Categorizing grants data by research field can help to measure results of investment in research and science and provide evidence for the data-driven policy-making; yet administrative data often lacks this type of categorization. In …


Genderlects In Social Media, Alina Korovatskaya Jun 2020

Genderlects In Social Media, Alina Korovatskaya

Dissertations, Theses, and Capstone Projects

Many studies have found significant differences in ways men and women use language; some argue that these differences occur as a result of culture differences, and others suggest that they are influenced by differences in social status and power between the genders. However, some of the major studies were concluded decades ago and do not reflect changes in gender relations in recent years. In this study, we analyze modern conversations using two social media platforms, Twitter and Reddit, to determine whether substantial differences between men and women’s use of language were preserved between the genders.


Doing Away With Defaults: Motivation For A Gradient Parameter Space, Katherine Howitt Jun 2020

Doing Away With Defaults: Motivation For A Gradient Parameter Space, Katherine Howitt

Dissertations, Theses, and Capstone Projects

In this thesis, I propose a reconceptualization of the traditional syntactic parameter space of the principles and parameters framework (Chomsky, 1981). In lieu of binary parameter settings, parameter values exist on a gradient plane where a learner’s knowledge of their language is encoded in their confidence that a particular parametric target value, and thus grammatical construction of an encountered sentence, is likely to be licensed by their target grammar. First, I discuss other learnability models in the classic parameter space which lack either psychological plausibility, theoretical consistency, or some combination of the two. Then, I argue for the Gradient Parameter …


Computational Approaches To The Syntax–Prosody Interface: Using Prosody To Improve Parsing, Hussein M. Ghaly Feb 2020

Computational Approaches To The Syntax–Prosody Interface: Using Prosody To Improve Parsing, Hussein M. Ghaly

Dissertations, Theses, and Capstone Projects

Prosody has strong ties with syntax, since prosody can be used to resolve some syntactic ambiguities. Syntactic ambiguities have been shown to negatively impact automatic syntactic parsing, hence there is reason to believe that prosodic information can help improve parsing. This dissertation considers a number of approaches that aim to computationally examine the relationship between prosody and syntax of natural languages, while also addressing the role of syntactic phrase length, with the ultimate goal of using prosody to improve parsing.

Chapter 2 examines the effect of syntactic phrase length on prosody in double center embedded sentences in French. Data collected …


Phonologically-Informed Speech Coding For Automatic Speech Recognition-Based Foreign Language Pronunciation Training, Anthony J. Vicario Feb 2020

Phonologically-Informed Speech Coding For Automatic Speech Recognition-Based Foreign Language Pronunciation Training, Anthony J. Vicario

Dissertations, Theses, and Capstone Projects

Automatic speech recognition (ASR) and computer-assisted pronunciation training (CAPT) systems used in foreign-language educational contexts are often not developed with the specific task of second-language acquisition in mind. Systems that are built for this task are often excessively targeted to one native language (L1) or a single phonemic contrast and are therefore burdensome to train. Current algorithms have been shown to provide erroneous feedback to learners and show inconsistencies between human and computer perception. These discrepancies have thus far hindered more extensive application of ASR in educational systems.

This thesis reviews the computational models of the human perception of American …


Ghost Peppers: Using Ensemble Models To Detect Professor Attractiveness Commentary On Ratemyprofessors.Com, Angie Waller Feb 2020

Ghost Peppers: Using Ensemble Models To Detect Professor Attractiveness Commentary On Ratemyprofessors.Com, Angie Waller

Dissertations, Theses, and Capstone Projects

In June 2018, RateMyProfessors.com (RMP), a popular website for students to leave professor reviews, removed a controversial feature known as the “chili pepper” which allowed students to rate their professors as “hot” or “not hot.” Though past research has rigorously analyzed the correlation of the chili pepper with higher ratings in other categories (Felton, Mitchell, and Stinson, 2004; Felton et al., 2008), none has measured the effect of the removal of the chili pepper on the text content submitted by students. While it is a positive step that the chili pepper has been removed, text commentary on teacher attractiveness persists …