Open Access. Powered by Scholars. Published by Universities.®

Computational Linguistics Commons

Open Access. Powered by Scholars. Published by Universities.®

304 Full-Text Articles 491 Authors 37,428 Downloads 43 Institutions

All Articles in Computational Linguistics

Faceted Search

304 full-text articles. Page 1 of 15.

Genderlects In Social Media, Alina Korovatskaya 2020 The Graduate Center, City University of New York

Genderlects In Social Media, Alina Korovatskaya

All Dissertations, Theses, and Capstone Projects

Many studies have found significant differences in ways men and women use language; some argue that these differences occur as a result of culture differences, and others suggest that they are influenced by differences in social status and power between the genders. However, some of the major studies were concluded decades ago and do not reflect changes in gender relations in recent years. In this study, we analyze modern conversations using two social media platforms, Twitter and Reddit, to determine whether substantial differences between men and women’s use of language were preserved between the genders.


Doing Away With Defaults: Motivation For A Gradient Parameter Space, Katherine Howitt 2020 The Graduate Center, City University of New York

Doing Away With Defaults: Motivation For A Gradient Parameter Space, Katherine Howitt

All Dissertations, Theses, and Capstone Projects

In this thesis, I propose a reconceptualization of the traditional syntactic parameter space of the principles and parameters framework (Chomsky, 1981). In lieu of binary parameter settings, parameter values exist on a gradient plane where a learner’s knowledge of their language is encoded in their confidence that a particular parametric target value, and thus grammatical construction of an encountered sentence, is likely to be licensed by their target grammar. First, I discuss other learnability models in the classic parameter space which lack either psychological plausibility, theoretical consistency, or some combination of the two. Then, I argue for the Gradient ...


Chaprates, Brinly Xavier, Micole Amanda Marietta, Nidhi Vedantam 2020 Chapman University

Chaprates, Brinly Xavier, Micole Amanda Marietta, Nidhi Vedantam

Student Scholar Symposium Abstracts and Posters

On the Chapman campus, through taking and choosing various classes, there is a significant need for communication and feedback between students and peers, professors, tutors, and study groups. With this, we wanted to create an application that enables users from various majors to not only easily and effectively communicate with various people in their field, but one that also enables them to give and receive feedback on various classes through a rating system. We believe that the application will aid students in a myriad of specific ways, including being involved in study groups and getting tutoring help, determining which classes ...


Ghost Peppers: Using Ensemble Models To Detect Professor Attractiveness Commentary On Ratemyprofessors.Com, Angie Waller 2020 The Graduate Center, City University of New York

Ghost Peppers: Using Ensemble Models To Detect Professor Attractiveness Commentary On Ratemyprofessors.Com, Angie Waller

All Dissertations, Theses, and Capstone Projects

In June 2018, RateMyProfessors.com (RMP), a popular website for students to leave professor reviews, removed a controversial feature known as the “chili pepper” which allowed students to rate their professors as “hot” or “not hot.” Though past research has rigorously analyzed the correlation of the chili pepper with higher ratings in other categories \parencite{easiness, attractive}, none has measured the effect of the removal of the chili pepper on the text content submitted by students. While it is a positive step that the chili pepper has been removed, text commentary on teacher attractiveness persists and is submitted to the ...


Computational Approaches To The Syntax–Prosody Interface: Using Prosody To Improve Parsing, Hussein M. Ghaly 2020 The Graduate Center, City University of New York

Computational Approaches To The Syntax–Prosody Interface: Using Prosody To Improve Parsing, Hussein M. Ghaly

All Dissertations, Theses, and Capstone Projects

Prosody has strong ties with syntax, since prosody can be used to resolve some syntactic ambiguities. Syntactic ambiguities have been shown to negatively impact automatic syntactic parsing, hence there is reason to believe that prosodic information can help improve parsing. This dissertation considers a number of approaches that aim to computationally examine the relationship between prosody and syntax of natural languages, while also addressing the role of syntactic phrase length, with the ultimate goal of using prosody to improve parsing.

Chapter 2 examines the effect of syntactic phrase length on prosody in double center embedded sentences in French. Data collected ...


Phonologically-Informed Speech Coding For Automatic Speech Recognition-Based Foreign Language Pronunciation Training, Anthony J. Vicario 2020 The Graduate Center, City University of New York

Phonologically-Informed Speech Coding For Automatic Speech Recognition-Based Foreign Language Pronunciation Training, Anthony J. Vicario

All Dissertations, Theses, and Capstone Projects

Automatic speech recognition (ASR) and computer-assisted pronunciation training (CAPT) systems used in foreign-language educational contexts are often not developed with the specific task of second-language acquisition in mind. Systems that are built for this task are often excessively targeted to one native language (L1) or a single phonemic contrast and are therefore burdensome to train. Current algorithms have been shown to provide erroneous feedback to learners and show inconsistencies between human and computer perception. These discrepancies have thus far hindered more extensive application of ASR in educational systems.

This thesis reviews the computational models of the human perception of American ...


What Code-Switching Strategies Are Effective In Dialogue Systems?, Emily Ahn, Cecilia Jimenez, Yulia Tsvetkov, Alan Black 2020 University of Washington

What Code-Switching Strategies Are Effective In Dialogue Systems?, Emily Ahn, Cecilia Jimenez, Yulia Tsvetkov, Alan Black

Proceedings of the Society for Computation in Linguistics

Since most people in the world today are multilingual, code-switching is ubiquitous in spoken and written interactions. Paving the way for future adaptive, multilingual conversational agents, we incorporate linguistically-motivated strategies of code-switching into a rule-based goal-oriented dialogue system. We collect and release CommonAmigos, a corpus of 587 human-computer text conversations between our dialogue system and human users in mixed Spanish and English. From this new corpus, we analyze the amount of elicited code-switching, preferred patterns of user code-switching, and the impact of user demographics on code-switching. Based on these exploratory findings, we give recommendations for future effective code-switching dialogue systems ...


Preface: Scil 2020 Editors' Note, Allyson Ettinger, Gaja Jarosz, Max Nelson 2020 University of Chicago

Preface: Scil 2020 Editors' Note, Allyson Ettinger, Gaja Jarosz, Max Nelson

Proceedings of the Society for Computation in Linguistics

No abstract provided.


The Stability Of Segmental Properties Across Genre And Corpus Types In Low-Resource Languages, Uriel Cohen Priva, Shiying Yang, Emily Strand 2020 Brown University

The Stability Of Segmental Properties Across Genre And Corpus Types In Low-Resource Languages, Uriel Cohen Priva, Shiying Yang, Emily Strand

Proceedings of the Society for Computation in Linguistics

Are written corpora useful for phonological research? Word frequency lists for low-resource languages have become ubiquitous in recent years (Scannell, 2007). For many languages there is direct correspondence between their written forms and their alphabets, but it is not clear whether written corpora can adequately represent language use. We use 15 low-resource languages and compare several information-theoretic properties across three corpus types. We show that despite differences in origin and genre, estimates in one corpus are highly correlated with estimates in other corpora.


Modeling Behavior In Truth Value Judgment Task Experiments, Brandon Waldon, Judith Degen 2020 Stanford University

Modeling Behavior In Truth Value Judgment Task Experiments, Brandon Waldon, Judith Degen

Proceedings of the Society for Computation in Linguistics

Truth Value Judgment Task experiments (TVJTs) are a common means of investigating pragmatic competence, particularly with regards to scalar inference. We present a novel quantitative linking function from pragmatic competence to participant behavior on TVJTs, based upon a Bayesian probabilistic model of linguistic production. Our model captures a range of observed phenomena on TVJTs, including intermediate responses on a non-binary scale, population and individual-level variation, participant endorsement of false utterances, and variation in response due to so-called scalar diversity.


Probing Rnn Encoder-Decoder Generalization Of Subregular Functions Using Reduplication, Max Nelson, Hossep Dolatian, Jonathan Rawski, Brandon Prickett 2020 University of Massachusetts Amherst

Probing Rnn Encoder-Decoder Generalization Of Subregular Functions Using Reduplication, Max Nelson, Hossep Dolatian, Jonathan Rawski, Brandon Prickett

Proceedings of the Society for Computation in Linguistics

This paper examines the generalization abilities of encoder-decoder networks on a class of subregular functions characteristic of natural language reduplication. We find that, for the simulations we run, attention is a necessary and sufficient mechanism for learning generalizable reduplication. We examine attention alignment to connect RNN computation to a class of 2-way transducers.


Where New Words Are Born: Distributional Semantic Analysis Of Neologisms And Their Semantic Neighborhoods, Maria Ryskina, Ella Rabinovich, Taylor Berg-Kirkpatrick, David R. Mortensen, Yulia Tsvetkov 2020 Carnegie Mellon University

Where New Words Are Born: Distributional Semantic Analysis Of Neologisms And Their Semantic Neighborhoods, Maria Ryskina, Ella Rabinovich, Taylor Berg-Kirkpatrick, David R. Mortensen, Yulia Tsvetkov

Proceedings of the Society for Computation in Linguistics

We perform statistical analysis of the phenomenon of neology, the process by which new words emerge in a language, using large diachronic corpora of English. We investigate the importance of two factors, semantic sparsity and frequency growth rates of semantic neighbors, formalized in the distributional semantics paradigm. We show that both factors are predictive of word emergence although we find more support for the latter hypothesis. Besides presenting a new linguistic application of distributional semantics, this study tackles the linguistic question of the role of language-internal factors (in our case, sparsity) in language change motivated by language-external factors (reflected in ...


Mg Parsing As A Model Of Gradient Acceptability In Syntactic Islands, Aniello De Santo 2020 Stony Brook University

Mg Parsing As A Model Of Gradient Acceptability In Syntactic Islands, Aniello De Santo

Proceedings of the Society for Computation in Linguistics

It is well-known that the acceptability judgments at the core of current syntactic theories are continuous. However, an open debate is whether the source of such gradience is situated in the grammar itself, or can be derived from extra-grammatical factors. In this paper, we propose the use of a top-down parser for Minimalist grammars (Stabler, 2013; Kobele et al., 2013; Graf et al., 2017), as a formal model of how gradient acceptability can arise from categorical grammars. As a test case, we target the acceptability judgments for island effects collected by Sprouse et al. (2012a).


Interpreting Verbal Irony: Linguistic Strategies And The Connection To The Type Of Semantic Incongruity, Debanjan Ghosh, Elena Musi, Kartikeya Upasani, Smaranda Muresan 2020 McGovern Institute for Brain Research, MIT

Interpreting Verbal Irony: Linguistic Strategies And The Connection To The Type Of Semantic Incongruity, Debanjan Ghosh, Elena Musi, Kartikeya Upasani, Smaranda Muresan

Proceedings of the Society for Computation in Linguistics

Human communication often involves the use of verbal irony or sarcasm, where the speakers usually mean the opposite of what they say. To better understand how verbal irony is expressed by the speaker and interpreted by the hearer we conduct a crowdsourcing task: given an utterance expressing verbal irony, users are asked to verbalize their interpretation of the speaker's ironic message. We propose a typology of linguistic strategies for verbal irony interpretation and link it to various theoretical linguistic frameworks. We design computational models to capture these strategies and present empirical studies aimed to answer three questions: (1) what ...


Inflectional Networks: Graph-Theoretic Tools For Inflectional Typology, Andrea D. Sims 2020 The Ohio State University

Inflectional Networks: Graph-Theoretic Tools For Inflectional Typology, Andrea D. Sims

Proceedings of the Society for Computation in Linguistics

The interpredictability of the inflected forms of lexemes is increasingly important to questions of morphological complexity and typology, but tools to quantify and visualize this aspect of inflectional organization are lacking, inhibiting effective cross-linguistic comparison. In this paper I use metrics from graph theory to describe and compare the organizational structure of inflectional systems. Graph theory offers a well-established toolbox for describing the properties of networks, making it ideal for this purpose. Comparison of nine languages reveals previously unobserved generalizations about the typological space of morphological systems. This is the first paper to apply graph-theoretic tools to the goal of ...


Acquisition Of Inflectional Morphology In Artificial Neural Networks With Prior Knowledge, Katharina Kann 2020 New York University

Acquisition Of Inflectional Morphology In Artificial Neural Networks With Prior Knowledge, Katharina Kann

Proceedings of the Society for Computation in Linguistics

How does knowledge of one language’s morphology influence learning of inflection rules in a second one? In order to investigate this question in artificial neural network models, we perform experiments with a sequence-to-sequence architecture, which we train on different combinations of eight source and three target languages. A detailed analysis of the model outputs suggests the following conclusions: (i) if source and target language are closely related, acquisition of the target language’s inflectional morphology constitutes an easier task for the model; (ii) knowledge of a prefixing (resp. suffixing) language makes acquisition of a suffixing (resp. prefixing) language’s ...


Modeling Conventionalization And Predictability Within Mwes At The Brain Level, Shohini Bhattasali, Murielle Popa-Fabre, Christophe Pallier, John Hale 2020 University of Maryland, College Park

Modeling Conventionalization And Predictability Within Mwes At The Brain Level, Shohini Bhattasali, Murielle Popa-Fabre, Christophe Pallier, John Hale

Proceedings of the Society for Computation in Linguistics

While expressions have traditionally been binarized as compositional and noncompositional in linguistic theory, Multiword Expressions (MWEs) demonstrate finer-grained distinctions. Using Association Measures like Pointwise Mutual Information and Dice's Coefficient, MWEs can be characterized as having different degrees of conventionalization and predictability. Our goal is to investigate how these gradiences could reflect cognitive processes. In this study, fMRI recordings of naturalistic narrative comprehension is used to probe to what extent these computational measures and the cognitive processes they could operationalize are observable during on-line sentence processing. Our results show that Dice's Coefficent, representing lexical predictability, is a better predictor ...


A Principled Derivation Of Harmonic Grammar, Giorgio Magri 2020 CNRS

A Principled Derivation Of Harmonic Grammar, Giorgio Magri

Proceedings of the Society for Computation in Linguistics

Phonologists focus on a few processes at the time. This practice is motivated by the intuition that phonological processes factorize into clusters with no interactions across clusters (e.g., obstruent voicing does not interact with vowel harmony). To formalize this intuition, we factorize a full-blown representation into under-specified representations, each encoding only the information needed by the corresponding phonological cluster. And we require a grammar for the original full-blown representations to factorize into gram- mars that handle the under-specified representations separately, independently of each other. Within a harmony-based implementation of constraint-based phonology, HG is shown to follow axiomatically from this ...


Phonotactic Learning With Neural Language Models, Connor Mayer, Max Nelson 2020 University of California, Los Angeles

Phonotactic Learning With Neural Language Models, Connor Mayer, Max Nelson

Proceedings of the Society for Computation in Linguistics

Computational models of phonotactics share much in common with language models, which assign probabilities to sequences of words. While state of the art language models are implemented using neural networks, phonotactic models have not followed suit. We present several neural models of phonotactics, and show that they perform favorably when compared to existing models. In addition, they provide useful insights into the role of representations on phonotactic learning and generalization. This work provides a promising starting point for future modeling of human phonotactic knowledge.


An Ibsp Description Of Sanskrit /N/-Retroflexion, Ayla Karakaş 2020 Stony Brook University

An Ibsp Description Of Sanskrit /N/-Retroflexion, Ayla Karakaş

Proceedings of the Society for Computation in Linguistics

Graf and Mayer (2018) analyze the process of Sanskrit /n/-retroflexion (nati) from a subregular perspective. They show that nati, which might be the most complex phenomenon in segmental phonology, belongs to the class of input-output tier-based strictly local languages (IO-TSL). However, the generative capacity and linguistic relevance of IO-TSL is still largely unclear compared to other recent classes like the interval-based strictly piecewise languages (IBSP: Graf, 2017, 2018). This paper shows that IBSP has a much harder time capturing nati than IO-TSL does, due to two major shortcomings: namely, the requirement of an upper bound on relevant segments, and ...


Digital Commons powered by bepress