Open Access. Powered by Scholars. Published by Universities.®

Computational Linguistics Commons

Open Access. Powered by Scholars. Published by Universities.®

429 Full-Text Articles 650 Authors 122,890 Downloads 48 Institutions

All Articles in Computational Linguistics

Faceted Search

429 full-text articles. Page 1 of 21.

Generic Ab Initio, James A. Heilpern, Earl Kjar Brown, William G. Eggington, Zachary D. Smith 2022 Brigham Young University Law School

Generic Ab Initio, James A. Heilpern, Earl Kjar Brown, William G. Eggington, Zachary D. Smith

Buffalo Law Review

From comic conventions to disbanded dioceses, courts continue to struggle with a unique but puzzling question of trademark law. Federal law protects certain terms that refer to a product or service from a specific producer instead of to a product generally. Terms that refer to products are considered generic and cannot receive protection. Courts have also held that a term that was generic at the time the party adopted the mark cannot receive protection, even if the public later views it as being specific to a particular producer. But, many marks were adopted decades or centuries ago. As a result ...


A Machine Learning Approach To Text-Based Sarcasm Detection, Lara I. Novic 2022 The Graduate Center, City University of New York

A Machine Learning Approach To Text-Based Sarcasm Detection, Lara I. Novic

Dissertations, Theses, and Capstone Projects

Sarcasm and indirect language are commonplace for humans to produce and recognize but difficult for machines to detect. While artificial intelligence can accurately analyze sentiment and emotion in speech and text, it may struggle with insincere and sardonic content, although it is possible to train a machine to identify uttered and written sarcasm. This paper aims to detect sarcasm using logistic regression and a support vector machine (SVM) and compare their results to a baseline.

The models are trained on headlines from a Kaggle dataset containing headlines from the satirical news website The Onion and serious news website Huffpost (formerly ...


Yay…, 😉, And #Sarcasm: Exploring How Sarcasm Is Marked In Text-Based Cmc, Bronte G. Gordon 2022 Portland State University

Yay…, 😉, And #Sarcasm: Exploring How Sarcasm Is Marked In Text-Based Cmc, Bronte G. Gordon

University Honors Theses

Sarcasm is a complex phenomenon of indirect speech, when we intend a meaning different from that of the literal words we use. In face-to-face settings (FtF), facial expressions, body language, and prosodic cues can be helpful indicators of sarcasm. It becomes even harder to decipher when these physical cues are removed as in any written setting. This paper explores what text strategies are used to mark sarcasm in text-based English language communication online. Through a systematic literature review, the similarities and differences of irony and sarcasm were explored, as well as the issues these parallels and distinctions create in delineating ...


Covert Determiners In Appalachian English Narrative Declarative Sentences, William Oliver 2022 The Graduate Center, City University of New York

Covert Determiners In Appalachian English Narrative Declarative Sentences, William Oliver

Dissertations, Theses, and Capstone Projects

In this thesis, I explore the syntax and semantics of covert determiners (Ds) in matrix subject determiner phrases (DPs) with definite specific interpretations. To conduct my investigation, I used the Audio-Aligned and Parsed Corpus of Appalachian English (AAPCAppE), a million-word Penn Treebank corpus, and the software CorpusSearch, a Java program that searches Penn Treebank corpora. My research shows that Appalachian English contains a linguistic phenomenon where speakers drop the D, replacing overt Ds with covert Ds, in definite specific DPs. For example, where Standard English speakers say The doctor came by horseback, Appalachian speakers may use a covert D in ...


“I Can See The Forest For The Trees”: Examining Personality Traits With Trasformers, Alexander Moore 2022 Clemson University

“I Can See The Forest For The Trees”: Examining Personality Traits With Trasformers, Alexander Moore

All Dissertations

Our understanding of Personality and its structure is rooted in linguistic studies operating under the assumptions made by the Lexical Hypothesis: personality characteristics that are important to a group of people will at some point be codified in their language, with the number of encoded representations of a personality characteristic indicating their importance. Qualitative and quantitative efforts in the dimension reduction of our lexicon throughout the mid-20th century have played a vital role in the field’s eventual arrival at the widely accepted Five Factor Model (FFM). However, there are a number of presently unresolved conflicts regarding the breadth ...


Metaphor Detection In Poems In Misurata Arabic Sub-Dialect : An Lstm Model, Azza Abugharsa 2022 Montclair State University

Metaphor Detection In Poems In Misurata Arabic Sub-Dialect : An Lstm Model, Azza Abugharsa

Theses, Dissertations and Culminating Projects

Natural Language Processing (NLP) in Arabic is witnessing an increasing interest in investigating different topics in the field. One of the topics that have drawn attention is the automatic processing of Arabic figurative language. The focus in previous projects is on detecting and interpreting metaphors in comments from social media as well as phrases and/or headlines from news articles. The current project focuses on metaphor detection in poems written in the Misurata Arabic sub-dialect spoken in Misurata, located in the North African region. The dataset is initially annotated by a group of linguists, and their annotation is treated as ...


Toward Suicidal Ideation Detection With Lexical Network Features And Machine Learning, Ulya Bayram, William Lee, Daniel Santel, Ali Minai, Peggy Clark, Tracy Glauser, John Pestian 2022 Çanakkale Onsekiz Mart University

Toward Suicidal Ideation Detection With Lexical Network Features And Machine Learning, Ulya Bayram, William Lee, Daniel Santel, Ali Minai, Peggy Clark, Tracy Glauser, John Pestian

Northeast Journal of Complex Systems (NEJCS)

In this study, we introduce a new network feature for detecting suicidal ideation from clinical texts and conduct various additional experiments to enrich the state of knowledge. We evaluate statistical features with and without stopwords, use lexical networks for feature extraction and classification, and compare the results with standard machine learning methods using a logistic classifier, a neural network, and a deep learning method. We utilize three text collections. The first two contain transcriptions of interviews conducted by experts with suicidal (n=161 patients that experienced severe ideation) and control subjects (n=153). The third collection consists of interviews conducted ...


Representing Multiple Dependencies In Prosodic Structures, Kristine M. Yu 2022 University of Massachusetts - Amherst

Representing Multiple Dependencies In Prosodic Structures, Kristine M. Yu

Proceedings of the Society for Computation in Linguistics

Association of tones to prosodic trees was introduced in Pierrehumbert and Beckman (1988). This included: (i) tonal association to higher-level prosodic nodes such as intonational phrases, and (ii) multiple association of a tone to a higher-level prosodic node in addition to a tone bearing unit such as a syllable. Since then, these concepts have been broadly assumed in intonational phonology without much comment, even though Pierrehumbert and Beckman (1988)'s stipulation that tones associated to higher-level prosodic nodes are peripherally realized does not fit all the empirical data. We show that peripherally-realized tones associated to prosodic nodes can be naturally ...


Learning Constraints On Wh-Dependencies By Learning How To Efficiently Represent Wh-Dependencies: A Developmental Modeling Investigation With Fragment Grammars, Niels Dickson, Lisa Pearl, Richard Futrell 2022 University of California, Irvine

Learning Constraints On Wh-Dependencies By Learning How To Efficiently Represent Wh-Dependencies: A Developmental Modeling Investigation With Fragment Grammars, Niels Dickson, Lisa Pearl, Richard Futrell

Proceedings of the Society for Computation in Linguistics

It’s hotly contested how children learn constraints on wh-dependencies, called syntactic islands. When learning this knowledge, a prerequisite is knowing how to represent wh-dependencies so that constraints can be hypothesized over those representations. Previous work has explained disparate sets of syntactic island constraints by assuming different wh-dependency representations, without a unified dependency representation capturing all these constraints. We implement a modeled learner who learns a Fragment Grammar (FG) representation of wh-dependencies–a representation comprised of potentially different-sized fragments that combine to form full dependencies–that best accounts for the input while being as compact as ...


Can Language Models Capture Syntactic Associations Without Surface Cues? A Case Study Of Reflexive Anaphor Licensing In English Control Constructions, Soo-Hwan Lee, Sebastian Schuster 2022 New York University

Can Language Models Capture Syntactic Associations Without Surface Cues? A Case Study Of Reflexive Anaphor Licensing In English Control Constructions, Soo-Hwan Lee, Sebastian Schuster

Proceedings of the Society for Computation in Linguistics

We examine GPT-2 (Radford et al., 2019), which is trained only on surface strings, to see whether or not the model makes correct predictions about the agreement patterns of a reflexive anaphor in English control constructions. Our findings show that GPT-2 struggles with transitive subject control constructions, but does well on transitive object control constructions. One reason might be that the model tries to associate the anaphor with the closest noun phrase. Moreover, while we find that a model with a larger number of parameters shows higher accuracy on the tasks related to subject control constructions, performance remains below chance.


Incremental Acquisition Of A Minimalist Grammar Using An Smt-Solver, Sagar Indurkhya 2022 Massachusetts Institute of Technology

Incremental Acquisition Of A Minimalist Grammar Using An Smt-Solver, Sagar Indurkhya

Proceedings of the Society for Computation in Linguistics

We introduce a novel procedure that uses the Z3 SMT-solver, an interactive theorem prover, to incrementally infer a Minimalist Grammar (MG) from an input sequence of paired interface conditions, which corresponds to the primary linguistic data (PLD) a child is exposed to. The procedure outputs an MG lexicon, consisting of a set of (word, feature-sequence) pairings, that yields, for each entry in the PLD, a derivation that satisfies the listed interface conditions; the output MG lexicon corresponds to the Knowledge of Language that the child acquires from processing the PLD. We use the acquisition procedure to infer an MG lexicon ...


Masked Language Models Directly Encode Linguistic Uncertainty, Cassandra Jacobs, Ryan J. Hubbard, Kara D. Federmeier 2022 SUNY University at Buffalo

Masked Language Models Directly Encode Linguistic Uncertainty, Cassandra Jacobs, Ryan J. Hubbard, Kara D. Federmeier

Proceedings of the Society for Computation in Linguistics

Large language models (LLMs) have recently been used as models of psycholinguistic processing, usually focusing on lexical or syntactic surprisal. However, this approach casts away representations of utterance meaning (e.g., hidden states), which are used by LLMs to predict upcoming words. The present work explores whether hidden state representations of LLMs encode human language processing-relevant uncertainty. We specifically assess this possibility using sentences from Federmeier et al. (2007) that are either strongly or weakly predictive of a final word. Using a machine learning approach, we tested and confirmed that LLMs encode uncertainty in their hidden states.


Learning Argument Structures With Recurrent Neural Network Grammars, Ryo Yoshida, Yohei Oseki 2022 The University of Tokyo

Learning Argument Structures With Recurrent Neural Network Grammars, Ryo Yoshida, Yohei Oseki

Proceedings of the Society for Computation in Linguistics

In targeted syntactic evaluations, the syntactic competence of LMs has been investigated through various syntactic phenomena, among which one of the important domains has been argument structure. Argument structures in head-initial languages have been exclusively tested in the previous literature, but may be readily predicted from lexical information of verbs, potentially overestimating the syntactic competence of LMs. In this paper, we explore whether argument structures can be learned by LMs in head-final languages, which could be more challenging given that argument structures must be predicted before encountering verbs during incremental sentence processing, so that the relative weight of syntactic information ...


Inferring Inferences: Relational Propositions For Argument Mining, Andrew Potter 2022 University of North Alabama

Inferring Inferences: Relational Propositions For Argument Mining, Andrew Potter

Proceedings of the Society for Computation in Linguistics

Inferential reasoning is an essential feature of argumentation. Therefore, a method for mining discourse for inferential structures would be of value for argument analysis and assessment. The logic of relational propositions is a procedure for rendering texts as expressions in propositional logic directly from their rhetorical structures. From rhetorical structures, relational propositions are defined, and from these propositions, logical expressions are then generated. There are, however, unsettled issues associated with Rhetorical Structure Theory (RST), some of which are problematic for inference mining. This paper takes a deep dive into some of these issues, with the aim of elucidating the problems ...


Learning Stress Patterns With A Sequence-To-Sequence Neural Network, Brandon Prickett, Joe Pater 2022 Linguistics Department, University of Massachusetts Amherst

Learning Stress Patterns With A Sequence-To-Sequence Neural Network, Brandon Prickett, Joe Pater

Proceedings of the Society for Computation in Linguistics

We present the first application of modern neural networks to the well studied task of learning word stress systems. We tested our adaptation of a sequence-to-sequence network on the Tesar and Smolensky test set of 124 "languages", showing that it acquires generalizable representations of stress patterns in a very high proportion of runs. We also show that it learns restricted lexically conditioned patterns, known as stress windows. The ability of this model to acquire lexical idiosyncracies, which are very common in natural language systems, sets it apart from past, non-neural models tested on the Tesar and Smolensky data set.


Maxent Learners Are Biased Against Giving Probability To Harmonically Bounded Candidates, Charlie O'Hara 2022 University of Michigan

Maxent Learners Are Biased Against Giving Probability To Harmonically Bounded Candidates, Charlie O'Hara

Proceedings of the Society for Computation in Linguistics

One of the major differences between MaxEnt Harmonic Grammar (Goldwater and Johnson, 2003) and Noisy Harmonic Grammar (Boersma and Pater, 2016) is that in MaxEnt harmonically bounded candidates are able to get some probability, whereas in most other constraint-based grammars they can never be output (Jesney, 2007). The probability given to harmonically bounded candidates is taken from other candidates, in some cases allowing Max- Ent to model grammars that subvert some of the universal implications that are true in NoisyHG (Anttila and Magri, 2018). Magri (2018) argues that the types of implicational universals that remain valid in MaxEnt are phonologically ...


Horse Or Pony? Visual Typicality And Lexical Frequency Affect Variability In Object Naming, Eleonora Gualdoni, Thomas Brochhagen, Andreas Mädebach, Gemma Boleda 2022 Universitat Pompeu Fabra

Horse Or Pony? Visual Typicality And Lexical Frequency Affect Variability In Object Naming, Eleonora Gualdoni, Thomas Brochhagen, Andreas Mädebach, Gemma Boleda

Proceedings of the Society for Computation in Linguistics

Often we can use different names to refer to the same object (e.g., pony vs. horse) and naming choices vary among people. In the present study we explore factors that affect naming variation for visually presented objects. We analyse a large dataset of object naming with realistic images and focus on two factors: (a) the visual typicality of objects and their context for the names used by human annotators and (b) the lexical frequency of these names. We use a novel computational approach to estimate visual typicality by calculating the visual similarity of a given object (or context) and ...


The Interaction Between Cognitive Ease And Informativeness Shapes The Lexicons Of Natural Languages, Thomas Brochhagen, Gemma Boleda 2022 Universitat Pompeu Fabra

The Interaction Between Cognitive Ease And Informativeness Shapes The Lexicons Of Natural Languages, Thomas Brochhagen, Gemma Boleda

Proceedings of the Society for Computation in Linguistics

Lexical ambiguity is pervasive in language, and often systematic. Previous work shows that systematic ambiguities involve related meanings. This is attributed to cognitive pressure towards simplicity in language, as it makes lexicons easier to learn and use. The present study examines the interplay between this pressure and competing pressure for languages to support accurate information transfer. We hypothesize that ambiguity is shaped by a balance of the two pressures; and find support for this idea in data from over 1200 languages and 1400 meanings. Our results thus suggest that universal forces shape the lexicons of natural languages.


Typological Implications Of Tier-Based Strictly Local Movement, Thomas Graf 2022 Stony Brook University

Typological Implications Of Tier-Based Strictly Local Movement, Thomas Graf

Proceedings of the Society for Computation in Linguistics

Earlier work has shown that movement, which forms the backbone of Minimalist syntax, belongs in the subregular class of TSL-2 dependencies over trees. The central idea is that movement, albeit unbounded, boils down to local mother-daughter dependencies on a specific substructure called a tree tier. This reveals interesting parallels between syntax and phonology, but it also looks very different from the standard view of movement. One may wonder, then, whether the TSL-2 characterization is linguistically natural. I argue that this is indeed the case because TSL-2 furnishes a unified analysis of a variety of phenomena: multiple wh-movement, expletive constructions, the ...


Remodelling Complement Coercion Interpretation, Frederick G. Gietz, Barend Beekhuizen 2022 University of Toronto

Remodelling Complement Coercion Interpretation, Frederick G. Gietz, Barend Beekhuizen

Proceedings of the Society for Computation in Linguistics

Existing (experimental and computational) linguistic work uses participant paraphrases as a stand-in for event interpretation in complement coercion sentences (e.g. she finished the coffee > she finished drinking the coffee). We present crowdsourcing data and modelling that supports broadening this conception. In particular, our results suggest that sentences where many participants do not give a paraphrase, or where many different paraphrases are given, are informative about to how complement coercion is interpreted in naturalistic contexts.


Digital Commons powered by bepress