Open Access. Powered by Scholars. Published by Universities.®

Computational Linguistics Commons

Open Access. Powered by Scholars. Published by Universities.®

228 Full-Text Articles 340 Authors 191,144 Downloads 63 Institutions

All Articles in Computational Linguistics

Faceted Search

228 full-text articles. Page 1 of 10.

Skyler's Lunch, Noah Sherman, Autumn Boone, Hilaria Cruz 2024 University of Louisville

Skyler's Lunch, Noah Sherman, Autumn Boone, Hilaria Cruz

LING 590/Internet Language

Our class was studying the use of emojis across different platforms and wanted to explore how stories using emojis could impact young readers. Here, we try to translate the story of Skyler into emoji, providing translations along the way. We replace words completely with emoji, represent phrases with a few emoji, and use additional emoji to make sense of the content, including punctuation. In this book, we explore the character of Skyler, who is a picky eater. But they learn to eat the nutritious food that is good for them. In the end, they even get a reward!


Retórica Intercultural En El Discurso Académico Universitario: Las Funciones Retóricas De La Citación En Los Trabajos De Fin De Máster Escritos En Español Y En Inglés Por Hablantes Nativos Y No Nativos, David Sanchez-Jimenez 2024 City University of New York (CUNY)

Retórica Intercultural En El Discurso Académico Universitario: Las Funciones Retóricas De La Citación En Los Trabajos De Fin De Máster Escritos En Español Y En Inglés Por Hablantes Nativos Y No Nativos, David Sanchez-Jimenez

Publications and Research

This research derives from the interest in learning the cultural differences in citation practices in the academic genre of Master's thesis of native Spanish (Ee), non-native Filipino writers of Spanish (Fe), native Filipino writers of English (Fi), and American writers of English. A total of thirty-two (32) master´s theses – eight (8) for each group – were analyzed. A quantitative and qualitative methodology was used to study this phenomenon based on the computerized textual analysis of the rhetorical function of citations arranged in typological classification that modified the outline proposed by Petrić in his 2007 article. The results obtained from …


How Do We Learn What We Cannot Say?, Daniel Yakubov 2024 The Graduate Center, City University of New York

How Do We Learn What We Cannot Say?, Daniel Yakubov

Dissertations, Theses, and Capstone Projects

The contributions of this thesis are two-fold. First, this thesis presents UDTube, an easily usable software developed to perform morphological analysis in a multi-task fashion. This work shows the strong performance of UDTube versus the current state-of-the-art, UDPipe, across eight languages, primarily in the annotation of morphological features. The second contribution of this thesis is a exploration into the study of defectivity. UDTube is used to annotate a large amount of data in Greek and Russian which is ultimately used to investigate the plausibility of Indirect Negative Evidence (INE), a popular approach to the acquisition of morphological defectivity. The reported …


Consonant (De)Gradation In Ingrian?, Andrea M. Harrison 2024 The Graduate Center, City University of New York

Consonant (De)Gradation In Ingrian?, Andrea M. Harrison

Dissertations, Theses, and Capstone Projects

This paper will present a dual method toward data enrichment for low-resource languages. Using Yoyodyne -- a Fairseq-inspired neural library for small-vocabulary sequence-to-sequence generation -- a morphological generation task was tested across labeled data encompassing multiple stages of enrichment for the low-resource language Ingrian. Due to limitations in the available data for Ingrian, weighted finite-state transducers (WFSTs) were used to generate an expanded vocabulary via HFST's toolkit for Uralic languages, and GiellaLT, a source for FST-driven lexica for low-resource languages. Further stages of experimentation used labeled data from related, higher-resource languages (Finnish, Estonian) to encourage cross-lingual transfer in the interest …


The Ring Cycle: Journeying Through The Language Of Tolkien’S Third Age With Corpus Linguistics, Michael Livesey 2024 University of Sheffield

The Ring Cycle: Journeying Through The Language Of Tolkien’S Third Age With Corpus Linguistics, Michael Livesey

Journal of Tolkien Research

This article explores the journey taken by the One Ring across J.R.R. Tolkien’s Third Age writings. It employs a digital humanities approach to analyse linguistic patterns in Tolkien’s use of the word ring, across The Hobbit and The Lord of the Rings. Specifically, the article employs corpus linguistic methods to track shifts in the quantities and qualities of the Ring’s appearance across these texts. It uses techniques of keyness and collocation analysis to trace transformations in these quantities/qualities, including: a) the Ring’s transition from a central to a peripheral place in the Third Age’s narrative arc; and b) …


Guilty Machines: On Ab-Sens In The Age Of Ai, Dylan Lackey, Katherine Weinschenk 2023 Virginia Commonwealth University

Guilty Machines: On Ab-Sens In The Age Of Ai, Dylan Lackey, Katherine Weinschenk

Critical Humanities

For Lacan, guilt arises in the sublimation of ab-sens (non-sense) into the symbolic comprehension of sen-absexe (sense without sex, sense in the deficiency of sexual relation), or in the maturation of language to sensibility through the effacement of sex. Though, as Slavoj Žižek himself points out in a recent article regarding ChatGPT, the split subject always misapprehends the true reason for guilt’s manifestation, such guilt at best provides a sort of evidence for the inclusion of the subject in the order of language, acting as a necessary, even enjoyable mark of the subject’s coherence (or, more importantly, the subject’s separation …


The Near-Synonymous Classifiers In Mandarin Chinese: Etymology, Modern Usage, And Possible Problems In L2 Classroom, Irina Kavokina 2023 University of Massachusetts Amherst

The Near-Synonymous Classifiers In Mandarin Chinese: Etymology, Modern Usage, And Possible Problems In L2 Classroom, Irina Kavokina

Masters Theses

Many Chinese classifiers are nearly synonymic – they can be used with the same head nouns without changing the meaning of the sentence, in other words, such classifiers can be used interchangeably or almost interchangeably. This poses a challenge for Chinese language learners, especially those who lack such a grammatical category in their own native language. Another complication arises from the ambiguous English translations of many classifiers.

In this paper we investigate the collocation behavior of near-synonymous Chinese classifiers, focusing on their semantic nuances and interchangeability. Analyzing 6 pairs of classifiers — 栋 and 幢, 匹 and 头, 批 and …


Executive Order On The Safe, Secure, And Trustworthy Development And Use Of Artificial Intelligence, Joseph R. Biden 2023 United States Office of the President

Executive Order On The Safe, Secure, And Trustworthy Development And Use Of Artificial Intelligence, Joseph R. Biden

Copyright, Fair Use, Scholarly Communication, etc.

Section 1. Purpose. Artificial intelligence (AI) holds extraordinary potential for both promise and peril. Responsible AI use has the potential to help solve urgent challenges while making our world more prosperous, productive, innovative, and secure. At the same time, irresponsible use could exacerbate societal harms such as fraud, discrimination, bias, and disinformation; displace and disempower workers; stifle competition; and pose risks to national security. Harnessing AI for good and realizing its myriad benefits requires mitigating its substantial risks. This endeavor demands a society-wide effort that includes government, the private sector, academia, and civil society.

My Administration places the highest urgency …


Towards Interpretable Machine Reading Comprehension With Mixed Effects Regression And Exploratory Prompt Analysis, Luca Del Signore 2023 The Graduate Center, City University of New York

Towards Interpretable Machine Reading Comprehension With Mixed Effects Regression And Exploratory Prompt Analysis, Luca Del Signore

Dissertations, Theses, and Capstone Projects

We investigate the properties of natural language prompts that determine their difficulty in machine reading comprehension tasks. While much work has been done benchmarking language model performance at the task level, there is considerably less literature focused on how individual task items can contribute to interpretable evaluations of natural language understanding. Such work is essential to deepening our understanding of language models and ensuring their responsible use as a key tool in human machine communication. We perform an in depth mixed effects analysis on the behavior of three major generative language models, comparing their performance on a large reading comprehension …


A Computational Analysis Of Volodymyr Zelenskyy's Public Diplomacy Discourse In Times Of Crisis, Amber Brittain-Hale 2023 Pepperdine University

A Computational Analysis Of Volodymyr Zelenskyy's Public Diplomacy Discourse In Times Of Crisis, Amber Brittain-Hale

Education Division Scholarship

In this study, we delve into the public diplomacy discourse of Ukrainian President Volodymyr Zelenskyy during the ongoing crisis of the Russo-Ukrainian War. We aim to conduct a computational analysis of Zelenskyy's English, Russian, and Ukrainian speeches, exploring the linguistic patterns and code-switching employed in his discourse. The study period encompasses Russia’s build-up to and full-scale invasion of Ukraine from May 2019 to May 30, 2023. This time frame is crucial as it captures the dynamic development of the crisis and the expansion of Zelenskyy's presidency, providing a unique context for analyzing his public diplomacy efforts. By utilizing Linguistic Inquiry …


Destined Failure, Chengjun Pan 2023 Rhode Island School of Design

Destined Failure, Chengjun Pan

Masters Theses

I attempt to examine the complex structure of human communication, explaining why it is bound to fail. By reproducing experienceable phenomena, I demonstrate how they can expose communication structure and reveal the limitations of our perception and symbolization.I divide the process of communication into six stages: input, detection, symbolization, dictionary, interpretation, and output. In this thesis, I examine the flaws and challenges that arise in the first five stages. I argue that reception acts as a filter and that understanding relies on a symbolic system that is full of redundancies. Therefore, every interpretation is destined to be a deviation.


The Sociolinguistics Of Code-Switching In Hong Kong’S Digital Landscape: A Mixed-Methods Exploration Of Cantonese-English Alternation Patterns On Whatsapp, Wilkinson Daniel Wong Gonzales, Yuen Man Tsang 2023 The Chinese University of Hong Kong

The Sociolinguistics Of Code-Switching In Hong Kong’S Digital Landscape: A Mixed-Methods Exploration Of Cantonese-English Alternation Patterns On Whatsapp, Wilkinson Daniel Wong Gonzales, Yuen Man Tsang

Journal of English and Applied Linguistics

This paper examines the prevalence of Cantonese-English code-mixing in Hong Kong through an under-researched digital medium. Prior research on this code-alternation practice has often been limited to exploring either the social or linguistic constraints of code-switching in spoken or written communication. Our study takes a holistic approach to analyzing code-switching in a hybrid medium that exhibits features of both spoken and written discourse. We specifically analyze the code-switching patterns of 24 undergraduates from a Hong Kong university on WhatsApp and examine how both social and linguistic factors potentially constrain these patterns. Utilizing a self-compiled sociolinguistic corpus as well as survey …


Topics For He But Not For She: Quantifying And Classifying Gender Bias In The Media, Tyler J. Lanni 2023 The Graduate Center, City University of New York

Topics For He But Not For She: Quantifying And Classifying Gender Bias In The Media, Tyler J. Lanni

Dissertations, Theses, and Capstone Projects

In this study, we used computational techniques to analyze the language used in news articles to describe female and male politicians. Our corpus included 370 subtexts for male candidates and 374 subtexts for female candidates, gathered through the New York Times API. We conducted two experiments: an LDA topic analysis to explore the data, and a logistic regression to classify the subtexts as either male or female. Our analysis revealed some noteworthy findings that suggest the possibility of developing a gender bias classifier in the future. However, to create a more robust understanding of bias, additional research and data are …


Evaluating Neural Networks As Cognitive Models For Learning Quasi-Regularities In Language, Xiaomeng Ma 2023 The Graduate Center, City University of New York

Evaluating Neural Networks As Cognitive Models For Learning Quasi-Regularities In Language, Xiaomeng Ma

Dissertations, Theses, and Capstone Projects

Many aspects of language can be categorized as quasi-regular: the relationship between the inputs and outputs is systematic but allows many exceptions. Common domains that contain quasi-regularity include morphological inflection and grapheme-phoneme mapping. How humans process quasi-regularity has been debated for decades. This thesis implemented modern neural network models, transformer models, on two tasks: English past tense inflection and Chinese character naming, to investigate how transformer models perform quasi-regularity tasks. This thesis focuses on investigating to what extent the models' performances can represent human behavior. The results show that the transformers' performance is very similar to human behavior in many …


Neural Network Vs. Rule-Based G2p: A Hybrid Approach To Stress Prediction And Related Vowel Reduction In Bulgarian, Maria Karamihaylova 2023 The Graduate Center, City University of New York

Neural Network Vs. Rule-Based G2p: A Hybrid Approach To Stress Prediction And Related Vowel Reduction In Bulgarian, Maria Karamihaylova

Dissertations, Theses, and Capstone Projects

An effective grapheme-to-phoneme (G2P) conversion system is a critical element of speech synthesis. Rule-based systems were an early method for G2P conversion. In recent years, machine learning tools have been shown to outperform rule-based approaches in G2P tasks. We investigate neural network sequence-to-sequence modeling for the prediction of syllable stress and resulting vowel reductions in the Bulgarian language. We then develop a hybrid G2P approach which combines manually written grapheme-to-phoneme mapping rules with neural network-enabled syllable stress predictions by inserting stress markers in the predicted stress position of the transcription produced by the rule-based finite-state transducer. Finally, we apply vowel …


Ai Approaches To Understand Human Deceptions, Perceptions, And Perspectives In Social Media, Chih-Yuan Li 2023 New Jersey Institute of Technology

Ai Approaches To Understand Human Deceptions, Perceptions, And Perspectives In Social Media, Chih-Yuan Li

Dissertations

Social media platforms have created virtual space for sharing user generated information, connecting, and interacting among users. However, there are research and societal challenges: 1) The users are generating and sharing the disinformation 2) It is difficult to understand citizens' perceptions or opinions expressed on wide variety of topics; and 3) There are overloaded information and echo chamber problems without overall understanding of the different perspectives taken by different people or groups.

This dissertation addresses these three research challenges with advanced AI and Machine Learning approaches. To address the fake news, as deceptions on the facts, this dissertation presents Machine …


Predicting High-Cap Tech Stock Polarity: A Combined Approach Using Support Vector Machines And Bidirectional Encoders From Transformers, Ian L. Grisham 2023 East Tennessee State University

Predicting High-Cap Tech Stock Polarity: A Combined Approach Using Support Vector Machines And Bidirectional Encoders From Transformers, Ian L. Grisham

Electronic Theses and Dissertations

The abundance, accessibility, and scale of data have engendered an era where machine learning can quickly and accurately solve complex problems, identify complicated patterns, and uncover intricate trends. One research area where many have applied these techniques is the stock market. Yet, financial domains are influenced by many factors and are notoriously difficult to predict due to their volatile and multivariate behavior. However, the literature indicates that public sentiment data may exhibit significant predictive qualities and improve a model’s ability to predict intricate trends. In this study, momentum SVM classification accuracy was compared between datasets that did and did not …


Improving Sign Recognition With Phonology, Lee Kezar, Jesse Thomason, Zed Sevcikova Sehyr 2023 University of Southern California

Improving Sign Recognition With Phonology, Lee Kezar, Jesse Thomason, Zed Sevcikova Sehyr

Communication Sciences and Disorders Faculty Articles and Research

We use insights from research on American Sign Language (ASL) phonology to train models for isolated sign language recognition (ISLR), a step towards automatic sign language understanding. Our key insight is to explicitly recognize the role of phonology in sign production to achieve more accurate ISLR than existing work which does not consider sign language phonology. We train ISLR models that take in pose estimations of a signer producing a single sign to predict not only the sign but additionally its phonological characteristics, such as the handshape. These auxiliary predictions lead to a nearly 9% absolute gain in sign recognition …


Content-Based Unsupervised Fake News Detection On Ukraine-Russia War, Yucheol Shin, Yvan Sojdehei, Limin Zheng, Brad Blanchard 2023 Southern Methodist University

Content-Based Unsupervised Fake News Detection On Ukraine-Russia War, Yucheol Shin, Yvan Sojdehei, Limin Zheng, Brad Blanchard

SMU Data Science Review

The Ukrainian-Russian war has garnered significant attention worldwide, with fake news obstructing the formation of public opinion and disseminating false information. This scholarly paper explores the use of unsupervised learning methods and the Bidirectional Encoder Representations from Transformers (BERT) to detect fake news in news articles from various sources. BERT topic modeling is applied to cluster news articles by their respective topics, followed by summarization to measure the similarity scores. The hypothesis posits that topics with larger variances are more likely to contain fake news. The proposed method was evaluated using a dataset of approximately 1000 labeled news articles related …


Single-Case Pilot Study For Longitudinal Analysis Of Referential Failures And Sentiment In Schizophrenic Speech From Client-Centered Psychotherapy Recordings, Travis A. Musich 2023 National Louis University

Single-Case Pilot Study For Longitudinal Analysis Of Referential Failures And Sentiment In Schizophrenic Speech From Client-Centered Psychotherapy Recordings, Travis A. Musich

Dissertations

Though computational linguistic analyses have revealed the presence of distinctly characteristic language features in schizophrenic disordered speech, the relative stability of these language features in longitudinal samples is still unknown. This longitudinal pilot study analyzed schizophrenic disordered speech data from the archival therapy audio recordings of one patient spanning 23 years. End-to-end Neural Coreference Resolution software was used to analyze transcribed speech data from three therapy sessions to identify ambiguous pronouns, referred to as referential failures, which were reviewed and confirmed by multiple raters. Speech samples were analyzed using Google Cloud Natural Language API software for sentiment variables (i.e., score, …


Digital Commons powered by bepress