Open Access. Powered by Scholars. Published by Universities.®

Computational Linguistics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 6 of 6

Full-Text Articles in Computational Linguistics

Predicting Stress In Russian Using Modern Machine-Learning Tools, John Schriner Sep 2022

Predicting Stress In Russian Using Modern Machine-Learning Tools, John Schriner

Dissertations, Theses, and Capstone Projects

In the Russian language, stress on a word is determined via often complex patterns and rules. In this paper, after examining nearly a century of research in stress rules and methods in Russian, we turn to see if modern machine learning tools can aid in predicting stress. Using A.A. Zaliznyak’s dictionary grammar and over 300,000 word forms, we derived stress codes to aid in predicting which syllable primary stress falls on. We trained an LSTM neural network on the data and conducted eight experiments with added features such as lemma, part of speech, and morphology. While the model performed better …


Towards Explaining Variation In Entrainment, Andreas Weise Sep 2022

Towards Explaining Variation In Entrainment, Andreas Weise

Dissertations, Theses, and Capstone Projects

Entrainment refers to the tendency of human speakers to adapt to their interlocutors to become more similar to them. This affects various dimensions and occurs in many contexts, allowing for rich applications in human-computer interaction. However, it is not exhibited by every speaker in every conversation but varies widely across features, speakers, and contexts, hindering broad application. This variation, whose guiding principles are poorly understood even after decades of entrainment research, is the subject of this thesis. We begin with a comprehensive literature review that serves as the foundation of our own work and provides a reference to guide future …


From Sesame Street To Beyond: Multi-Domain Discourse Relation Classification With Pretrained Bert, Isaac R. Raff Sep 2022

From Sesame Street To Beyond: Multi-Domain Discourse Relation Classification With Pretrained Bert, Isaac R. Raff

Dissertations, Theses, and Capstone Projects

Research efforts in transfer learning have gained massive popularity in recent years. Pretrained language models have demonstrated the most successful results in producing high quality neural networks capable of quality inference after training across domains via transfer learning. This study expands on the domain transfer introduced in \cite{ferracane-etal-2019-news} exploring neural methods for transfer learning of discourse parsing between a news source domain and a medical target domain. \cite{ferracane-etal-2019-news} specifically discuss transfer learning from news articles to PubMed medical journal articles. Experiments in transfer learning in the current work expand to include three domains: Wall Street Journal articles previously annotated with …


Linguistic Abstractions In Children’S Very Early Utterances, Qihui Xu Sep 2022

Linguistic Abstractions In Children’S Very Early Utterances, Qihui Xu

Dissertations, Theses, and Capstone Projects

How early do children produce multiword utterances? Do children's early utterances reflect abstract syntactic knowledge or are they the result of data-driven learning? We examine this issue through corpus analysis, computational modeling, and adult simulation experiments. Chapter 1 investigates when children start producing multiword utterances; we use corpora to establish the development of multiword utterances and a probabilistic computational model to account for the quantitative change of early multiword utterances. We find that multiword utterances of different lengths appear early in acquisition and increase together, and the length growth pattern can be viewed as a probabilistic and dynamic process.

Chapter …


A Machine Learning Approach To Text-Based Sarcasm Detection, Lara I. Novic Jun 2022

A Machine Learning Approach To Text-Based Sarcasm Detection, Lara I. Novic

Dissertations, Theses, and Capstone Projects

Sarcasm and indirect language are commonplace for humans to produce and recognize but difficult for machines to detect. While artificial intelligence can accurately analyze sentiment and emotion in speech and text, it may struggle with insincere and sardonic content, although it is possible to train a machine to identify uttered and written sarcasm. This paper aims to detect sarcasm using logistic regression and a support vector machine (SVM) and compare their results to a baseline.

The models are trained on headlines from a Kaggle dataset containing headlines from the satirical news website The Onion and serious news website Huffpost (formerly …


Covert Determiners In Appalachian English Narrative Declarative Sentences, William Oliver Jun 2022

Covert Determiners In Appalachian English Narrative Declarative Sentences, William Oliver

Dissertations, Theses, and Capstone Projects

In this thesis, I explore the syntax and semantics of covert determiners (Ds) in matrix subject determiner phrases (DPs) with definite specific interpretations. To conduct my investigation, I used the Audio-Aligned and Parsed Corpus of Appalachian English (AAPCAppE), a million-word Penn Treebank corpus, and the software CorpusSearch, a Java program that searches Penn Treebank corpora. My research shows that Appalachian English contains a linguistic phenomenon where speakers drop the D, replacing overt Ds with covert Ds, in definite specific DPs. For example, where Standard English speakers say The doctor came by horseback, Appalachian speakers may use a covert D …