Open Access. Powered by Scholars. Published by Universities.®

Computational Linguistics Commons

Open Access. Powered by Scholars. Published by Universities.®

228 Full-Text Articles 340 Authors 192,439 Downloads 63 Institutions

All Articles in Computational Linguistics

Faceted Search

228 full-text articles. Page 3 of 10.

Covert Determiners In Appalachian English Narrative Declarative Sentences, William Oliver 2022 The Graduate Center, City University of New York

Covert Determiners In Appalachian English Narrative Declarative Sentences, William Oliver

Dissertations, Theses, and Capstone Projects

In this thesis, I explore the syntax and semantics of covert determiners (Ds) in matrix subject determiner phrases (DPs) with definite specific interpretations. To conduct my investigation, I used the Audio-Aligned and Parsed Corpus of Appalachian English (AAPCAppE), a million-word Penn Treebank corpus, and the software CorpusSearch, a Java program that searches Penn Treebank corpora. My research shows that Appalachian English contains a linguistic phenomenon where speakers drop the D, replacing overt Ds with covert Ds, in definite specific DPs. For example, where Standard English speakers say The doctor came by horseback, Appalachian speakers may use a covert D …


Integrating Cultural Knowledge Into Artificially Intelligent Systems: Human Experiments And Computational Implementations, Anurag Acharya 2022 Florida International University

Integrating Cultural Knowledge Into Artificially Intelligent Systems: Human Experiments And Computational Implementations, Anurag Acharya

FIU Electronic Theses and Dissertations

With the advancement of Artificial Intelligence, it seems as if every aspect of our lives is impacted by AI in one way or the other. As AI is used for everything from driving vehicles to criminal justice, it becomes crucial that it overcome any biases that might hinder its fair application. We are constantly trying to make AI be more like humans. But most AI systems so far fail to address one of the main aspects of humanity: our culture and the differences between cultures. We cannot truly consider AI to have understood human reasoning without understanding culture. So it …


“I Can See The Forest For The Trees”: Examining Personality Traits With Trasformers, Alexander Moore 2022 Clemson University

“I Can See The Forest For The Trees”: Examining Personality Traits With Trasformers, Alexander Moore

All Dissertations

Our understanding of Personality and its structure is rooted in linguistic studies operating under the assumptions made by the Lexical Hypothesis: personality characteristics that are important to a group of people will at some point be codified in their language, with the number of encoded representations of a personality characteristic indicating their importance. Qualitative and quantitative efforts in the dimension reduction of our lexicon throughout the mid-20th century have played a vital role in the field’s eventual arrival at the widely accepted Five Factor Model (FFM). However, there are a number of presently unresolved conflicts regarding the breadth and …


Metaphor Detection In Poems In Misurata Arabic Sub-Dialect : An Lstm Model, Azza Abugharsa 2022 Montclair State University

Metaphor Detection In Poems In Misurata Arabic Sub-Dialect : An Lstm Model, Azza Abugharsa

Theses, Dissertations and Culminating Projects

Natural Language Processing (NLP) in Arabic is witnessing an increasing interest in investigating different topics in the field. One of the topics that have drawn attention is the automatic processing of Arabic figurative language. The focus in previous projects is on detecting and interpreting metaphors in comments from social media as well as phrases and/or headlines from news articles. The current project focuses on metaphor detection in poems written in the Misurata Arabic sub-dialect spoken in Misurata, located in the North African region. The dataset is initially annotated by a group of linguists, and their annotation is treated as the …


Toward Suicidal Ideation Detection With Lexical Network Features And Machine Learning, Ulya Bayram, William Lee, Daniel Santel, Ali Minai, Peggy Clark, Tracy Glauser, John Pestian 2022 Çanakkale Onsekiz Mart University

Toward Suicidal Ideation Detection With Lexical Network Features And Machine Learning, Ulya Bayram, William Lee, Daniel Santel, Ali Minai, Peggy Clark, Tracy Glauser, John Pestian

Northeast Journal of Complex Systems (NEJCS)

In this study, we introduce a new network feature for detecting suicidal ideation from clinical texts and conduct various additional experiments to enrich the state of knowledge. We evaluate statistical features with and without stopwords, use lexical networks for feature extraction and classification, and compare the results with standard machine learning methods using a logistic classifier, a neural network, and a deep learning method. We utilize three text collections. The first two contain transcriptions of interviews conducted by experts with suicidal (n=161 patients that experienced severe ideation) and control subjects (n=153). The third collection consists of interviews conducted by experts …


Prácticas Comunicativas Digitales Y Construcción De Subjetividades: El Uso Del Podcast En La Escuela, María Isabel Guevara Rodríguez 2022 Universidad de La Salle, Bogotá

Prácticas Comunicativas Digitales Y Construcción De Subjetividades: El Uso Del Podcast En La Escuela, María Isabel Guevara Rodríguez

Doctorado en Educación y Sociedad

No abstract provided.


Exploring The Personality Of Virtual Tutors In Conversational Foreign Language Practice, Johanna Dobbriner, Cathy Ennis, Robert J. Ross 2021 Technological University Dublin

Exploring The Personality Of Virtual Tutors In Conversational Foreign Language Practice, Johanna Dobbriner, Cathy Ennis, Robert J. Ross

Conference papers

Fluid interaction between virtual agents and humans requires the understanding of many issues of conversational pragmatics. One such issue is the interaction between communication strategy and personality. As a step towards developing models of personality driven pragmatics policies, in this paper, we present our initial experiment to explore differences in user interaction with two contrasting avatar personalities. Each user saw a single personality in a video-call setting and gave feedback on the interaction. Our expectations, that a more extroverted outgoing positive personality would be a more successful tutor, were only partially confirmed. While this personality did induce longer conversations in …


Label Imputation For Homograph Disambiguation: Theoretical And Practical Approaches, Jennifer M. Seale 2021 City University of New York (CUNY)

Label Imputation For Homograph Disambiguation: Theoretical And Practical Approaches, Jennifer M. Seale

Dissertations, Theses, and Capstone Projects

This dissertation presents the first implementation of label imputation for the task of homograph disambiguation using 1) transcribed audio, and 2) parallel, or translated, corpora. For label imputation from parallel corpora, a hypothesis of interlingual alignment between homograph pronunciations and text word forms is developed and formalized. Both audio and parallel corpora label imputation techniques are tested empirically in experiments that compare homograph disambiguation model performance using: 1) hand-labeled training data, and 2) hand-labeled training data augmented with label-imputed data. Regularized, multinomial logistic regression and pre-trained ALBERT, BERT, and XLNet language models fine-tuned as token classifiers are developed for homograph …


Detection And Morphological Analysis Of Novel Russian Loanwords, Yulia Spektor 2021 The Graduate Center, City University of New York

Detection And Morphological Analysis Of Novel Russian Loanwords, Yulia Spektor

Dissertations, Theses, and Capstone Projects

This paper investigates recent English loanwords in Russian and explores ways in which computational methods can help further theoretical research. The goal of the study is two-fold: to find new, previously unattested loanwords borrowed over the last decade and to examine the rate of adaptation of the new borrowings, attested by the degree to which they conform to the constraints of the Russian language. First, we train a finite-state pipeline that combines character n-gram language models, which encode phonotactic and lexical properties of loanwords, with a binary classifier to detect loanwords. The model achieves state-of-the-art performance results during evaluation, surpassing …


From An Art To A Science: Features And Methodology In Computational Authorship Identification, Jonathan I. Manczur 2021 The Graduate Center, City University of New York

From An Art To A Science: Features And Methodology In Computational Authorship Identification, Jonathan I. Manczur

Dissertations, Theses, and Capstone Projects

Nearly thirty years ago, the United States Supreme Court revaluated the criteria for accepting forensic science and expert testimony, challenging Forensic Linguistics to assert itself as a reputable science. Much work has been produced in the interim to that end, but much still needs to be accomplished to satisfy the judicial standards. Computational linguistics has the potential to provide that necessary analytical framework. This paper’s intent is two-fold. First, there are two competing theories on the proper features necessary to identify an unknown author. Four features were drawn from the syntactic computational linguistics tradition and four from computational stylometry to …


Learning Phonology With Sequence-To-Sequence Neural Networks, Brandon Prickett 2021 University of Massachusetts Amherst

Learning Phonology With Sequence-To-Sequence Neural Networks, Brandon Prickett

Doctoral Dissertations

This dissertation tests sequence-to-sequence neural networks to see whether they can simulate human phonological learning and generalization in a number of artificial language experiments. These experiments and simulations are organized into three chapters: one on opaque interactions, one on computational complexity in phonology, and one on reduplication. The first chapter focuses on two biases involving interactions that have been proposed in the past: a bias for transparent patterns and a bias for patterns that maximally utilize all of the processes in a language. The second chapter looks at harmony patterns of varying complexity to see whether both Formal Language Theory …


The Public Innovations Explorer: A Geo-Spatial & Linked-Data Visualization Platform For Publicly Funded Innovation Research In The United States, Seth Schimmel 2021 The Graduate Center, City University of New York

The Public Innovations Explorer: A Geo-Spatial & Linked-Data Visualization Platform For Publicly Funded Innovation Research In The United States, Seth Schimmel

Dissertations, Theses, and Capstone Projects

The Public Innovations Explorer (https://sethsch.github.io/innovations-explorer/app/index.html) is a web-based tool created using Node.js, D3.js and Leaflet.js that can be used for investigating awards made by Federal agencies and departments participating in the Small Business Innovation Research (SBIR) and Small Business Technology Transfer (STTR) grant-making programs between 2008 and 2018. By geocoding the publicly available grants data from SBIR.gov, the Public Innovations Explorer allows users to identify companies performing publicly-funded innovative research in each congressional district and obtain dynamic district-level summaries of funding activity by agency and year. Applying spatial clustering techniques on districts' employment levels across major economic sectors provides users …


Predicting Stock Price Movements Using Sentiment And Subjectivity Analyses, Andrew Kirby 2021 The Graduate Center, City University of New York

Predicting Stock Price Movements Using Sentiment And Subjectivity Analyses, Andrew Kirby

Dissertations, Theses, and Capstone Projects

In a quick search online, one can find many tools which use information from news headlines to make predictions concerning the trajectory of a given stock. But what if we went further, looking instead into the text of the article, to extract this and other information? Here, the goal is to extract the sentence in which a stock ticker symbol is mentioned from a news article, then determine sentiment and subjectivity values from that sentence, and finally make a prediction on whether or not the value of that stock will go up or not in a 24-hour timespan. Bloomberg News …


Plprepare: A Grammar Checker For Challenging Cases, Jacob Hoyos 2021 East Tennessee State University

Plprepare: A Grammar Checker For Challenging Cases, Jacob Hoyos

Electronic Theses and Dissertations

This study investigates one of the Polish language’s most arbitrary cases: the genitive masculine inanimate singular. It collects and ranks several guidelines to help language learners discern its proper usage and also introduces a framework to provide detailed feedback regarding arbitrary cases. The study tests this framework by implementing and evaluating a hybrid grammar checker called PLPrepare. PLPrepare performs similarly to other grammar checkers and is able to detect genitive case usages and provide feedback based on a number of error classifications.


Introducing Señal, A Computational Tool For The Linguistic Analysis Of Spanish L2 Compositions, Falcon Restrepo-Ramos 2021 Minnesota State University, Mankato

Introducing Señal, A Computational Tool For The Linguistic Analysis Of Spanish L2 Compositions, Falcon Restrepo-Ramos

World Languages & Cultures Department Publications

SEÑAL is a modular program that automatizes and facilitates the assessment of Spanish L2 compositions. The tool can extract syntactic and lexical information, while also assessing grammar, and Spanish L2 proficiency. A computational tool that can process written learners’ corpora and extract measures of language development has enormous practical value in Spanish and modern language departments alike.


Shifting The Perspectival Landscape: Methods For Encoding, Identifying, And Selecting Perspectives, Carolyn Jane Anderson 2021 University of Massachusetts Amherst

Shifting The Perspectival Landscape: Methods For Encoding, Identifying, And Selecting Perspectives, Carolyn Jane Anderson

Doctoral Dissertations

This dissertation explores the semantics and pragmatics of perspectival expressions. Perspective, or point-of-view, encompasses an individual’s thoughts, perceptions, and location. Many expressions in natural language have components of their meanings that shift depending on whose perspective they are evaluated against. In this dissertation, I explore two sets of questions relating to perspective sensitivity. The first set of questions relate to how perspective is encoded in the semantics of perspectival expressions. The second set of questions relate to how conversation participants treat perspectival expressions: the speaker’s selection of a perspective and the listener’s identification of the speaker’s perspective. In Part I, …


An Interactive Visual Database For American Sign Language Reveals How Signs Are Organized In The Mind, Zed Sevcikova Sehyr, Ariel Goldberg, Karen Emmory, Naomi Caselli 2021 Chapman University

An Interactive Visual Database For American Sign Language Reveals How Signs Are Organized In The Mind, Zed Sevcikova Sehyr, Ariel Goldberg, Karen Emmory, Naomi Caselli

Communication Sciences and Disorders Faculty Articles and Research

"We are four researchers who study psycholinguistics, linguistics, neuroscience and deaf education. Our team of deaf and hearing scientists worked with a group of software engineers to create the ASL-LEX database that anyone can use for free. We cataloged information on nearly 3,000 signs and built a visual, searchable and interactive database that allows scientists and linguists to work with ASL in entirely new ways."


Otrouha: A Corpus Of Arabic Etds And A Framework For Automatic Subject Classification, Eman Abdelrahman, Fatimah Alotaibi, Edward A. Fox, Osman Balci 2021 Virgnia Tech, Blacksburg

Otrouha: A Corpus Of Arabic Etds And A Framework For Automatic Subject Classification, Eman Abdelrahman, Fatimah Alotaibi, Edward A. Fox, Osman Balci

The Journal of Electronic Theses and Dissertations

Although the Arabic language is spoken by more than 300 million people and is one of the six official languages of the United Nations (UN), there has been less research done on Arabic text data (compared to English) in the realm of machine learning, especially in text classification. In the past decade, Arabic data such as news, tweets, etc. have begun to receive some attention. Although automatic text classification plays an important role in improving the browsability and accessibility of data, Electronic Theses and Dissertations (ETDs) have not received their fair share of attention, in spite of the huge number …


The Asl-Lex 2.0 Project: A Database Of Lexical And Phonological Properties For 2,723 Signs In American Sign Language, Zed Sevcikova Sehyr, Naomi Caselli, Ariel M. Cohen-Goldberg, Karen Emmory 2021 Chapman University

The Asl-Lex 2.0 Project: A Database Of Lexical And Phonological Properties For 2,723 Signs In American Sign Language, Zed Sevcikova Sehyr, Naomi Caselli, Ariel M. Cohen-Goldberg, Karen Emmory

Communication Sciences and Disorders Faculty Articles and Research

ASL-LEX is a publicly available, large-scale lexical database for American Sign Language (ASL). We report on the expanded database (ASL-LEX 2.0) that contains 2,723 ASL signs. For each sign, ASL-LEX now includes a more detailed phonological description, phonological density and complexity measures, frequency ratings (from deaf signers), iconicity ratings (from hearing non-signers and deaf signers), transparency (“guessability”) ratings (from non-signers), sign and videoclip durations, lexical class, and more. We document the steps used to create ASL-LEX 2.0 and describe the distributional characteristics for sign properties across the lexicon and examine the relationships among lexical and phonological properties of signs. Correlation …


A Computational Study In The Detection Of English–Spanish Code-Switches, Yohamy C. Polanco 2021 The Graduate Center, City University of New York

A Computational Study In The Detection Of English–Spanish Code-Switches, Yohamy C. Polanco

Dissertations, Theses, and Capstone Projects

Code-switching is the linguistic phenomenon where a multilingual person alternates between two or more languages in a conversation, whether that be spoken or written. This thesis studies the automatic detection of code-switching occurring specifically between English and Spanish in two corpora.

Twitter and other social media sites have provided an abundance of linguistic data that is available to researchers to perform countless experiments. Collecting the data is fairly easy if a study is on monolingual text, but if a study requires code-switched data, this becomes a complication as APIs only accept one language as a parameter. This thesis focuses on …


Digital Commons powered by bepress