Open Access. Powered by Scholars. Published by Universities.®

Computational Linguistics Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

Institution
Keyword
Publication Year
Publication
Publication Type

Articles 1 - 30 of 76

Full-Text Articles in Computational Linguistics

Guilty Machines: On Ab-Sens In The Age Of Ai, Dylan Lackey, Katherine Weinschenk Dec 2023

Guilty Machines: On Ab-Sens In The Age Of Ai, Dylan Lackey, Katherine Weinschenk

Critical Humanities

For Lacan, guilt arises in the sublimation of ab-sens (non-sense) into the symbolic comprehension of sen-absexe (sense without sex, sense in the deficiency of sexual relation), or in the maturation of language to sensibility through the effacement of sex. Though, as Slavoj Žižek himself points out in a recent article regarding ChatGPT, the split subject always misapprehends the true reason for guilt’s manifestation, such guilt at best provides a sort of evidence for the inclusion of the subject in the order of language, acting as a necessary, even enjoyable mark of the subject’s coherence (or, more importantly, the subject’s separation …


Executive Order On The Safe, Secure, And Trustworthy Development And Use Of Artificial Intelligence, Joseph R. Biden Oct 2023

Executive Order On The Safe, Secure, And Trustworthy Development And Use Of Artificial Intelligence, Joseph R. Biden

Copyright, Fair Use, Scholarly Communication, etc.

Section 1. Purpose. Artificial intelligence (AI) holds extraordinary potential for both promise and peril. Responsible AI use has the potential to help solve urgent challenges while making our world more prosperous, productive, innovative, and secure. At the same time, irresponsible use could exacerbate societal harms such as fraud, discrimination, bias, and disinformation; displace and disempower workers; stifle competition; and pose risks to national security. Harnessing AI for good and realizing its myriad benefits requires mitigating its substantial risks. This endeavor demands a society-wide effort that includes government, the private sector, academia, and civil society.

My Administration places the highest urgency …


Evaluating Neural Networks As Cognitive Models For Learning Quasi-Regularities In Language, Xiaomeng Ma Jun 2023

Evaluating Neural Networks As Cognitive Models For Learning Quasi-Regularities In Language, Xiaomeng Ma

Dissertations, Theses, and Capstone Projects

Many aspects of language can be categorized as quasi-regular: the relationship between the inputs and outputs is systematic but allows many exceptions. Common domains that contain quasi-regularity include morphological inflection and grapheme-phoneme mapping. How humans process quasi-regularity has been debated for decades. This thesis implemented modern neural network models, transformer models, on two tasks: English past tense inflection and Chinese character naming, to investigate how transformer models perform quasi-regularity tasks. This thesis focuses on investigating to what extent the models' performances can represent human behavior. The results show that the transformers' performance is very similar to human behavior in many …


Predicting High-Cap Tech Stock Polarity: A Combined Approach Using Support Vector Machines And Bidirectional Encoders From Transformers, Ian L. Grisham May 2023

Predicting High-Cap Tech Stock Polarity: A Combined Approach Using Support Vector Machines And Bidirectional Encoders From Transformers, Ian L. Grisham

Electronic Theses and Dissertations

The abundance, accessibility, and scale of data have engendered an era where machine learning can quickly and accurately solve complex problems, identify complicated patterns, and uncover intricate trends. One research area where many have applied these techniques is the stock market. Yet, financial domains are influenced by many factors and are notoriously difficult to predict due to their volatile and multivariate behavior. However, the literature indicates that public sentiment data may exhibit significant predictive qualities and improve a model’s ability to predict intricate trends. In this study, momentum SVM classification accuracy was compared between datasets that did and did not …


Chatgpt As Metamorphosis Designer For The Future Of Artificial Intelligence (Ai): A Conceptual Investigation, Amarjit Kumar Singh (Library Assistant), Dr. Pankaj Mathur (Deputy Librarian) Mar 2023

Chatgpt As Metamorphosis Designer For The Future Of Artificial Intelligence (Ai): A Conceptual Investigation, Amarjit Kumar Singh (Library Assistant), Dr. Pankaj Mathur (Deputy Librarian)

Library Philosophy and Practice (e-journal)

Abstract

Purpose: The purpose of this research paper is to explore ChatGPT’s potential as an innovative designer tool for the future development of artificial intelligence. Specifically, this conceptual investigation aims to analyze ChatGPT’s capabilities as a tool for designing and developing near about human intelligent systems for futuristic used and developed in the field of Artificial Intelligence (AI). Also with the helps of this paper, researchers are analyzed the strengths and weaknesses of ChatGPT as a tool, and identify possible areas for improvement in its development and implementation. This investigation focused on the various features and functions of ChatGPT that …


Simulating The Machine Translation Of Low-Resource Languages By Designing A Translator Between English And An Artificially Constructed Language, Michaela Snyder Jan 2023

Simulating The Machine Translation Of Low-Resource Languages By Designing A Translator Between English And An Artificially Constructed Language, Michaela Snyder

Mahurin Honors College Capstone Experience/Thesis Projects

Natural language processing (NLP), or the use of computers to analyze natural language, is a field that relies heavily on syntax. It would seem intuitive that computers would thrive in this area due to their strict syntax requirements, but the syntax of natural languages leaves them unable to properly parse and generate sentences that seem normal to the average speaker. A subfield of NLP, machine translation, works mainly to computerize translation between different languages. Unfortunately, such translation is not without its weaknesses; language documentation is not created equal, and many low-resource languages—languages with relatively few kinds of documentation, most often …


Creating Data From Unstructured Text With Context Rule Assisted Machine Learning (Craml), Stephen Meisenbacher, Peter Norlander Dec 2022

Creating Data From Unstructured Text With Context Rule Assisted Machine Learning (Craml), Stephen Meisenbacher, Peter Norlander

School of Business: Faculty Publications and Other Works

Popular approaches to building data from unstructured text come with limitations, such as scalability, interpretability, replicability, and real-world applicability. These can be overcome with Context Rule Assisted Machine Learning (CRAML), a method and no-code suite of software tools that builds structured, labeled datasets which are accurate and reproducible. CRAML enables domain experts to access uncommon constructs within a document corpus in a low-resource, transparent, and flexible manner. CRAML produces document-level datasets for quantitative research and makes qualitative classification schemes scalable over large volumes of text. We demonstrate that the method is useful for bibliographic analysis, transparent analysis of proprietary data, …


Towards Explaining Variation In Entrainment, Andreas Weise Sep 2022

Towards Explaining Variation In Entrainment, Andreas Weise

Dissertations, Theses, and Capstone Projects

Entrainment refers to the tendency of human speakers to adapt to their interlocutors to become more similar to them. This affects various dimensions and occurs in many contexts, allowing for rich applications in human-computer interaction. However, it is not exhibited by every speaker in every conversation but varies widely across features, speakers, and contexts, hindering broad application. This variation, whose guiding principles are poorly understood even after decades of entrainment research, is the subject of this thesis. We begin with a comprehensive literature review that serves as the foundation of our own work and provides a reference to guide future …


Corrective Feedback Timing In Kanji Writing Instruction Apps, Phoenix Mulgrew Jun 2022

Corrective Feedback Timing In Kanji Writing Instruction Apps, Phoenix Mulgrew

Honors Theses

The focus of this research paper is to determine the correct time to provide corrective feedback to people who are learning how to write Japanese kanji. To do this, we developed a system that is able to recognize Japanese kanji that is handwritten onto an iPad screen and check for errors such as wrong stroke order. Previous research has achieved success in developing similar systems, but this project is unique because the research question involves the timing of corrective feedback. In particular, we are looking at whether immediate or delayed corrective feedback results in better learning.


Integrating Cultural Knowledge Into Artificially Intelligent Systems: Human Experiments And Computational Implementations, Anurag Acharya May 2022

Integrating Cultural Knowledge Into Artificially Intelligent Systems: Human Experiments And Computational Implementations, Anurag Acharya

FIU Electronic Theses and Dissertations

With the advancement of Artificial Intelligence, it seems as if every aspect of our lives is impacted by AI in one way or the other. As AI is used for everything from driving vehicles to criminal justice, it becomes crucial that it overcome any biases that might hinder its fair application. We are constantly trying to make AI be more like humans. But most AI systems so far fail to address one of the main aspects of humanity: our culture and the differences between cultures. We cannot truly consider AI to have understood human reasoning without understanding culture. So it …


Toward Suicidal Ideation Detection With Lexical Network Features And Machine Learning, Ulya Bayram, William Lee, Daniel Santel, Ali Minai, Peggy Clark, Tracy Glauser, John Pestian Apr 2022

Toward Suicidal Ideation Detection With Lexical Network Features And Machine Learning, Ulya Bayram, William Lee, Daniel Santel, Ali Minai, Peggy Clark, Tracy Glauser, John Pestian

Northeast Journal of Complex Systems (NEJCS)

In this study, we introduce a new network feature for detecting suicidal ideation from clinical texts and conduct various additional experiments to enrich the state of knowledge. We evaluate statistical features with and without stopwords, use lexical networks for feature extraction and classification, and compare the results with standard machine learning methods using a logistic classifier, a neural network, and a deep learning method. We utilize three text collections. The first two contain transcriptions of interviews conducted by experts with suicidal (n=161 patients that experienced severe ideation) and control subjects (n=153). The third collection consists of interviews conducted by experts …


Exploring The Personality Of Virtual Tutors In Conversational Foreign Language Practice, Johanna Dobbriner, Cathy Ennis, Robert J. Ross Sep 2021

Exploring The Personality Of Virtual Tutors In Conversational Foreign Language Practice, Johanna Dobbriner, Cathy Ennis, Robert J. Ross

Conference papers

Fluid interaction between virtual agents and humans requires the understanding of many issues of conversational pragmatics. One such issue is the interaction between communication strategy and personality. As a step towards developing models of personality driven pragmatics policies, in this paper, we present our initial experiment to explore differences in user interaction with two contrasting avatar personalities. Each user saw a single personality in a video-call setting and gave feedback on the interaction. Our expectations, that a more extroverted outgoing positive personality would be a more successful tutor, were only partially confirmed. While this personality did induce longer conversations in …


Plprepare: A Grammar Checker For Challenging Cases, Jacob Hoyos May 2021

Plprepare: A Grammar Checker For Challenging Cases, Jacob Hoyos

Electronic Theses and Dissertations

This study investigates one of the Polish language’s most arbitrary cases: the genitive masculine inanimate singular. It collects and ranks several guidelines to help language learners discern its proper usage and also introduces a framework to provide detailed feedback regarding arbitrary cases. The study tests this framework by implementing and evaluating a hybrid grammar checker called PLPrepare. PLPrepare performs similarly to other grammar checkers and is able to detect genitive case usages and provide feedback based on a number of error classifications.


Lulling Waters: A Poetry Reading For Real-Time Music Generation Through Emotion Mapping, Ashley Muniz, Toshihisa Tsuruoka Jul 2020

Lulling Waters: A Poetry Reading For Real-Time Music Generation Through Emotion Mapping, Ashley Muniz, Toshihisa Tsuruoka

Electronic Literature Organization Conference 2020

Through a poetic narrative, “Lulling Waters” tells the story of a whale overcoming the loss of his mother, who passed away from ingesting plastic, as he attempts to escape from the polluted oceanic world. The live performance of this poem utilizes a software system called Soundwriter, which was developed with the goal of enriching the oral storytelling experience through music. This video demonstrates how Soundwriter’s real-time hybrid system was able to analyze “Lulling Waters” through its lexical and auditory features. Emotionally salient words were given ratings based on arousal, valence, and dominance while the emotionally charged prosodic features of the …


Poetry For Seers Or The Peruvian Visual Poetic Tradition In Front Of New Media, Michael Hurtado, Pamela Medina, Enrique García, Michael Prado Jul 2020

Poetry For Seers Or The Peruvian Visual Poetic Tradition In Front Of New Media, Michael Hurtado, Pamela Medina, Enrique García, Michael Prado

Electronic Literature Organization Conference 2020

Since the first decades of the twentieth century, Peruvian poetic tradition has been characterized by experimental uses of language. Among these possibilities, some records tensioned this medium from the link with the plastic arts, as in the case of the poetry of José María Eguren, while others opted for the playing with the spatiality and visuality of the blank sheet, such as in the case of the work of Carlos Oquendo de Amat. However, it is not until the appearance of the poetry of César Vallejo, specifically with a poems like Trilce in 1922, that these breakages force us to …


Automatic Learning Of Document Section Structure For Ontology-Based Semantic Search, Deya Banisakher Jul 2020

Automatic Learning Of Document Section Structure For Ontology-Based Semantic Search, Deya Banisakher

FIU Electronic Theses and Dissertations

Modeling natural human behavior in understanding written language is crucial for developing true artificial intelligence. For people, words convey certain semantic concepts. While documents represent an abstract concept---they are collections of text organized in some logical structure, that is, sentences, paragraphs, sections, and so on. Similar to words, these document structures, are used to convey a logical flow of semantic concepts. Machines however, only view words as spans of characters and documents as mere collections of free-text, missing any underlying meanings behind words and the logical structure of those documents.

Automatic semantic concept detection is the process by which the …


Automatic Keyphrase Extraction From Russian-Language Scholarly Papers In Computational Linguistics, Yves Wienecke Jul 2020

Automatic Keyphrase Extraction From Russian-Language Scholarly Papers In Computational Linguistics, Yves Wienecke

University Honors Theses

The automatic extraction of keyphrases from scholarly papers is a necessary step for many Natural Language Processing (NLP) tasks, including text retrieval, machine translation, and text summarization. However, due to the different grammatical and semantic intricacies of languages, this is a highly language-dependent task. Many free and open source implementations of state-of-the-art keyphrase extraction techniques exist, but they are not adapted for processing Russian text. Furthermore, the multi-linguistic character of scholarly papers in the field of Russian computational linguistics and NLP introduces additional complexity to keyphrase extraction. This paper describes a free and open source program as a proof of …


Empirical Analysis Of Cbow And Skip Gram Nlp Models, Tejas Menon Jul 2020

Empirical Analysis Of Cbow And Skip Gram Nlp Models, Tejas Menon

University Honors Theses

CBOW and Skip Gram are two NLP techniques to produce word embedding models that are accurate and performant. They were invented in the seminal paper by T. Mikolov et al. and have since observed optimizations such as negative sampling and subsampling. This paper implements a fully-optimized version of these models using Py-Torch and runs them through a toy sentiment/subject analysis. It is weakly observed that different corpus types affect the skew of word embeddings such that fictional corpus are better suited for sentiment analysis and non-fictional for subject analysis.


The Stained Glass Of Knowledge: On Understanding Novice Mental Models Of Computing, Briana Christina Bettin Jan 2020

The Stained Glass Of Knowledge: On Understanding Novice Mental Models Of Computing, Briana Christina Bettin

Dissertations, Master's Theses and Master's Reports

Learning to program can be a novel experience. The rigidity of programming can be at odds with beginning programmer's existing perceptions, and the concepts can feel entirely unfamiliar. These observations motivated this research, which explores two major questions: What factors influence how novices learn programming? and How can analogy by more appropriately leveraged in programming education?

This dissertation investigates the factors influencing novice programming through multiple methods. The CS1 classroom is observed as a "whole system", with consideration to the factors present in it that can influence the learning process. Learning's cognitive processes are elaborated to ground exploration into specifically …


Do It Like A Syntactician: Using Binary Gramaticality Judgements To Train Sentence Encoders And Assess Their Sensitivity To Syntactic Structure, Pablo Gonzalez Martinez Sep 2019

Do It Like A Syntactician: Using Binary Gramaticality Judgements To Train Sentence Encoders And Assess Their Sensitivity To Syntactic Structure, Pablo Gonzalez Martinez

Dissertations, Theses, and Capstone Projects

The binary nature of grammaticality judgments and their use to access the structure of syntax are a staple of modern linguistics. However, computational models of natural language rarely make use of grammaticality in their training or application. Furthermore, developments in modern neural NLP have produced a myriad of methods that push the baselines in many complex tasks, but those methods are typically not evaluated from a linguistic perspective. In this dissertation I use grammaticality judgements with artificially generated ungrammatical sentences to assess the performance of several neural encoders and propose them as a suitable training target to make models learn …


Synthetic, Yet Natural: Properties Of Wordnet Random Walk Corpora And The Impact Of Rare Words On Embedding Performance, Filip Klubicka, Alfredo Maldonado, Abhijit Mahalunkar, John D. Kelleher Jul 2019

Synthetic, Yet Natural: Properties Of Wordnet Random Walk Corpora And The Impact Of Rare Words On Embedding Performance, Filip Klubicka, Alfredo Maldonado, Abhijit Mahalunkar, John D. Kelleher

Conference papers

Creating word embeddings that reflect semantic relationships encoded in lexical knowledge resources is an open challenge. One approach is to use a random walk over a knowledge graph to generate a pseudo-corpus and use this corpus to train embeddings. However, the effect of the shape of the knowledge graph on the generated pseudo-corpora, and on the resulting word embeddings, has not been studied. To explore this, we use English WordNet, constrained to the taxonomic (tree-like) portion of the graph, as a case study. We investigate the properties of the generated pseudo-corpora, and their impact on the resulting embeddings. We find …


The Design And Implementation Of Aida: Ancient Inscription Database And Analytics System, M Parvez Rashid Jul 2019

The Design And Implementation Of Aida: Ancient Inscription Database And Analytics System, M Parvez Rashid

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

AIDA, the Ancient Inscription Database and Analytic system can be used to translate and analyze ancient Minoan language. The AIDA system currently stores three types of ancient Minoan inscriptions: Linear A, Cretan Hieroglyph and Phaistos Disk inscriptions. In addition, AIDA provides candidate syllabic values and translations of Minoan words and inscriptions into English. The AIDA system allows the users to change these candidate phonetic assignments to the Linear A, Cretan Hieroglyph and Phaistos symbols. Hence the AIDA system provides for various scholars not only a convenient online resource to browse Minoan inscriptions but also provides an analysis tool to explore …


Application Of Boolean Logic To Natural Language Complexity In Political Discourse, Austin Taing Jan 2019

Application Of Boolean Logic To Natural Language Complexity In Political Discourse, Austin Taing

Theses and Dissertations--Computer Science

Press releases serve as a major influence on public opinion of a politician, since they are a primary means of communicating with the public and directing discussion. Thus, the public’s ability to digest them is an important factor for politicians to consider. This study employs several well-studied measures of linguistic complexity and proposes a new one to examine whether politicians change their language to become more or less difficult to parse in different situations. This study uses 27,500 press releases from the US Senate between 2004–2008 and examines election cycles and natural disasters, namely hurricanes, as situations where politicians’ language …


Perception & Perspective: An Analysis Of Discourse And Situational Factors In Reference Frame Selection, Robert J. Ross, Kavita E. Thomas Jun 2018

Perception & Perspective: An Analysis Of Discourse And Situational Factors In Reference Frame Selection, Robert J. Ross, Kavita E. Thomas

Conference papers

To integrate perception into dialogue, it is necessary to bind spatial language descriptions to reference frame use. To this end, we present an analysis of discourse and situational factors that may influence reference frame choice in dialogues. We show that factors including spatial orientation, task, self and other alignment, and dyad have an influence on reference frame use. We further show that a computational model to estimate reference frame based on these features provides results greater than both random and greedy reference frame selection strategies.


Multimodal Depression Detection: An Investigation Of Features And Fusion Techniques For Automated Systems, Michelle Renee Morales May 2018

Multimodal Depression Detection: An Investigation Of Features And Fusion Techniques For Automated Systems, Michelle Renee Morales

Dissertations, Theses, and Capstone Projects

Depression is a serious illness that affects a large portion of the world’s population. Given the large effect it has on society, it is evident that depression is a serious health issue. This thesis evaluates, at length, how technology may aid in assessing depression. We present an in-depth investigation of features and fusion techniques for depression detection systems. We also present OpenMM: a novel tool for multimodal feature extraction. Lastly, we present novel techniques for multimodal fusion. The contributions of this work add considerably to our knowledge of depression detection systems and have the potential to improve future systems by …


Automatic Analysis Of Musical Lyrics, Joanna Gormley Apr 2018

Automatic Analysis Of Musical Lyrics, Joanna Gormley

Honors Senior Capstone Projects

Is music getting less sophisticated over time? That is the question which this study aims to answer, with the goal of improving upon previous analysis done on the topic. The blog posts which inspired this project lacked accuracy and dimensionality. Realizing that a larger data set of songs would make a significant difference in the precision of our analysis, we set out to design a piece of software constructed with the capability to analyze several thousand songs. Mimicking previous works which analyzed sophistication of music, the software focuses on the lyrics of songs. Three metrics were used in order to …


Does The Test Work? Evaluating A Web-Based Language Placement Test, Avizia Long, Sun-Young Shin, Kimberly Geeslin, Erik Willis Feb 2018

Does The Test Work? Evaluating A Web-Based Language Placement Test, Avizia Long, Sun-Young Shin, Kimberly Geeslin, Erik Willis

Faculty Publications

In response to the need for examples of test validation from which everyday language programs can benefit, this paper reports on a study that used Bachman’s (2005) assessment use argument (AUA) framework to examine evidence to support claims made about the intended interpretations and uses of scores based on a new web-based Spanish language placement test. The test, which consisted of 100 items distributed across five item types (sound discrimination, grammar, listening comprehension, reading comprehension, and vocabulary), was tested with 2,201 incoming first-year and transfer students at a large, Midwestern public university. Analyses of internal consistency and validity revealed the …


Losing Shahrazad: A Distant Reading Of 1001 Nights, Taysa Mohler Jan 2018

Losing Shahrazad: A Distant Reading Of 1001 Nights, Taysa Mohler

Senior Projects Spring 2018

This project is a distant reading analysis of seven 19th and 20th-century English translations of One Thousand and One Nights or The Arabian Nights. Through the use of computer programming and distant reading, it becomes clear that the Nights' frame tale is the carrier of the internal logic and generative power of the story cycle. Further, the frame tale expresses the Nights' self-representation, which serves to undermine the historical use of the Nights as synecdoche for the Orient. Therefore, the translators that remove the frame story from their versions further the Nights' use as an Orientalist object, …


Exploring The Functional And Geometric Bias Of Spatial Relations Using Neural Language Models, Simon Dobnik, Mehdi Ghanimifard, John D. Kelleher Jan 2018

Exploring The Functional And Geometric Bias Of Spatial Relations Using Neural Language Models, Simon Dobnik, Mehdi Ghanimifard, John D. Kelleher

Conference papers

The challenge for computational models of spatial descriptions for situated dialogue systems is the integration of information from different modalities. The semantics of spatial descriptions are grounded in at least two sources of information: (i) a geometric representation of space and (ii) the functional interaction of related objects that. We train several neural language models on descriptions of scenes from a dataset of image captions and examine whether the functional or geometric bias of spatial descriptions reported in the literature is reflected in the estimated perplexity of these models. The results of these experiments have implications for the creation of …


Cloud‐Based Text Analytics Harvesting, Cleaning And Analyzing Corporate Earnings Conference Calls, Michael Chuancai Zhang, Vikram Gazula, Dan Stone, Hong Xie Oct 2017

Cloud‐Based Text Analytics Harvesting, Cleaning And Analyzing Corporate Earnings Conference Calls, Michael Chuancai Zhang, Vikram Gazula, Dan Stone, Hong Xie

Commonwealth Computational Summit

No abstract provided.