Open Access. Powered by Scholars. Published by Universities.®

Computational Linguistics Commons

Open Access. Powered by Scholars. Published by Universities.®

163 Full-Text Articles 252 Authors 19,786 Downloads 39 Institutions

All Articles in Computational Linguistics

Faceted Search

163 full-text articles. Page 1 of 8.

Generative Linguistics And Neural Networks At 60: Foundation, Friction, And Fusion, Joe Pater 2019 Selected Works

Generative Linguistics And Neural Networks At 60: Foundation, Friction, And Fusion, Joe Pater

Joe Pater

The birthdate of both generative linguistics and neural networks can be taken as 1957, the year of the publication of foundational work by both Noam Chomsky and Frank Rosenblatt. This paper traces the development of these two approaches to cognitive science, from their largely autonomous early development in their first thirty years, through their collision in the 1980s around the past tense debate (Rumelhart and McClelland 1986, Pinker and Prince 1988), and their integration in much subsequent work up to the present. Although this integration has produced a considerable body of results, the continued general gulf between these two lines ...


Recursive Neural Networks For Semantic Sentence Representation, Liam S. Geron 2018 The Graduate Center, City University of New York

Recursive Neural Networks For Semantic Sentence Representation, Liam S. Geron

All Dissertations, Theses, and Capstone Projects

Semantic representation has a rich history rife with both complex linguistic theory and computational models. Though this history stretches back almost 50 years (Salton, 1971), recently the field has undergone an unexpected shift in paradigm thanks to the work of Mikolov et al., 2013(a & b) which has proven that vector-space semantic models can capture large amounts of semantic information. As of yet, these semantic representations are computed at the word level, and finding a semantic representation of a phrase is a much more difficult challenge. Mikolov et al., 2013(a&b) proved that their word vectors can be composed ...


Perception & Perspective: An Analysis Of Discourse And Situational Factors In Reference Frame Selection, Robert Ross, Kavita E. Thomas 2018 Dublin Institute of Technology

Perception & Perspective: An Analysis Of Discourse And Situational Factors In Reference Frame Selection, Robert Ross, Kavita E. Thomas

Conference papers

To integrate perception into dialogue, it is necessary to bind spatial language descriptions to reference frame use. To this end, we present an analysis of discourse and situational factors that may influence reference frame choice in dialogues. We show that factors including spatial orientation, task, self and other alignment, and dyad have an influence on reference frame use. We further show that a computational model to estimate reference frame based on these features provides results greater than both random and greedy reference frame selection strategies.


Multimodal Depression Detection: An Investigation Of Features And Fusion Techniques For Automated Systems, Michelle Renee Morales 2018 The Graduate Center, City University of New York

Multimodal Depression Detection: An Investigation Of Features And Fusion Techniques For Automated Systems, Michelle Renee Morales

All Dissertations, Theses, and Capstone Projects

Depression is a serious illness that affects a large portion of the world’s population. Given the large effect it has on society, it is evident that depression is a serious health issue. This thesis evaluates, at length, how technology may aid in assessing depression. We present an in-depth investigation of features and fusion techniques for depression detection systems. We also present OpenMM: a novel tool for multimodal feature extraction. Lastly, we present novel techniques for multimodal fusion. The contributions of this work add considerably to our knowledge of depression detection systems and have the potential to improve future systems ...


Intergroup Variability In Personality Recognition, Arundhati Sengupta 2018 The Graduate Center, City University of New York

Intergroup Variability In Personality Recognition, Arundhati Sengupta

All Dissertations, Theses, and Capstone Projects

Automatic Identification of personality in conversational speech has many applications in natural language processing such as leader identification in a meeting, adaptive dialogue systems, and dating websites. However, the widespread acceptance of automatic personality recognition through lexical and vocal characteristics is limited by the variability of error rate in a general purpose model among speakers from different demographic groups. While other work reports accuracy, we explored error rates of automatic personality recognition task using classification models for different genders and native language groups (L1). We also present a statistical experiment showing the influence of gender and L1 on the relation ...


Describing Doggo-Speak: Features Of Doggo Meme Language, Jennifer Bivens 2018 The Graduate Center, City University of New York

Describing Doggo-Speak: Features Of Doggo Meme Language, Jennifer Bivens

All Dissertations, Theses, and Capstone Projects

Doggo-speak is a specialized way of writing most commonly associated with captions on Doggo memes, humorous images of dogs shared in online communities. This paper will explore linguistic features of Doggo-speak through analysis of social media posts by Doggo fan pages. It will use the discussed features as inputs to five machine learning classifiers and will show, through this classification task, that the discussed features are sufficient for distinguishing between Doggo-speak and more general English text.


Speech Perception In “Bubble” Noise: Korean Fricatives And Affricates By Native And Non-Native Korean Listeners, Jiyoung Choi 2018 The Graduate Center, City University of New York

Speech Perception In “Bubble” Noise: Korean Fricatives And Affricates By Native And Non-Native Korean Listeners, Jiyoung Choi

All Dissertations, Theses, and Capstone Projects

The current study examines acoustic cues used by second language learners of Korean to discriminate between Korean fricatives and affricates in noise and how these cues relate to those used by native Korean listeners. Stimuli consist of naturally-spoken consonant-vowel-consonant-vowel (CVCV) syllables: /sɑdɑ/, /s*ɑdɑ/, /tʃɑdɑ/, /tʃhɑdɑ/, and /tʃ*ɑdɑ/. In this experiment, the “bubble noise” methodology of Mandel at al. (2016) was used to identify the time-frequency locations of important cues in each utterance, i.e., where audibility of the location is significantly correlated with correct identification of the utterance in noise. Results show that non-native Korean listeners ...


Automatic Analysis Of Musical Lyrics, Joanna Gormley 2018 Merrimack College

Automatic Analysis Of Musical Lyrics, Joanna Gormley

Honors Senior Capstone Projects

Is music getting less sophisticated over time? That is the question which this study aims to answer, with the goal of improving upon previous analysis done on the topic. The blog posts which inspired this project lacked accuracy and dimensionality. Realizing that a larger data set of songs would make a significant difference in the precision of our analysis, we set out to design a piece of software constructed with the capability to analyze several thousand songs. Mimicking previous works which analyzed sophistication of music, the software focuses on the lyrics of songs. Three metrics were used in order to ...


Role Of Information Technology In Development Of Eritrean Language - ኣበርክቶ ቴክኖሎጂ ሓበሬታ ኣብ ምምዕባል ቋንቋታት ኤርትራ, Filmon Gebreyesus Ph.D 2018 Santa Clara University

Role Of Information Technology In Development Of Eritrean Language - ኣበርክቶ ቴክኖሎጂ ሓበሬታ ኣብ ምምዕባል ቋንቋታት ኤርትራ, Filmon Gebreyesus Ph.D

Symposium on Eritrean Literature

Information technology has been affecting us in every day of our lives, especially social media has been the main means of communication in our society. But, all the access to this current and ever-growing technology has always been limited to using it in English, Arab or other languages because our language didn’t come up to speed with the current technology.

Though there has been lots of efforts to develop Tigrigna or other languages application programs to help us use our language, there are still lots of gaps that could be filled to achieve the competence of our languages. In ...


Innovative Implementation Of A Web-Based Rating System For Individualizing Online English Speaking Instruction, Hyejin Yang, Elena Cotos 2018 Sookmyung Women’s University

Innovative Implementation Of A Web-Based Rating System For Individualizing Online English Speaking Instruction, Hyejin Yang, Elena Cotos

English Publications

The primary goal of computer-assisted language learning (CALL) in general, and of online language instruction in particular, is to create and evaluate language learning opportunities. To be effective, online language courses need to be guided by an integrated set of theoretical perspectives to second language acquisition (SLA), as well as by specific curricular goals, learning objectives and outcomes, appropriate tasks and necessary materials, and learners’ characteristics and abilities – to name a few factors that are essential in both online and face-to-face teaching (Xu & Morris, 2007). Doughty and Long (2003) articulate pedagogical principles for computer-enhanced language teaching, which highlight the importance ...


Does The Test Work? Evaluating A Web-Based Language Placement Test, Avizia Long, Sun-Young Shin, Kimberly Geeslin, Erik Willis 2018 Texas Tech University

Does The Test Work? Evaluating A Web-Based Language Placement Test, Avizia Long, Sun-Young Shin, Kimberly Geeslin, Erik Willis

Faculty Publications

In response to the need for examples of test validation from which everyday language programs can benefit, this paper reports on a study that used Bachman’s (2005) assessment use argument (AUA) framework to examine evidence to support claims made about the intended interpretations and uses of scores based on a new web-based Spanish language placement test. The test, which consisted of 100 items distributed across five item types (sound discrimination, grammar, listening comprehension, reading comprehension, and vocabulary), was tested with 2,201 incoming first-year and transfer students at a large, Midwestern public university. Analyses of internal consistency and validity ...


Detecting Language Impairments In Autism: A Computational Analysis Of Semi-Structured Conversations With Vector Semantics, Adam Goodkind, Michelle Lee, Gary E. Martin, Molly Losh, Klinton Bicknell 2018 Northwestern University

Detecting Language Impairments In Autism: A Computational Analysis Of Semi-Structured Conversations With Vector Semantics, Adam Goodkind, Michelle Lee, Gary E. Martin, Molly Losh, Klinton Bicknell

Proceedings of the Society for Computation in Linguistics

Many of the most significant impairments faced by individuals with autism spectrum disorder (ASD) relate to pragmatic (i.e. social) language. There is also evidence that pragmatic language differences may map to ASD-related genes. Therefore, quantifying the social-linguistic features of ASD has the potential to both improve clinical treatment and help identify gene-behavior relationships in ASD. Here, we apply vector semantics to transcripts of semi-structured interactions with children with both idiopathic and syndromic ASD. We find that children with ASD are less semantically similar to a gold standard derived from typically developing participants, and are more semantically variable. We show ...


A Bidirectional Mapping Between English And Cnf-Based Reasoners, Steven Abney 2018 University of Michigan

A Bidirectional Mapping Between English And Cnf-Based Reasoners, Steven Abney

Proceedings of the Society for Computation in Linguistics

If language is a transduction between sound and meaning, the target of semantic interpretation should be the meaning representation expected by general cognition. Automated reasoners provide the best available fully-explicit proxies for general cognition, and they commonly expect Clause Normal Form (CNF) as input. There is a well-known algorithm for converting from unrestricted predicate calculus to CNF, but it is not invertible, leaving us without a means to transduce CNF back to English. I present a solution, with possible repercussions for the overall framework of semantic interpretation.


Differentiating Phrase Structure Parsing And Memory Retrieval In The Brain, Shohini Bhattasali, John Hale, Christophe Pallier, Jonathan Brennan, Wen-Ming Luh, R. Nathan Spreng 2018 Cornell University

Differentiating Phrase Structure Parsing And Memory Retrieval In The Brain, Shohini Bhattasali, John Hale, Christophe Pallier, Jonathan Brennan, Wen-Ming Luh, R. Nathan Spreng

Proceedings of the Society for Computation in Linguistics

On some level, human sentence comprehension must involve both memory retrieval and structural composition. This study differentiates these two processes using neuroimaging data collected during naturalistic listening. Retrieval is formalized in terms of "multiword expressions" while structure-building is formalized in terms of bottom-up parsing. The results most strongly implicate Anterior Temporal regions for structure-building and Precuneus Cortex for memory retrieval.


Modeling The Complexity And Descriptive Adequacy Of Construction Grammars, Jonathan Dunn 2018 Illinois Institute of Technology

Modeling The Complexity And Descriptive Adequacy Of Construction Grammars, Jonathan Dunn

Proceedings of the Society for Computation in Linguistics

This paper uses the Minimum Description Length paradigm to model the complexity of CxGs (operationalized as the encoding size of a grammar) alongside their descriptive adequacy (operationalized as the encoding size of a corpus given a grammar). These two quantities are combined to measure the quality of potential CxGs against unannotated corpora, supporting discovery-device CxGs for English, Spanish, French, German, and Italian. The results show (i) that these grammars provide significant generalizations as measured using compression and (ii) that more complex CxGs with access to multiple levels of representation provide greater generalizations than single-representation CxGs.


Phonologically Informed Edit Distance Algorithms For Word Alignment With Low-Resource Languages, Richard T. McCoy, Robert Frank 2018 Johns Hopkins University

Phonologically Informed Edit Distance Algorithms For Word Alignment With Low-Resource Languages, Richard T. Mccoy, Robert Frank

Proceedings of the Society for Computation in Linguistics

We present three methods for weighting edit distance algorithms based on linguistic information. These methods base their penalties on (i) phonological features, (ii) distributional character embeddings, or (iii) differences between cognate words. We also introduce a novel method for evaluating edit distance through the task of low-resource word alignment by using edit-distance neighbors in a high-resource pivot language to inform alignments from the low-resource language. At this task, the cognate-based scheme outperforms our other methods and the Levenshtein edit distance baseline, showing that NLP applications can benefit from information about cross-linguistic phonological patterns.


Conditions On Abruptness In A Gradient-Ascent Maximum Entropy Learner, Elliott Moreton 2018 University of North Carolina, Chapel Hill

Conditions On Abruptness In A Gradient-Ascent Maximum Entropy Learner, Elliott Moreton

Proceedings of the Society for Computation in Linguistics

When does a gradual learning rule yield gradual learning performance? This paper studies a gradient-ascent Maximum Entropy phonotactic learner, as applied to two-alternative forced-choice performance expressed as log-odds. The main result is that slow initial performance cannot accelerate later if the initial weights are near zero, but can if they are not. Stated another way, abruptness in this learner is an effect of transfer, either from Universal Grammar in the form of an initial weighting, or from previous learning in the form of an acquired weighting.


Using Rhetorical Topics For Automatic Summarization, Natalie M. Schrimpf 2018 Yale University

Using Rhetorical Topics For Automatic Summarization, Natalie M. Schrimpf

Proceedings of the Society for Computation in Linguistics

Summarization involves finding the most important information in a text in order to convey the meaning of the document. In this paper, I present a method for using topic information to influence which content is selected for a summary. Texts are divided into topics using rhetorical information that creates a partition of a text into a sequence of non-overlapping topics. To investigate the effect of this topic structure, I compare the output of summarizing an entire text without topics to summarizing individual topics and combining them into a complete summary. The results show that the use of these rhetorical topics ...


The Organization Of Lexicons: A Cross-Linguistic Analysis Of Monosyllabic Words, Shiying Yang, Chelsea Sanker, Uriel Cohen Priva 2018 Brown University

The Organization Of Lexicons: A Cross-Linguistic Analysis Of Monosyllabic Words, Shiying Yang, Chelsea Sanker, Uriel Cohen Priva

Proceedings of the Society for Computation in Linguistics

Lexicons utilize a fraction of licit structures. Different theories predict either that lexicons prioritize contrastiveness or structural economy. Study 1 finds that the monosyllabic lexicon of Mandarin is no more distinctive than a randomly sampled baseline using the phonological inventory. Study 2 finds that the lexicons of Mandarin and American English have fewer phonotactically complex words than the random baseline: Words tend not to have multiple low-probability components. This suggests that phonological constraints can have superadditive penalties for combined violations, consistent with e.g. Albright (ms.).


Dependency Length Minimization And Lexical Frequency In Prepositional Phrase Ordering In English, Zoey Liu, Kenji Sagae 2018 University of California, Davis

Dependency Length Minimization And Lexical Frequency In Prepositional Phrase Ordering In English, Zoey Liu, Kenji Sagae

Proceedings of the Society for Computation in Linguistics

Previous research has shown cross-linguistically that the human language parser prefers constituent orders that minimize the distance between syntactic heads and their dependents, but the interaction between dependency length minimization (DLM) and other factors governing linear word ordering is still unknown. We examine the effects of DLM, lexical frequency, and the traditional rule of Manner before Place before Time (MPT) in ordering of prepositional phrase (PP) adjuncts in English using corpora in different language genres annotated with syntactic structure. While MPT and DLM were consistently predictive of PP ordering in our analysis, lexical frequency information was sensitive to language genre.


Digital Commons powered by bepress