Open Access. Powered by Scholars. Published by Universities.®

Computational Linguistics Commons

Open Access. Powered by Scholars. Published by Universities.®

156 Full-Text Articles 244 Authors 19,786 Downloads 39 Institutions

All Articles in Computational Linguistics

Faceted Search

156 full-text articles. Page 1 of 8.

Intergroup Variability In Personality Recognition, Arundhati Sengupta 2018 The Graduate Center, City University of New York

Intergroup Variability In Personality Recognition, Arundhati Sengupta

All Dissertations, Theses, and Capstone Projects

Automatic Identification of personality in conversational speech has many applications in natural language processing such as leader identification in a meeting, adaptive dialogue systems, and dating websites. However, the widespread acceptance of automatic personality recognition through lexical and vocal characteristics is limited by the variability of error rate in a general purpose model among speakers from different demographic groups. While other work reports accuracy, we explored error rates of automatic personality recognition task using classification models for different genders and native language groups (L1). We also present a statistical experiment showing the influence of gender and L1 on the relation ...


Multimodal Depression Detection: An Investigation Of Features And Fusion Techniques For Automated Systems, Michelle Renee Morales 2018 The Graduate Center, City University of New York

Multimodal Depression Detection: An Investigation Of Features And Fusion Techniques For Automated Systems, Michelle Renee Morales

All Dissertations, Theses, and Capstone Projects

Depression is a serious illness that affects a large portion of the world’s population. Given the large effect it has on society, it is evident that depression is a serious health issue. This thesis evaluates, at length, how technology may aid in assessing depression. We present an in-depth investigation of features and fusion techniques for depression detection systems. We also present OpenMM: a novel tool for multimodal feature extraction. Lastly, we present novel techniques for multimodal fusion. The contributions of this work add considerably to our knowledge of depression detection systems and have the potential to improve future systems ...


Describing Doggo-Speak: Features Of Doggo Meme Language, Jennifer Bivens 2018 The Graduate Center, City University of New York

Describing Doggo-Speak: Features Of Doggo Meme Language, Jennifer Bivens

All Dissertations, Theses, and Capstone Projects

Doggo-speak is a specialized way of writing most commonly associated with captions on Doggo memes, humorous images of dogs shared in online communities. This paper will explore linguistic features of Doggo-speak through analysis of social media posts by Doggo fan pages. It will use the discussed features as inputs to five machine learning classifiers and will show, through this classification task, that the discussed features are sufficient for distinguishing between Doggo-speak and more general English text.


Speech Perception In “Bubble” Noise: Korean Fricatives And Affricates By Native And Non-Native Korean Listeners, Jiyoung Choi 2018 The Graduate Center, City University of New York

Speech Perception In “Bubble” Noise: Korean Fricatives And Affricates By Native And Non-Native Korean Listeners, Jiyoung Choi

All Dissertations, Theses, and Capstone Projects

The current study examines acoustic cues used by second language learners of Korean to discriminate between Korean fricatives and affricates in noise and how these cues relate to those used by native Korean listeners. Stimuli consist of naturally-spoken consonant-vowel-consonant-vowel (CVCV) syllables: /sɑdɑ/, /s*ɑdɑ/, /tʃɑdɑ/, /tʃhɑdɑ/, and /tʃ*ɑdɑ/. In this experiment, the “bubble noise” methodology of Mandel at al. (2016) was used to identify the time-frequency locations of important cues in each utterance, i.e., where audibility of the location is significantly correlated with correct identification of the utterance in noise. Results show that non-native Korean listeners ...


Automatic Analysis Of Musical Lyrics, Joanna Gormley 2018 Merrimack College

Automatic Analysis Of Musical Lyrics, Joanna Gormley

Honors Senior Capstone Projects

Is music getting less sophisticated over time? That is the question which this study aims to answer, with the goal of improving upon previous analysis done on the topic. The blog posts which inspired this project lacked accuracy and dimensionality. Realizing that a larger data set of songs would make a significant difference in the precision of our analysis, we set out to design a piece of software constructed with the capability to analyze several thousand songs. Mimicking previous works which analyzed sophistication of music, the software focuses on the lyrics of songs. Three metrics were used in order to ...


Role Of Information Technology In Development Of Eritrean Language - ኣበርክቶ ቴክኖሎጂ ሓበሬታ ኣብ ምምዕባል ቋንቋታት ኤርትራ, Filmon Gebreyesus Ph.D 2018 Santa Clara University

Role Of Information Technology In Development Of Eritrean Language - ኣበርክቶ ቴክኖሎጂ ሓበሬታ ኣብ ምምዕባል ቋንቋታት ኤርትራ, Filmon Gebreyesus Ph.D

Symposium on Eritrean Literature

Information technology has been affecting us in every day of our lives, especially social media has been the main means of communication in our society. But, all the access to this current and ever-growing technology has always been limited to using it in English, Arab or other languages because our language didn’t come up to speed with the current technology.

Though there has been lots of efforts to develop Tigrigna or other languages application programs to help us use our language, there are still lots of gaps that could be filled to achieve the competence of our languages. In ...


Detecting Language Impairments In Autism: A Computational Analysis Of Semi-Structured Conversations With Vector Semantics, Adam Goodkind, Michelle Lee, Gary E. Martin, Molly Losh, Klinton Bicknell 2018 Northwestern University

Detecting Language Impairments In Autism: A Computational Analysis Of Semi-Structured Conversations With Vector Semantics, Adam Goodkind, Michelle Lee, Gary E. Martin, Molly Losh, Klinton Bicknell

Proceedings of the Society for Computation in Linguistics

Many of the most significant impairments faced by individuals with autism spectrum disorder (ASD) relate to pragmatic (i.e. social) language. There is also evidence that pragmatic language differences may map to ASD-related genes. Therefore, quantifying the social-linguistic features of ASD has the potential to both improve clinical treatment and help identify gene-behavior relationships in ASD. Here, we apply vector semantics to transcripts of semi-structured interactions with children with both idiopathic and syndromic ASD. We find that children with ASD are less semantically similar to a gold standard derived from typically developing participants, and are more semantically variable. We show ...


A Bidirectional Mapping Between English And Cnf-Based Reasoners, Steven Abney 2018 University of Michigan

A Bidirectional Mapping Between English And Cnf-Based Reasoners, Steven Abney

Proceedings of the Society for Computation in Linguistics

If language is a transduction between sound and meaning, the target of semantic interpretation should be the meaning representation expected by general cognition. Automated reasoners provide the best available fully-explicit proxies for general cognition, and they commonly expect Clause Normal Form (CNF) as input. There is a well-known algorithm for converting from unrestricted predicate calculus to CNF, but it is not invertible, leaving us without a means to transduce CNF back to English. I present a solution, with possible repercussions for the overall framework of semantic interpretation.


Differentiating Phrase Structure Parsing And Memory Retrieval In The Brain, Shohini Bhattasali, John Hale, Christophe Pallier, Jonathan Brennan, Wen-Ming Luh, R. Nathan Spreng 2018 Cornell University

Differentiating Phrase Structure Parsing And Memory Retrieval In The Brain, Shohini Bhattasali, John Hale, Christophe Pallier, Jonathan Brennan, Wen-Ming Luh, R. Nathan Spreng

Proceedings of the Society for Computation in Linguistics

On some level, human sentence comprehension must involve both memory retrieval and structural composition. This study differentiates these two processes using neuroimaging data collected during naturalistic listening. Retrieval is formalized in terms of "multiword expressions" while structure-building is formalized in terms of bottom-up parsing. The results most strongly implicate Anterior Temporal regions for structure-building and Precuneus Cortex for memory retrieval.


Modeling The Complexity And Descriptive Adequacy Of Construction Grammars, Jonathan Dunn 2018 Illinois Institute of Technology

Modeling The Complexity And Descriptive Adequacy Of Construction Grammars, Jonathan Dunn

Proceedings of the Society for Computation in Linguistics

This paper uses the Minimum Description Length paradigm to model the complexity of CxGs (operationalized as the encoding size of a grammar) alongside their descriptive adequacy (operationalized as the encoding size of a corpus given a grammar). These two quantities are combined to measure the quality of potential CxGs against unannotated corpora, supporting discovery-device CxGs for English, Spanish, French, German, and Italian. The results show (i) that these grammars provide significant generalizations as measured using compression and (ii) that more complex CxGs with access to multiple levels of representation provide greater generalizations than single-representation CxGs.


Phonologically Informed Edit Distance Algorithms For Word Alignment With Low-Resource Languages, Richard T. McCoy, Robert Frank 2018 Johns Hopkins University

Phonologically Informed Edit Distance Algorithms For Word Alignment With Low-Resource Languages, Richard T. Mccoy, Robert Frank

Proceedings of the Society for Computation in Linguistics

We present three methods for weighting edit distance algorithms based on linguistic information. These methods base their penalties on (i) phonological features, (ii) distributional character embeddings, or (iii) differences between cognate words. We also introduce a novel method for evaluating edit distance through the task of low-resource word alignment by using edit-distance neighbors in a high-resource pivot language to inform alignments from the low-resource language. At this task, the cognate-based scheme outperforms our other methods and the Levenshtein edit distance baseline, showing that NLP applications can benefit from information about cross-linguistic phonological patterns.


Conditions On Abruptness In A Gradient-Ascent Maximum Entropy Learner, Elliott Moreton 2018 University of North Carolina, Chapel Hill

Conditions On Abruptness In A Gradient-Ascent Maximum Entropy Learner, Elliott Moreton

Proceedings of the Society for Computation in Linguistics

When does a gradual learning rule yield gradual learning performance? This paper studies a gradient-ascent Maximum Entropy phonotactic learner, as applied to two-alternative forced-choice performance expressed as log-odds. The main result is that slow initial performance cannot accelerate later if the initial weights are near zero, but can if they are not. Stated another way, abruptness in this learner is an effect of transfer, either from Universal Grammar in the form of an initial weighting, or from previous learning in the form of an acquired weighting.


Using Rhetorical Topics For Automatic Summarization, Natalie M. Schrimpf 2018 Yale University

Using Rhetorical Topics For Automatic Summarization, Natalie M. Schrimpf

Proceedings of the Society for Computation in Linguistics

Summarization involves finding the most important information in a text in order to convey the meaning of the document. In this paper, I present a method for using topic information to influence which content is selected for a summary. Texts are divided into topics using rhetorical information that creates a partition of a text into a sequence of non-overlapping topics. To investigate the effect of this topic structure, I compare the output of summarizing an entire text without topics to summarizing individual topics and combining them into a complete summary. The results show that the use of these rhetorical topics ...


The Organization Of Lexicons: A Cross-Linguistic Analysis Of Monosyllabic Words, Shiying Yang, Chelsea Sanker, Uriel Cohen Priva 2018 Brown University

The Organization Of Lexicons: A Cross-Linguistic Analysis Of Monosyllabic Words, Shiying Yang, Chelsea Sanker, Uriel Cohen Priva

Proceedings of the Society for Computation in Linguistics

Lexicons utilize a fraction of licit structures. Different theories predict either that lexicons prioritize contrastiveness or structural economy. Study 1 finds that the monosyllabic lexicon of Mandarin is no more distinctive than a randomly sampled baseline using the phonological inventory. Study 2 finds that the lexicons of Mandarin and American English have fewer phonotactically complex words than the random baseline: Words tend not to have multiple low-probability components. This suggests that phonological constraints can have superadditive penalties for combined violations, consistent with e.g. Albright (ms.).


Dependency Length Minimization And Lexical Frequency In Prepositional Phrase Ordering In English, Zoey Liu, Kenji Sagae 2018 University of California, Davis

Dependency Length Minimization And Lexical Frequency In Prepositional Phrase Ordering In English, Zoey Liu, Kenji Sagae

Proceedings of the Society for Computation in Linguistics

Previous research has shown cross-linguistically that the human language parser prefers constituent orders that minimize the distance between syntactic heads and their dependents, but the interaction between dependency length minimization (DLM) and other factors governing linear word ordering is still unknown. We examine the effects of DLM, lexical frequency, and the traditional rule of Manner before Place before Time (MPT) in ordering of prepositional phrase (PP) adjuncts in English using corpora in different language genres annotated with syntactic structure. While MPT and DLM were consistently predictive of PP ordering in our analysis, lexical frequency information was sensitive to language genre.


A Structural Theory Of Derivations, Zachary Stone 2018 University of Maryland, College Park

A Structural Theory Of Derivations, Zachary Stone

Proceedings of the Society for Computation in Linguistics

No abstract provided.


T-Orders Across Categorical And Probabilistic Constraint-Based Phonology, Arto Tapani Anttila, Giorgio Magri 2018 Stanford University

T-Orders Across Categorical And Probabilistic Constraint-Based Phonology, Arto Tapani Anttila, Giorgio Magri

Proceedings of the Society for Computation in Linguistics

No abstract provided.


Quantitatively Assessing The Development Of Adjective Ordering Preferences Using Child-Directed And Child-Produced Speech Corpora, Galia Bar-Sever, Rachael Lee, Gregory Scontras, Lisa Pearl 2018 University of California, Irvine

Quantitatively Assessing The Development Of Adjective Ordering Preferences Using Child-Directed And Child-Produced Speech Corpora, Galia Bar-Sever, Rachael Lee, Gregory Scontras, Lisa Pearl

Proceedings of the Society for Computation in Linguistics

No abstract provided.


How Far Can Vot Take Us? Voicing Categorization With And Without The Use Of Vot, Abigail Benecke, Joseph Toscano 2018 Villanova University

How Far Can Vot Take Us? Voicing Categorization With And Without The Use Of Vot, Abigail Benecke, Joseph Toscano

Proceedings of the Society for Computation in Linguistics

Voice-onset time (VOT) is an extremely reliable cue to word-initial stop voicing, such that VOT alone may be sufficient as a voicing cue. To test this, 35 potential cues were measured and used to train logistic regression classifiers, asking whether VOT is sufficient, whether other cues increase categorization accuracy, and whether, without VOT, other cues produce listener-level accuracy. Results show that human-like performance was never achieved without VOT or with VOT alone. Models using a cue-integration approach (additively combining multiple cues) offered the closest performance to human listeners. Thus, VOT appears to be necessary, but not sufficient, for voicing judgments.


Logical Metonymy In A Distributional Model Of Sentence Comprehension, Emmanuele Chersoni, Alessandro Lenci, Philippe Blache 2018 Aix-Marseille University

Logical Metonymy In A Distributional Model Of Sentence Comprehension, Emmanuele Chersoni, Alessandro Lenci, Philippe Blache

Proceedings of the Society for Computation in Linguistics

No abstract provided.


Digital Commons powered by bepress