Open Access. Powered by Scholars. Published by Universities.®

Computational Linguistics Commons

Open Access. Powered by Scholars. Published by Universities.®

224 Full-Text Articles 342 Authors 104,370 Downloads 63 Institutions

All Articles in Computational Linguistics

Faceted Search

224 full-text articles. Page 6 of 10.

Application Of Boolean Logic To Natural Language Complexity In Political Discourse, Austin Taing 2019 University of Kentucky

Application Of Boolean Logic To Natural Language Complexity In Political Discourse, Austin Taing

Theses and Dissertations--Computer Science

Press releases serve as a major influence on public opinion of a politician, since they are a primary means of communicating with the public and directing discussion. Thus, the public’s ability to digest them is an important factor for politicians to consider. This study employs several well-studied measures of linguistic complexity and proposes a new one to examine whether politicians change their language to become more or less difficult to parse in different situations. This study uses 27,500 press releases from the US Senate between 2004–2008 and examines election cycles and natural disasters, namely hurricanes, as situations where politicians’ language …


Non-Manual Articulators In Irish Sign Language Verbs: An Analysis With Data Mining Association Rules, Robert G. Smith, Markus Hofmann 2018 Technological University Dublin

Non-Manual Articulators In Irish Sign Language Verbs: An Analysis With Data Mining Association Rules, Robert G. Smith, Markus Hofmann

Conference Papers

The Signs of Ireland (SOI) corpus (Leeson et al., 2006) deploys a complex multi-tiered temporal data structure. The process of manually analyzing such data is laborious, cannot eliminate bias and often, important patterns can go completely unnoticed. In addition to this, as a result of the complex nature of grammatical structures contained in the corpus, identifying complex linguistic associations or patterns across tiers is simply too intricate a task for a human to carry out in an acceptable timeframe. This work explores the application of data mining techniques on a set of multi-tiered temporal data from the SOI corpus. Building …


Recursive Neural Networks For Semantic Sentence Representation, Liam S. Geron 2018 The Graduate Center, City University of New York

Recursive Neural Networks For Semantic Sentence Representation, Liam S. Geron

Dissertations, Theses, and Capstone Projects

Semantic representation has a rich history rife with both complex linguistic theory and computational models. Though this history stretches back almost 50 years (Salton, 1971), recently the field has undergone an unexpected shift in paradigm thanks to the work of Mikolov et al., 2013(a & b) which has proven that vector-space semantic models can capture large amounts of semantic information. As of yet, these semantic representations are computed at the word level, and finding a semantic representation of a phrase is a much more difficult challenge. Mikolov et al., 2013(a&b) proved that their word vectors can be composed arithmetically to …


Advanced Recurrent Network-Based Hybrid Acoustic Models For Low Resource Speech Recognition, Jian Kang, Wei-Qiang Zhang, Wei-Wei Liu, Jia Liu, Michael T. Johnson 2018 Tsinghua University, China

Advanced Recurrent Network-Based Hybrid Acoustic Models For Low Resource Speech Recognition, Jian Kang, Wei-Qiang Zhang, Wei-Wei Liu, Jia Liu, Michael T. Johnson

Electrical and Computer Engineering Faculty Publications

Recurrent neural networks (RNNs) have shown an ability to model temporal dependencies. However, the problem of exploding or vanishing gradients has limited their application. In recent years, long short-term memory RNNs (LSTM RNNs) have been proposed to solve this problem and have achieved excellent results. Bidirectional LSTM (BLSTM), which uses both preceding and following context, has shown particularly good performance. However, the computational requirements of BLSTM approaches are quite heavy, even when implemented efficiently with GPU-based high performance computers. In addition, because the output of LSTM units is bounded, there is often still a vanishing gradient issue over multiple layers. …


Perception & Perspective: An Analysis Of Discourse And Situational Factors In Reference Frame Selection, Robert J. Ross, Kavita E. Thomas 2018 Technological University Dublin

Perception & Perspective: An Analysis Of Discourse And Situational Factors In Reference Frame Selection, Robert J. Ross, Kavita E. Thomas

Conference papers

To integrate perception into dialogue, it is necessary to bind spatial language descriptions to reference frame use. To this end, we present an analysis of discourse and situational factors that may influence reference frame choice in dialogues. We show that factors including spatial orientation, task, self and other alignment, and dyad have an influence on reference frame use. We further show that a computational model to estimate reference frame based on these features provides results greater than both random and greedy reference frame selection strategies.


Intergroup Variability In Personality Recognition, Arundhati Sengupta 2018 The Graduate Center, City University of New York

Intergroup Variability In Personality Recognition, Arundhati Sengupta

Dissertations, Theses, and Capstone Projects

Automatic Identification of personality in conversational speech has many applications in natural language processing such as leader identification in a meeting, adaptive dialogue systems, and dating websites. However, the widespread acceptance of automatic personality recognition through lexical and vocal characteristics is limited by the variability of error rate in a general purpose model among speakers from different demographic groups. While other work reports accuracy, we explored error rates of automatic personality recognition task using classification models for different genders and native language groups (L1). We also present a statistical experiment showing the influence of gender and L1 on the relation …


Multimodal Depression Detection: An Investigation Of Features And Fusion Techniques For Automated Systems, Michelle Renee Morales 2018 The Graduate Center, City University of New York

Multimodal Depression Detection: An Investigation Of Features And Fusion Techniques For Automated Systems, Michelle Renee Morales

Dissertations, Theses, and Capstone Projects

Depression is a serious illness that affects a large portion of the world’s population. Given the large effect it has on society, it is evident that depression is a serious health issue. This thesis evaluates, at length, how technology may aid in assessing depression. We present an in-depth investigation of features and fusion techniques for depression detection systems. We also present OpenMM: a novel tool for multimodal feature extraction. Lastly, we present novel techniques for multimodal fusion. The contributions of this work add considerably to our knowledge of depression detection systems and have the potential to improve future systems by …


Speech Perception In “Bubble” Noise: Korean Fricatives And Affricates By Native And Non-Native Korean Listeners, Jiyoung Choi 2018 The Graduate Center, City University of New York

Speech Perception In “Bubble” Noise: Korean Fricatives And Affricates By Native And Non-Native Korean Listeners, Jiyoung Choi

Dissertations, Theses, and Capstone Projects

The current study examines acoustic cues used by second language learners of Korean to discriminate between Korean fricatives and affricates in noise and how these cues relate to those used by native Korean listeners. Stimuli consist of naturally-spoken consonant-vowel-consonant-vowel (CVCV) syllables: /sɑdɑ/, /s*ɑdɑ/, /tʃɑdɑ/, /tʃhɑdɑ/, and /tʃ*ɑdɑ/. In this experiment, the “bubble noise” methodology of Mandel at al. (2016) was used to identify the time-frequency locations of important cues in each utterance, i.e., where audibility of the location is significantly correlated with correct identification of the utterance in noise. Results show that non-native Korean listeners can discriminate between …


Describing Doggo-Speak: Features Of Doggo Meme Language, Jennifer Bivens 2018 The Graduate Center, City University of New York

Describing Doggo-Speak: Features Of Doggo Meme Language, Jennifer Bivens

Dissertations, Theses, and Capstone Projects

Doggo-speak is a specialized way of writing most commonly associated with captions on Doggo memes, humorous images of dogs shared in online communities. This paper will explore linguistic features of Doggo-speak through analysis of social media posts by Doggo fan pages. It will use the discussed features as inputs to five machine learning classifiers and will show, through this classification task, that the discussed features are sufficient for distinguishing between Doggo-speak and more general English text.


Nevertheless, She Persisted: A Linguistic Analysis Of The Speech Of Elizabeth Warren, 2007-2017, Matthew Jennings 2018 East Tennessee State University

Nevertheless, She Persisted: A Linguistic Analysis Of The Speech Of Elizabeth Warren, 2007-2017, Matthew Jennings

Undergraduate Honors Theses

A breakout star among American progressives in the recent past, Elizabeth Warren has quickly gone from a law professor to a leading figure in Democratic politics. This paper analyzes Warren’s speech from before her time as a political figure to the present using the quantitative textual methodology established by Jones (2016) in order to see if Warren’s speech supports Jones’s assertion that masculine speech is the language of power. Ratios of feminine to masculine markers ultimately indicate that despite her increasing political sway, Warren’s speech becomes increasingly feminine instead. However, despite associations of feminine speech with weakness, Warren’s speech scores …


Automatic Analysis Of Musical Lyrics, Joanna Gormley 2018 Merrimack College

Automatic Analysis Of Musical Lyrics, Joanna Gormley

Honors Senior Capstone Projects

Is music getting less sophisticated over time? That is the question which this study aims to answer, with the goal of improving upon previous analysis done on the topic. The blog posts which inspired this project lacked accuracy and dimensionality. Realizing that a larger data set of songs would make a significant difference in the precision of our analysis, we set out to design a piece of software constructed with the capability to analyze several thousand songs. Mimicking previous works which analyzed sophistication of music, the software focuses on the lyrics of songs. Three metrics were used in order to …


Role Of Information Technology In Development Of Eritrean Language - ኣበርክቶ ቴክኖሎጂ ሓበሬታ ኣብ ምምዕባል ቋንቋታት ኤርትራ, Filmon Gebreyesus Ph.D 2018 Santa Clara University

Role Of Information Technology In Development Of Eritrean Language - ኣበርክቶ ቴክኖሎጂ ሓበሬታ ኣብ ምምዕባል ቋንቋታት ኤርትራ, Filmon Gebreyesus Ph.D

Symposium on Eritrean Literature

Information technology has been affecting us in every day of our lives, especially social media has been the main means of communication in our society. But, all the access to this current and ever-growing technology has always been limited to using it in English, Arab or other languages because our language didn’t come up to speed with the current technology.

Though there has been lots of efforts to develop Tigrigna or other languages application programs to help us use our language, there are still lots of gaps that could be filled to achieve the competence of our languages. In light …


Does The Test Work? Evaluating A Web-Based Language Placement Test, Avizia Long, Sun-Young Shin, Kimberly Geeslin, Erik Willis 2018 Texas Tech University

Does The Test Work? Evaluating A Web-Based Language Placement Test, Avizia Long, Sun-Young Shin, Kimberly Geeslin, Erik Willis

Faculty Publications

In response to the need for examples of test validation from which everyday language programs can benefit, this paper reports on a study that used Bachman’s (2005) assessment use argument (AUA) framework to examine evidence to support claims made about the intended interpretations and uses of scores based on a new web-based Spanish language placement test. The test, which consisted of 100 items distributed across five item types (sound discrimination, grammar, listening comprehension, reading comprehension, and vocabulary), was tested with 2,201 incoming first-year and transfer students at a large, Midwestern public university. Analyses of internal consistency and validity revealed the …


Losing Shahrazad: A Distant Reading Of 1001 Nights, Taysa Mohler 2018 Bard College

Losing Shahrazad: A Distant Reading Of 1001 Nights, Taysa Mohler

Senior Projects Spring 2018

This project is a distant reading analysis of seven 19th and 20th-century English translations of One Thousand and One Nights or The Arabian Nights. Through the use of computer programming and distant reading, it becomes clear that the Nights' frame tale is the carrier of the internal logic and generative power of the story cycle. Further, the frame tale expresses the Nights' self-representation, which serves to undermine the historical use of the Nights as synecdoche for the Orient. Therefore, the translators that remove the frame story from their versions further the Nights' use as an Orientalist object, …


Exploring The Functional And Geometric Bias Of Spatial Relations Using Neural Language Models, Simon Dobnik, Mehdi Ghanimifard, John D. Kelleher 2018 University of Gothenberg, Sweden

Exploring The Functional And Geometric Bias Of Spatial Relations Using Neural Language Models, Simon Dobnik, Mehdi Ghanimifard, John D. Kelleher

Conference papers

The challenge for computational models of spatial descriptions for situated dialogue systems is the integration of information from different modalities. The semantics of spatial descriptions are grounded in at least two sources of information: (i) a geometric representation of space and (ii) the functional interaction of related objects that. We train several neural language models on descriptions of scenes from a dataset of image captions and examine whether the functional or geometric bias of spatial descriptions reported in the literature is reflected in the estimated perplexity of these models. The results of these experiments have implications for the creation of …


A Markedly Different Approach: Investigating Pie Stops Using Modern Empirical Methods, Phillip Barnett 2018 University of Kentucky

A Markedly Different Approach: Investigating Pie Stops Using Modern Empirical Methods, Phillip Barnett

Theses and Dissertations--Linguistics

In this thesis, I investigate a decades-old problem found in the stop system of Proto-Indo-European (PIE). More specifically, I will be investigating the paucity of */b/ in the forms reconstructed for the ancient, hypothetical language. As cross-linguistic evidence and phonological theory alone have fallen short of providing a satisfactory answer, herein will I employ modern empirical methods of linguistic investigation, namely laboratory phonology experiments and computational database analysis. Following Byrd 2015, I advocate for an examination of synchronic phenomena and behavior as a method for investigating diachronic change.

In Chapter 1, I present an overview of the various proposed phonological …


#Hashtags: A Look At The Evaluative Roles Of Hashtags On Twitter, Leah Rose Schaede 2018 University of Kentucky

#Hashtags: A Look At The Evaluative Roles Of Hashtags On Twitter, Leah Rose Schaede

Theses and Dissertations--Linguistics

Social media has become a large part of today’s pop culture and keeping up with what is going on not only in our social circles, but around the world. It has given many a platform to unite their causes, build fandoms, and share their commentary with the world. A tool in helping group posts together or give commentary on a thought is the hashtag. In this paper I explore the evaluative roles of hashtags in social media discourse, specifically on Twitter. I use a sample of randomly selected tweets from the Twitter API stream I collected and compiled myself. I …


Cloud‐Based Text Analytics Harvesting, Cleaning And Analyzing Corporate Earnings Conference Calls, Michael Chuancai Zhang, Vikram Gazula, Dan Stone, Hong Xie 2017 University of Kentucky

Cloud‐Based Text Analytics Harvesting, Cleaning And Analyzing Corporate Earnings Conference Calls, Michael Chuancai Zhang, Vikram Gazula, Dan Stone, Hong Xie

Commonwealth Computational Summit

No abstract provided.


Cloud-Based Text Analytics: Harvesting, Cleaning And Analyzing Corporate Earnings Conference Calls, Michael Chuancai Zhang, Vikram Gazula, Dan Stone, Hong Xie 2017 University of Kentucky

Cloud-Based Text Analytics: Harvesting, Cleaning And Analyzing Corporate Earnings Conference Calls, Michael Chuancai Zhang, Vikram Gazula, Dan Stone, Hong Xie

Commonwealth Computational Summit

Does management language cohesion in earnings conference calls matter to the capital market? As a part of the research on the above question, and taking advantage of the modern IT technologies, this project:

  • harvested 115,882 earnings conference call transcripts from SeekingAlpha.com
  • parsed and structured 89,988 transcripts using regular expressions in Stata
  • analyzed 179,976 text files using Amazon Elastic Compute Cloud (Amazon EC2), which
  • saved almost 2 years (675 days) of the project time
As this project is related to big data, text analytics, and big computing, it may be a good case to show how we can benefit from modern …


A Sentiment Analysis Of Language & Gender Using Word Embedding Models, Ellyn Rolleston Keith 2017 The Graduate Center, City University of New York

A Sentiment Analysis Of Language & Gender Using Word Embedding Models, Ellyn Rolleston Keith

Dissertations, Theses, and Capstone Projects

Since Robin Lakoff started the conversation around language and gender with her 1975 essay “Language and Woman’s Place,” extensive work has been done on analyzing sociolinguistics associated with gender. While much work has been done on the differences between how men and women use language, there is less research to be found on language about women as opposed to language about men. In this work, I build a word embedding model from a corpus of Wikipedia film summaries and use this model to create lists of words associated with men and words associated with women. I then use sentiment analysis …


Digital Commons powered by bepress