Open Access. Powered by Scholars. Published by Universities.®

Computational Linguistics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 11 of 11

Full-Text Articles in Computational Linguistics

The Near-Synonymous Classifiers In Mandarin Chinese: Etymology, Modern Usage, And Possible Problems In L2 Classroom, Irina Kavokina Nov 2023

The Near-Synonymous Classifiers In Mandarin Chinese: Etymology, Modern Usage, And Possible Problems In L2 Classroom, Irina Kavokina

Masters Theses

Many Chinese classifiers are nearly synonymic – they can be used with the same head nouns without changing the meaning of the sentence, in other words, such classifiers can be used interchangeably or almost interchangeably. This poses a challenge for Chinese language learners, especially those who lack such a grammatical category in their own native language. Another complication arises from the ambiguous English translations of many classifiers.

In this paper we investigate the collocation behavior of near-synonymous Chinese classifiers, focusing on their semantic nuances and interchangeability. Analyzing 6 pairs of classifiers — 栋 and 幢, 匹 and 头, 批 and …


A Computational Analysis Of Volodymyr Zelenskyy's Public Diplomacy Discourse In Times Of Crisis, Amber Brittain-Hale Jul 2023

A Computational Analysis Of Volodymyr Zelenskyy's Public Diplomacy Discourse In Times Of Crisis, Amber Brittain-Hale

Education Division Scholarship

In this study, we delve into the public diplomacy discourse of Ukrainian President Volodymyr Zelenskyy during the ongoing crisis of the Russo-Ukrainian War. We aim to conduct a computational analysis of Zelenskyy's English, Russian, and Ukrainian speeches, exploring the linguistic patterns and code-switching employed in his discourse. The study period encompasses Russia’s build-up to and full-scale invasion of Ukraine from May 2019 to May 30, 2023. This time frame is crucial as it captures the dynamic development of the crisis and the expansion of Zelenskyy's presidency, providing a unique context for analyzing his public diplomacy efforts. By utilizing Linguistic Inquiry …


‘A Category Of Their Own’: Quantitative Methods In The Use Of Pile-Sort Data In Perceptual Dialectology, Zachary Ty Gill Jan 2023

‘A Category Of Their Own’: Quantitative Methods In The Use Of Pile-Sort Data In Perceptual Dialectology, Zachary Ty Gill

Theses and Dissertations--Linguistics

The purpose of this study is to investigate how Mississippi Gulf Coast Creoles perceive language differences in their home area. A pile-sort task was carried out in which respondents were given stacks of cards with local communities written on them and instructed to stack together the regions where people “talk the same.” Once the piles were made, the fieldworker discussed their sortings with the respondents. The stacks were analyzed by means of a hierarchal agglomerative cluster analysis and non-parametric multidimensional scaling with k-means cluster analysis overlays to extract the perceived dialect areas. The groupings reveal that respondent strategies are based …


Otrouha: A Corpus Of Arabic Etds And A Framework For Automatic Subject Classification, Eman Abdelrahman, Fatimah Alotaibi, Edward A. Fox, Osman Balci Mar 2021

Otrouha: A Corpus Of Arabic Etds And A Framework For Automatic Subject Classification, Eman Abdelrahman, Fatimah Alotaibi, Edward A. Fox, Osman Balci

The Journal of Electronic Theses and Dissertations

Although the Arabic language is spoken by more than 300 million people and is one of the six official languages of the United Nations (UN), there has been less research done on Arabic text data (compared to English) in the realm of machine learning, especially in text classification. In the past decade, Arabic data such as news, tweets, etc. have begun to receive some attention. Although automatic text classification plays an important role in improving the browsability and accessibility of data, Electronic Theses and Dissertations (ETDs) have not received their fair share of attention, in spite of the huge number …


A Computational Study In The Detection Of English–Spanish Code-Switches, Yohamy C. Polanco Feb 2021

A Computational Study In The Detection Of English–Spanish Code-Switches, Yohamy C. Polanco

Dissertations, Theses, and Capstone Projects

Code-switching is the linguistic phenomenon where a multilingual person alternates between two or more languages in a conversation, whether that be spoken or written. This thesis studies the automatic detection of code-switching occurring specifically between English and Spanish in two corpora.

Twitter and other social media sites have provided an abundance of linguistic data that is available to researchers to perform countless experiments. Collecting the data is fairly easy if a study is on monolingual text, but if a study requires code-switched data, this becomes a complication as APIs only accept one language as a parameter. This thesis focuses on …


A Lexical Frequency Analysis Of Irish Sign Language, Robert G Smith, Markus Hofmann Sep 2020

A Lexical Frequency Analysis Of Irish Sign Language, Robert G Smith, Markus Hofmann

Other Resources

Word frequency has a significant impact on language acquisition and fluency. It is often a point of reference for the teaching and assessing of a language and indeed, as a control for psycholinguistic studies. This paper presents the results of the first objective frequency analysis of lexical tokens from the Signs of Ireland corpus. We investigate the frequency of fully lexical, partly lexical and non-lexical signs in Irish Sign Language as they are presented in the corpus. We confirm the accuracy of the lexical gloss frequency data with a supplementary corpus subset that is tagged for grammatical class and additional …


Non-Manual Articulators In Irish Sign Language Verbs: An Analysis With Data Mining Association Rules, Robert G. Smith, Markus Hofmann Nov 2018

Non-Manual Articulators In Irish Sign Language Verbs: An Analysis With Data Mining Association Rules, Robert G. Smith, Markus Hofmann

Conference Papers

The Signs of Ireland (SOI) corpus (Leeson et al., 2006) deploys a complex multi-tiered temporal data structure. The process of manually analyzing such data is laborious, cannot eliminate bias and often, important patterns can go completely unnoticed. In addition to this, as a result of the complex nature of grammatical structures contained in the corpus, identifying complex linguistic associations or patterns across tiers is simply too intricate a task for a human to carry out in an acceptable timeframe. This work explores the application of data mining techniques on a set of multi-tiered temporal data from the SOI corpus. Building …


Emotional Facial Expressions In Synthesised Sign Language Avatars: A Manual Evaluation., Robert G Smith, Brian Nolan Oct 2015

Emotional Facial Expressions In Synthesised Sign Language Avatars: A Manual Evaluation., Robert G Smith, Brian Nolan

Other Resources

This research explores and evaluates the contribution that facial expressions might have regarding improved comprehension and acceptability in sign language avatars. Focusing specifically on Irish sign language (ISL), the Deaf (the uppercase ‘‘D’’ in the word ‘‘Deaf’’ indicates Deaf as a culture as opposed to ‘‘deaf’’ as a medical condition) community’s responsiveness to sign language avatars is examined. The hypothesis of this is as follows: augmenting an existing avatar with the seven widely accepted universal emotions identified by Ekman (Basic emotions: handbook of cognition and emotion. Wiley, London, 2005) to achieve underlying facial expressions will make that avatar more human-like …


Alternative Translation Approach – Part I: "Labor Division", Ludvig Glavati Mar 2014

Alternative Translation Approach – Part I: "Labor Division", Ludvig Glavati

Ludvig Glavati

No abstract provided.


Misheard Me Oronyminator: Using Oronyms To Validate The Correctness Of Frequency Dictionaries, Jennifer G. Hughes Jun 2013

Misheard Me Oronyminator: Using Oronyms To Validate The Correctness Of Frequency Dictionaries, Jennifer G. Hughes

Master's Theses

In the field of speech recognition, an algorithm must learn to tell the difference between "a nice rock" and "a gneiss rock". These identical-sounding phrases are called oronyms. Word frequency dictionaries are often used by speech recognition systems to help resolve phonetic sequences with more than one possible orthographic phrase interpretation, by looking up which oronym of the root phonetic sequence contains the most-common words.

Our paper demonstrates a technique used to validate word frequency dictionary values. We chose to use frequency values from the UNISYN dictionary, which tallies each word on a per-occurance basis, using a proprietary text corpus, …


Enterprise Users And Web Search Behavior, April Ann Lewis May 2010

Enterprise Users And Web Search Behavior, April Ann Lewis

Masters Theses

This thesis describes analysis of user web query behavior associated with Oak Ridge National Laboratory’s (ORNL) Enterprise Search System (Hereafter, ORNL Intranet). The ORNL Intranet provides users a means to search all kinds of data stores for relevant business and research information using a single query. The Global Intranet Trends for 2010 Report suggests the biggest current obstacle for corporate intranets is “findability and Siloed content”. Intranets differ from internets in the way they create, control, and share content which can make it often difficult and sometimes impossible for users to find information. Stenmark (2006) first noted studies of corporate …