Open Access. Powered by Scholars. Published by Universities.®

Computational Linguistics Commons

Open Access. Powered by Scholars. Published by Universities.®

2014

Discipline
Institution
Keyword
Publication
Publication Type

Articles 1 - 14 of 14

Full-Text Articles in Computational Linguistics

An Empirical Study Of Semantic Similarity In Wordnet And Word2vec, Abram Handler Dec 2014

An Empirical Study Of Semantic Similarity In Wordnet And Word2vec, Abram Handler

University of New Orleans Theses and Dissertations

This thesis performs an empirical analysis of Word2Vec by comparing its output to WordNet, a well-known, human-curated lexical database. It finds that Word2Vec tends to uncover more of certain types of semantic relations than others -- with Word2Vec returning more hypernyms, synonomyns and hyponyms than hyponyms or holonyms. It also shows the probability that neighbors separated by a given cosine distance in Word2Vec are semantically related in WordNet. This result both adds to our understanding of the still-unknown Word2Vec and helps to benchmark new semantic tools built from word vectors.


Computational Communication Intelligence: Exploring Linguistic Manifestation And Social Dynamics In Online Communication, Xiaoxi Xu Nov 2014

Computational Communication Intelligence: Exploring Linguistic Manifestation And Social Dynamics In Online Communication, Xiaoxi Xu

Doctoral Dissertations

We now live in an age of online communication. As social media becomes an integral part of our life, online communication becomes an essential life skill. In this dissertation, we aim to understand how people effectively communicate online. We research components of success in online communication and present scientific methods to study the skill of effective communication. This research advances the state of art in machine learning and communication studies. For communication studies, we pioneer the study of a communication phenomenon we call Communication Intelligence in online interactions. We create a theory about communication intelligence that measures participants’ ten high-order …


Computational Modeling Of Learning Biases In Stress Typology, Robert D. Staubs Nov 2014

Computational Modeling Of Learning Biases In Stress Typology, Robert D. Staubs

Doctoral Dissertations

This dissertation demonstrates a strong connection between the frequency of stress patterns and their relative learnability under a wide class of learning algorithms. These frequency results follow from hypotheses about the learner's available representations and the distribution of input data. Such hypotheses are combined with a model of learning to derive distinctions between classes of stress patterns, addressing frequency biases not modeled by traditional generative theory. I present a series of results for error-driven learners of constraint-based grammars. These results are shown both for single learners and learners in an iterated learning model. First, I show that with general n …


Predicting Survey Responses: How And Why Semantics Shape Survey Statistics On Organizational Behaviour, Ketil Arnulf, Kai R. Larsen, Øyvind Martinsen, Chih How Bong Sep 2014

Predicting Survey Responses: How And Why Semantics Shape Survey Statistics On Organizational Behaviour, Ketil Arnulf, Kai R. Larsen, Øyvind Martinsen, Chih How Bong

Kai R.T. Larsen

Some disciplines in the social sciences rely heavily on collecting survey responses to detect empirical relationships among variables. We explored whether these relationships were a priori predictable from the semantic properties of the survey items, using language processing algorithms which are now available as new research methods. Language processing algorithms were used to calculate the semantic similarity among all items in state-of-the-art surveys from Organisational Behaviour research. These surveys covered areas such as transformational leadership, work motivation and work outcomes. This information was used to explain and predict the response patterns from real subjects. Semantic algorithms explained 60–86% of the …


The Role Of Emotional And Facial Expression In Synthesised Sign Language Avatars, Robert G Smith Sep 2014

The Role Of Emotional And Facial Expression In Synthesised Sign Language Avatars, Robert G Smith

Other Resources

This thesis explores the role that underlying emotional facial expressions might have in regards to understandability in sign language avatars. Focusing specifically on Irish Sign Language (ISL), we examine the Deaf community’s requirement for a visual-gestural language as well as some linguistic attributes of ISL which we consider fundamental to this research. Unlike spoken language, visual-gestural languages such as ISL have no standard written representation. Given this, we compare current methods of written representation for signed languages as we consider: which, if any, is the most suitable transcription method for the medical receptionist dialogue corpus. A growing body of work …


Identification Of Informativeness In Text Using Natural Language Stylometry, Rushdi Shams Aug 2014

Identification Of Informativeness In Text Using Natural Language Stylometry, Rushdi Shams

Electronic Thesis and Dissertation Repository

In this age of information overload, one experiences a rapidly growing over-abundance of written text. To assist with handling this bounty, this plethora of texts is now widely used to develop and optimize statistical natural language processing (NLP) systems. Surprisingly, the use of more fragments of text to train these statistical NLP systems may not necessarily lead to improved performance. We hypothesize that those fragments that help the most with training are those that contain the desired information. Therefore, determining informativeness in text has become a central issue in our view of NLP. Recent developments in this field have spawned …


The Effect Of Sensor Errors In Situated Human-Computer Dialogue, Niels Schütte, John D. Kelleher, Brian Mac Namee Aug 2014

The Effect Of Sensor Errors In Situated Human-Computer Dialogue, Niels Schütte, John D. Kelleher, Brian Mac Namee

Conference papers

Errors in perception are a problem for computer systems that use sensors to perceive the environment. If a computer system is engaged in dialogue with a human user, these problems in perception lead to problems in the dialogue. We present two experiments, one in which participants interact through dialogue with a robot with perfect perception to fulfil a simple task, and a second one in which the robot is affected by sensor errors and compare the resulting dialogues to determine whether the sensor problems have an impact on dialogue success.


Cosine Similarity For Article Section Classification: Using Structured Abstracts As A Proxy For An Annotated Corpus, Arthur T. Bugorski Jun 2014

Cosine Similarity For Article Section Classification: Using Structured Abstracts As A Proxy For An Annotated Corpus, Arthur T. Bugorski

Electronic Thesis and Dissertation Repository

During the last decade, the amount of research published in biomedical journals has grown significantly and at an accelerating rate. To fully explore all of this literature, new tools and techniques are needed for both information retrieval and processing. One such tool is the identification and extraction of key claims. In an e ort to work toward claim-extraction, we aim to identify the key areas in the body of the article referred to by text in the abstract. In this project, our work is preliminary to that goal in that we attempt to match specific clauses in the abstract with …


Predicting Music Genre Preferences Based On Online Comments, Andrew J. Sinclair Jun 2014

Predicting Music Genre Preferences Based On Online Comments, Andrew J. Sinclair

Master's Theses

Communication Accommodation Theory (CAT) states that individuals adapt to each other’s communicative behaviors. This adaptation is called “convergence.” In this work we explore the convergence of writing styles of users of the online music distribution plat- form SoundCloud.com. In order to evaluate our system we created a corpus of over 38,000 comments retrieved from SoundCloud in April 2014. The corpus represents comments from 8 distinct musical genres: Classical, Electronic, Hip Hop, Jazz, Country, Metal, Folk, and World. Our corpus contains: short comments, frequent misspellings, little sentence struc- ture, hashtags, emoticons, and URLs. We adapt techniques used by researchers analyzing other …


Alternative Translation Approach – Part I: "Labor Division", Ludvig Glavati Mar 2014

Alternative Translation Approach – Part I: "Labor Division", Ludvig Glavati

Ludvig Glavati

No abstract provided.


Cecl: A New Baseline And A Non-Compositional Approach For The Sick Benchmark., Yves Bestgen Jan 2014

Cecl: A New Baseline And A Non-Compositional Approach For The Sick Benchmark., Yves Bestgen

Yves Bestgen

This paper describes the two procedures for determining the semantic similarities between sentences submitted for the SemEval 2014 Task 1. MeanMaxSim, an unsupervised procedure, is proposed as a new baseline to assess the efficiency gain provided by compositional models. It outperforms a number of other baselines by a wide margin. Compared to the word-overlap baseline, it has the advantage of taking into account the distributional similarity between words that are also involved in compositional models. The second procedure aims at building a predictive model using as predictors MeanMaxSim and (transformed) lexical features describing the differences between each sentence of a …


Quantifying The Development Of Phraseological Competence In L2 English Writing: An Automated Approach, Yves Bestgen, Sylviane Granger Jan 2014

Quantifying The Development Of Phraseological Competence In L2 English Writing: An Automated Approach, Yves Bestgen, Sylviane Granger

Yves Bestgen

Based on the large body of research that shows phraseology to be pervasive in language, this study aims to assess the role played by phraseological competence in the development of L2 writing proficiency and text quality assessment. We propose to use CollGram, a technique that assigns to each pair of contiguous words (bigrams) in a learner text two association scores (mutual information and t-score) computed on the basis of a large reference corpus, the Corpus of Contemporary American English. Applied to the Michigan State University Corpus of second language writing, CollGram shows a longitudinal decrease in the use of collocations …


Position Class Preclusion: A Computational Resolution Of Mutually Exclusive Affix Positions, Rebecca O. Hale Jan 2014

Position Class Preclusion: A Computational Resolution Of Mutually Exclusive Affix Positions, Rebecca O. Hale

Theses and Dissertations--Linguistics

In Paradigm Function Morphology, it is usual to model affix position classes with an ordered sequence of inflectional rule blocks. Each rule block determines how (or whether) a particular affix position is filled. In this model, competition among inflectional rules is assumed to be limited to members of the same rule block; thus, the appearance of an affix in one position cannot be precluded by the appearance of an affix in another position. I present evidence that apparently disconfirms this restriction and suggests that a more general conception of rule competition is necessary. The data appear to imply that an …


Perception Based Misunderstandings In Human-Computer Dialogues, Niels Schütte, John D. Kelleher, Brian Mac Namee Jan 2014

Perception Based Misunderstandings In Human-Computer Dialogues, Niels Schütte, John D. Kelleher, Brian Mac Namee

Articles

In a situated dialogue, misunderstandings may arise if the participants perceive or interpret the environment in different ways. In human-computer dialogue this may be due the sensor errors. We present an experiment system and a series of experiments in which we investigate this problem.