Open Access. Powered by Scholars. Published by Universities.®

Computational Linguistics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 7 of 7

Full-Text Articles in Computational Linguistics

Alternative Translation Approach – Part I: "Labor Division", Ludvig Glavati Mar 2014

Alternative Translation Approach – Part I: "Labor Division", Ludvig Glavati

Ludvig Glavati

No abstract provided.


Cecl: A New Baseline And A Non-Compositional Approach For The Sick Benchmark., Yves Bestgen Jan 2014

Cecl: A New Baseline And A Non-Compositional Approach For The Sick Benchmark., Yves Bestgen

Yves Bestgen

This paper describes the two procedures for determining the semantic similarities between sentences submitted for the SemEval 2014 Task 1. MeanMaxSim, an unsupervised procedure, is proposed as a new baseline to assess the efficiency gain provided by compositional models. It outperforms a number of other baselines by a wide margin. Compared to the word-overlap baseline, it has the advantage of taking into account the distributional similarity between words that are also involved in compositional models. The second procedure aims at building a predictive model using as predictors MeanMaxSim and (transformed) lexical features describing the differences between each sentence of a …


Quantifying The Development Of Phraseological Competence In L2 English Writing: An Automated Approach, Yves Bestgen, Sylviane Granger Jan 2014

Quantifying The Development Of Phraseological Competence In L2 English Writing: An Automated Approach, Yves Bestgen, Sylviane Granger

Yves Bestgen

Based on the large body of research that shows phraseology to be pervasive in language, this study aims to assess the role played by phraseological competence in the development of L2 writing proficiency and text quality assessment. We propose to use CollGram, a technique that assigns to each pair of contiguous words (bigrams) in a learner text two association scores (mutual information and t-score) computed on the basis of a large reference corpus, the Corpus of Contemporary American English. Applied to the Michigan State University Corpus of second language writing, CollGram shows a longitudinal decrease in the use of collocations …


Relation Between Harappan And Brahmi Scripts, Subhajit Kumar Ganguly Jan 2013

Relation Between Harappan And Brahmi Scripts, Subhajit Kumar Ganguly

Subhajit Kumar Ganguly

Around 45 odd signs out of the total number of Harappan signs found make up almost 100 percent of the inscriptions, in some form or other, as said earlier. Out of these 45 signs, around 40 are readily distinguishable. These form an almost exclusive and unique set. The primary signs are seen to have many variants, as in Brahmi. Many of these provide us with quite a vivid picture of their evolution, depending upon the factors of time, place and usefulness. Even minor adjustments in such signs, depending upon these factors, are noteworthy. Many of the signs in this list …


Maximizing Classification Accuracy In Native Language Identification, Scott Jarvis, Yves Bestgen, Steve Pepper Jan 2013

Maximizing Classification Accuracy In Native Language Identification, Scott Jarvis, Yves Bestgen, Steve Pepper

Yves Bestgen

This paper reports our contribution to the 2013 NLI Shared Task. The purpose of the task was to train a machine-learning system to identify the native-language affiliations of 1,100 texts written in English by nonnative speakers as part of a high-stakes test of gen- eral academic English proficiency. We trained our system on the new TOEFL11 corpus, which includes 11,000 essays written by nonnative speakers from 11 native-language backgrounds. Our final system used an SVM classifier with over 400,000 unique features consisting of lexical and POS n-grams occur- ring in at least two texts in the training set. Our system …


Evaluation Automatique De Textes Et Cohésion Lexicale, Yves Bestgen Jan 2012

Evaluation Automatique De Textes Et Cohésion Lexicale, Yves Bestgen

Yves Bestgen

(Article in French). Automatic essay grading is currently experiencing a growing popularity because of its importance in the field of education and, particularly, in foreign language learning. While several efficient systems have been developed over the last fifteen years, almost none of them take the discourse level into account. Recently, a few studies proposed to fill this gap by means of automatic indexes of lexical cohesion obtained from Latent Semantic Analysis, but the results were disappointing. Based on a well-known model of writing expertise, the present study proposes a new index of cohesion derived from work on the thematic segmentation …


The Low Entropy Conjecture: The Challenges Of Modern Irish Nominal Declension, Robert Malouf, Farrell Ackerman Jan 2011

The Low Entropy Conjecture: The Challenges Of Modern Irish Nominal Declension, Robert Malouf, Farrell Ackerman

Robert Malouf

No abstract provided.