Open Access. Powered by Scholars. Published by Universities.®

Computational Linguistics Commons

Open Access. Powered by Scholars. Published by Universities.®

Yves Bestgen

2013

Articles 1 - 1 of 1

Full-Text Articles in Computational Linguistics

Maximizing Classification Accuracy In Native Language Identification, Scott Jarvis, Yves Bestgen, Steve Pepper Jan 2013

Maximizing Classification Accuracy In Native Language Identification, Scott Jarvis, Yves Bestgen, Steve Pepper

Yves Bestgen

This paper reports our contribution to the 2013 NLI Shared Task. The purpose of the task was to train a machine-learning system to identify the native-language affiliations of 1,100 texts written in English by nonnative speakers as part of a high-stakes test of gen- eral academic English proficiency. We trained our system on the new TOEFL11 corpus, which includes 11,000 essays written by nonnative speakers from 11 native-language backgrounds. Our final system used an SVM classifier with over 400,000 unique features consisting of lexical and POS n-grams occur- ring in at least two texts in the training set. Our system …