Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 2 of 2
Full-Text Articles in Computational Linguistics
Phonologically Informed Edit Distance Algorithms For Word Alignment With Low-Resource Languages, Richard T. Mccoy, Robert Frank
Phonologically Informed Edit Distance Algorithms For Word Alignment With Low-Resource Languages, Richard T. Mccoy, Robert Frank
Robert Frank
We present three methods for weighting edit distance algorithms based on linguistic information. These methods base their penalties on (i) phonological features, (ii) distributional character embeddings, or (iii) differences between cognate words. We also introduce a novel method for evaluating edit distance through the task of low-resource word alignment by using edit-distance neighbors in a high-resource pivot language to inform alignments from the low-resource language. At this task, the cognate-based scheme outperforms our other methods and the Levenshtein edit distance baseline, showing that NLP applications can benefit from information about cross-linguistic phonological patterns.
Jabberwocky Parsing: Dependency Parsing With Lexical Noise, Jungo Kasai, Robert Frank
Jabberwocky Parsing: Dependency Parsing With Lexical Noise, Jungo Kasai, Robert Frank
Robert Frank
Parsing models have long benefited from the use of lexical information, and indeed current state-of-the art neural network models for dependency parsing achieve substantial improvements by benefiting from distributed representations of lexical information. At the same time, humans can easily parse sentences with unknown or even novel words, as in Lewis Carroll’s poem Jabberwocky. In this paper, we carry out jabberwocky parsing experiments, exploring how robust a state-of-the-art neural network parser is to the absence of lexical information. We find that current parsing models, at least under usual training regimens, are in fact overly dependent on lexical information, and perform …