Open Access. Powered by Scholars. Published by Universities.®

Computational Linguistics Commons

Open Access. Powered by Scholars. Published by Universities.®

250 Full-Text Articles 445 Authors 37,428 Downloads 39 Institutions

All Articles in Computational Linguistics

Faceted Search

250 full-text articles. Page 7 of 12.

Word Learning As Category Formation, Spencer Caplan 2018 University of Pennsylvania

Word Learning As Category Formation, Spencer Caplan

Proceedings of the Society for Computation in Linguistics

A fundamental question in word learning is how, given only evidence about what objects a word has previously referred to, children are able to generalize the total class (Smith, 1979; Xu and Tenenbaum, 2007). E.g. how a child ends up knowing that \textit{`poodle'} only picks out a specific subset of dogs rather than the whole class and vice versa. Here we present a computational model of word learning which accounts for a wide range of previously conflicting experimental findings.


A Structural Theory Of Derivations, Zachary Stone 2018 University of Maryland, College Park

A Structural Theory Of Derivations, Zachary Stone

Proceedings of the Society for Computation in Linguistics

No abstract provided.


Differentiating Phrase Structure Parsing And Memory Retrieval In The Brain, Shohini Bhattasali, John Hale, Christophe Pallier, Jonathan Brennan, Wen-Ming Luh, R. Nathan Spreng 2018 Cornell University

Differentiating Phrase Structure Parsing And Memory Retrieval In The Brain, Shohini Bhattasali, John Hale, Christophe Pallier, Jonathan Brennan, Wen-Ming Luh, R. Nathan Spreng

Proceedings of the Society for Computation in Linguistics

On some level, human sentence comprehension must involve both memory retrieval and structural composition. This study differentiates these two processes using neuroimaging data collected during naturalistic listening. Retrieval is formalized in terms of "multiword expressions" while structure-building is formalized in terms of bottom-up parsing. The results most strongly implicate Anterior Temporal regions for structure-building and Precuneus Cortex for memory retrieval.


Modeling The Complexity And Descriptive Adequacy Of Construction Grammars, Jonathan Dunn 2018 Illinois Institute of Technology

Modeling The Complexity And Descriptive Adequacy Of Construction Grammars, Jonathan Dunn

Proceedings of the Society for Computation in Linguistics

This paper uses the Minimum Description Length paradigm to model the complexity of CxGs (operationalized as the encoding size of a grammar) alongside their descriptive adequacy (operationalized as the encoding size of a corpus given a grammar). These two quantities are combined to measure the quality of potential CxGs against unannotated corpora, supporting discovery-device CxGs for English, Spanish, French, German, and Italian. The results show (i) that these grammars provide significant generalizations as measured using compression and (ii) that more complex CxGs with access to multiple levels of representation provide greater generalizations than single-representation CxGs.


Sound Analogies With Phoneme Embeddings, Miikka P. Silfverberg, Lingshuang Mao, Mans Hulden 2018 University of Colorado

Sound Analogies With Phoneme Embeddings, Miikka P. Silfverberg, Lingshuang Mao, Mans Hulden

Proceedings of the Society for Computation in Linguistics

Vector space models of words in NLP---word embeddings---have been recently shown to reliably encode semantic information, offering capabilities such as solving proportional analogy tasks such as man:woman::king:queen. We study how well these distributional properties carry over to similarly learned phoneme embeddings, and whether phoneme vector spaces align with articulatory distinctive features, using several methods of obtaining such continuous-space representations. We demonstrate a statistically significant correlation between distinctive feature spaces and vector spaces learned with word-context PPMI+SVD and word2vec, showing that many distinctive feature contrasts are implicitly present in phoneme distributions. Furthermore, these distributed representations allow us ...


Conditions On Abruptness In A Gradient-Ascent Maximum Entropy Learner, Elliott Moreton 2018 University of North Carolina, Chapel Hill

Conditions On Abruptness In A Gradient-Ascent Maximum Entropy Learner, Elliott Moreton

Proceedings of the Society for Computation in Linguistics

When does a gradual learning rule yield gradual learning performance? This paper studies a gradient-ascent Maximum Entropy phonotactic learner, as applied to two-alternative forced-choice performance expressed as log-odds. The main result is that slow initial performance cannot accelerate later if the initial weights are near zero, but can if they are not. Stated another way, abruptness in this learner is an effect of transfer, either from Universal Grammar in the form of an initial weighting, or from previous learning in the form of an acquired weighting.


Using Rhetorical Topics For Automatic Summarization, Natalie M. Schrimpf 2018 Yale University

Using Rhetorical Topics For Automatic Summarization, Natalie M. Schrimpf

Proceedings of the Society for Computation in Linguistics

Summarization involves finding the most important information in a text in order to convey the meaning of the document. In this paper, I present a method for using topic information to influence which content is selected for a summary. Texts are divided into topics using rhetorical information that creates a partition of a text into a sequence of non-overlapping topics. To investigate the effect of this topic structure, I compare the output of summarizing an entire text without topics to summarizing individual topics and combining them into a complete summary. The results show that the use of these rhetorical topics ...


Decomposing Phonological Transformations In Serial Derivations, Andrew Lamont 2018 University of Massachusetts, Amherst

Decomposing Phonological Transformations In Serial Derivations, Andrew Lamont

Proceedings of the Society for Computation in Linguistics

While most phonological transformations have been shown to be subsequential, there are tonal processes that do not belong to any subregular class, thereby making it difficult to identify a tighter bound on the complexity of phonological processes than the regular languages. This paper argues that a tighter bound obtains from examining the way transformations are computed: when derived in serial, phonological processes can be decomposed into iterated subsequential maps.


Phonologically Informed Edit Distance Algorithms For Word Alignment With Low-Resource Languages, Richard T. McCoy, Robert Frank 2018 Johns Hopkins University

Phonologically Informed Edit Distance Algorithms For Word Alignment With Low-Resource Languages, Richard T. Mccoy, Robert Frank

Proceedings of the Society for Computation in Linguistics

We present three methods for weighting edit distance algorithms based on linguistic information. These methods base their penalties on (i) phonological features, (ii) distributional character embeddings, or (iii) differences between cognate words. We also introduce a novel method for evaluating edit distance through the task of low-resource word alignment by using edit-distance neighbors in a high-resource pivot language to inform alignments from the low-resource language. At this task, the cognate-based scheme outperforms our other methods and the Levenshtein edit distance baseline, showing that NLP applications can benefit from information about cross-linguistic phonological patterns.


Double Trouble: The Problem Of Construal In Semantic Annotation Of Adpositions, Jena D. Hwang, Archna Bhatia, Na-Rae Han, Tim O'Gorman, Vivek Srikumar, Nathan Schneider 2018 Institute for Human and Machine Cognition

Double Trouble: The Problem Of Construal In Semantic Annotation Of Adpositions, Jena D. Hwang, Archna Bhatia, Na-Rae Han, Tim O'Gorman, Vivek Srikumar, Nathan Schneider

Proceedings of the Society for Computation in Linguistics

We consider the semantics of prepositions, revisiting a broad-coverage annotation scheme used for annotating all preposition tokens in a 55,000-word corpus of English. In an attempt to resolve problematic cases in English and apply the scheme to adpositions and case markers in other languages, we reconsider the assumption that an adposition’s lexical contribution is equivalent to the role/relation that it mediates, embracing the potential for construal to manage complexity and avoid sense proliferation. We suggest a framework to represent both the scene role and the adposition's lexical function, and discuss how it would allow for a ...


Quantitatively Assessing The Development Of Adjective Ordering Preferences Using Child-Directed And Child-Produced Speech Corpora, Galia Bar-Sever, Rachael Lee, Gregory Scontras, Lisa Pearl 2018 University of California, Irvine

Quantitatively Assessing The Development Of Adjective Ordering Preferences Using Child-Directed And Child-Produced Speech Corpora, Galia Bar-Sever, Rachael Lee, Gregory Scontras, Lisa Pearl

Proceedings of the Society for Computation in Linguistics

No abstract provided.


Subregular Complexity Across Speech And Sign, Jon Rawski 2018 Stony Brook University

Subregular Complexity Across Speech And Sign, Jon Rawski

Proceedings of the Society for Computation in Linguistics

No abstract provided.


A Bayesian Investigation Of Factors Shaping The Network Structure Of Inflection Class Systems, Jeff Parker, Robert Reynolds, Andrea D. Sims 2018 Brigham Young University

A Bayesian Investigation Of Factors Shaping The Network Structure Of Inflection Class Systems, Jeff Parker, Robert Reynolds, Andrea D. Sims

Proceedings of the Society for Computation in Linguistics

No abstract provided.


Imdlawn Tashlhiyt Berber Syllabification Is Quantifier-Free, Kristina Strother-Garcia 2018 University of Delaware

Imdlawn Tashlhiyt Berber Syllabification Is Quantifier-Free, Kristina Strother-Garcia

Proceedings of the Society for Computation in Linguistics

Imdlawn Tashlhiyt Berber (ITB) is unusual due to its tolerance of non-vocalic syllabic nuclei. Rule-based and constraint-based accounts of ITB syllabification do not directly address the question of how complex the process is. Model theory and formal logic allow for comparison of complexity across different theories of phonology by identifying the computational power (or expressivity) of linguistic formalisms in a grammar-independent way. With these tools, I develop a mathematical formalism for representing ITB syllabification using Quantifier-Free (QF) logic, one of the least powerful logics known. This result indicates that ITB syllabification is relatively simple from a computational standpoint and that ...


Logical Metonymy In A Distributional Model Of Sentence Comprehension, Emmanuele Chersoni, Alessandro Lenci, Philippe Blache 2018 Aix-Marseille University

Logical Metonymy In A Distributional Model Of Sentence Comprehension, Emmanuele Chersoni, Alessandro Lenci, Philippe Blache

Proceedings of the Society for Computation in Linguistics

No abstract provided.


Exploring The Functional And Geometric Bias Of Spatial Relations Using Neural Language Models, Simon Dobnik, Mehdi Ghanimifard, John Kelleher 2018 University of Gothenberg, Sweden

Exploring The Functional And Geometric Bias Of Spatial Relations Using Neural Language Models, Simon Dobnik, Mehdi Ghanimifard, John Kelleher

Conference papers

The challenge for computational models of spatial descriptions for situated dialogue systems is the integration of information from different modalities. The semantics of spatial descriptions are grounded in at least two sources of information: (i) a geometric representation of space and (ii) the functional interaction of related objects that. We train several neural language models on descriptions of scenes from a dataset of image captions and examine whether the functional or geometric bias of spatial descriptions reported in the literature is reflected in the estimated perplexity of these models. The results of these experiments have implications for the creation of ...


Losing Shahrazad: A Distant Reading Of 1001 Nights, Taysa Mohler 2018 Bard College

Losing Shahrazad: A Distant Reading Of 1001 Nights, Taysa Mohler

Senior Projects Spring 2018

This project is a distant reading analysis of seven 19th and 20th-century English translations of One Thousand and One Nights or The Arabian Nights. Through the use of computer programming and distant reading, it becomes clear that the Nights' frame tale is the carrier of the internal logic and generative power of the story cycle. Further, the frame tale expresses the Nights' self-representation, which serves to undermine the historical use of the Nights as synecdoche for the Orient. Therefore, the translators that remove the frame story from their versions further the Nights' use as an Orientalist object, and take the ...


Onestopenglish Corpus: A New Corpus For Automatic Readability Assessment And Text Simplification, Sowmya Vajjala, Ivana Lucic 2018 Iowa State University

Onestopenglish Corpus: A New Corpus For Automatic Readability Assessment And Text Simplification, Sowmya Vajjala, Ivana Lucic

English Conference Papers, Posters and Proceedings

This paper describes the collection and compilation of the OneStopEnglish corpus of texts written at three reading levels, and demonstrates its usefulness for through two applications - automatic readability assessment and automatic text simplification. The corpus consists of 189 texts, each in three versions (567 in total). The corpus is now freely available under a CC by-SA 4.0 license1 and we hope that it would foster further research on the topics of readability assessment and text simplification.


A Markedly Different Approach: Investigating Pie Stops Using Modern Empirical Methods, Phillip Barnett 2018 University of Kentucky

A Markedly Different Approach: Investigating Pie Stops Using Modern Empirical Methods, Phillip Barnett

Theses and Dissertations--Linguistics

In this thesis, I investigate a decades-old problem found in the stop system of Proto-Indo-European (PIE). More specifically, I will be investigating the paucity of */b/ in the forms reconstructed for the ancient, hypothetical language. As cross-linguistic evidence and phonological theory alone have fallen short of providing a satisfactory answer, herein will I employ modern empirical methods of linguistic investigation, namely laboratory phonology experiments and computational database analysis. Following Byrd 2015, I advocate for an examination of synchronic phenomena and behavior as a method for investigating diachronic change.

In Chapter 1, I present an overview of the various proposed phonological ...


#Hashtags: A Look At The Evaluative Roles Of Hashtags On Twitter, Leah Rose Schaede 2018 University of Kentucky

#Hashtags: A Look At The Evaluative Roles Of Hashtags On Twitter, Leah Rose Schaede

Theses and Dissertations--Linguistics

Social media has become a large part of today’s pop culture and keeping up with what is going on not only in our social circles, but around the world. It has given many a platform to unite their causes, build fandoms, and share their commentary with the world. A tool in helping group posts together or give commentary on a thought is the hashtag. In this paper I explore the evaluative roles of hashtags in social media discourse, specifically on Twitter. I use a sample of randomly selected tweets from the Twitter API stream I collected and compiled myself ...


Digital Commons powered by bepress