Open Access. Powered by Scholars. Published by Universities.®

Computational Linguistics Commons

Open Access. Powered by Scholars. Published by Universities.®

Discipline
Institution
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 211 - 231 of 231

Full-Text Articles in Computational Linguistics

Frequency Based Incremental Attribute Selection For Gre., John D. Kelleher Jan 2007

Frequency Based Incremental Attribute Selection For Gre., John D. Kelleher

Conference papers

The DIT system uses an incremental greedy search to generate descriptions, similar to the incremental algorithm described in (Dale and Reiter, 1995). The selection of the next attribute to be tested for inclusion in the description is ordered by the absolute frequency of each attribute in the training corpus. Attributes are selected in descending order of frequency (i.e. the attribute that occurred most frequently in the training corpus is selected first). Where two or more attributes have the same frequency of occurrence the first attribute found with that frequency is selected. The type attribute is always included in the description. …


A Classifier To Evaluate Language Specificity In Medical Documents, Trudi Miller '08, Gondy A. Leroy, Samir Chatterjee, Jie Fan, Brian Thoms '09 Jan 2007

A Classifier To Evaluate Language Specificity In Medical Documents, Trudi Miller '08, Gondy A. Leroy, Samir Chatterjee, Jie Fan, Brian Thoms '09

CGU Faculty Publications and Research

Consumer health information written by health care professionals is often inaccessible to the consumers it is written for. Traditional readability formulas examine syntactic features like sentence length and number of syllables, ignoring the target audience's grasp of the words themselves. The use of specialized vocabulary disrupts the understanding of patients with low reading skills, causing a decrease in comprehension. A naive Bayes classifier for three levels of increasing medical terminology specificity (consumer/patient, novice health learner, medical professional) was created with a lexicon generated from a representative medical corpus. Ninety-six percent accuracy in classification was attained. The classifier was then applied …


Active Learning For Part-Of-Speech Tagging: Accelerating Corpus Annotation, Deryle W. Lonsdale, Eric K. Ringger, Peter J. Mcclanahan, Robbie A. Haertel, George Busby, Marc A. Carmen, James Carroll, Kevin Seppi Jan 2007

Active Learning For Part-Of-Speech Tagging: Accelerating Corpus Annotation, Deryle W. Lonsdale, Eric K. Ringger, Peter J. Mcclanahan, Robbie A. Haertel, George Busby, Marc A. Carmen, James Carroll, Kevin Seppi

Faculty Publications

In the construction of a part-of-speech annotated corpus, we are constrained by a fixed budget. A fully annotated corpus is required, but we can afford to label only a subset. We train a Maximum Entropy Markov Model tagger from a labeled subset and automatically tag the remainder. This paper addresses the question of where to focus our manual tagging efforts in order to deliver an annotation of highest quality. In this context, we find that active learning is always helpful. We focus on Query by Uncertainty (QBU) and Query by Committee (QBC) and report on experiments with several baselines and …


Proceedings Of The 4th Acl-Sigsem Workshop On Prepositions At Acl-2007., Fintan Costello, John D. Kelleher, Martin Volk Jan 2007

Proceedings Of The 4th Acl-Sigsem Workshop On Prepositions At Acl-2007., Fintan Costello, John D. Kelleher, Martin Volk

Conference papers

This volume contains the papers presented at the Fourth ACL-SIGSEM Workshop on Prepositions. This workshop is endorsed by the ACL Special Interest Group on Semantics (ACL-SIGSEM), and is hosted in conjunction with ACL 2007, taking place on 28th June, 2007 in Prague, the Czech Republic.


Multilingual Phoneme Models For Rapid Speech Processing System Development, Eric G. Hansen Sep 2006

Multilingual Phoneme Models For Rapid Speech Processing System Development, Eric G. Hansen

Theses and Dissertations

Current speech recognition systems tend to be developed only for commercially viable languages. The resources needed for a typical speech recognition system include hundreds of hours of transcribed speech for acoustic models and 10 to 100 million words of text for language models; both of these requirements can be costly in time and money. The goal of this research is to facilitate rapid development of speech systems to new languages by using multilingual phoneme models to alleviate requirements for large amounts of transcribed speech. The Global Phone database, winch contains transcribed speech from 15 languages, is used as source data …


Xnl-Soar, Incremental Parsing, And The Minimalist Program, Deryle W. Lonsdale, Lareina Hingson, Jamison Cooper-Leavitt, David W. Casbeer, Rebecca Madsen Mar 2006

Xnl-Soar, Incremental Parsing, And The Minimalist Program, Deryle W. Lonsdale, Lareina Hingson, Jamison Cooper-Leavitt, David W. Casbeer, Rebecca Madsen

Faculty Publications

Minimalist Principles (Chomsky 1995)

Hierarchy of Projections (Adger 2003)

Features play a central role

NP, VP symmetry including shells


Speech Recognition Using The Mellin Transform, Jesse R. Hornback Mar 2006

Speech Recognition Using The Mellin Transform, Jesse R. Hornback

Theses and Dissertations

The purpose of this research was to improve performance in speech recognition. Specifically, a new approach was investigating by applying an integral transform known as the Mellin transform (MT) on the output of an auditory model to improve the recognition rate of phonemes through the scale-invariance property of the Mellin transform. Scale-invariance means that as a time-domain signal is subjected to dilations, the distribution of the signal in the MT domain remains unaffected. An auditory model was used to transform speech waveforms into images representing how the brain "sees" a sound. The MT was applied and features were extracted. The …


A Computational Model Of The Referential Semantics Of Projective Prepositions, John D. Kelleher, Josef Van Genabith Jan 2006

A Computational Model Of The Referential Semantics Of Projective Prepositions, John D. Kelleher, Josef Van Genabith

Conference papers

In this paper we present a framework for interpreting locative expressions containing the prepositions in front of and behind. These prepositions have different semantics in the viewer-centred and intrinsic frames of reference (Vandeloise, 1991). We define a model of their semantics in each frame of reference. The basis of these models is a novel parameterized continuum function that creates a 3-D spatial template. In the intrinsic frame of reference the origin used by the continuum function is assumed to be known a priori and object occlusion does not impact on the applicability rating of a point in the spatial template. …


Proximity In Context: An Empirically Grounded Computational Model Of Proximity For Processing Topological Spatial Expression., John D. Kelleher, Geert-Jan Kruijff, Fintan Costello Jan 2006

Proximity In Context: An Empirically Grounded Computational Model Of Proximity For Processing Topological Spatial Expression., John D. Kelleher, Geert-Jan Kruijff, Fintan Costello

Conference papers

The paper presents a new model for context-dependent interpretation of linguistic expressions about spatial proximity between objects in a natural scene. The paper discusses novel psycholinguistic experimental data that tests and verifies the model. The model has been implemented, and enables a conversational robot to identify objects in a scene through topological spatial relations (e.g. ''X near Y''). The model can help motivate the choice between topological and projective prepositions.


Incremental Generation Of Spatial Referring Expressions In Situated Dialogue, John D. Kelleher, Geert-Jan Kruijff Jan 2006

Incremental Generation Of Spatial Referring Expressions In Situated Dialogue, John D. Kelleher, Geert-Jan Kruijff

Conference papers

This paper presents an approach to incrementally generating locative expressions. It addresses the issue of combinatorial explosion inherent in the construction of relational context models by: (a) contextually defining the set of objects in the context that may function as a landmark, and (b) sequencing the order in which spatial relations are considered using a cognitively motivated hierarchy of relations, and visual and discourse salience.


Automatic Creation Of Web Services From Extraction Ontologies, Deryle W. Lonsdale, Cui Tao, Yihong Ding Jan 2006

Automatic Creation Of Web Services From Extraction Ontologies, Deryle W. Lonsdale, Cui Tao, Yihong Ding

Faculty Publications

The Semantic Web promises to provide timely, targeted access to user-specified information online. Though standardized services exist for performing this work, specifying these services is too complex for most people. Annotating these services is also problematic. A similar situation exists for traditional information extraction, where ontologies are increasingly used to specify information used by various extraction methods. The approach we introduce in this paper involves converting such ontologies into executable Java code. These APIs act individually or compositionally as services for Semantic Web extraction.


An Operator-Based Account Of Semantic Processing, Deryle W. Lonsdale, C. Anton Rytting Jan 2006

An Operator-Based Account Of Semantic Processing, Deryle W. Lonsdale, C. Anton Rytting

Faculty Publications

This paper explores issues of psychological plausibility in modeling natural language understanding within Soar, a symbolic cognitive model. It focuses on constructing syntactic and semantic representations in simulated real time, with particular emphasis on word sense disambiguation (WSD). We discuss (i) what level of WSD should be modeled and (ii) how to use resources such as WordNet to inform these models. A preliminary model of coarse-grained WSD is included to show how syntactic, semantic, and other knowledge sources interact in Soar. Finally, we explore issues of interleaving, learning, and integrating other WSD approaches with Soar's native model of learning.


Resolving Automatic Prepositional Phrase Attachments By Non-Statistical Means, Deryle W. Lonsdale, Michael B. Manookin Jan 2004

Resolving Automatic Prepositional Phrase Attachments By Non-Statistical Means, Deryle W. Lonsdale, Michael B. Manookin

Faculty Publications

Prepositional-phrase attachment is a topic of active research in the field of computational linguistics. Properly attaching prepositional phrases to their pertinent constituent proves straightforward for humans, but inferring these attachments in a cognitive modeling system becomes difficult. For example, in the sentence, ‘Ralph threw the frisbee to John,’ the prepositional phrase ‘to John’ will attach to the verb phrase ‘threw’. In another example, ‘Joe saw the dog with fur,’ the prepositional phrase ‘with fur’ will attach directly to the noun phrase ‘the dog.’ Humans would have little difficulty resolving these examples, but for computers this would be difficult.


Integrating Perception, Language And Problem Solving In A Cognitive Agent For A Mobile Robot., Deryle W. Lonsdale, D. Paul Benjamin, Damian M. Lyons Jan 2004

Integrating Perception, Language And Problem Solving In A Cognitive Agent For A Mobile Robot., Deryle W. Lonsdale, D. Paul Benjamin, Damian M. Lyons

Faculty Publications

We are implementing a unified cognitive architecture for a mobile robot. Our goal is to endow a robot agent with the full range of cognitive abilities, including perception, use of natural language, learning and the ability to solve complex problems. The perspective of this work is that an architecture based on a unified theory of robot cognition has the best chance of attaining human-level performance.

This agent architecture is an integration of three theories: a theory of cognition embodied in the Soar system, the RS formal model of sensorimotor activity and an algebraic theory of decomposition and reformulation.

These three …


Combining Learning Approaches For Incremental On-Line Parsing, Deryle W. Lonsdale, Michael B. Manookin Jan 2004

Combining Learning Approaches For Incremental On-Line Parsing, Deryle W. Lonsdale, Michael B. Manookin

Faculty Publications

This paper discusses the integration of two different machine learning approaches to modeling language, NL-Soar and analogical modeling (AM). The resulting hybrid system is capable of functionality that is not possible when using only one of the systems in isolation. After a brief introduction of each system, an explanation is given of how AM is used to provide information useful to NL-Soar for two tasks. Examples are given, and related issues are outlined.


Nl-Soar Update, Deryle W. Lonsdale Jun 2003

Nl-Soar Update, Deryle W. Lonsdale

Faculty Publications

No abstract provided.


Nl-Soar And Lg-Soar: Ongoing Work, Deryle W. Lonsdale Jun 2002

Nl-Soar And Lg-Soar: Ongoing Work, Deryle W. Lonsdale

Faculty Publications

Goals:

Expand Soar knowledge and explore possible uses on-campus

Provide and support an NL capability to the Soar research community

Toolkits, resources, knowledge repositories

Carry out research into the cognitive modeling of linguistic performance


Dialog Act Modeling For Automatic Tagging And Recognition Of Conversational Speech, Andreas Stolcke, Klaus Ries, Noah Coccaro, Elizabeth Shriberg, Rebecca Bates, Daniel Jurafsky, Paul Taylor, Rachel Martin, Carol Van Ess-Dykema, Marie Meteer Sep 2000

Dialog Act Modeling For Automatic Tagging And Recognition Of Conversational Speech, Andreas Stolcke, Klaus Ries, Noah Coccaro, Elizabeth Shriberg, Rebecca Bates, Daniel Jurafsky, Paul Taylor, Rachel Martin, Carol Van Ess-Dykema, Marie Meteer

Integrated Engineering Department Publications

We describe a statistical approach for modeling dialogue acts in conversational speech, i.e., speech-act-like units such as Statement, Question, Back channel, Agreement, Disagreement, and Apology. Our model detects and predicts dialogue acts based on lexical, collocational, and prosodic cues, as well as on the discourse coherence of the dialogue act sequence. The dialogue model is based on treating the discourse structure of a conversation as a hidden Markov model and the individual dialogue acts as observations emanating from the model states. Constraints on the likely sequence of dialogue acts are modeled via a dialogue act n-gram. The statistical dialogue grammar …


The Variable Elision Of Unstressed Vowels In European Portuguese: A Case Study, David James Silva Dec 1993

The Variable Elision Of Unstressed Vowels In European Portuguese: A Case Study, David James Silva

David Silva

European varieties of Portuguese exhibit a process whereby unstressed vowels, particularly schwa, optionally undergo elision: an item such as idade ‘idea’ can be realized as [ida'd] and para Maria ‘for Maria’ may surface as [prɐmɐrí'ɐ]. While previous research in the study of phonological variation of this sort has typically focused on syntactic, morphological, functional, and segmental factors as the primary linguistic conditions for accurately characterizing variable processes (Guy 1980; Poplack & Walter 1986, among many others), less work has been done investigating the role of prosodic factors in this respect. Yet if one believes (along with Nespor and Vogel 1986, …


Concept Association, Sally Yeates Sedelow Jan 1989

Concept Association, Sally Yeates Sedelow

Journal of the Arkansas Academy of Science

The complement to decomposition in scientific research is composition. In human language computing, composition is achieved by way of semantic association and the generation of strings of entities. That generation of strings takes place progressively: e.g., strings of symbols (words), strings of strings (sentences), strings of strings of strings (paragraphs), etc. The mathematical (topological, graph-theoretic) analysis of Roget's Thesaurus (1962) has opened a door onto a broad vista of potential achievements in such areas as artificial intelligence and expert systems, through the analysis of concept association, or concept composition.


A Comparison Of Norm-Referenced, Traditional, And Computer-Assisted Language Assessments, Michel P. Helmke Jan 1987

A Comparison Of Norm-Referenced, Traditional, And Computer-Assisted Language Assessments, Michel P. Helmke

Masters Theses

Current literature in the field of communication disorders suggests that traditional norm-referenced tests may yield erroneous or misleading information regarding a child's level of language acquisition. Additional research suggests that the most valid and reliable technique for determining a client's level of linguistic expertise is language sampling and analysis. Language sampling and analysis has traditionally been rejected as a means of evaluation, especially for the school-age child, due to the length of time necessary to complete such analyses. In recent years, language sampling and analysis techniques have been redesigned as computer software application programs. Computer software application programs may significantly …