Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

University of Massachusetts Amherst

Selected Works

2012

Articles 1 - 12 of 12

Full-Text Articles in Entire DC Network

Proceedings Of The Oss 2012 Doctoral Consortium, Klaas-Jan Stol, Charles M. Schweik, Imed Hammouda Sep 2012

Proceedings Of The Oss 2012 Doctoral Consortium, Klaas-Jan Stol, Charles M. Schweik, Imed Hammouda

Charles M. Schweik

Papers accepted (and revised) by doctoral students who participated in the Open Source Systems (OSS) 2012 Doctoral Consortium, Hammamet, Tunisia


Combining Joint Models For Biomedical Event Extraction, David Mcclosky, Sebastian Riedel, Mihai Surdeanu, Andrew Mccallum, Christopher D. Manning Jun 2012

Combining Joint Models For Biomedical Event Extraction, David Mcclosky, Sebastian Riedel, Mihai Surdeanu, Andrew Mccallum, Christopher D. Manning

Andrew McCallum

Background: We explore techniques for performing model combination between the UMass and Stanford biomedical event extraction systems. Both sub-components address event extraction as a structured prediction problem, and use dual decomposition (UMass) and parsing algorithms (Stanford) to find the best scoring event structure. Our primary focus is on stacking where the predictions from the Stanford system are used as features in the UMass system. For comparison, we look at simpler model combination techniques such as intersection and union which require only the outputs from each system and combine them directly. Results: First, we find that stacking substantially improves performance while …


Identifying Independence In Relational Models, Marc Maier, David Jensen Jun 2012

Identifying Independence In Relational Models, Marc Maier, David Jensen

David Jensen

The rules of d-separation provide a framework for deriving conditional independence facts from model structure. However, this theory only applies to simple directed graphical models. We introduce relational d-separation, a theory for deriving conditional independence in relational models. We provide a sound, complete, and computationally efficient method for relational d-separation, and we present empirical results that demonstrate effectiveness.


Topic Models Conditioned On Arbitrary Features With Dirichlet-Multinomial Regression, David Mimno, Andrew Mccallum Jun 2012

Topic Models Conditioned On Arbitrary Features With Dirichlet-Multinomial Regression, David Mimno, Andrew Mccallum

Andrew McCallum

Although fully generative models have been successfully used to model the contents of text documents, they are often awkward to apply to combinations of text data and document metadata. In this paper we propose a Dirichlet-multinomial regression (DMR) topic model that includes a log-linear prior on document-topic distributions that is a function of observed features of the document, such as author, publication venue, references, and dates. We show that by selecting appropriate features, DMR topic models can meet or exceed the performance of several previously published topic models designed for specic data.


Inference By Minimizing Size, Divergence, Or Their Sum, Sebastian Riedel, David A. Smith, Andrew Mccallum Jan 2012

Inference By Minimizing Size, Divergence, Or Their Sum, Sebastian Riedel, David A. Smith, Andrew Mccallum

Andrew McCallum

We speed up marginal inference by ignoring factors that do not significantly contribute to overall accuracy. In order to pick a suitable subset of factors to ignore, we propose three schemes: minimizing the number of model factors under a bound on the KL divergence between pruned and full models; minimizing the KL divergence under a bound on factor count; and minimizing the weighted sum of KL divergence and factor count. All three problems are solved using an approximation of the KL divergence than can be calculated in terms of marginals computed on a simple seed graph. Applied to synthetic image …


Unsupervised Relation Discovery With Sense Disambiguation, Limin Yao, Sebastian Riedel, Andrew Mccallum Jan 2012

Unsupervised Relation Discovery With Sense Disambiguation, Limin Yao, Sebastian Riedel, Andrew Mccallum

Andrew McCallum

To discover relation types from text, most methods cluster shallow or syntactic patterns of relation mentions, but consider only one possible sense per pattern. In practice this assumption is often violated. In this paper we overcome this issue by inducing clusters of pattern senses from feature representations of patterns. In particular, we employ a topic model to partition entity pairs associated with patterns into sense clusters using local and global features. We merge these sense clusters into semantic relations using hierarchical agglomerative clustering. We compare against several baselines: a generative latent-variable model, a clustering method that does not disambiguate between …


A Discriminative Hierarchical Model For Fast Coreference At Large Scale, Michael Wick, Sameer Singh, Andrew Mccallum Jan 2012

A Discriminative Hierarchical Model For Fast Coreference At Large Scale, Michael Wick, Sameer Singh, Andrew Mccallum

Andrew McCallum

Methods that measure compatibility between mention pairs are currently the dominant approach to coreference. However, they suffer from a number of drawbacks including difficulties scaling to large numbers of mentions and limited representational power. As the severity of these drawbacks continue to progress with the growing demand for more data, the need to replace the pairwise approaches with a more expressive, highly scalable alternative is becoming increasingly urgent. In this paper we propose a novel discriminative hierarchical model that recursively structures entities into trees. These trees succinctly summarize the mentions providing a highly-compact information-rich structure for reasoning about entities and …


Map Inference In Chains Using Column Generation, David Belanger, Alexandre Passos, Sebastian Riedel, Andrew Mccallum Jan 2012

Map Inference In Chains Using Column Generation, David Belanger, Alexandre Passos, Sebastian Riedel, Andrew Mccallum

Andrew McCallum

Linear chains and trees are basic building blocks in many applications of graphical models. Although exact inference in these models can be performed by dynamic programming, this computation can still be prohibitively expensive with non-trivial target variable domain sizes due to the quadratic dependence on this size. Standard message-passing algorithms for these problems are inefficient because they compute scores on hypotheses for which there is strong negative local evidence. For this reason there has been significant previous interest in beam search and its variants; however, these methods provide only approximate inference. This paper presents new efficient exact inference algorithms based …


Learning To Speed Up Map Decoding With Column Generation, D. Belanger, A. Passos, S. Riedel, Andrew Mccallum Jan 2012

Learning To Speed Up Map Decoding With Column Generation, D. Belanger, A. Passos, S. Riedel, Andrew Mccallum

Andrew McCallum

In this paper, we show how the connections between max-product message passing for max-product and linear programming relaxations allow for a more efficient exact algorithm for the MAP problem. Our proposed algorithm uses column generation to pass messages only on a small subset of the possible assignments to each variable, while guaranteeing to find the exact solution. This algorithm is three times faster than Viterbi decoding for part-of-speech tagging on WSJ data and equivalently fast as beam search with a beam of size two while being exact. The empirical performance of column generation depends on how quickly we can rule …


Monte Carlo Mcmc: Efficient Inference By Approximate Sampling, Sameer Singh, Michael Wick, Andrew Mccallum Jan 2012

Monte Carlo Mcmc: Efficient Inference By Approximate Sampling, Sameer Singh, Michael Wick, Andrew Mccallum

Andrew McCallum

Conditional random fields and other graphical models have achieved state of the art results in a variety of tasks such as coreference, relation extraction, data integration, and parsing. Increasingly, practitioners are using models with more complex structure---higher tree-width, larger fan-out, more features, and more data---rendering even approximate inference methods such as MCMC inefficient. In this paper we propose an alternative MCMC sampling scheme in which transition probabilities are approximated by sampling from the set of relevant factors. We demonstrate that our method converges more quickly than a traditional MCMC sampler for both marginal and MAP inference. In an author coreference …


Monte Carlo Mcmc: Efficient Inference By Sampling Factors, Sameer Singh, Michael Wick, Andrew Mccallum Jan 2012

Monte Carlo Mcmc: Efficient Inference By Sampling Factors, Sameer Singh, Michael Wick, Andrew Mccallum

Andrew McCallum

Discriminative graphical models such as conditional random fields and Markov logic net- works have achieved state of the art results in a variety of NLP and IE tasks including coreference and relation extraction. Increasingly, automated knowledge extraction is demanding models with more complex structure— higher tree-width, larger fan-out, more features, more data—rendering even approximate inference methods such as MCMC inefficient. In this paper we propose a new MCMC sampling scheme where transition probabilities are approximated. We demonstrate that our method converges more quickly than a traditional MCMC sampler for both marginal and MAP inference. For a task of author coreference …


Probabilistic Databases Of Universal Schema, Limin Yao, Sebastian Riedel, Andrew Mccallum Jan 2012

Probabilistic Databases Of Universal Schema, Limin Yao, Sebastian Riedel, Andrew Mccallum

Andrew McCallum

In data integration we transform information from a source into a target schema. A general problem in this task is loss of fidelity and coverage: the source expresses more knowledge than can fit into the target schema, or knowledge that is hard to fit into any schema at all. This problem is taken to an extreme in information extraction (IE) where the source is natural language. To address this issue, one can either automatically learn a latent schema emergent in text (a brittle and ill-defined task), or manually extend schemas. We propose instead to store data in a probabilistic database …