Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

2009

University of Massachusetts Amherst

Andrew McCallum

Articles 1 - 11 of 11

Full-Text Articles in Physical Sciences and Mathematics

Efficient Methods For Topic Model Inference On Streaming Document Collections, Limin Yao, David Mimno, Andrew Mccallum Jan 2009

Efficient Methods For Topic Model Inference On Streaming Document Collections, Limin Yao, David Mimno, Andrew Mccallum

Andrew McCallum

Topic models provide a powerful tool for analyzing large text collections by representing high dimensional data in a low dimensional subspace. Fitting a topic model given a set of training documents requires approximate inference techniques that are computationally expensive. With today's large-scale, constantly expanding document collections, it is useful to be able to infer topic distributions for new documents without retraining the model. In this paper, we empirically evaluate the performance of several methods for topic inference in previously unseen documents, including methods based on Gibbs sampling, variational inference, and a new method inspired by text classification. The classification-based inference …


Towards Theoretical Bounds For Resource-Bounded Information Gathering For Correlation Clustering, Pallika Kanani, Andrew Mccallum, Ramesh Sitaraman Jan 2009

Towards Theoretical Bounds For Resource-Bounded Information Gathering For Correlation Clustering, Pallika Kanani, Andrew Mccallum, Ramesh Sitaraman

Andrew McCallum

Resource-bounded Information Gathering for Correlation Clustering deals with designing efficient methods for obtaining and incorporating information from external sources to improve accuracy of clustering tasks. In this paper, we formulate the problem, and some specific goals and lay the foundation for better theoretical understanding of this framework. We address the challenging problem of analytically quantifying the effect of changing a single edge weight on the partitioning of the entire graph, under some simplifying assumptions, hence demonstrating a method to calculate the expected reduction in error. Our analysis of different query selection criteria provides a formal way of comparing different heuristics. …


Bi-Directional Joint Inference For Entity Resolution And Segmentation Using Imperatively-Defined Factor Graphs, Sameer Singh, Karl Schultz, Andrew Mccallum Jan 2009

Bi-Directional Joint Inference For Entity Resolution And Segmentation Using Imperatively-Defined Factor Graphs, Sameer Singh, Karl Schultz, Andrew Mccallum

Andrew McCallum

There has been growing interest in using joint inference across multiple subtasks as a mechanism for avoiding the cascading accumulation of errors in traditional pipelines. Several recent papers demonstrate joint inference between the segmentation of entity mentions and their de-duplication, however, they have various weaknesses: inference information flows only in one direction, the number of uncertain hypotheses is severely limited, or the subtasks are only loosely coupled. This paper presents a highly-coupled, bi-directional approach to joint inference based on efficient Markov chain Monte Carlo sampling in a relational conditional random field. The model is specified with our new probabilistic programming …


Semi-Supervised Learning Of Dependency Parsers Using Generalized Expectation Criteria, Gregory Druck, Gideon Mann, Andrew Mccallum Jan 2009

Semi-Supervised Learning Of Dependency Parsers Using Generalized Expectation Criteria, Gregory Druck, Gideon Mann, Andrew Mccallum

Andrew McCallum

In this paper, we propose a novel method for semi-supervised learning of non-projective log-linear dependency parsers using directly expressed linguistic prior knowledge (e.g.~a noun's parent is often a verb). Model parameters are estimated using a generalized expectation (GE) objective function that penalizes the mismatch between model predictions and linguistic expectation constraints. In a comparison with two prominent "unsupervised" learning methods that require indirect biasing toward the correct syntactic structure, we show that GE can attain better accuracy with as few as 20 intuitive constraints. We also present positive experimental results on longer sentences in multiple languages.


Inference And Learning In Large Factor Graphs With Adaptive Proposal Distributions And A Rank-Based Objective, Khashayar Rohanimanesh, Michael Wick, Andrew Mccallum Jan 2009

Inference And Learning In Large Factor Graphs With Adaptive Proposal Distributions And A Rank-Based Objective, Khashayar Rohanimanesh, Michael Wick, Andrew Mccallum

Andrew McCallum

Large templated factor graphs with complex structure that changes during inference have been shown to provide state-of-the-art experimental results in tasks such as identity uncertainty and information integration. However, inference and learning in these models is notoriously difficult. This paper formalizes, analyzes and proves convergence for the SampleRank algorithm, which learns extremely efficiently by calculating approximate parameter estimation gradients from each proposed MCMC jump. Next we present a parameterized, adaptive proposal distribution, which greatly increases the number of accepted jumps. We combine these methods in experiments on a real-world information extraction problem and demonstrate that the adaptive proposal distribution requires …


An Entity Based Model For Coreference Resolution, Michael Wick, Aron Culotta, Khashayar Rohanimanesh, Andrew Mccallum Jan 2009

An Entity Based Model For Coreference Resolution, Michael Wick, Aron Culotta, Khashayar Rohanimanesh, Andrew Mccallum

Andrew McCallum

Recently, many advanced machine learning approaches have been proposed for coreference resolution; however, all of the discriminatively-trained models reason over mentions, rather than entities. That is, they do not explicitly contain variables indicating the ``canonical'' values for each attribute of an entity (e.g., name, venue, title, etc.). This canonicalization step is typically implemented as a post-processing routine to coreference resolution prior to adding the extracted entity to a database. In this paper, we propose a discriminatively-trained model that jointly performs coreference resolution and canonicalization, enabling features over hypothesized entities. We validate our approach on two different coreference problems: newswire anaphora …


Active Learning By Labeling Features, Gregory Druck, Burr Settles, Andrew Mccallum Jan 2009

Active Learning By Labeling Features, Gregory Druck, Burr Settles, Andrew Mccallum

Andrew McCallum

Methods that learn from prior information about input features such as generalized expectation (GE) have been used to train accurate models with very little effort. In this paper, we propose an active learning approach in which the machine solicits "labels" on features rather than instances. In both simulated and real user experiments on two sequence labeling tasks we show that our active learning method outperforms passive learning with features as well as traditional active learning with instances. Preliminary experiments suggest that novel interfaces which intelligently solicit labels on multiple features facilitate more efficient annotation.


Factorie: Probabilistic Programming Via Imperatively Defined Factor Graphs, Andrew Mccallum, Karl Schultz, Sameer Singh Jan 2009

Factorie: Probabilistic Programming Via Imperatively Defined Factor Graphs, Andrew Mccallum, Karl Schultz, Sameer Singh

Andrew McCallum

Discriminatively trained undirected graphical models have had wide empirical success, and there has been increasing interest in toolkits that ease their application to complex relational data. The power in relational models is in their repeated structure and tied parameters; at issue is how to define these structures in a powerful and flexible way. Rather than using a declarative language, such as SQL or first-order logic, we advocate using an imperative language to express various aspects of model structure, inference, and learning. By combining the traditional, declarative, statistical semantics of factor graphs with imperative definitions of their construction and operation, we …


Generalized Expectation Criteria For Bootstrapping Extractors Using Record-Text Alignment, Kedar Bellare, Andrew Mccallum Jan 2009

Generalized Expectation Criteria For Bootstrapping Extractors Using Record-Text Alignment, Kedar Bellare, Andrew Mccallum

Andrew McCallum

Traditionally, machine learning approaches for information extraction require human annotated data that can be costly and time-consuming to produce. However, in many cases, there already exists a database (DB) with schema related to the desired output, and records related to the expected input text. We present a conditional random field (CRF) that aligns tokens of a given DB record and its realization in text. The CRF model is trained using only the available DB and unlabeled text with generalized expectation criteria. An annotation of the text induced from inferred alignments is used to train an information extractor. We evaluate our …


Alternating Projections For Learning With Expectation Constraints, Kedar Bellare, Gregory Druck, Andrew Mccallum Jan 2009

Alternating Projections For Learning With Expectation Constraints, Kedar Bellare, Gregory Druck, Andrew Mccallum

Andrew McCallum

We present an objective function for learning with unlabeled data that utilizes auxiliary expectation constraints. We optimize this objective function using a procedure that alternates between information and moment projections. Our method provides an alternate interpretation of the posterior regularization framework (Graca et al., 2008), maintains uncertainty during optimization unlike constraint-driven learning (Chang et al., 2007), and is more efficient than generalized expectation criteria (Mann & McCallum, 2008). Applications of this framework include minimally supervised learning, semisupervised learning, and learning with constraints that are more expressive than the underlying model. In experiments, we demonstrate comparable accuracy to generalized expectation criteria …


Training Factor Graphs With Reinforcement Learning For Efficient Map Inference, Michael Wick, Khashayar Rohanimanesh, Sameer Singh, Andrew Mccallum Jan 2009

Training Factor Graphs With Reinforcement Learning For Efficient Map Inference, Michael Wick, Khashayar Rohanimanesh, Sameer Singh, Andrew Mccallum

Andrew McCallum

Large, relational factor graphs with structure defined by first-order logic or other languages give rise to notoriously difficult inference problems. Because unrolling the structure necessary to represent distributions over all hypotheses has exponential blow-up, solutions are often derived from MCMC. However, because of limitations in the design and parameterization of the jump function, these sampling-based methods suffer from local minima--the system must transition through lower-scoring configurations before arriving at a better MAP solution. This paper presents a new method of explicitly selecting fruitful downward jumps by leveraging reinforcement learning (RL). Rather than setting parameters to maximize the likelihood of the …