Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 17 of 17

Full-Text Articles in Physical Sciences and Mathematics

A Continuous-Time Model Of Topic Co-Occurrence Trends, Wei Li, Xuerui Wang, Andrew Mccallum Jan 2006

A Continuous-Time Model Of Topic Co-Occurrence Trends, Wei Li, Xuerui Wang, Andrew Mccallum

Andrew McCallum

Recent work in statistical topic models has investigated richer structures to capture either temporal or inter-topic correlations. This paper introduces a topic model that combines the advantages of two recently proposed models: (1) The Pachinko Allocation model (PAM), which captures arbitrary topic correlations with a directed acyclic graph (DAG), and (2) the Topics over Time model (TOT), which captures time-localized shifts in topic prevalence with a continuous distribution over timestamps. Our model can thus capture not only temporal patterns in individual topics, but also the temporal patterns in their co-occurrences. We present results on a research paper corpus, showing interesting …


Pachinko Allocation: Dag-Structured Mixture Models Of Topic Correlations, Wei Li, Andrew Mccallum Jan 2006

Pachinko Allocation: Dag-Structured Mixture Models Of Topic Correlations, Wei Li, Andrew Mccallum

Andrew McCallum

Latent Dirichlet allocation (LDA) and other related topic models are increasingly popular tools for summarization and manifold discovery in discrete data. However, LDA does not capture correlations between topics. In this paper, we introduce the pachinko allocation model (PAM), which captures arbitrary, nested, and possibly sparse correlations between topics using a directed acyclic graph (DAG). The leaves of the DAG represent individual words in the vocabulary, while each interior node represents a correlation among its children, which may be words or other interior nodes (topics). PAM provides a flexible alternative to recent work by Blei and Lafferty (2006), which captures …


Combining Generative And Discriminative Methods For Pixel Classification With Multi-Conditional Learning, B. Michael Kelm, Chris Pal, Andrew Mccallum Jan 2006

Combining Generative And Discriminative Methods For Pixel Classification With Multi-Conditional Learning, B. Michael Kelm, Chris Pal, Andrew Mccallum

Andrew McCallum

It is possible to broadly characterize two approaches to probabilistic modeling in terms of generative and discriminative methods. Provided with sufficient training data the discriminative approach is expected to yield superior accuracy as compared to the analogous generative model since no modeling power is expended on the marginal distribution of the features. Conversely, if the model is accurate the generative approach can perform better with less data. In general it is less vulnerable to overfitting and allows one to more easily specify meaningful priors on the model parameters. We investigate multi-conditional learning--a method combining the merits of both approaches. Through …


Tractable Learning And Inference With High-Order Representations, Aron Culotta, Andrew Mccallum Jan 2006

Tractable Learning And Inference With High-Order Representations, Aron Culotta, Andrew Mccallum

Andrew McCallum

Representing high-order interactions in data often results in large models with an intractable number of hidden variables. In these models, inference and learning must operate without instantiating the entire set of variables. This paper presents a Metropolis-Hastings sampling approach to address this issue, and proposes new methods to discriminatively estimate the proposal and target distribution of the sampler using a ranking function over configurations. We demonstrate our approach on the task of paper and author deduplication, showing that our method enables complex, advantageous representations of the data while maintaining tractable learning and inference procedures.


Group And Topic Discovery From Relations And Their Attributes, Xuerui Wang, Natasha Mohanty, Andrew Mccallum Jan 2006

Group And Topic Discovery From Relations And Their Attributes, Xuerui Wang, Natasha Mohanty, Andrew Mccallum

Andrew McCallum

We present a probabilistic generative model of entity relationships and their attributes that simultaneously discovers groups among the entities and topics among the corresponding textual attributes. Block-models of relationship data have been studied in social network analysis for some time. Here we simultaneously cluster in several modalities at once, incorporating the attributes (here, words) associated with certain relationships. Significantly, joint inference allows the discovery of topics to be guided by the emerging groups, and vice-versa. We present experimental results on two large data sets: sixteen years of bills put before the U.S. Senate, comprising their corresponding text and voting records, …


Learning Field Compatibilities To Extract Database Records From Unstructured Text, Michael Wick, Aron Culotta, Andrew Mccallum Jan 2006

Learning Field Compatibilities To Extract Database Records From Unstructured Text, Michael Wick, Aron Culotta, Andrew Mccallum

Andrew McCallum

Named-entity recognition systems extract entities in text by type, such as people, organizations, and locations from unstructured text. Rather than extract these fields in isolation, in this paper we present a record extraction system that clusters fields together into records (i.e. database tuples). We construct a probabilistic model of the compatibility of field values, then employ graph partitioning algorithms to partition fields into cohesive records. We also investigate compatibility functions over sets of fields, rather than simply pairs of fields, to examine how higher representational power can impact performance. We apply our techniques to the task of extracting contact records …


Cc Prediction With Graphical Models, Chris Pal, Andrew Mccallum Jan 2006

Cc Prediction With Graphical Models, Chris Pal, Andrew Mccallum

Andrew McCallum

We address the problem of suggesting who to add as an additional recipient (i.e. cc, or carbon copy) for an email under composition. We address the problem using graphical models for words in the body and subject line of the email as well as the recipients given so far on the email. The problem of cc prediction is closely related to the problem of expert finding in an organization. We show that graphical models present a variety of solutions to these problems. We present results using naively structured models and introduce a powerful new modeling tool: plated factor graphs.


Multi-Conditional Learning For Joint Probability Models With Latent Variables, Chris Pal, Xuerui Wang, Michael Kelm, Andrew Mccallum Jan 2006

Multi-Conditional Learning For Joint Probability Models With Latent Variables, Chris Pal, Xuerui Wang, Michael Kelm, Andrew Mccallum

Andrew McCallum

We introduce Multi-Conditional Learning, a framework for optimizing graphical models based not on joint likelihood, or on conditional likelihood, but based on a product of several marginal conditional likelihoods each relying on common sets of parameters from an underlying joint model and predicting different subsets of variables conditioned on other subsets. When applied to undirected models with latent variables, such as the Harmonium, this approach can result in powerful, structured latent variable representations that combine some of the advantages of conditional random fields with the unsupervised clustering ability of popular topic models, such as latent Dirichlet allocation and its successors. …


On Discriminative And Semi-Supervised Dimensionality Reduction, Chris Pal, Michael Kelm, Xuerui Wang, Greg Druck, Andrew Mccallum Jan 2006

On Discriminative And Semi-Supervised Dimensionality Reduction, Chris Pal, Michael Kelm, Xuerui Wang, Greg Druck, Andrew Mccallum

Andrew McCallum

We are interested in using the goal of making predictions to influence dimensionality reduction procedures. A number of new methods are emerging aimed at combining attributes of generative and discriminative approaches to data modeling. New approaches to semi-supervised learning have also been emerging. We present and apply some new methods to non-linear and richly structured problems comparing and contrasting models designed for computer vision with those designed for text processing and discuss essential properties that need to be preserved when reducing dimensionality.


Integrating Probabilistic Extraction Models And Data Mining To Discover Relations And Patterns In Text, Aron Culotta, Andrew Mccallum, Jonathon Betz Jan 2006

Integrating Probabilistic Extraction Models And Data Mining To Discover Relations And Patterns In Text, Aron Culotta, Andrew Mccallum, Jonathon Betz

Andrew McCallum

In order for relation extraction systems to obtain human-level performance, they must be able to incorporate relational patterns inherent in the data (for example, that one's sister is likely one's mother's daughter, or that children are likely to attend the same college as their parents). Hand-coding such knowledge can be time-consuming and inadequate. Additionally, there may exist many interesting, unknown relational patterns that both improve extraction performance and provide insight into text. We describe a probabilistic extraction model that provides mutual benefits to both ``top-down'' relational pattern discovery and ``bottom-up'' relation extraction.


Joint Group And Topic Discovery From Relations And Text, Andrew Mccallum, Xuerui Wang, Natasha Mohanty Jan 2006

Joint Group And Topic Discovery From Relations And Text, Andrew Mccallum, Xuerui Wang, Natasha Mohanty

Andrew McCallum

We present a probabilistic generative model of entity relationships and textual attributes; the model simultaneously discovers groups among the entities and topics among the corresponding text. Block models of relationship data have been studied in social network analysis for some time, however here we cluster in multiple modalities at once. Significantly, joint inference allows the discovery of groups to be guided by the emerging topics, and vice-versa. We present experimental results on two large data sets: sixteen years of bills put before the U.S. Senate, comprising their corresponding text and voting records, and 43 years of similar data from the …


Multi-Conditional Learning: Generative/Discriminative Training For Clustering And Classification, Andrew Mccallum, Chris Pal, Greg Druck, Xuerui Wang Jan 2006

Multi-Conditional Learning: Generative/Discriminative Training For Clustering And Classification, Andrew Mccallum, Chris Pal, Greg Druck, Xuerui Wang

Andrew McCallum

This paper presents multi-conditional learning MCL), a training criterion based on a product of multiple conditional likelihoods. When combining the traditional conditional probability of "label given input" with a generative probability of "input given label" the later acts as a surprisingly effective regularizer. When applied to models with latent variables, MCL combines the structure-discovery capabilities of generative topic models, such as latent Dirichlet allocation and the exponential family harmonium, with the accuracy and robustness of discriminative classifiers, such as logistic regression and conditional random fields. We present results on several standard text data sets showing significant reductions in classification error …


Topics Over Time: A Nonmarkov Continuoustime Model Of Topical Trends, Xuerui Wang, Andrew Mccallum Jan 2006

Topics Over Time: A Nonmarkov Continuoustime Model Of Topical Trends, Xuerui Wang, Andrew Mccallum

Andrew McCallum

This paper presents an LDA-style topic model that captures not only the low-dimensional structure of data, but also how the structure changes over time. Unlike other recent work that relies on Markov assumptions or discretization of time, here each topic is associated with a continuous distribution over timestamps, and for each generated document, the mixture distribution over topics is influenced by both word co-occurrences and the document's timestamp. Thus, the meaning of a particular topic can be relied upon as constant, but the topics' occurrence and correlations change significantly over time. We present results on nine months of personal email, …


Corrective Feedback And Persistent Learning For Information Extraction, Aron Culotta, Trausti Kristjansson, Andrew Mccallum, Paul Viola Jan 2006

Corrective Feedback And Persistent Learning For Information Extraction, Aron Culotta, Trausti Kristjansson, Andrew Mccallum, Paul Viola

Andrew McCallum

To successfully embed statistical machine learning models in real world applications, two post-deployment capabilities must be provided: (1) the ability to solicit user corrections and (2) the ability to update the model from these corrections. We refer to the former capability as corrective feedback and the latter as persistent learning. While these capabilities have a natural implementation for simple classification tasks such as spam filtering, we argue that a more careful design is required for structured classification tasks. One example of a structured classification task is information extraction, in which raw text is analyzed to automatically populate a database. In …


Practical Markov Logic Containing First-Order Quantifiers With Application To Identity Uncertainty, Aron Culotta, Andrew Mccallum Jan 2006

Practical Markov Logic Containing First-Order Quantifiers With Application To Identity Uncertainty, Aron Culotta, Andrew Mccallum

Andrew McCallum

Markov logic is a highly expressive language recently introduced to specify the connectivity of a Markov network using first-order logic. While Markov logic is capable of constructing arbitrary first-order formulae over the data, the complexity of these formulae is often limited in practice because of the size and connectivity of the resulting network. In this paper, we present approximate inference and estimation methods that incrementally instantiate portions of the network as needed to enable first-order existential and universal quantifiers in Markov logic networks. When applied to the problem of identity uncertainty, this approach results in a conditional probabilistic model that …


First-Order Probabilistic Models For Coreference Resolution, Aron Culotta, Michael Wick, Robert Hall, Andrew Mccallum Jan 2006

First-Order Probabilistic Models For Coreference Resolution, Aron Culotta, Michael Wick, Robert Hall, Andrew Mccallum

Andrew McCallum

Traditional noun phrase coreference resolution systems represent features only of pairs of noun phrases. In this paper, we propose a machine learning method that enables features over sets of noun phrases, resulting in a first-order probabilistic model for coreference. We outline a set of approximations that make this approach practical, and apply our method to the ACE coreference dataset, achieving an 11% error reduction over a comparable method that only considers features of pairs of noun phrases. This result demonstrates an example of how a powerful representation language can be incorporated into a probabilistic model and be scaled efficiently.


Bibliometric Impact Measures Leveraging Topic Analysis, Gideon S. Mann, David Mimno, Andrew Mccallum Jan 2006

Bibliometric Impact Measures Leveraging Topic Analysis, Gideon S. Mann, David Mimno, Andrew Mccallum

Andrew McCallum

Measurements of the impact and history of research literature provide a useful complement to scientific digital library collections. Bibliometric indicators have been extensively studied, mostly in the context of journals. However, journal-based metrics poorly capture topical distinctions in fast-moving fields, and are increasingly problematic in the context of open-access publishing. Recent developments in latent topic models have produced promising results for automatic sub-field discovery. The fine-grained, faceted topics produced by such models provide a more clear view of the topical divisions of a body of research literature and the interactions between those divisions. We demonstrate the usefulness of topic models …