Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics

University of Massachusetts Amherst

2007

Artificial Intelligence

Articles 1 - 2 of 2

Full-Text Articles in Entire DC Network

Semi-Supervised Classification With Hybrid Generative/Discriminative Methods, Gregory Druck, Chris Pal, Xiaoping Zhu, Andrew Mccallum Jan 2007

Semi-Supervised Classification With Hybrid Generative/Discriminative Methods, Gregory Druck, Chris Pal, Xiaoping Zhu, Andrew Mccallum

Andrew McCallum

In this paper, we study semi-supervised learning using hybrid generative/discriminative methods. Specifically, we compare two recently proposed frameworks for combining generative and discriminative classifiers and apply them to semi-supervised classification. In both cases we explore the tradeoff between maximizing a discriminative likelihood of labeled data and a generative likelihood of unlabeled data. While prominent semi-supervised learning methods assume low density regions between classes or are subject to generative modeling assumptions, hybrid generative/discriminative methods allow semi-supervised learning in the presence of strongly overlapping classes and reduce the risk of modeling structure in the unlabeled data that is irrelevant for the specific …


Generalized Component Analysis For Text With Heterogeneous Attributes, Xuerui Wang, Chris Pal, Andrew Mccallum Jan 2007

Generalized Component Analysis For Text With Heterogeneous Attributes, Xuerui Wang, Chris Pal, Andrew Mccallum

Andrew McCallum

We present a class of richly structured, undirected hidden variable models suitable for simultaneously modeling text along with other attributes encoded in different modalities. Our model generalizes techniques such as Principal Component Analysis to heterogeneous data types. In contrast to other approaches, this framework allows modalities such as words, authors and timestamps to be captured in their natural, probabilistic encodings. We demonstrate the effectiveness of our framework on the task of author prediction from 13 years of the NIPS conference proceedings and for a recipient prediction task using a 10-month academic email archive of a researcher. Our approach should be …