Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 2 of 2
Full-Text Articles in Physical Sciences and Mathematics
Learning From Labeled Features Using Generalized Expectation Criteria, Gregory Druck, Gideon Mann, Andrew Mccallum
Learning From Labeled Features Using Generalized Expectation Criteria, Gregory Druck, Gideon Mann, Andrew Mccallum
Andrew McCallum
It is difficult to apply machine learning to new domains because often we lack labeled problem instances. In this paper, we provide a solution to this problem that leverages domain knowledge in the form of affinities between input features and classes. For example, in a baseball vs. hockey text classification problem, even without any labeled data, we know that the presence of the word puck is a strong indicator of hockey. We refer to this type of domain knowledge as a labeled feature. In this paper, we propose a method for training discriminative probabilistic models with labeled features and unlabeled …
Rapid Development Of Hindi Named Entity Recognition Using Conditional Random Fields And Feature Induction, Wei Li, Andrew Mccallum
Rapid Development Of Hindi Named Entity Recognition Using Conditional Random Fields And Feature Induction, Wei Li, Andrew Mccallum
Andrew McCallum
This paper describes our application of Conditional Random Fields (CRFs) with feature induction to a Hindi named entity recognition task. With only five days development time and little knowledge of this language, we automatically discover relevant features by providing a large array of lexical tests and using feature induction to automatically construct the features that most increase conditional likelihood. In an effort to reduce overfitting, we use a combination of a Gaussian prior and early-stopping based on the results of 10-fold cross validation.