Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 1 of 1
Full-Text Articles in Physical Sciences and Mathematics
Rapid Development Of Hindi Named Entity Recognition Using Conditional Random Fields And Feature Induction, Wei Li, Andrew Mccallum
Rapid Development Of Hindi Named Entity Recognition Using Conditional Random Fields And Feature Induction, Wei Li, Andrew Mccallum
Andrew McCallum
This paper describes our application of Conditional Random Fields (CRFs) with feature induction to a Hindi named entity recognition task. With only five days development time and little knowledge of this language, we automatically discover relevant features by providing a large array of lexical tests and using feature induction to automatically construct the features that most increase conditional likelihood. In an effort to reduce overfitting, we use a combination of a Gaussian prior and early-stopping based on the results of 10-fold cross validation.