Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Astrophysics and Astronomy

PDF

Conference papers

Active learning

Articles 1 - 1 of 1

Full-Text Articles in Physical Sciences and Mathematics

Off To A Good Start: Using Clustering To Select The Initial Training Set In Active Learning, Rong Hu, Brian Mac Namee, Sarah Jane Delany Jan 2010

Off To A Good Start: Using Clustering To Select The Initial Training Set In Active Learning, Rong Hu, Brian Mac Namee, Sarah Jane Delany

Conference papers

Active learning (AL) is used in textual classification to alleviate the cost of labelling documents for training. An important issue in AL is the selection of a representative sample of documents to label for the initial training set that seeds the process, and clustering techniques have been successfully used in this regard. However, the clustering techniques used are nondeterministic which causes inconsistent behaviour in the AL process. In this paper we first illustrate the problems associated with using non-deterministic clustering for initial training set selection in AL. We then examine the performance of three deterministic clustering techniques for this task …