Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Technological University Dublin

Series

2006

Case based reasoning

Articles 1 - 1 of 1

Full-Text Articles in Physical Sciences and Mathematics

Feature Based And Feature Free Textual Cbr: A Comparison In Spam Filtering, Sarah Jane Delany, Derek Bridge Jan 2006

Feature Based And Feature Free Textual Cbr: A Comparison In Spam Filtering, Sarah Jane Delany, Derek Bridge

Conference papers

Spam filtering is a text classification task to which Case-Based Reasoning (CBR) has been successfully applied. We describe the ECUE system, which classifies emails using a feature-based form of textual CBR. Then, we describe an alternative way to compute the distances between cases in a feature-free fashion, using a distance measure based on text compression. This distance measure has the advantages of having no set-up costs and being resilient to concept drift. We report an empirical comparison, which shows the feature-free approach to be more accurate than the feature-based system. These results are fairly robust over different compression algorithms in …