Open Access. Powered by Scholars. Published by Universities.®

Library and Information Science Commons

Open Access. Powered by Scholars. Published by Universities.®

Syracuse University

School of Information Studies - Faculty Scholarship

2005

Document classification

Articles 1 - 1 of 1

Full-Text Articles in Library and Information Science

Leveraging One-Class Svm And Semantic Analysis To Detect Anomalous Content, Ozgur Yilmazel, Svetlana Symonenko, Niranjan Balasubramanian, Elizabeth D. Liddy Jan 2005

Leveraging One-Class Svm And Semantic Analysis To Detect Anomalous Content, Ozgur Yilmazel, Svetlana Symonenko, Niranjan Balasubramanian, Elizabeth D. Liddy

School of Information Studies - Faculty Scholarship

Experiments were conducted to test several hypotheses on methods for improving document classification for the malicious insider threat problem within the Intelligence Community. Bag-of-words (BOW) representations of documents were compared to Natural Language Processing (NLP) based representations in both the typical and one-class classification problems using the Support Vector Machine algorithm. Results show that the NLP features significantly improved classifier performance over the BOW approach both in terms of precision and recall, while using many fewer features. The one-class algorithm using NLP features demonstrated robustness when tested on new domains.