Open Access. Powered by Scholars. Published by Universities.®

Theory and Algorithms Commons

Open Access. Powered by Scholars. Published by Universities.®

2012

Research Collection School Of Computing and Information Systems

One-pass clustering

Articles 1 - 1 of 1

Full-Text Articles in Theory and Algorithms

An Improved K-Nearest-Neighbor Algorithm For Text Categorization, Shengyi Jiang, Guansong Pang, Meiling Wu, Limin Kuang Jan 2012

An Improved K-Nearest-Neighbor Algorithm For Text Categorization, Shengyi Jiang, Guansong Pang, Meiling Wu, Limin Kuang

Research Collection School Of Computing and Information Systems

Text categorization is a significant tool to manage and organize the surging text data. Many text categorization algorithms have been explored in previous literatures, such as KNN, Naive Bayes and Support Vector Machine. KNN text categorization is an effective but less efficient classification method. In this paper, we propose an improved KNN algorithm for text categorization, which builds the classification model by combining constrained one pass clustering algorithm and KNN text categorization. Empirical results on three benchmark corpora show that our algorithm can reduce the text similarity computation substantially and outperform the-state-of-the-art KNN, Naive Bayes and Support Vector Machine classifiers. …