Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

PDF

Technological University Dublin

2016

Clustering

Articles 1 - 1 of 1

Full-Text Articles in Engineering

Empirical Comparative Analysis Of 1-Of-K Coding And K-Prototypes In Categorical Clustering, Fei Wang, Hector Franco, John Pugh, Robert J. Ross Sep 2016

Empirical Comparative Analysis Of 1-Of-K Coding And K-Prototypes In Categorical Clustering, Fei Wang, Hector Franco, John Pugh, Robert J. Ross

Conference papers

Clustering is a fundamental machine learning application, which partitions data into homogeneous groups. K-means and its variants are the most widely used class of clustering algorithms today. However, the original k-means algorithm can only be applied to numeric data. For categorical data, the data has to be converted into numeric data through 1-of-K coding which itself causes many problems. K-prototypes, another clustering algorithm that originates from the k-means algorithm, can handle categorical data by adopting a different notion of distance. In this paper, we systematically compare these two methods through an experimental analysis. Our analysis shows that K-prototypes is more …