Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Artificial Intelligence and Robotics

PDF

Singapore Management University

Research Collection School Of Computing and Information Systems

Series

2020

Applied statistics

Articles 1 - 1 of 1

Full-Text Articles in Entire DC Network

Nearest Centroid: A Bridge Between Statistics And Machine Learning, Manoj Thulasidas Dec 2020

Nearest Centroid: A Bridge Between Statistics And Machine Learning, Manoj Thulasidas

Research Collection School Of Computing and Information Systems

In order to guide our students of machine learning in their statistical thinking, we need conceptually simple and mathematically defensible algorithms. In this paper, we present the Nearest Centroid algorithm (NC) algorithm as a pedagogical tool, combining the key concepts behind two foundational algorithms: K-Means clustering and K Nearest Neighbors (k- NN). In NC, we use the centroid (as defined in the K-Means algorithm) of the observations belonging to each class in our training data set and its distance from a new observation (similar to k-NN) for class prediction. Using this obvious extension, we will illustrate how the concepts of …