Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

2002

Computer Science Faculty Publications

Missing data

Articles 1 - 1 of 1

Full-Text Articles in Physical Sciences and Mathematics

A Pseudo Nearest-Neighbor Approach For Missing Data Recovery On Gaussian Random Data Sets, Xiaolu Huang, Qiuming Zhu Nov 2002

A Pseudo Nearest-Neighbor Approach For Missing Data Recovery On Gaussian Random Data Sets, Xiaolu Huang, Qiuming Zhu

Computer Science Faculty Publications

Missing data handling is an important preparation step for most data discrimination or mining tasks. Inappropriate treatment of missing data may cause large errors or false results. In this paper, we study the effect of a missing data recovery method, namely the pseudo- nearest neighbor substitution approach, on Gaussian distributed data sets that represent typical cases in data discrimination and data mining applications. The error rate of the proposed recovery method is evaluated by comparing the clustering results of the recovered data sets to the clustering results obtained on the originally complete data sets. The results are also compared with …