Open Access. Powered by Scholars. Published by Universities.®

Biomedical Informatics Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Science ETDs

Theses/Dissertations

2023

Articles 1 - 1 of 1

Full-Text Articles in Biomedical Informatics

Machine Learning Methods For Computational Phenotyping Using Patient Healthcare Data With Noisy Labels, Praveen Kumar Feb 2023

Machine Learning Methods For Computational Phenotyping Using Patient Healthcare Data With Noisy Labels, Praveen Kumar

Computer Science ETDs

Positive and Unlabeled (PU) learning problems abound in many real-world applications. In healthcare informatics, diagnosed patients are considered labeled positive for a specific disease, but being undiagnosed does not mean they can be labeled negative. PU learning can improve classification performance, and estimate the positive fraction, α, among unlabeled samples. However, algorithms based on the Selected Completely At Random (SCAR) assumption are inadequate when the SCAR assumption fails (e.g., severe cases overrepresented), and when class imbalance is substantial. This dissertation presents and evaluates new algorithms to overcome these limitations. The proposed methods outperform the state-of-art for α-estimation, enhance classification performance, …