Open Access. Powered by Scholars. Published by Universities.®

Medicine and Health Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Oncology

GW Research Days 2016 - 2020

Conference

2019

Articles 1 - 1 of 1

Full-Text Articles in Medicine and Health Sciences

Nci Multi-Omics Mislabeling Challenge: A Machine Learning Approach, Yeshwant Chillakuru, Arjun Panda, Sindhu Kubendran, Norman Lee Apr 2019

Nci Multi-Omics Mislabeling Challenge: A Machine Learning Approach, Yeshwant Chillakuru, Arjun Panda, Sindhu Kubendran, Norman Lee

GW Research Days 2016 - 2020

Sample mislabeling is a pervasive problem in biomedical research, especially large-scale multi-omics studies, contributing to errors and leading to false conclusions. The Food and Drug Administration (FDA) and the National Cancer Institute Clinical Proteomic Tumor Analysis Consortium (NCI-CPTA) have launched a data science challenge to address this problem. We developed a novel machine learning based approach that combines traditional machine learning with learning from cancer genomics literature to identify mislabeled tumors in the NCI-CPTA Multi-omics Mislabeling Challenge.

The training data contained a sample of a tumor from 80 different patients, each containing features on gender, microsatellite instability (MSI) status, and …