Medicine and Health Sciences | Open Access Articles

Deep Active Learning For Classifying Cancer Pathology Reports, Kevin De Angeli, Shang Gao, Mohammed Alawad, Hong‑Jun Yoon, Noah Schaeferkoetter, Xiao‑Cheng Wu, Eric B. Durbin, Jennifer Doherty, Antoinette Stroup, Linda Coyle, Lynne Penberthy, Georgia Tourassi Mar 2021

Deep Active Learning For Classifying Cancer Pathology Reports, Kevin De Angeli, Shang Gao, Mohammed Alawad, Hong‑Jun Yoon, Noah Schaeferkoetter, Xiao‑Cheng Wu, Eric B. Durbin, Jennifer Doherty, Antoinette Stroup, Linda Coyle, Lynne Penberthy, Georgia Tourassi

Kentucky Cancer Registry Faculty Publications

Background: Automated text classification has many important applications in the clinical setting; however, obtaining labelled data for training machine learning and deep learning models is often difficult and expensive. Active learning techniques may mitigate this challenge by reducing the amount of labelled data required to effectively train a model. In this study, we analyze the effectiveness of 11 active learning algorithms on classifying subsite and histology from cancer pathology reports using a Convolutional Neural Network as the text classification model.

Results: We compare the performance of each active learning strategy using two differently sized datasets and two different classification tasks. …

Go to article

Limitations Of Transformers On Clinical Text Classification, Shang Gao, Mohammed Alawad, Michael Todd Young, John Gounley, Noah Schaefferkoetter, Hong-Jun Yoon, Xiao-Cheng Wu, Eric B. Durbin, Jennifer Doherty, Antoinette Stroup, Linda Coyle, Georgia D. Tourassi Feb 2021

Limitations Of Transformers On Clinical Text Classification, Shang Gao, Mohammed Alawad, Michael Todd Young, John Gounley, Noah Schaefferkoetter, Hong-Jun Yoon, Xiao-Cheng Wu, Eric B. Durbin, Jennifer Doherty, Antoinette Stroup, Linda Coyle, Georgia D. Tourassi

Kentucky Cancer Registry Faculty Publications

Bidirectional Encoder Representations from Transformers (BERT) and BERT-based approaches are the current state-of-the-art in many natural language processing (NLP) tasks; however, their application to document classification on long clinical texts is limited. In this work, we introduce four methods to scale BERT, which by default can only handle input sequences up to approximately 400 words long, to perform document classification on clinical texts several thousand words long. We compare these methods against two much simpler architectures -- a word-level convolutional neural network and a hierarchical self-attention network -- and show that BERT often cannot beat these simpler baselines when classifying …

Go to article

Using Case-Level Context To Classify Cancer Pathology Reports, Shang Gao, Mohammed Alawad, Noah Schaefferkoetter, Lynne Penberthy, Xiao-Cheng Wu, Eric B. Durbin, Linda Coyle, Arvind Ramanathan, Georgia Tourassi May 2020

Using Case-Level Context To Classify Cancer Pathology Reports, Shang Gao, Mohammed Alawad, Noah Schaefferkoetter, Lynne Penberthy, Xiao-Cheng Wu, Eric B. Durbin, Linda Coyle, Arvind Ramanathan, Georgia Tourassi

Kentucky Cancer Registry Faculty Publications

Individual electronic health records (EHRs) and clinical reports are often part of a larger sequence-for example, a single patient may generate multiple reports over the trajectory of a disease. In applications such as cancer pathology reports, it is necessary not only to extract information from individual reports, but also to capture aggregate information regarding the entire cancer case based off case-level context from all reports in the sequence. In this paper, we introduce a simple modular add-on for capturing case-level context that is designed to be compatible with most existing deep learning architectures for text classification on individual reports. We …

Go to article

Medicine and Health Sciences Commons^™

Full-Text Articles in Medicine and Health Sciences

Deep Active Learning For Classifying Cancer Pathology Reports, Kevin De Angeli, Shang Gao, Mohammed Alawad, Hong‑Jun Yoon, Noah Schaeferkoetter, Xiao‑Cheng Wu, Eric B. Durbin, Jennifer Doherty, Antoinette Stroup, Linda Coyle, Lynne Penberthy, Georgia Tourassi

Kentucky Cancer Registry Faculty Publications

Limitations Of Transformers On Clinical Text Classification, Shang Gao, Mohammed Alawad, Michael Todd Young, John Gounley, Noah Schaefferkoetter, Hong-Jun Yoon, Xiao-Cheng Wu, Eric B. Durbin, Jennifer Doherty, Antoinette Stroup, Linda Coyle, Georgia D. Tourassi

Kentucky Cancer Registry Faculty Publications

Using Case-Level Context To Classify Cancer Pathology Reports, Shang Gao, Mohammed Alawad, Noah Schaefferkoetter, Lynne Penberthy, Xiao-Cheng Wu, Eric B. Durbin, Linda Coyle, Arvind Ramanathan, Georgia Tourassi

Kentucky Cancer Registry Faculty Publications