Open Access. Powered by Scholars. Published by Universities.®

Medicine and Health Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Public Health Education and Promotion

Journal

COVID-19

SMU Data Science Review

Articles 1 - 1 of 1

Full-Text Articles in Medicine and Health Sciences

A Novel Methodology To Identify The Primary Topics Contained Within The Covid-19 Research Corpus, Allen Crane, Brock Freidrich, William Fehlman, Igor Frolow, Daniel W. Engels Aug 2020

A Novel Methodology To Identify The Primary Topics Contained Within The Covid-19 Research Corpus, Allen Crane, Brock Freidrich, William Fehlman, Igor Frolow, Daniel W. Engels

SMU Data Science Review

In this paper, we present a novel framework and system for the identification of primary research topics from within a corpus of related publications, the classification of individual publications according to these topics, and the results of the application of our framework and system to the COVID-19 Open Research Dataset (CORD-19). CORD-19 is a corpus of published peer reviewed and pre-peer reviewed articles related to the coronavirus that causes COVID-19. Using machine learning techniques, such as Non-negative Matrix Factorization for Natural Language Processing and a Bayesian classifier, we developed a novel framework and system that automatically extracts sparse and meaningful …