Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Engineering

2017

Virginia Commonwealth University

Articles 1 - 1 of 1

Full-Text Articles in Databases and Information Systems

Parsing Metamap Files In Hadoop, Amy Olex, Alberto Cano, Bridget T. Mcinnes Jan 2017

Parsing Metamap Files In Hadoop, Amy Olex, Alberto Cano, Bridget T. Mcinnes

Computer Science Publications

The UMLS::Association CUICollector module identifies UMLS Concept Unique Identifier bigrams and their frequencies in a biomedical text corpus. CUICollector was re-implemented in Hadoop MapReduce to improve algorithm speed, flexibility, and scalability. Evaluation of the Hadoop implementation compared to the serial module produced equivalent results and achieved a 28x speedup on a single-node Hadoop system.