Open Access. Powered by Scholars. Published by Universities.®

Computer Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Virginia Commonwealth University

Databases and Information Systems

Articles 1 - 1 of 1

Full-Text Articles in Computer Engineering

Parsing Metamap Files In Hadoop, Amy Olex, Alberto Cano, Bridget T. Mcinnes Jan 2017

Parsing Metamap Files In Hadoop, Amy Olex, Alberto Cano, Bridget T. Mcinnes

Computer Science Publications

The UMLS::Association CUICollector module identifies UMLS Concept Unique Identifier bigrams and their frequencies in a biomedical text corpus. CUICollector was re-implemented in Hadoop MapReduce to improve algorithm speed, flexibility, and scalability. Evaluation of the Hadoop implementation compared to the serial module produced equivalent results and achieved a 28x speedup on a single-node Hadoop system.