Open Access. Powered by Scholars. Published by Universities.®
Databases and Information Systems Commons™
Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 1 of 1
Full-Text Articles in Databases and Information Systems
Parsing Metamap Files In Hadoop, Amy Olex, Alberto Cano, Bridget T. Mcinnes
Parsing Metamap Files In Hadoop, Amy Olex, Alberto Cano, Bridget T. Mcinnes
Computer Science Publications
The UMLS::Association CUICollector module identifies UMLS Concept Unique Identifier bigrams and their frequencies in a biomedical text corpus. CUICollector was re-implemented in Hadoop MapReduce to improve algorithm speed, flexibility, and scalability. Evaluation of the Hadoop implementation compared to the serial module produced equivalent results and achieved a 28x speedup on a single-node Hadoop system.