Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

PDF

Chapman University

Cloud Computing

Articles 1 - 1 of 1

Full-Text Articles in Physical Sciences and Mathematics

High Performance Computing Markov Models Using Hadoop Mapreduce, Matthew Shaffer Sep 2014

High Performance Computing Markov Models Using Hadoop Mapreduce, Matthew Shaffer

e-Research: A Journal of Undergraduate Work

In this paper, I will explain how I used the probability modeling tool, Markov Models, in combination with Hadoop MapReduce parallel programming platform in order to quickly and efficiently analyses documents and create a probability model of them. I will explain what Markov Models are, give a brief overview of what MapReduce is, explain why Markov models can be used for document analysis, explain my code of the modeling program, and examine the performance of various MapReduce platforms and techniques in analyzing documents.