Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

CSE Conference and Workshop Papers

2012

Middleware

Articles 1 - 1 of 1

Full-Text Articles in Physical Sciences and Mathematics

Hog: Distributed Hadoop Mapreduce On The Grid, Chen He, Derek J. Weitzel, David Swanson, Ying Lu Nov 2012

Hog: Distributed Hadoop Mapreduce On The Grid, Chen He, Derek J. Weitzel, David Swanson, Ying Lu

CSE Conference and Workshop Papers

MapReduce is a powerful data processing platform for commercial and academic applications. In this paper, we build a novel Hadoop MapReduce framework executed on the Open Science Grid which spans multiple institutions across the United States – Hadoop On the Grid (HOG). It is different from previous MapReduce platforms that run on dedicated environments like clusters or clouds. HOG provides a free, elastic, and dynamic MapReduce environment on the opportunistic resources of the grid. In HOG, we improve Hadoop’s fault tolerance for wide area data analysis by mapping data centers across the U.S. to virtual racks and creating multi-institution failure …