Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 1 of 1
Full-Text Articles in Physical Sciences and Mathematics
Hog: Distributed Hadoop Mapreduce On The Grid, Chen He, Derek J. Weitzel, David Swanson, Ying Lu
Hog: Distributed Hadoop Mapreduce On The Grid, Chen He, Derek J. Weitzel, David Swanson, Ying Lu
CSE Conference and Workshop Papers
MapReduce is a powerful data processing platform for commercial and academic applications. In this paper, we build a novel Hadoop MapReduce framework executed on the Open Science Grid which spans multiple institutions across the United States – Hadoop On the Grid (HOG). It is different from previous MapReduce platforms that run on dedicated environments like clusters or clouds. HOG provides a free, elastic, and dynamic MapReduce environment on the opportunistic resources of the grid. In HOG, we improve Hadoop’s fault tolerance for wide area data analysis by mapping data centers across the U.S. to virtual racks and creating multi-institution failure …