Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

PDF

LSU Doctoral Dissertations

2013

MapReduce

Articles 1 - 2 of 2

Full-Text Articles in Physical Sciences and Mathematics

On-The-Fly Tracing For Data-Centric Computing : Parallelization, Workflow And Applications, Lei Jiang Jan 2013

On-The-Fly Tracing For Data-Centric Computing : Parallelization, Workflow And Applications, Lei Jiang

LSU Doctoral Dissertations

As data-centric computing becomes the trend in science and engineering, more and more hardware systems, as well as middleware frameworks, are emerging to handle the intensive computations associated with big data. At the programming level, it is crucial to have corresponding programming paradigms for dealing with big data. Although MapReduce is now a known programming model for data-centric computing where parallelization is completely replaced by partitioning the computing task through data, not all programs particularly those using statistical computing and data mining algorithms with interdependence can be re-factorized in such a fashion. On the other hand, many traditional automatic parallelization …


A Hybrid Framework Of Iterative Mapreduce And Mpi For Molecular Dynamics Applications, Shuju Bai Jan 2013

A Hybrid Framework Of Iterative Mapreduce And Mpi For Molecular Dynamics Applications, Shuju Bai

LSU Doctoral Dissertations

Developing platforms for large scale data processing has been a great interest to scientists. Hadoop is a widely used computational platform which is a fault-tolerant distributed system for data storage due to HDFS (Hadoop Distributed File System) and performs fault-tolerant distributed data processing in parallel due to MapReduce framework. It is quite often that actual computations require multiple MapReduce cycles, which needs chained MapReduce jobs. However, Design by Hadoop is poor in addressing problems with iterative structures. In many iterative problems, some invariant data is required by every MapReduce cycle. The same data is uploaded to Hadoop file system in …