Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

LSU Doctoral Dissertations

Computer Sciences

Hadoop

Publication Year

Articles 1 - 2 of 2

Full-Text Articles in Physical Sciences and Mathematics

Performance Improvement Of Distributed Computing Framework And Scientific Big Data Analysis, Praveenkumar Kondikoppa Jan 2014

Performance Improvement Of Distributed Computing Framework And Scientific Big Data Analysis, Praveenkumar Kondikoppa

LSU Doctoral Dissertations

Analysis of Big data to gain better insights has been the focus of researchers in the recent past. Traditional desktop computers or database management systems may not be suitable for efficient and timely analysis, due to the requirement of massive parallel processing. Distributed computing frameworks are being explored as a viable solution. For example, Google proposed MapReduce, which is becoming a de facto computing architecture for Big data solutions. However, scheduling in MapReduce is coarse grained and remains as a challenge for improvement. Related with MapReduce scheduler when configured over distributed clusters, we identify two issues: data locality disruption and …


A Hybrid Framework Of Iterative Mapreduce And Mpi For Molecular Dynamics Applications, Shuju Bai Jan 2013

A Hybrid Framework Of Iterative Mapreduce And Mpi For Molecular Dynamics Applications, Shuju Bai

LSU Doctoral Dissertations

Developing platforms for large scale data processing has been a great interest to scientists. Hadoop is a widely used computational platform which is a fault-tolerant distributed system for data storage due to HDFS (Hadoop Distributed File System) and performs fault-tolerant distributed data processing in parallel due to MapReduce framework. It is quite often that actual computations require multiple MapReduce cycles, which needs chained MapReduce jobs. However, Design by Hadoop is poor in addressing problems with iterative structures. In many iterative problems, some invariant data is required by every MapReduce cycle. The same data is uploaded to Hadoop file system in …