Open Access. Powered by Scholars. Published by Universities.®

Computer Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 3 of 3

Full-Text Articles in Computer Engineering

Distributed Knowledge Discovery For Diverse Data, Hossein Hamooni Jul 2017

Distributed Knowledge Discovery For Diverse Data, Hossein Hamooni

Computer Science ETDs

In the era of new technologies, computer scientists deal with massive data of size hundreds of terabytes. Smart cities, social networks, health care systems, large sensor networks, etc. are constantly generating new data. It is non-trivial to extract knowledge from big datasets because traditional data mining algorithms run impractically on such big datasets. However, distributed systems have come to aid this problem while introducing new challenges in designing scalable algorithms. The transition from traditional algorithms to the ones that can be run on a distributed platform should be done carefully. Researchers should design the modern distributed algorithms based on the …


Comparative Analysis Of Big Data Analytics Software In Assessing Sample Data, Soly Mathew Biju, Alex Mathew Jun 2017

Comparative Analysis Of Big Data Analytics Software In Assessing Sample Data, Soly Mathew Biju, Alex Mathew

Journal of International Technology and Information Management

Over the last few years, big data has emerged as an important topic of discussion in most firms owing to its ability of creation, storage and processing of content at a reasonable price. Big data consists of advanced tools and techniques to process large volumes of data in organisations. Investment in big data analytics has almost become a necessity in large-sized firms, particularly multinational companies, for its unique benefits, particularly in prediction and identification of various trends. Some of the most popular big data analytics software used today are MapReduce, Hive, Tableau and Hive, while the framework Hadoop enables easy …


Hadoop Framework Implementation And Performance Analysis On A Cloud, Göksu Zeki̇ye Özen, Mehmet Tekerek, Rayi̇mbek Sultanov Jan 2017

Hadoop Framework Implementation And Performance Analysis On A Cloud, Göksu Zeki̇ye Özen, Mehmet Tekerek, Rayi̇mbek Sultanov

Turkish Journal of Electrical Engineering and Computer Sciences

The Hadoop framework uses the MapReduce programming paradigm to process big data by distributing data across a cluster and aggregating. MapReduce is one of the methods used to process big data hosted on large clusters. In this method, jobs are processed by dividing into small pieces and distributing over nodes. Parameters such as distributing method over nodes, the number of jobs held in a parallel fashion, and the number of nodes in the cluster affect the execution time of jobs. The aim of this paper is to determine how the numbers of nodes, maps, and reduces affect the performance of …