Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 2 of 2
Full-Text Articles in Computer Engineering
Scalable Sentiment Analytics, Aslan Baki̇rov, Kevser Nur Çoğalmiş, Ahmet Bulut
Scalable Sentiment Analytics, Aslan Baki̇rov, Kevser Nur Çoğalmiş, Ahmet Bulut
Turkish Journal of Electrical Engineering and Computer Sciences
Spark has become a widely popular analytics framework that provides an implementation of the equally popular MapReduce programming model. Hadoop is an Apache foundation framework that can be used for processing large datasets on a cluster of computers using the MapReduce programming model. Mahout is an Apache foundation project developed for building scalable machine learning libraries, which includes built-in machine learning classifiers. In this paper, we show how to build a simple text classifier on Spark, Apache Hadoop, and Apache Mahout for extracting out sentiments from a text collection containing millions of text documents. Using a collection of 7 million …
A Mapreduce-Based Distributed Svm Algorithm For Binary Classification, Ferhat Özgür Çatak, Mehmet Erdal Balaban
A Mapreduce-Based Distributed Svm Algorithm For Binary Classification, Ferhat Özgür Çatak, Mehmet Erdal Balaban
Turkish Journal of Electrical Engineering and Computer Sciences
Although the support vector machine (SVM) algorithm has a high generalization property for classifying unseen examples after the training phase~and a small loss value, the algorithm is not suitable for real-life classification and regression problems. SVMs cannot solve hundreds of thousands of examples in a training dataset. In previous studies on distributed machine-learning algorithms, the SVM was trained in a costly and preconfigured computer environment. In this research, we present a MapReduce-based distributed parallel SVM training algorithm for binary classification problems. This work shows how to distribute optimization problems over cloud computing systems with the MapReduce technique. In the second …