Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

2015

Big Data

Discipline
Institution
Publication
Publication Type
File Type

Articles 1 - 15 of 15

Full-Text Articles in Computer Sciences

Data To Decisions For Cyberspace Operations, Steve Stone Dec 2015

Data To Decisions For Cyberspace Operations, Steve Stone

Military Cyber Affairs

In 2011, the United States (U.S.) Department of Defense (DOD) named cyberspace a new operational domain. The U.S. Cyber Command and the Military Services are working to make the cyberspace environment a suitable place for achieving national objectives and enabling military command and control (C2). To effectively conduct cyberspace operations, DOD requires data and analysis of the Mission, Network, and Adversary. However, the DOD’s current data processing and analysis capabilities do not meet mission needs within critical operational timelines. This paper presents a summary of the data processing and analytics necessary to effectively conduct cyberspace operations.


Energy Forecasting For Event Venues: Big Data And Prediction Accuracy, Katarina Grolinger, Alexandra L'Heureux, Miriam Am Capretz, Luke Seewald Dec 2015

Energy Forecasting For Event Venues: Big Data And Prediction Accuracy, Katarina Grolinger, Alexandra L'Heureux, Miriam Am Capretz, Luke Seewald

Electrical and Computer Engineering Publications

Advances in sensor technologies and the proliferation of smart meters have resulted in an explosion of energy-related data sets. These Big Data have created opportunities for development of new energy services and a promise of better energy management and conservation. Sensor-based energy forecasting has been researched in the context of office buildings, schools, and residential buildings. This paper investigates sensor-based forecasting in the context of event-organizing venues, which present an especially difficult scenario due to large variations in consumption caused by the hosted events. Moreover, the significance of the data set size, specifically the impact of temporal granularity, on energy …


Exploring The Role Of Sentiments In Identification Of Active And Influential Bloggers, Mohammad Alghobiri, Umer Ishfaq, Hikmat Ullah Khan, Tahir Afzal Malik Nov 2015

Exploring The Role Of Sentiments In Identification Of Active And Influential Bloggers, Mohammad Alghobiri, Umer Ishfaq, Hikmat Ullah Khan, Tahir Afzal Malik

UBT International Conference

The social Web provides opportunities for the public to have social interactions and online discussions. A large number of online users using the social web sites create a high volume of data. This leads to the emergence of Big Data, which focuses on computational analysis of data to reveal patterns, and associations relating to human interactions. Such analyses have vast applications in various fields such as understanding human behaviors, studying culture influence, and promoting online marketing. The blogs are one of the social web channels that offer a way to discuss various topics. Finding the top bloggers has been a …


The Importance Of Big Data Analytics, Eljona Proko Nov 2015

The Importance Of Big Data Analytics, Eljona Proko

UBT International Conference

Identified as the tendency of IT, Big Data gained global attention. Advances in data analytics are changing the way businesses compete, enabling them to make faster and better decisions based on real-time analysis. Big Data introduces a new set of challenges. Three characteristics define Big Data: volume, variety, and velocity. Big Data requires tools and methods that can be applied to analyze and extract patterns from large-scale data. Companies generate enormous volumes of polystructured data from Web, social network posts, sensors, mobile devices, emails, and many other sources. Companies need a cost-effective, massively scalable solution for capturing, storing, and analyzing …


Data Management In Cloud Environments: Nosql And Newsql Data Stores, Katarina Grolinger, Wilson A. Higashino, Abhinav Tiwari, Miriam Am Capretz May 2015

Data Management In Cloud Environments: Nosql And Newsql Data Stores, Katarina Grolinger, Wilson A. Higashino, Abhinav Tiwari, Miriam Am Capretz

Wilson A Higashino

: Advances in Web technology and the proliferation of mobile devices and sensors connected to the Internet have resulted in immense processing and storage requirements. Cloud computing has emerged as a paradigm that promises to meet these requirements. This work focuses on the storage aspect of cloud computing, specifically on data management in cloud environments. Traditional relational databases were designed in a different hardware and software era and are facing challenges in meeting the performance and scale requirements of Big Data. NoSQL and NewSQL data stores present themselves as alternatives that can handle huge volume of data. Because of the …


Welcome To The Machine: Privacy And Workplace Implications Of Predictive Analytics, Robert Sprague Apr 2015

Welcome To The Machine: Privacy And Workplace Implications Of Predictive Analytics, Robert Sprague

Robert Sprague

Predictive analytics use a method known as data mining to identify trends, patterns, or relationships among data, which can then be used to develop a predictive model. Data mining itself relies upon big data, which is “big” not solely because of its size but also because its analytical potential is qualitatively different. “Big data” analysis allows organizations, including government and businesses, to combine diverse digital datasets and then use statistics and other data mining techniques to extract from them both hidden information and surprising correlations. These data are not necessarily tracking transactional records of atomized behavior, such as the purchasing …


Dcms: A Data Analytics And Management System For Molecular Simulation, Meryem Berrada Mar 2015

Dcms: A Data Analytics And Management System For Molecular Simulation, Meryem Berrada

USF Tampa Graduate Theses and Dissertations

Despite the fact that Molecular Simulation systems represent a major research tool in multiple scientific and engineering fields, there is still a lack of systems for effective data management and fast data retrieval and processing. This is mainly due to the nature of MS which generate a very large amount of data - a system usually encompass millions of data information, and one query usually runs for tens of thousands of time frames. For this purpose, we designed and developed a new application, DCMS (A data Analytics and Management System for molecular Simulation), that intends to speed up the process …


Performance Comparison Of Two Data Mining Algorithms On Big Data Platforms, Md Rajiur Rahman Raju Jan 2015

Performance Comparison Of Two Data Mining Algorithms On Big Data Platforms, Md Rajiur Rahman Raju

Wayne State University Theses

In this Big data era, the need for performing large-scale computations is evident. A better understanding of the most suitable platforms which can efficiently run these computations is needed. In this thesis, we attempt to compare four such big data platforms, namely Hadoop, Spark, GPU, and Multicore CPU. We compare these platforms using two prominent data mining algorithms, namely, K-means clustering and K-nearest neighbour classification and discuss specific implementation-level details. We provide several insights into the best possible implementations of these algorithms and systematically compare the benefits and drawbacks of each of these platforms. We conduct experiments by varying data …


Unsupervised Learning And Image Classification In High Performance Computing Cluster, Itauma Itauma Jan 2015

Unsupervised Learning And Image Classification In High Performance Computing Cluster, Itauma Itauma

Wayne State University Theses

Feature learning and object classification in machine learning have become very active research areas in recent decades. Identifying good features has various benefits for object classification in respect to reducing the computational cost and increasing the classification accuracy. In addition, many research studies have focused on the use of Graphics Processing Units (GPUs) to improve the training time for machine learning algorithms. In this study, the use of an alternative platform, called High Performance Computing Cluster (HPCC), to handle unsupervised feature learning, image and speech classification and improve the computational cost is proposed.

HPCC is a Big Data processing and …


Privacy Preserving Data Mining For Numerical Matrices, Social Networks, And Big Data, Lian Liu Jan 2015

Privacy Preserving Data Mining For Numerical Matrices, Social Networks, And Big Data, Lian Liu

Theses and Dissertations--Computer Science

Motivated by increasing public awareness of possible abuse of confidential information, which is considered as a significant hindrance to the development of e-society, medical and financial markets, a privacy preserving data mining framework is presented so that data owners can carefully process data in order to preserve confidential information and guarantee information functionality within an acceptable boundary.

First, among many privacy-preserving methodologies, as a group of popular techniques for achieving a balance between data utility and information privacy, a class of data perturbation methods add a noise signal, following a statistical distribution, to an original numerical matrix. With the help …


Behavior-Based Anomaly Detection On Big Data, Hyunjoo Kim, Jonghyun Kim, Ikkyun Kim, Tai-Myung Chung Jan 2015

Behavior-Based Anomaly Detection On Big Data, Hyunjoo Kim, Jonghyun Kim, Ikkyun Kim, Tai-Myung Chung

Australian Information Security Management Conference

Recently, cyber-targeted attacks such as APT (Advanced Persistent Threat) are rapidly growing as a social and national threat. It is an intelligent cyber-attack that infiltrates the target organization and enterprise clandestinely using various methods and causes considerable damage by making a final attack after long-term and through preparations. These attacks are threatening cyber worlds such as Internet by infecting and attacking the devices on this environment with the malicious code, and by destroying them or gaining their authorities. Detecting these attacks requires collecting and analysing data from various sources (network, host, security equipment, and devices) over the long haul. Therefore, …


Value Oriented Big Data Processing With Applications, Krishnaprasad Thirunarayan Jan 2015

Value Oriented Big Data Processing With Applications, Krishnaprasad Thirunarayan

Kno.e.sis Publications

We discuss the nature of Big Data and address the role of semantics in analyzing and processing Big Data that arises in the context of Physical-Cyber-Social Systems. To handle Volume, we advocate semantic perception that can convert low-level observational data to higher-level abstractions more suitable for decision- making. To handle Variety, we resort to semantic models and annotations of data so that intelligent processing can be done independent of heterogeneity of data formats and media. To handle Velocity, we seek to use continuous semantics capability to dynamically create event or situation specific models and recognize relevant new concepts, entities and …


Browser Based Visualization For Parameter Spaces Of Big Data Using Client-Server Model, Kurtis M. Glendenning Jan 2015

Browser Based Visualization For Parameter Spaces Of Big Data Using Client-Server Model, Kurtis M. Glendenning

Browse all Theses and Dissertations

Visualization is an important task in data analytics, as it allows researchers to view abstract patterns within the data instead of reading through extensive raw data. Allowing the ability to interact with the visualizations is an essential aspect since it provides the ability to intuitively explore data to find meaning and patterns more efficiently. Interactivity, however, becomes progressively more difficult as the size of the dataset increases. This project begins by leveraging existing web-based data visualization technologies and extends their functionality through the use of parallel processing. This methodology utilizes state-of-the-art techniques, such as Node.js, to split the visualization rendering …


Cepsim: A Simulator For Cloud-Based Complex Event Processing, Wilson Higashino, Miriam Capretz, Luiz Bittencourt Dec 2014

Cepsim: A Simulator For Cloud-Based Complex Event Processing, Wilson Higashino, Miriam Capretz, Luiz Bittencourt

Wilson A Higashino

As one of the Vs defining Big Data, data velocity brings many new challenges to traditional data processing approaches. The adoption of cloud environments in complex event processing (CEP) systems is a recent architectural style that aims to overcome these challenges. Validating cloud-based CEP systems at the required Big Data scale, however, is often a laborious, error-prone, and expensive task. This article presents CEPSim, a new simulator that has been developed to facilitate this validation process. CEPSim extends CloudSim, an existing cloud simulator, with an application model based on directed acyclic graphs that is used to represent continuous CEP queries. …


Data, Analytics And Community-Based Organizations: Transforming Data To Decisions For Community Development, Michael P. Johnson Jr. Dec 2014

Data, Analytics And Community-Based Organizations: Transforming Data To Decisions For Community Development, Michael P. Johnson Jr.

Michael P. Johnson

The past ten years have seen a revolution in two disciplines related to operations and strategy design. “Big Data” has transformed the theory and practice of producing and selling goods and services through methods associated with computer science and information technology. “Analytics” has popularized primarily quantitative models and methods by which organizations and systems can measure multiple aspects of performance. As these fields rely on information technology to collect, store, process and share data, we refer to the collection of knowledge and applications associated with Big Data and analytics as “data analytics and information technology.” The impacts of data analytics …