Open Access. Powered by Scholars. Published by Universities.®
- Institution
-
- Western University (6)
- Selected Works (4)
- University of South Florida (4)
- Louisiana State University (3)
- University for Business and Technology in Kosovo (3)
-
- California Polytechnic State University, San Luis Obispo (2)
- Embry-Riddle Aeronautical University (2)
- Singapore Management University (2)
- TÜBİTAK (2)
- Wright State University (2)
- Kennesaw State University (1)
- Nova Southeastern University (1)
- University of New Orleans (1)
- University of North Florida (1)
- University of Tennessee, Knoxville (1)
- Publication Year
- Publication
-
- Electrical and Computer Engineering Publications (3)
- Electronic Thesis and Dissertation Repository (3)
- USF Tampa Graduate Theses and Dissertations (3)
- Browse all Theses and Dissertations (2)
- Katarina Grolinger (2)
-
- LSU Doctoral Dissertations (2)
- Research Collection School Of Computing and Information Systems (2)
- Turkish Journal of Electrical Engineering and Computer Sciences (2)
- UBT International Conference (2)
- CCE Theses and Dissertations (1)
- College of Engineering Summer Undergraduate Research Program (1)
- Doctoral Dissertations (1)
- International Journal of Aviation, Aeronautics, and Aerospace (1)
- International Journal of Business and Technology (1)
- Journal of Digital Forensics, Security and Law (1)
- LSU Master's Theses (1)
- Master of Science in Computer Science Theses (1)
- Master's Theses (1)
- Military Cyber Affairs (1)
- UNF Graduate Theses and Dissertations (1)
- University of New Orleans Theses and Dissertations (1)
- Wilson A Higashino (1)
- Zhengyu Yang (1)
- Publication Type
Articles 1 - 30 of 35
Full-Text Articles in Engineering
Building A Benchmark For Industrial Iot Application, Pranay K. Tiru, Soma Tummala
Building A Benchmark For Industrial Iot Application, Pranay K. Tiru, Soma Tummala
College of Engineering Summer Undergraduate Research Program
In this project, we have developed a rather robust means of processing and displaying large sums of IoT data using several cutting-edge, industry-standard technologies. Our data pipeline integrates physical sensors that send various environmental data like temperature, humidity, and pressure. Once created, the data is then collected at an MQTT broker, streamed through a Kafka cluster, processed within a Spark Cluster, and stored in a Cassandra database.
In order to test the rigidity of the pipeline, we also created virtual sensors. This allowed us to send an immense amount of data, which wasn’t necessarily feasible with just the physical sensors. …
Parallel Algorithms For Scalable Graph Mining: Applications On Big Data And Machine Learning, Naw Safrin Sattar
Parallel Algorithms For Scalable Graph Mining: Applications On Big Data And Machine Learning, Naw Safrin Sattar
University of New Orleans Theses and Dissertations
Parallel computing plays a crucial role in processing large-scale graph data. Complex network analysis is an exciting area of research for many applications in different scientific domains e.g., sociology, biology, online media, recommendation systems and many more. Graph mining is an area of interest with diverse problems from different domains of our daily life. Due to the advancement of data and computing technologies, graph data is growing at an enormous rate, for example, the number of links in social networks is growing every millisecond. Machine/Deep learning plays a significant role for technological accomplishments to work with big data in modern …
Learning Analytics For The Formative Assessment Of New Media Skills, Negar Shabihi
Learning Analytics For The Formative Assessment Of New Media Skills, Negar Shabihi
Electronic Thesis and Dissertation Repository
Recent theories of education have shifted learning environments towards student-centred education. Also, the advancement of technology and the need for skilled individuals in different areas have led to the introduction of new media skills. Along with new pedagogies and content, these changes require new forms of assessment. However, assessment as the core of learning has not been modified as much as other educational aspects. Hence, much attention is required to develop assessment methods based on current educational requirements. To address this gap, we have implemented two data-driven systematic literature reviews to recognize the existing state of the field in the …
A Method For Monitoring Operating Equipment Effectiveness With The Internet Of Things And Big Data, Carl D. Hays Iii
A Method For Monitoring Operating Equipment Effectiveness With The Internet Of Things And Big Data, Carl D. Hays Iii
Master's Theses
The purpose of this paper was to use the Overall Equipment Effectiveness productivity formula in plant manufacturing and convert it to measuring productivity for forklifts. Productivity for a forklift was defined as being available and picking up and moving containers at port locations in Seattle and Alaska. This research uses performance measures in plant manufacturing and applies them to mobile equipment in order to establish the most effective means of analyzing reliability and productivity. Using the Internet of Things to collect data on fifteen forklift trucks in three different locations, this data was then analyzed over a six-month period to …
A New Distributed Anomaly Detection Approach For Log Ids Management Based Ondeep Learning, Murat Koca, Muhammed Ali̇ Aydin, Ahmet Sertbaş, Abdül Hali̇m Zai̇m
A New Distributed Anomaly Detection Approach For Log Ids Management Based Ondeep Learning, Murat Koca, Muhammed Ali̇ Aydin, Ahmet Sertbaş, Abdül Hali̇m Zai̇m
Turkish Journal of Electrical Engineering and Computer Sciences
Today, with the rapid increase of data, the security of big data has become more important than ever for managers. However, traditional infrastructure systems cannot cope with increasingly big data that is created like an avalanche. In addition, as the existing database systems increase licensing costs per transaction, organizations using information technologies are shifting to free and open source solutions. For this reason, we propose an anomaly attack detection model on Apache Hadoop distributed file system (HDFS), which stands out in open source big data analytics, and Apache Spark, which stands out with its speed performance in analysis to reduce …
Design Development And Performance Analysis Of Distributed Least Square Twinsupport Vector Machine For Binary Classification, Bakshi Rohit Prasad, Sonali Agarwal
Design Development And Performance Analysis Of Distributed Least Square Twinsupport Vector Machine For Binary Classification, Bakshi Rohit Prasad, Sonali Agarwal
Turkish Journal of Electrical Engineering and Computer Sciences
Machine learning (ML) on Big Data has gone beyond the capacity of traditional machines and technologies. ML for large scale datasets is the current focus of researchers. Most of the ML algorithms primarily suffer from memory constraints, complex computation, and scalability issues.The least square twin support vector machine (LSTSVM) technique is an extended version of support vector machine (SVM). It is much faster as compared to SVM and is widely used for classification tasks. However, when applied to large scale datasets having millions or billions of samples and/or large number of classes, it causes computational and storage bottlenecks. This paper …
A Study On The Improvement Of Data Collection In Data Centers And Its Analysis On Deep Learning-Based Applications, Dipak Kumar Singh
A Study On The Improvement Of Data Collection In Data Centers And Its Analysis On Deep Learning-Based Applications, Dipak Kumar Singh
LSU Doctoral Dissertations
Big data are usually stored in data center networks for processing and analysis through various cloud applications. Such applications are a collection of data-intensive jobs which often involve many parallel flows and are network bound in the distributed environment. The recent networking abstraction, coflow, for data parallel programming paradigm to express the communication requirements has opened new opportunities to network scheduling for such applications. Therefore, I propose coflow based network scheduling algorithm, Coflourish, to enhance the job completion time for such data-parallel applications, in the presence of the increased background traffic to mimic the cloud environment infrastructure. It outperforms …
Design And Implementation Of Anomaly Detections For User Authentication Framework, Iman Abu Sulayman
Design And Implementation Of Anomaly Detections For User Authentication Framework, Iman Abu Sulayman
Electronic Thesis and Dissertation Repository
Anomaly detection is quickly becoming a very significant tool for a variety of applications such as intrusion detection, fraud detection, fault detection, system health monitoring, and event detection in IoT devices. An application that lacks a strong implementation for anomaly detection is user trait modeling for user authentication purposes. User trait models expose up-to-date representation of the user so that changes in their interests, their learning progress or interactions with the system are noticed and interpreted. The reason behind the lack of adoption in user trait modeling arises from the need of a continuous flow of high-volume data, that is …
Similarity-Based Chained Transfer Learning For Energy Forecasting With Big Data, Yifang Tian, Ljubisa Sehovac, Katarina Grolinger
Similarity-Based Chained Transfer Learning For Energy Forecasting With Big Data, Yifang Tian, Ljubisa Sehovac, Katarina Grolinger
Electrical and Computer Engineering Publications
Smart meter popularity has resulted in the ability to collect big energy data and has created opportunities for large-scale energy forecasting. Machine Learning (ML) techniques commonly used for forecasting, such as neural networks, involve computationally intensive training typically with data from a single building or a single aggregated load to predict future consumption for that same building or aggregated load. With hundreds of thousands of meters, it becomes impractical or even infeasible to individually train a model for each meter. Consequently, this paper proposes Similarity-Based Chained Transfer Learning (SBCTL), an approach for building neural network-based models for many meters by …
Big Five Technologies In Aeronautical Engineering Education: Scoping Review, Ruth Martinez-Lopez
Big Five Technologies In Aeronautical Engineering Education: Scoping Review, Ruth Martinez-Lopez
International Journal of Aviation, Aeronautics, and Aerospace
The constant demands that technology creates in aerospace engineering also influence education. The identification of the technologies with practical application in aerospace engineering is of current interest to decision makers in both universities and industry. A social network approach enhances this scoping review of the research literature to identify the main topics using the Big Five technologies in aerospace engineering education. The conceptual structure of the dataset (n=447) was analyzed from different approaches: at macro-level, a comparative of the digital technology identified by cluster analysis with the number of co-words established in 3 and 8 and, a keyword central structure …
Automatic Identification Of Animals In The Wild: A Comparative Study Between C-Capsule Networks And Deep Convolutional Neural Networks., Joel Kamdem Teto, Ying Xie
Automatic Identification Of Animals In The Wild: A Comparative Study Between C-Capsule Networks And Deep Convolutional Neural Networks., Joel Kamdem Teto, Ying Xie
Master of Science in Computer Science Theses
The evolution of machine learning and computer vision in technology has driven a lot of
improvements and innovation into several domains. We see it being applied for credit decisions, insurance quotes, malware detection, fraud detection, email composition, and any other area having enough information to allow the machine to learn patterns. Over the years the number of sensors, cameras, and cognitive pieces of equipment placed in the wilderness has been growing exponentially. However, the resources (human) to leverage these data into something meaningful are not improving at the same rate. For instance, a team of scientist volunteers took 8.4 years, …
Exploring The Role Of Sentiments In Identification Of Active And Influential Bloggers, Mohammad Alghobiri, Umer Ishfaq, Hikmat Ullah Khan, Tahir Afzal Malik
Exploring The Role Of Sentiments In Identification Of Active And Influential Bloggers, Mohammad Alghobiri, Umer Ishfaq, Hikmat Ullah Khan, Tahir Afzal Malik
International Journal of Business and Technology
The social Web provides opportunities for the public to have social interactions and online discussions. A large number of online users using the social web sites create a high volume of data. This leads to the emergence of Big Data, which focuses on computational analysis of data to reveal patterns, and associations relating to human interactions. Such analyses have vast applications in various fields such as understanding human behaviors, studying culture influence, and promoting online marketing. The blogs are one of the social web channels that offer a way to discuss various topics. Finding the top bloggers has been a …
A New Framework For Securing, Extracting And Analyzing Big Forensic Data, Hitesh Sachdev, Hayden Wimmer, Lei Chen, Carl Rebman
A New Framework For Securing, Extracting And Analyzing Big Forensic Data, Hitesh Sachdev, Hayden Wimmer, Lei Chen, Carl Rebman
Journal of Digital Forensics, Security and Law
Finding new methods to investigate criminal activities, behaviors, and responsibilities has always been a challenge for forensic research. Advances in big data, technology, and increased capabilities of smartphones has contributed to the demand for modern techniques of examination. Smartphones are ubiquitous, transformative, and have become a goldmine for forensics research. Given the right tools and research methods investigating agencies can help crack almost any illegal activity using smartphones. This paper focuses on conducting forensic analysis in exposing a terrorist or criminal network and introduces a new Big Forensic Data Framework model where different technologies of Hadoop and EnCase software are …
Resampling Methods And Visualization Tools For Computer Performance Comparisons In The Presence Of Performance Variation, Samuel Oridge Irving
Resampling Methods And Visualization Tools For Computer Performance Comparisons In The Presence Of Performance Variation, Samuel Oridge Irving
LSU Master's Theses
Performance variability, stemming from non-deterministic hardware and software behaviors or deterministic behaviors such as measurement bias, is a well-known phenomenon of computer systems which increases the difficulty of comparing computer performance metrics and is slated to become even more of a concern as interest in Big Data Analytics increases. Conventional methods use various measures (such as geometric mean) to quantify the performance of different benchmarks to compare computers without considering this variability which may lead to wrong conclusions. In this paper, we propose three resampling methods for performance evaluation and comparison: a randomization test for a general performance comparison between …
Graphmp: An Efficient Semi-External-Memory Big Graph Processing System On A Single Machine, Peng Sun, Yonggang Wen, Nguyen Binh Duong Ta, Xiaokui Xiao
Graphmp: An Efficient Semi-External-Memory Big Graph Processing System On A Single Machine, Peng Sun, Yonggang Wen, Nguyen Binh Duong Ta, Xiaokui Xiao
Research Collection School Of Computing and Information Systems
Recent studies showed that single-machine graph processing systems can be as highly competitive as clusterbased approaches on large-scale problems. While several outof-core graph processing systems and computation models have been proposed, the high disk I/O overhead could significantly reduce performance in many practical cases. In this paper, we propose GraphMP to tackle big graph analytics on a single machine. GraphMP achieves low disk I/O overhead with three techniques. First, we design a vertex-centric sliding window (VSW) computation model to avoid reading and writing vertices on disk. Second, we propose a selective scheduling method to skip loading and processing unnecessary edge …
A Study Of Application-Awareness In Software-Defined Data Center Networks, Chui-Hui Chiu
A Study Of Application-Awareness In Software-Defined Data Center Networks, Chui-Hui Chiu
LSU Doctoral Dissertations
A data center (DC) has been a fundamental infrastructure for academia and industry for many years. Applications in DC have diverse requirements on communication. There are huge demands on data center network (DCN) control frameworks (CFs) for coordinating communication traffic. Simultaneously satisfying all demands is difficult and inefficient using existing traditional network devices and protocols. Recently, the agile software-defined Networking (SDN) is introduced to DCN for speeding up the development of the DCNCF. Application-awareness preserves the application semantics including the collective goals of communications. Previous works have illustrated that application-aware DCNCFs can much more efficiently allocate network resources by explicitly …
Wide-Area Measurement-Driven Approaches For Power System Modeling And Analytics, Hesen Liu
Wide-Area Measurement-Driven Approaches For Power System Modeling And Analytics, Hesen Liu
Doctoral Dissertations
This dissertation presents wide-area measurement-driven approaches for power system modeling and analytics. Accurate power system dynamic models are the very basis of power system analysis, control, and operation. Meanwhile, phasor measurement data provide first-hand knowledge of power system dynamic behaviors. The idea of building out innovative applications with synchrophasor data is promising.
Taking advantage of the real-time wide-area measurements, one of phasor measurements’ novel applications is to develop a synchrophasor-based auto-regressive with exogenous inputs (ARX) model that can be updated online to estimate or predict system dynamic responses.
Furthermore, since auto-regressive models are in a big family, the ARX model …
Data Masking, Encryption, And Their Effect On Classification Performance: Trade-Offs Between Data Security And Utility, Juan C. Asenjo
Data Masking, Encryption, And Their Effect On Classification Performance: Trade-Offs Between Data Security And Utility, Juan C. Asenjo
CCE Theses and Dissertations
As data mining increasingly shapes organizational decision-making, the quality of its results must be questioned to ensure trust in the technology. Inaccuracies can mislead decision-makers and cause costly mistakes. With more data collected for analytical purposes, privacy is also a major concern. Data security policies and regulations are increasingly put in place to manage risks, but these policies and regulations often employ technologies that substitute and/or suppress sensitive details contained in the data sets being mined. Data masking and substitution and/or data encryption and suppression of sensitive attributes from data sets can limit access to important details. It is believed …
Conditional Correlation Analysis, Sanjeev Bhatta
Conditional Correlation Analysis, Sanjeev Bhatta
Browse all Theses and Dissertations
Correlation analysis is a frequently used statistical measure to examine the relationship among variables in different practical applications. However, the traditional correlation analysis uses an overly simplistic method to do so. It measures how two variables are related in an application by examining only their relationship in the entire underlying data space. As a result, traditional correlation analysis may miss a strong correlation between those variables especially when that relationship exists in the small subpopulation of the larger data space. This is no longer acceptable and may lose a fair share of information in this era of Big Data which …
Accelerating Big Data Applications Using Lightweight Virtualization Framework On Enterprise Cloud, Janki Bhimani, Zhengyu Yang, Miriam Leeser, Ningfang Mi
Accelerating Big Data Applications Using Lightweight Virtualization Framework On Enterprise Cloud, Janki Bhimani, Zhengyu Yang, Miriam Leeser, Ningfang Mi
Zhengyu Yang
Metaflow: A Scalable Metadata Lookup Service For Distributed File Systems In Data Centers, Peng Sun, Yonggang Wen, Nguyen Binh Duong Ta, Haiyong Xie
Metaflow: A Scalable Metadata Lookup Service For Distributed File Systems In Data Centers, Peng Sun, Yonggang Wen, Nguyen Binh Duong Ta, Haiyong Xie
Research Collection School Of Computing and Information Systems
In large-scale distributed file systems, efficient metadata operations are critical since most file operations have to interact with metadata servers first. In existing distributed hash table (DHT) based metadata management systems, the lookup service could be a performance bottleneck due to its significant CPU overhead. Our investigations showed that the lookup service could reduce system throughput by up to 70%, and increase system latency by a factor of up to 8 compared to ideal scenarios. In this paper, we present MetaFlow, a scalable metadata lookup service utilizing software-defined networking (SDN) techniques to distribute lookup workload over network components. MetaFlow tackles …
Cepsim: Modelling And Simulation Of Complex Event Processing Systems In Cloud Environments, Wilson A. Higashino, Miriam Am Capretz, Luiz F. Bittencourt
Cepsim: Modelling And Simulation Of Complex Event Processing Systems In Cloud Environments, Wilson A. Higashino, Miriam Am Capretz, Luiz F. Bittencourt
Electrical and Computer Engineering Publications
The emergence of Big Data has had profound impacts on how data are stored and processed. As technologies created to process continuous streams of data with low latency, Complex Event Processing (CEP) and Stream Processing (SP) have often been related to the Big Data velocity dimension and used in this context. Many modern CEP and SP systems leverage cloud environments to provide the low latency and scalability required by Big Data applications, yet validating these systems at the required scale is a research problem per se. Cloud computing simulators have been used as a tool to facilitate reproducible and repeatable …
Data To Decisions For Cyberspace Operations, Steve Stone
Data To Decisions For Cyberspace Operations, Steve Stone
Military Cyber Affairs
In 2011, the United States (U.S.) Department of Defense (DOD) named cyberspace a new operational domain. The U.S. Cyber Command and the Military Services are working to make the cyberspace environment a suitable place for achieving national objectives and enabling military command and control (C2). To effectively conduct cyberspace operations, DOD requires data and analysis of the Mission, Network, and Adversary. However, the DOD’s current data processing and analysis capabilities do not meet mission needs within critical operational timelines. This paper presents a summary of the data processing and analytics necessary to effectively conduct cyberspace operations.
Exploring The Role Of Sentiments In Identification Of Active And Influential Bloggers, Mohammad Alghobiri, Umer Ishfaq, Hikmat Ullah Khan, Tahir Afzal Malik
Exploring The Role Of Sentiments In Identification Of Active And Influential Bloggers, Mohammad Alghobiri, Umer Ishfaq, Hikmat Ullah Khan, Tahir Afzal Malik
UBT International Conference
The social Web provides opportunities for the public to have social interactions and online discussions. A large number of online users using the social web sites create a high volume of data. This leads to the emergence of Big Data, which focuses on computational analysis of data to reveal patterns, and associations relating to human interactions. Such analyses have vast applications in various fields such as understanding human behaviors, studying culture influence, and promoting online marketing. The blogs are one of the social web channels that offer a way to discuss various topics. Finding the top bloggers has been a …
The Importance Of Big Data Analytics, Eljona Proko
The Importance Of Big Data Analytics, Eljona Proko
UBT International Conference
Identified as the tendency of IT, Big Data gained global attention. Advances in data analytics are changing the way businesses compete, enabling them to make faster and better decisions based on real-time analysis. Big Data introduces a new set of challenges. Three characteristics define Big Data: volume, variety, and velocity. Big Data requires tools and methods that can be applied to analyze and extract patterns from large-scale data. Companies generate enormous volumes of polystructured data from Web, social network posts, sensors, mobile devices, emails, and many other sources. Companies need a cost-effective, massively scalable solution for capturing, storing, and analyzing …
Data Management In Cloud Environments: Nosql And Newsql Data Stores, Katarina Grolinger, Wilson A. Higashino, Abhinav Tiwari, Miriam Am Capretz
Data Management In Cloud Environments: Nosql And Newsql Data Stores, Katarina Grolinger, Wilson A. Higashino, Abhinav Tiwari, Miriam Am Capretz
Wilson A Higashino
: Advances in Web technology and the proliferation of mobile devices and sensors connected to the Internet have resulted in immense processing and storage requirements. Cloud computing has emerged as a paradigm that promises to meet these requirements. This work focuses on the storage aspect of cloud computing, specifically on data management in cloud environments. Traditional relational databases were designed in a different hardware and software era and are facing challenges in meeting the performance and scale requirements of Big Data. NoSQL and NewSQL data stores present themselves as alternatives that can handle huge volume of data. Because of the …
Dcms: A Data Analytics And Management System For Molecular Simulation, Meryem Berrada
Dcms: A Data Analytics And Management System For Molecular Simulation, Meryem Berrada
USF Tampa Graduate Theses and Dissertations
Despite the fact that Molecular Simulation systems represent a major research tool in multiple scientific and engineering fields, there is still a lack of systems for effective data management and fast data retrieval and processing. This is mainly due to the nature of MS which generate a very large amount of data - a system usually encompass millions of data information, and one query usually runs for tens of thousands of time frames. For this purpose, we designed and developed a new application, DCMS (A data Analytics and Management System for molecular Simulation), that intends to speed up the process …
Browser Based Visualization For Parameter Spaces Of Big Data Using Client-Server Model, Kurtis M. Glendenning
Browser Based Visualization For Parameter Spaces Of Big Data Using Client-Server Model, Kurtis M. Glendenning
Browse all Theses and Dissertations
Visualization is an important task in data analytics, as it allows researchers to view abstract patterns within the data instead of reading through extensive raw data. Allowing the ability to interact with the visualizations is an essential aspect since it provides the ability to intuitively explore data to find meaning and patterns more efficiently. Interactivity, however, becomes progressively more difficult as the size of the dataset increases. This project begins by leveraging existing web-based data visualization technologies and extends their functionality through the use of parallel processing. This methodology utilizes state-of-the-art techniques, such as Node.js, to split the visualization rendering …
Improvements On Scientific System Analysis, Vladimir Grupchev
Improvements On Scientific System Analysis, Vladimir Grupchev
USF Tampa Graduate Theses and Dissertations
Thanks to the advancement of the modern computer simulation systems, many scientific applications generate, and require manipulation of large volumes of data. Scientific exploration substantially relies on effective and accurate data analysis. The shear size of the generated data, however, imposes big challenges in the process of analyzing the system. In this dissertation we propose novel techniques as well as using some known designs in a novel way in order to improve scientific data analysis.
We develop an efficient method to compute an analytical query called spatial distance histogram (SDH). Special heuristics are exploited to process SDH efficiently and accurately. …
Disaster Data Management In Cloud Environments, Katarina Grolinger
Disaster Data Management In Cloud Environments, Katarina Grolinger
Katarina Grolinger
Facilitating decision-making in a vital discipline such as disaster management requires information gathering, sharing, and integration on a global scale and across governments, industries, communities, and academia. A large quantity of immensely heterogeneous disaster-related data is available; however, current data management solutions offer few or no integration capabilities and limited potential for collaboration. Moreover, recent advances in cloud computing, Big Data, and NoSQL have opened the door for new solutions in disaster data management. In this thesis, a Knowledge as a Service (KaaS) framework is proposed for disaster cloud data management (Disaster-CDM) with the objectives of 1) facilitating information gathering …