Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

PDF

Computer Sciences

Big data

Institution
Publication Year
Publication
Publication Type

Articles 1 - 30 of 35

Full-Text Articles in Engineering

Development Opportunities And Application Prospects Of Aero-Engine Simulation Technology Under Digital Transformation, Jianguo Cao Jan 2023

Development Opportunities And Application Prospects Of Aero-Engine Simulation Technology Under Digital Transformation, Jianguo Cao

Journal of System Simulation

Abstract: The development of China's social economy and the improvement of its national defense capability in the new era put forward higher requirements for the development of aero-engines. It is urgent to promote the digital transformation of aero-engines in order to achieve coordinated, agile and efficient aero-engine development. Based on the current research and development of aero-engine in China, this paper clarifies the new connotation of "speediness and efficiency, accurate mapping, comprehensive coverage, and dynamic prediction" given by the development of emerging cutting-edge technologies to aero-engine simulation technology, as well as the new technical features of "spatio-temporal ubiquity, data driven, …


Compilation Optimizations To Enhance Resilience Of Big Data Programs And Quantum Processors, Travis D. Lecompte Nov 2022

Compilation Optimizations To Enhance Resilience Of Big Data Programs And Quantum Processors, Travis D. Lecompte

LSU Doctoral Dissertations

Modern computers can experience a variety of transient errors due to the surrounding environment, known as soft faults. Although the frequency of these faults is low enough to not be noticeable on personal computers, they become a considerable concern during large-scale distributed computations or systems in more vulnerable environments like satellites. These faults occur as a bit flip of some value in a register, operation, or memory during execution. They surface as either program crashes, hangs, or silent data corruption (SDC), each of which can waste time, money, and resources. Hardware methods, such as shielding or error correcting memory (ECM), …


Lightweight Distributed Computing Framework For Orchestrating High Performance Computing And Big Data, Muhammed Numan İnce, Meli̇h Günay, Joseph Ledet May 2022

Lightweight Distributed Computing Framework For Orchestrating High Performance Computing And Big Data, Muhammed Numan İnce, Meli̇h Günay, Joseph Ledet

Turkish Journal of Electrical Engineering and Computer Sciences

In recent years, the need for the ability to work remotely and subsequently the need for the availability of remote computer-based systems has increased substantially. This trend has seen a dramatic increase with the onset of the 2020 pandemic. Often local data is produced, stored, and processed in the cloud to remedy this flood of computation and storage needs. Historically, HPC (high performance computing) and the concept of big data have been utilized for the storage and processing of large data. However, both HPC and Hadoop can be utilized as solutions for analytical work, though the differences between these may …


Deapsecure Computational Training For Cybersecurity: Third-Year Improvements And Impacts, Bahador Dodge, Jacob Strother, Rosby Asiamah, Karina Arcaute, Wirawan Purwanto, Masha Sosonkina, Hongyi Wu Apr 2022

Deapsecure Computational Training For Cybersecurity: Third-Year Improvements And Impacts, Bahador Dodge, Jacob Strother, Rosby Asiamah, Karina Arcaute, Wirawan Purwanto, Masha Sosonkina, Hongyi Wu

Modeling, Simulation and Visualization Student Capstone Conference

The Data-Enabled Advanced Training Program for Cybersecurity Research and Education (DeapSECURE) was introduced in 2018 as a non-degree training consisting of six modules covering a broad range of cyberinfrastructure techniques, including high performance computing, big data, machine learning and advanced cryptography, aimed at reducing the gap between current cybersecurity curricula and requirements needed for advanced research and industrial projects. By its third year, DeapSECURE, like many other educational endeavors, experienced abrupt changes brought by the COVID-19 pandemic. The training had to be retooled to adapt to fully online delivery. Hands-on activities were reformatted to accommodate self-paced learning. In this paper, …


Data-Driven Operational And Safety Analysis Of Emerging Shared Electric Scooter Systems, Qingyu Ma Dec 2021

Data-Driven Operational And Safety Analysis Of Emerging Shared Electric Scooter Systems, Qingyu Ma

Computational Modeling & Simulation Engineering Theses & Dissertations

The rapid rise of shared electric scooter (E-Scooter) systems offers many urban areas a new micro-mobility solution. The portable and flexible characteristics have made E-Scooters a competitive mode for short-distance trips. Compared to other modes such as bikes, E-Scooters allow riders to freely ride on different facilities such as streets, sidewalks, and bike lanes. However, sharing lanes with vehicles and other users tends to cause safety issues for riding E-Scooters. Conventional methods are often not applicable for analyzing such safety issues because well-archived historical crash records are not commonly available for emerging E-Scooters.

Perceiving the growth of such a micro-mobility …


Can Subjective Pain Be Inferred From Objective Physiological Data? Evidence From Patients With Sickle Cell Disease, Mark J. Panaggio, Daniel M. Abrams, Fan Yang, Tanvi Banerjee, Nirmish R. Shah Mar 2021

Can Subjective Pain Be Inferred From Objective Physiological Data? Evidence From Patients With Sickle Cell Disease, Mark J. Panaggio, Daniel M. Abrams, Fan Yang, Tanvi Banerjee, Nirmish R. Shah

Computer Science and Engineering Faculty Publications

Patients with sickle cell disease (SCD) experience lifelong struggles with both chronic and acute pain, often requiring medical interventMaion. Pain can be managed with medications, but dosages must balance the goal of pain mitigation against the risks of tolerance, addiction and other adverse effects. Setting appropriate dosages requires knowledge of a patient's subjective pain, but collecting pain reports from patients can be difficult for clinicians and disruptive for patients, and is only possible when patients are awake and communicative. Here we investigate methods for estimating SCD patients' pain levels indirectly using vital signs that are routinely collected and documented in …


Administrative Law In The Automated State, Cary Coglianese Jan 2021

Administrative Law In The Automated State, Cary Coglianese

All Faculty Scholarship

In the future, administrative agencies will rely increasingly on digital automation powered by machine learning algorithms. Can U.S. administrative law accommodate such a future? Not only might a highly automated state readily meet longstanding administrative law principles, but the responsible use of machine learning algorithms might perform even better than the status quo in terms of fulfilling administrative law’s core values of expert decision-making and democratic accountability. Algorithmic governance clearly promises more accurate, data-driven decisions. Moreover, due to their mathematical properties, algorithms might well prove to be more faithful agents of democratic institutions. Yet even if an automated state were …


Performance Optimization Of Big Data Computing Workflows For Batch And Stream Data Processing In Multi-Clouds, Huiyan Cao Dec 2020

Performance Optimization Of Big Data Computing Workflows For Batch And Stream Data Processing In Multi-Clouds, Huiyan Cao

Dissertations

Workflow techniques have been widely used as a major computing solution in many science domains. With the rapid deployment of cloud infrastructures around the globe and the economic benefits of cloud-based computing and storage services, an increasing number of scientific workflows have migrated or are in active transition to clouds. As the scale of scientific applications continues to grow, it is now common to deploy various data- and network-intensive computing workflows such as serial computing workflows, MapReduce/Spark-based workflows, and Storm-based stream data processing workflows in multi-cloud environments, where inter-cloud data transfer oftentimes plays a significant role in both workflow performance …


Smt Bounded Constrained Non Centralized Automaton Web Service Model Checking, Wei Rong, Xibing Shen, Yang Yi Aug 2020

Smt Bounded Constrained Non Centralized Automaton Web Service Model Checking, Wei Rong, Xibing Shen, Yang Yi

Journal of System Simulation

Abstract: In model checking for web services applications, the combination of traditional finite state machine cannot guarantee the correctness of web service composition, a method of non centralized automaton model for web service detection algorithm was put forward,based on which could meet of mode theory (satisfiability modulo of the nanocomposite, SMT). The SMT was used to detect the bounded model of timed automata, and the time automaton was directly converted into SMT identifiable logic formula and was solved; using the SMT timed automata theory, implementation of employee travel arrangements for composite web service was modeled and verified. Through …


Qos-Aware Scheduling For Data Intensive Workflow, Wan Cong, Cuirong Wang, Wang Cong Jul 2020

Qos-Aware Scheduling For Data Intensive Workflow, Wan Cong, Cuirong Wang, Wang Cong

Journal of System Simulation

Abstract: The development of technology enables people to access resources from different data centers. Resource management and scheduling of applications, such as workflow, that are deployed on the cloud computing environment have already become a hot spot. A QoS-aware scheduling algorithm for data intensive workflow on multiple data center environment was proposed. Scheduling data intensive workflow on multiple data center environment has two characteristics: A large amount of data is distributed in different geographical locations, the process of data migration will consume a large amount of time and bandwidth; secondly, the data centers have different price and resources. Data migration …


Approach To Process Smart Grid Time-Serial Big Data Based On Hbase, Wang Yuan, Tao Ye, Yuan Jun, He Wei Jul 2020

Approach To Process Smart Grid Time-Serial Big Data Based On Hbase, Wang Yuan, Tao Ye, Yuan Jun, He Wei

Journal of System Simulation

Abstract: With the development of critical theories and technologies in Internet of things (IOT), more and more attentions have been focused on the IOT applications. Smart Grid is one of the typical IOT applications on which a huge number of sensors have been deployed to gather and generate time-serial data to make sense of the running states of the key devices. How to apply these data to make smart grid running secure and stable is a hot research topic. By considering the fact that smart grid is characterized by a huge number of devices, a huge amount of data and …


Research On Technology Of Data Storage And Access In High-Throughput Simulation, Zishuo Wang, Yanlong Zhai, Wenjun Tao, Yang Hao, Zhang Han, Duzheng Qing Jun 2020

Research On Technology Of Data Storage And Access In High-Throughput Simulation, Zishuo Wang, Yanlong Zhai, Wenjun Tao, Yang Hao, Zhang Han, Duzheng Qing

Journal of System Simulation

Abstract: Aiming at the massive data processing requirements of simulation application in the big data environment, the concept and reference structure of high-throughput simulation were proposed and defined in accordance with the architecture and technical characteristics of high-throughput big data computing. For the problem of data access bottleneck in high-throughput simulation, a high-throughput simulation data storage and access system was designed based on distributed memory file system, and the non-volatile memory was integrated to improve the throughput of data access. The experimental results of typical simulation applications show that the high-throughput simulation storage system with distributed memory file system and …


Parallel Method For Extracting Pulses From Multi-Source Massive Partial Discharge Signals, Liuwang Wang, Yongli Zhu, Yafei Jia Jun 2020

Parallel Method For Extracting Pulses From Multi-Source Massive Partial Discharge Signals, Liuwang Wang, Yongli Zhu, Yafei Jia

Journal of System Simulation

Abstract: Aiming at the issue of discharge pulse extraction for multi-source and massive PD signals, a novel parallel method based on Message Passing Interface was proposed. The proposed method applied a parallel mode called manager-worker-writer. In this method, a manager dynamically assigned task to several workers, and these workers executed tasks in parallel and a writer received results from workers in real time, so data management was separated from task execution. In addition, the manager identified sources of PD signals and sent them to workers as the keys for analyzing different data files and setting algorithm parameters, so multi-source and …


Optimization Of Real-Time Wireless Sensor Based Big Data With Deep Autoencoder Network: A Tourism Sector Application With Distributed Computing, Beki̇r Aksoy, Utku Kose Jan 2020

Optimization Of Real-Time Wireless Sensor Based Big Data With Deep Autoencoder Network: A Tourism Sector Application With Distributed Computing, Beki̇r Aksoy, Utku Kose

Turkish Journal of Electrical Engineering and Computer Sciences

Internet usage has increased rapidly with the development of information communication technologies. The increase in internet usage led to the growth of data volumes on the internet and the emergence of the big data concept. Therefore, it has become even more important to analyze the data and make it meaningful. In this study, 690 million queries and approximately 5.9 quadrillion data collected daily from different servers were recorded on the Redis servers by using real-time big data analysis method and load balance structure for a company operating in the tourism sector. Here, wireless networks were used as a triggering factor …


Disaster Damage Categorization Applying Satellite Images And Machine Learning Algorithm, Farinaz Sabz Ali Pour, Adrian Gheorghe Jan 2020

Disaster Damage Categorization Applying Satellite Images And Machine Learning Algorithm, Farinaz Sabz Ali Pour, Adrian Gheorghe

Engineering Management & Systems Engineering Faculty Publications

Special information has a significant role in disaster management. Land cover mapping can detect short- and long-term changes and monitor the vulnerable habitats. It is an effective evaluation to be included in the disaster management system to protect the conservation areas. The critical visual and statistical information presented to the decision-makers can help in mitigation or adaption before crossing a threshold. This paper aims to contribute in the academic and the practice aspects by offering a potential solution to enhance the disaster data source effectiveness. The key research question that the authors try to answer in this paper is how …


System Analysis Method Based On Simulation Big Data, Guangya Si, Wang Fei, Liu Yang Nov 2019

System Analysis Method Based On Simulation Big Data, Guangya Si, Wang Fei, Liu Yang

Journal of System Simulation

Abstract: Wargaming and exploratory simulation with large-scale simulation systems produce massive simulation data. These data contain many complexity patterns of war, and are significant samples for studying the mechanism of war. Based on the definition of simulation big data, an analysis framework based on simulation big data is proposed, which is divided into three levels: simulation environment and data planning, big data acquisition and storage, and analysis and mining. The simulation data planning and analysis and mining are briefly introduced.


Design Of Personnel Big Data Management System Based On Blockchain, Houbing Song, Jian Chen, Zhihan Lv Sep 2019

Design Of Personnel Big Data Management System Based On Blockchain, Houbing Song, Jian Chen, Zhihan Lv

Houbing Song

With the continuous development of information technology, enterprises, universities and governments are constantly stepping up the construction of electronic personnel information management system. The information of hundreds of thousands or even millions of people’s information are collected and stored into the system. So much information provides the cornerstone for the development of big data, if such data is tampered with or leaked, it will cause irreparable serious damage. However, in recent years, electronic archives have exposed a series of problems such as information leakage, information tampering, and information loss, which has made the reform of personnel information management more and …


Design Of Personnel Big Data Management System Based On Blockchain, Houbing Song, Jian Chen, Zhihan Lv Jul 2019

Design Of Personnel Big Data Management System Based On Blockchain, Houbing Song, Jian Chen, Zhihan Lv

Publications

With the continuous development of information technology, enterprises, universities and governments are constantly stepping up the construction of electronic personnel information management system. The information of hundreds of thousands or even millions of people’s information are collected and stored into the system. So much information provides the cornerstone for the development of big data, if such data is tampered with or leaked, it will cause irreparable serious damage. However, in recent years, electronic archives have exposed a series of problems such as information leakage, information tampering, and information loss, which has made the reform of personnel information management more and …


Using Big Data Analytics To Improve Hiv Medical Care Utilisation In South Carolina: A Study Protocol, Bankole Olatosi, Jiajia Zhang, Sharon Weissman, Jianjun Hu, Mohammad Rifat Haider, Xiaoming Li Jun 2019

Using Big Data Analytics To Improve Hiv Medical Care Utilisation In South Carolina: A Study Protocol, Bankole Olatosi, Jiajia Zhang, Sharon Weissman, Jianjun Hu, Mohammad Rifat Haider, Xiaoming Li

Faculty Publications

Introduction Linkage and retention in HIV medical care remains problematic in the USA. Extensive health utilisation data collection through electronic health records (EHR) and claims data represent new opportunities for scientific discovery. Big data science (BDS) is a powerful tool for investigating HIV care utilisation patterns. The South Carolina (SC) office of Revenue and Fiscal Affairs (RFA) data warehouse captures individual-level longitudinal health utilisation data for persons living with HIV (PLWH). The data warehouse includes EHR, claims and data from private institutions, housing, prisons, mental health, Medicare, Medicaid, State Health Plan and the department of health and human services. The …


A Data-Driven Approach For Modeling Agents, Hamdi Kavak Apr 2019

A Data-Driven Approach For Modeling Agents, Hamdi Kavak

Computational Modeling & Simulation Engineering Theses & Dissertations

Agents are commonly created on a set of simple rules driven by theories, hypotheses, and assumptions. Such modeling premise has limited use of real-world data and is challenged when modeling real-world systems due to the lack of empirical grounding. Simultaneously, the last decade has witnessed the production and availability of large-scale data from various sensors that carry behavioral signals. These data sources have the potential to change the way we create agent-based models; from simple rules to driven by data. Despite this opportunity, the literature has neglected to offer a modeling approach to generate granular agent behaviors from data, creating …


Parallel Pattern Recognition Of Leak Current Data Using Spark-Knn, Li Li, Yongli Zhu, Yaqi Song Jan 2019

Parallel Pattern Recognition Of Leak Current Data Using Spark-Knn, Li Li, Yongli Zhu, Yaqi Song

Journal of System Simulation

Abstract: With the rapid development of smart grid, the status monitoring data of power grid equipment increase exponentially and gradually form the big data. Traditional computing architectures are no longer to meet the demand of computing performance. This paper explores how Spark and Cloud computing can accelerate performance of missive insulator leak current data pattern recognition. The Parallel KNN (k-Nearest Neighbor) algorithm is designed and implemented by using Spark and Aliyun E-MapReduce cloud computing platform. The results from experiments show that the performance of Spark-KNN is 2.97 times of MapReduce-KNN and gains acceleration of 8.8 times. The experimental results confirm …


Association Rules Analysis Method Of Spatial Data Under Mapreduce Framework, Mingzhi Zhang, Li Yi Jan 2019

Association Rules Analysis Method Of Spatial Data Under Mapreduce Framework, Mingzhi Zhang, Li Yi

Journal of System Simulation

Abstract: Spatial data has the characteristic of extensity, timeliness, multidimensional, large amount of data and complex relations. Some non-conventional data screening tool for analysis and mining is required to find out the patterns, rules and characteristics knowledge in the spatial big data for battlefield situation awareness and battle space management. In view that the existing Apriori algorithm scans the database too frequently, the Apriori algorithm is improved on the basis of working principle of Map Reduce .The fast analysis ideas and technologyframework of spatial data is proposed. An elementary validate prototype is built for the key technology experimentation.Experimental results …


Data Analysis Through Social Media According To The Classified Crime, Serkan Savaş, Nuretti̇n Topaloğlu Jan 2019

Data Analysis Through Social Media According To The Classified Crime, Serkan Savaş, Nuretti̇n Topaloğlu

Turkish Journal of Electrical Engineering and Computer Sciences

The amount and variety of data generated through social media sites has increased along with the widespread use of social media sites. In addition, the data production rate has increased in the same way. The inclusion of personal information within these data makes it important to process the data and reach meaningful information within it. This process can be called intelligence and this meaningful information may be for commercial, academic, or security purposes. An example application is developed in this study for intelligence on Twitter. Crimes in Turkey are classified according to Turkish Statistical Institute criminal data and keywords are …


Big Data Investment And Knowledge Integration In Academic Libraries, Saher Manaseer, Afnan R. Alawneh, Dua Asoudi Jan 2019

Big Data Investment And Knowledge Integration In Academic Libraries, Saher Manaseer, Afnan R. Alawneh, Dua Asoudi

Copyright, Fair Use, Scholarly Communication, etc.

Recently, big data investment has become important for organizations, especially with the fast growth of data following the huge expansion in the usage of social media applications, and websites. Many organizations depend on extracting and reaching the needed reports and statistics. As the investments on big data and its storage have become major challenges for organizations, many technologies and methods have been developed to tackle those challenges.

One of such technologies is Hadoop, a framework that is used to divide big data into packages and distribute those packages through nodes to be processed, consuming less cost than the traditional storage …


Web Personalization Issues In Big Data And Semantic Web: Challenges Andopportunities, Bujar Raufi, Florije Ismaili, Jaumin Ajdari, Xhemal Zenuni Jan 2019

Web Personalization Issues In Big Data And Semantic Web: Challenges Andopportunities, Bujar Raufi, Florije Ismaili, Jaumin Ajdari, Xhemal Zenuni

Turkish Journal of Electrical Engineering and Computer Sciences

Web personalization is a process that utilizes a set of methods, techniques, and actions for adapting the linking structure of an information space or its content or both to user interaction preferences. The aim of personalization is to enhance the user experience by retrieving relevant resources and presenting them in a meaningful fashion. The advent of big data introduced new challenges that locate user modeling and personalization community in a new research setting. In this paper, we introduce the research challenges related to Web personalization analyzed in the context of big data and the Semantic Web. This paper also introduces …


Transparency And Algorithmic Governance, Cary Coglianese, David Lehr Jan 2019

Transparency And Algorithmic Governance, Cary Coglianese, David Lehr

All Faculty Scholarship

Machine-learning algorithms are improving and automating important functions in medicine, transportation, and business. Government officials have also started to take notice of the accuracy and speed that such algorithms provide, increasingly relying on them to aid with consequential public-sector functions, including tax administration, regulatory oversight, and benefits administration. Despite machine-learning algorithms’ superior predictive power over conventional analytic tools, algorithmic forecasts are difficult to understand and explain. Machine learning’s “black-box” nature has thus raised concern: Can algorithmic governance be squared with legal principles of governmental transparency? We analyze this question and conclude that machine-learning algorithms’ relative inscrutability does not pose a …


Aspie: A Framework For Active Sensing And Processing Of Complex Events In The Internet Of Manufacturing Things, Shaobo Li, Weixing Chen, Jie Hu, Jianjun Hu Mar 2018

Aspie: A Framework For Active Sensing And Processing Of Complex Events In The Internet Of Manufacturing Things, Shaobo Li, Weixing Chen, Jie Hu, Jianjun Hu

Faculty Publications

Rapid perception and processing of critical monitoring events are essential to ensure healthy operation of Internet of Manufacturing Things (IoMT)-based manufacturing processes. In this paper, we proposed a framework (active sensing and processing architecture (ASPIE)) for active sensing and processing of critical events in IoMT-based manufacturing based on the characteristics of IoMT architecture as well as its perception model. A relation model of complex events in manufacturing processes, together with related operators and unified XML-based semantic definitions, are developed to effectively process the complex event big data. A template based processing method for complex events is further introduced to conduct …


Recommender Systems For Large-Scale Social Networks: A Review Of Challenges And Solutions, Magdalini Eirinaki, Jerry Gao, Iraklis Varlamis, Konstantinos Tserpes Jan 2018

Recommender Systems For Large-Scale Social Networks: A Review Of Challenges And Solutions, Magdalini Eirinaki, Jerry Gao, Iraklis Varlamis, Konstantinos Tserpes

Faculty Publications

Social networks have become very important for networking, communications, and content sharing. Social networking applications generate a huge amount of data on a daily basis and social networks constitute a growing field of research, because of the heterogeneity of data and structures formed in them, and their size and dynamics. When this wealth of data is leveraged by recommender systems, the resulting coupling can help address interesting problems related to social engagement, member recruitment, and friend recommendations.In this work we review the various facets of large-scale social recommender systems, summarizing the challenges and interesting problems and discussing some of the …


Offline And Online Density Estimation For Large High-Dimensional Data, Aref Majdara Jan 2018

Offline And Online Density Estimation For Large High-Dimensional Data, Aref Majdara

Dissertations, Master's Theses and Master's Reports

Density estimation has wide applications in machine learning and data analysis techniques including clustering, classification, multimodality analysis, bump hunting and anomaly detection. In high-dimensional space, sparsity of data in local neighborhood makes many of parametric and nonparametric density estimation methods mostly inefficient.

This work presents development of computationally efficient algorithms for high-dimensional density estimation, based on Bayesian sequential partitioning (BSP). Copula transform is used to separate the estimation of marginal and joint densities, with the purpose of reducing the computational complexity and estimation error. Using this separation, a parallel implementation of the density estimation algorithm on a 4-core CPU is …


Distributed Knowledge Discovery For Diverse Data, Hossein Hamooni Jul 2017

Distributed Knowledge Discovery For Diverse Data, Hossein Hamooni

Computer Science ETDs

In the era of new technologies, computer scientists deal with massive data of size hundreds of terabytes. Smart cities, social networks, health care systems, large sensor networks, etc. are constantly generating new data. It is non-trivial to extract knowledge from big datasets because traditional data mining algorithms run impractically on such big datasets. However, distributed systems have come to aid this problem while introducing new challenges in designing scalable algorithms. The transition from traditional algorithms to the ones that can be run on a distributed platform should be done carefully. Researchers should design the modern distributed algorithms based on the …