Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

2019

Big data

Institution
Publication
Publication Type

Articles 1 - 26 of 26

Full-Text Articles in Physical Sciences and Mathematics

“Where’S The I-O?” Artificial Intelligence And Machine Learning In Talent Management Systems, Manuel F. Gonzalez, John F. Capman, Frederick L. Oswald, Evan R. Theys, David L. Tomczak Nov 2019

“Where’S The I-O?” Artificial Intelligence And Machine Learning In Talent Management Systems, Manuel F. Gonzalez, John F. Capman, Frederick L. Oswald, Evan R. Theys, David L. Tomczak

Personnel Assessment and Decisions

Artificial intelligence (AI) and machine learning (ML) have seen widespread adoption by organizations seeking to identify and hire high-quality job applicants. Yet the volume, variety, and velocity of professional involvement among I-O psychologists remains relatively limited when it comes to developing and evaluating AI/ML applications for talent assessment and selection. Furthermore, there is a paucity of empirical research that investigates the reliability, validity, and fairness of AI/ML tools in organizational contexts. To stimulate future involvement and research, we share our review and perspective on the current state of AI/ML in talent assessment as well as its benefits and potential pitfalls; …


System Analysis Method Based On Simulation Big Data, Guangya Si, Wang Fei, Liu Yang Nov 2019

System Analysis Method Based On Simulation Big Data, Guangya Si, Wang Fei, Liu Yang

Journal of System Simulation

Abstract: Wargaming and exploratory simulation with large-scale simulation systems produce massive simulation data. These data contain many complexity patterns of war, and are significant samples for studying the mechanism of war. Based on the definition of simulation big data, an analysis framework based on simulation big data is proposed, which is divided into three levels: simulation environment and data planning, big data acquisition and storage, and analysis and mining. The simulation data planning and analysis and mining are briefly introduced.


Ml4iot: A Framework To Orchestrate Machine Learning Workflows On Internet Of Things Data, Jose Miguel Alves, Leonardo Honorio, Miriam A M Capretz Oct 2019

Ml4iot: A Framework To Orchestrate Machine Learning Workflows On Internet Of Things Data, Jose Miguel Alves, Leonardo Honorio, Miriam A M Capretz

Electrical and Computer Engineering Publications

Internet of Things (IoT) applications generate vast amounts of real-time data. Temporal analysis of these data series to discover behavioural patterns may lead to qualified knowledge affecting a broad range of industries. Hence, the use of machine learning (ML) algorithms over IoT data has the potential to improve safety, economy, and performance in critical processes. However, creating ML workflows at scale is a challenging task that depends upon both production and specialized skills. Such tasks require investigation, understanding, selection, and implementation of specific ML workflows, which often lead to bottlenecks, production issues, and code management complexity and even then may …


Design Of Personnel Big Data Management System Based On Blockchain, Houbing Song, Jian Chen, Zhihan Lv Sep 2019

Design Of Personnel Big Data Management System Based On Blockchain, Houbing Song, Jian Chen, Zhihan Lv

Houbing Song

With the continuous development of information technology, enterprises, universities and governments are constantly stepping up the construction of electronic personnel information management system. The information of hundreds of thousands or even millions of people’s information are collected and stored into the system. So much information provides the cornerstone for the development of big data, if such data is tampered with or leaked, it will cause irreparable serious damage. However, in recent years, electronic archives have exposed a series of problems such as information leakage, information tampering, and information loss, which has made the reform of personnel information management more and …


Design Of Personnel Big Data Management System Based On Blockchain, Houbing Song, Jian Chen, Zhihan Lv Jul 2019

Design Of Personnel Big Data Management System Based On Blockchain, Houbing Song, Jian Chen, Zhihan Lv

Publications

With the continuous development of information technology, enterprises, universities and governments are constantly stepping up the construction of electronic personnel information management system. The information of hundreds of thousands or even millions of people’s information are collected and stored into the system. So much information provides the cornerstone for the development of big data, if such data is tampered with or leaked, it will cause irreparable serious damage. However, in recent years, electronic archives have exposed a series of problems such as information leakage, information tampering, and information loss, which has made the reform of personnel information management more and …


Networkmetrics Unraveled: Mbda In Action, José Camacho, Rasmus Bro, David Kotz Jul 2019

Networkmetrics Unraveled: Mbda In Action, José Camacho, Rasmus Bro, David Kotz

Other Faculty Materials

We propose networkmetrics, a new data-driven approach for monitoring, troubleshooting and understanding communication networks using multivariate analysis. Networkmetric models are powerful machine-learning tools to interpret and interact with data collected from a network. In this paper, we illustrate the application of Multivariate Big Data Analysis (MBDA), a recently proposed networkmetric method with application to Big Data sets. We use MBDA for the detection and troubleshooting of network problems in a campus-wide Wi-Fi network. Data includes a seven-year trace (from 2012 to 2018) of the network’s most recent activity, with approximately 3,000 distinct access points, 40,000 authenticated users, and 600,000 distinct …


Data Mining And Machine Learning To Improve Northern Florida’S Foster Care System, Daniel Oldham, Nathan Foster, Mihhail Berezovski Jun 2019

Data Mining And Machine Learning To Improve Northern Florida’S Foster Care System, Daniel Oldham, Nathan Foster, Mihhail Berezovski

Beyond: Undergraduate Research Journal

The purpose of this research project is to use statistical analysis, data mining, and machine learning techniques to determine identifiable factors in child welfare service records that could lead to a child entering the foster care system multiple times. This would allow us the capability of accurately predicting a case’s outcome based on these factors. We were provided with eight years of data in the form of multiple spreadsheets from Partnership for Strong Families (PSF), a child welfare services organization based in Gainesville, Florida, who is contracted by the Florida Department for Children and Families (DCF). This data contained a …


Spatio-Temporal Multimedia Big Data Analytics Using Deep Neural Networks, Samira Pouyanfar Jun 2019

Spatio-Temporal Multimedia Big Data Analytics Using Deep Neural Networks, Samira Pouyanfar

FIU Electronic Theses and Dissertations

With the proliferation of online services and mobile technologies, the world has stepped into a multimedia big data era, where new opportunities and challenges appear with the high diversity multimedia data together with the huge amount of social data. Nowadays, multimedia data consisting of audio, text, image, and video has grown tremendously. With such an increase in the amount of multimedia data, the main question raised is how one can analyze this high volume and variety of data in an efficient and effective way. A vast amount of research work has been done in the multimedia area, targeting different aspects …


High-Performance Computing Frameworks For Large-Scale Genome Assembly, Sayan Goswami Jun 2019

High-Performance Computing Frameworks For Large-Scale Genome Assembly, Sayan Goswami

LSU Doctoral Dissertations

Genome sequencing technology has witnessed tremendous progress in terms of throughput and cost per base pair, resulting in an explosion in the size of data. Typical de Bruijn graph-based assembly tools demand a lot of processing power and memory and cannot assemble big datasets unless running on a scaled-up server with terabytes of RAMs or scaled-out cluster with several dozens of nodes. In the first part of this work, we present a distributed next-generation sequence (NGS) assembler called Lazer, that achieves both scalability and memory efficiency by using partitioned de Bruijn graphs. By enhancing the memory-to-disk swapping and reducing the …


Using Big Data Analytics To Improve Hiv Medical Care Utilisation In South Carolina: A Study Protocol, Bankole Olatosi, Jiajia Zhang, Sharon Weissman, Jianjun Hu, Mohammad Rifat Haider, Xiaoming Li Jun 2019

Using Big Data Analytics To Improve Hiv Medical Care Utilisation In South Carolina: A Study Protocol, Bankole Olatosi, Jiajia Zhang, Sharon Weissman, Jianjun Hu, Mohammad Rifat Haider, Xiaoming Li

Faculty Publications

Introduction Linkage and retention in HIV medical care remains problematic in the USA. Extensive health utilisation data collection through electronic health records (EHR) and claims data represent new opportunities for scientific discovery. Big data science (BDS) is a powerful tool for investigating HIV care utilisation patterns. The South Carolina (SC) office of Revenue and Fiscal Affairs (RFA) data warehouse captures individual-level longitudinal health utilisation data for persons living with HIV (PLWH). The data warehouse includes EHR, claims and data from private institutions, housing, prisons, mental health, Medicare, Medicaid, State Health Plan and the department of health and human services. The …


Statistical Machine Learning Methods For Mining Spatial And Temporal Data, Fei Tan May 2019

Statistical Machine Learning Methods For Mining Spatial And Temporal Data, Fei Tan

Dissertations

Spatial and temporal dependencies are ubiquitous properties of data in numerous domains. The popularity of spatial and temporal data mining has thus grown with the increasing prevalence of massive data. The presence of spatial and temporal attributes not only provides complementary useful perspectives, but also poses new challenges to the representation and integration into the learning procedure. In this dissertation, the involved spatial and temporal dependencies are explored with three genres: sample-wise, feature-wise, and target-wise. A family of novel methodologies is developed accordingly for the dependency representation in respective scenarios.

First, dependencies among discrete, continuous and repeated observations are studied …


What To Do When Privacy Is Gone, James Brusseau May 2019

What To Do When Privacy Is Gone, James Brusseau

Computer Ethics - Philosophical Enquiry (CEPE) Proceedings

Today’s ethics of privacy is largely dedicated to defending personal information from big data technologies. This essay goes in the other direction. It considers the struggle to be lost, and explores two strategies for living after privacy is gone. First, total exposure embraces privacy’s decline, and then contributes to the process with transparency. All personal information is shared without reservation. The resulting ethics is explored through a big data version of Robert Nozick’s Experience Machine thought experiment. Second, transient existence responds to privacy’s loss by ceaselessly generating new personal identities, which translates into constantly producing temporarily unviolated private information. The …


How To Derive Causal Insights For Digital Commerce In China? A Research Commentary On Computational Social Science Methods, David C.W. Phang, Kanliang Wang, Qiu-Hong Wang, Robert John Kauffman, Maurizio Naldi May 2019

How To Derive Causal Insights For Digital Commerce In China? A Research Commentary On Computational Social Science Methods, David C.W. Phang, Kanliang Wang, Qiu-Hong Wang, Robert John Kauffman, Maurizio Naldi

Research Collection School Of Computing and Information Systems

The transformation of empirical research due to the arrival of big data analytics and data science, as well as the new availability of methods that emphasize causal inference, are moving forward at full speed. In this Research Commentary, we examine the extent to which this has the potential to influence how e-commerce research is conducted. China offers the ultimate in data-at-scale settings, and the construction of real-world natural experiments. Chinese e-commerce includes some of the largest firms involved in e-commerce, mobile commerce, social media and social networks. This article was written to encourage young faculty and doctoral students to engage …


The Security Of Big Data In Fog-Enabled Iot Applications Including Blockchain: A Survey, Noshina Tariq, Muhammad Asim, Feras Al-Obeidat, Muhammad Zubair Farooqi, Thar Baker, Mohammad Hammoudeh, Ibrahim Ghafir Apr 2019

The Security Of Big Data In Fog-Enabled Iot Applications Including Blockchain: A Survey, Noshina Tariq, Muhammad Asim, Feras Al-Obeidat, Muhammad Zubair Farooqi, Thar Baker, Mohammad Hammoudeh, Ibrahim Ghafir

All Works

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. The proliferation of inter-connected devices in critical industries, such as healthcare and power grid, is changing the perception of what constitutes critical infrastructure. The rising interconnectedness of new critical industries is driven by the growing demand for seamless access to information as the world becomes more mobile and connected and as the Internet of Things (IoT) grows. Critical industries are essential to the foundation of today’s society, and interruption of service in any of these sectors can reverberate through other sectors and even around the globe. In today’s hyper-connected world, the …


Google Trends Data As A Proxy For Interest In Leadership, Finley W. Walker Apr 2019

Google Trends Data As A Proxy For Interest In Leadership, Finley W. Walker

Doctor of Education (Ed.D)

The purpose of this quantitative study was to investigate the observable patterns of online search behavior in the topic of leadership using Google Trends data. Institutions have had a historically difficult time predicting good leadership candidates. Better predictions can be made by using the big data offered by groups such as Google to learn who, where, and when people are interested in leadership. The study utilized descriptive, comparative, and correlative methodologies to study Google users’ interest in leadership from 2004 to 2017. Society has placed great value into leadership throughout history, and though overall interest remains strong, it appears that …


A Data-Driven Approach For Modeling Agents, Hamdi Kavak Apr 2019

A Data-Driven Approach For Modeling Agents, Hamdi Kavak

Computational Modeling & Simulation Engineering Theses & Dissertations

Agents are commonly created on a set of simple rules driven by theories, hypotheses, and assumptions. Such modeling premise has limited use of real-world data and is challenged when modeling real-world systems due to the lack of empirical grounding. Simultaneously, the last decade has witnessed the production and availability of large-scale data from various sensors that carry behavioral signals. These data sources have the potential to change the way we create agent-based models; from simple rules to driven by data. Despite this opportunity, the literature has neglected to offer a modeling approach to generate granular agent behaviors from data, creating …


Wireless Sensor Networks For Big Data Systems, Beom Su Kim, Ki Il Kim, Babar Shah, Francis Chow, Kyong Hoon Kim Apr 2019

Wireless Sensor Networks For Big Data Systems, Beom Su Kim, Ki Il Kim, Babar Shah, Francis Chow, Kyong Hoon Kim

All Works

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. Before discovering meaningful knowledge from big data systems, it is first necessary to build a data-gathering infrastructure. Among many feasible data sources, wireless sensor networks (WSNs) are rich big data sources: a large amount of data is generated by various sensor nodes in large-scale networks. However, unlike typical wireless networks, WSNs have serious deficiencies in terms of data reliability and communication owing to the limited capabilities of the nodes. Moreover, a considerable amount of sensed data are of no interest, meaningless, and redundant when a large number of sensor nodes is …


Parallel Pattern Recognition Of Leak Current Data Using Spark-Knn, Li Li, Yongli Zhu, Yaqi Song Jan 2019

Parallel Pattern Recognition Of Leak Current Data Using Spark-Knn, Li Li, Yongli Zhu, Yaqi Song

Journal of System Simulation

Abstract: With the rapid development of smart grid, the status monitoring data of power grid equipment increase exponentially and gradually form the big data. Traditional computing architectures are no longer to meet the demand of computing performance. This paper explores how Spark and Cloud computing can accelerate performance of missive insulator leak current data pattern recognition. The Parallel KNN (k-Nearest Neighbor) algorithm is designed and implemented by using Spark and Aliyun E-MapReduce cloud computing platform. The results from experiments show that the performance of Spark-KNN is 2.97 times of MapReduce-KNN and gains acceleration of 8.8 times. The experimental results confirm …


Association Rules Analysis Method Of Spatial Data Under Mapreduce Framework, Mingzhi Zhang, Li Yi Jan 2019

Association Rules Analysis Method Of Spatial Data Under Mapreduce Framework, Mingzhi Zhang, Li Yi

Journal of System Simulation

Abstract: Spatial data has the characteristic of extensity, timeliness, multidimensional, large amount of data and complex relations. Some non-conventional data screening tool for analysis and mining is required to find out the patterns, rules and characteristics knowledge in the spatial big data for battlefield situation awareness and battle space management. In view that the existing Apriori algorithm scans the database too frequently, the Apriori algorithm is improved on the basis of working principle of Map Reduce .The fast analysis ideas and technologyframework of spatial data is proposed. An elementary validate prototype is built for the key technology experimentation.Experimental results …


Data Analysis Through Social Media According To The Classified Crime, Serkan Savaş, Nuretti̇n Topaloğlu Jan 2019

Data Analysis Through Social Media According To The Classified Crime, Serkan Savaş, Nuretti̇n Topaloğlu

Turkish Journal of Electrical Engineering and Computer Sciences

The amount and variety of data generated through social media sites has increased along with the widespread use of social media sites. In addition, the data production rate has increased in the same way. The inclusion of personal information within these data makes it important to process the data and reach meaningful information within it. This process can be called intelligence and this meaningful information may be for commercial, academic, or security purposes. An example application is developed in this study for intelligence on Twitter. Crimes in Turkey are classified according to Turkish Statistical Institute criminal data and keywords are …


Big Data Investment And Knowledge Integration In Academic Libraries, Saher Manaseer, Afnan R. Alawneh, Dua Asoudi Jan 2019

Big Data Investment And Knowledge Integration In Academic Libraries, Saher Manaseer, Afnan R. Alawneh, Dua Asoudi

Copyright, Fair Use, Scholarly Communication, etc.

Recently, big data investment has become important for organizations, especially with the fast growth of data following the huge expansion in the usage of social media applications, and websites. Many organizations depend on extracting and reaching the needed reports and statistics. As the investments on big data and its storage have become major challenges for organizations, many technologies and methods have been developed to tackle those challenges.

One of such technologies is Hadoop, a framework that is used to divide big data into packages and distribute those packages through nodes to be processed, consuming less cost than the traditional storage …


Research On The Law Of Garlic Price Based On Big Data, Feng Guo, Pingzeng Liu, Chao Zhang, Weijie Chen, Wei Han, Wanming Ren, Yong Zheng, Jianrui Ding Jan 2019

Research On The Law Of Garlic Price Based On Big Data, Feng Guo, Pingzeng Liu, Chao Zhang, Weijie Chen, Wei Han, Wanming Ren, Yong Zheng, Jianrui Ding

Computer Science Student Research

In view of the frequent fluctuation of garlic price under the market economy and the current situation of garlic price, the fluctuation of garlic price in the circulation link of garlic industry chain is analyzed, and the application mode of multidisciplinary in the agricultural industry is discussed. On the basis of the big data platform of garlic industry chain, this paper constructs a Garch model to analyze the fluctuation law of garlic price in the circulation link and provides the garlic industry service from the angle of price fluctuation combined with the economic analysis. The research shows that the average …


Transparency And Algorithmic Governance, Cary Coglianese, David Lehr Jan 2019

Transparency And Algorithmic Governance, Cary Coglianese, David Lehr

All Faculty Scholarship

Machine-learning algorithms are improving and automating important functions in medicine, transportation, and business. Government officials have also started to take notice of the accuracy and speed that such algorithms provide, increasingly relying on them to aid with consequential public-sector functions, including tax administration, regulatory oversight, and benefits administration. Despite machine-learning algorithms’ superior predictive power over conventional analytic tools, algorithmic forecasts are difficult to understand and explain. Machine learning’s “black-box” nature has thus raised concern: Can algorithmic governance be squared with legal principles of governmental transparency? We analyze this question and conclude that machine-learning algorithms’ relative inscrutability does not pose a …


Web Personalization Issues In Big Data And Semantic Web: Challenges Andopportunities, Bujar Raufi, Florije Ismaili, Jaumin Ajdari, Xhemal Zenuni Jan 2019

Web Personalization Issues In Big Data And Semantic Web: Challenges Andopportunities, Bujar Raufi, Florije Ismaili, Jaumin Ajdari, Xhemal Zenuni

Turkish Journal of Electrical Engineering and Computer Sciences

Web personalization is a process that utilizes a set of methods, techniques, and actions for adapting the linking structure of an information space or its content or both to user interaction preferences. The aim of personalization is to enhance the user experience by retrieving relevant resources and presenting them in a meaningful fashion. The advent of big data introduced new challenges that locate user modeling and personalization community in a new research setting. In this paper, we introduce the research challenges related to Web personalization analyzed in the context of big data and the Semantic Web. This paper also introduces …


Hierarchical Cluster Analysis: A New Type Of Ranking Criteria Based On Arwu Ranking Data, Zhengshuo Li Jan 2019

Hierarchical Cluster Analysis: A New Type Of Ranking Criteria Based On Arwu Ranking Data, Zhengshuo Li

Dissertations

The advent of big data leads to many applications of Machine Learning techniques. University rankings is one of the applicable domains, which is currently playing a crucial role in the assessment of the universities' performance. Currently, the rankings are usually carried out by some authoritative ranking institutions by means of weighting techniques and the results are conveyed in numerical rankings. Three of the most famous university ranking institutions have been introduced from a technical perspective. However, these institutions have been proven to be subjective in relation to their data selection and weighting method.


Privacy Preservation In Social Media Environments Using Big Data, Katrina Ward Jan 2019

Privacy Preservation In Social Media Environments Using Big Data, Katrina Ward

Doctoral Dissertations

"With the pervasive use of mobile devices, social media, home assistants, and smart devices, the idea of individual privacy is fading. More than ever, the public is giving up personal information in order to take advantage of what is now considered every day conveniences and ignoring the consequences. Even seemingly harmless information is making headlines for its unauthorized use (18). Among this data is user trajectory data which can be described as a user's location information over a time period (6). This data is generated whenever users access their devices to record their location, query the location of a point …