Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

PDF

Big data

Institution
Publication Year
Publication
Publication Type

Articles 31 - 35 of 35

Full-Text Articles in Engineering

Semantic Inference On Clinical Documents: Combining Machine Learning Algorithms With An Inference Engine For Effective Clinical Diagnosis And Treatment, Shuo Yang, Ran Wei, Jingzhi Guo, Lida Xu Jan 2017

Semantic Inference On Clinical Documents: Combining Machine Learning Algorithms With An Inference Engine For Effective Clinical Diagnosis And Treatment, Shuo Yang, Ran Wei, Jingzhi Guo, Lida Xu

Information Technology & Decision Sciences Faculty Publications

Clinical practice calls for reliable diagnosis and optimized treatment. However, human errors in health care remain a severe issue even in industrialized countries. The application of clinical decision support systems (CDSS) casts light on this problem. However, given the great improvement in CDSS over the past several years, challenges to their wide-scale application are still present, including: 1) decision making of CDSS is complicated by the complexity of the data regarding human physiology and pathology, which could render the whole process more time-consuming by loading big data related to patients; and 2) information incompatibility among different health information systems (HIS) …


Hadoop Framework Implementation And Performance Analysis On A Cloud, Göksu Zeki̇ye Özen, Mehmet Tekerek, Rayi̇mbek Sultanov Jan 2017

Hadoop Framework Implementation And Performance Analysis On A Cloud, Göksu Zeki̇ye Özen, Mehmet Tekerek, Rayi̇mbek Sultanov

Turkish Journal of Electrical Engineering and Computer Sciences

The Hadoop framework uses the MapReduce programming paradigm to process big data by distributing data across a cluster and aggregating. MapReduce is one of the methods used to process big data hosted on large clusters. In this method, jobs are processed by dividing into small pieces and distributing over nodes. Parameters such as distributing method over nodes, the number of jobs held in a parallel fashion, and the number of nodes in the cluster affect the execution time of jobs. The aim of this paper is to determine how the numbers of nodes, maps, and reduces affect the performance of …


An Automated Approach For Digital Forensic Analysis Of Heterogeneous Big Data, Hussam Mohammed, Nathan Clarke, Fudong Li Jan 2016

An Automated Approach For Digital Forensic Analysis Of Heterogeneous Big Data, Hussam Mohammed, Nathan Clarke, Fudong Li

Journal of Digital Forensics, Security and Law

The major challenges with big data examination and analysis are volume, complex interdependence across content, and heterogeneity. The examination and analysis phases are considered essential to a digital forensics process. However, traditional techniques for the forensic investigation use one or more forensic tools to examine and analyse each resource. In addition, when multiple resources are included in one case, there is an inability to cross-correlate findings which often leads to inefficiencies in processing and identifying evidence. Furthermore, most current forensics tools cannot cope with large volumes of data. This paper develops a novel framework for digital forensic analysis of heterogeneous …


Understanding The Paradigm Shift To Computational Social Science In The Presence Of Big Data, Ray M. Chang, Robert J. Kauffman, Young Ok Kwon Jul 2014

Understanding The Paradigm Shift To Computational Social Science In The Presence Of Big Data, Ray M. Chang, Robert J. Kauffman, Young Ok Kwon

Research Collection School Of Computing and Information Systems

The era of big data has created new opportunities for researchers to achieve high relevance and impact amid changes and transformations in how we study social science phenomena. With the emergence of new data collection technologies, advanced data mining and analytics support, there seems to be fundamental changes that are occurring with the research questions we can ask, and the research methods we can apply. The contexts include social networks and blogs, political discourse, corporate announcements, digital journalism, mobile telephony, home entertainment, online gaming, financial services, online shopping, social advertising, and social commerce. The changing costs of data collection and …


High Dimensional Data Set Analysis Using A Large-Scale Manifold Learning Approach, Loc Tran Jul 2014

High Dimensional Data Set Analysis Using A Large-Scale Manifold Learning Approach, Loc Tran

Electrical & Computer Engineering Theses & Dissertations

Because of technological advances, a trend occurs for data sets increasing in size and dimensionality. Processing these large scale data sets is challenging for conventional computers due to computational limitations. A framework for nonlinear dimensionality reduction on large databases is presented that alleviates the issue of large data sets through sampling, graph construction, manifold learning, and embedding. Neighborhood selection is a key step in this framework and a potential area of improvement. The standard approach to neighborhood selection is setting a fixed neighborhood. This could be a fixed number of neighbors or a fixed neighborhood size. Each of these has …