Open Access. Powered by Scholars. Published by Universities.®

Data Science Commons

Open Access. Powered by Scholars. Published by Universities.®

1,483 Full-Text Articles 2,962 Authors 435,013 Downloads 189 Institutions

All Articles in Data Science

Faceted Search

1,483 full-text articles. Page 18 of 73.

Crow Search Algorithm With Time Varying Flight Length Strategies For Feature Selection, Mohammed Abdullahi, Abdulhameed Adamu, Ibrahim Hayatu Hassan 2023 Ahmadu Bello University, Zaria, Nigeria

Crow Search Algorithm With Time Varying Flight Length Strategies For Feature Selection, Mohammed Abdullahi, Abdulhameed Adamu, Ibrahim Hayatu Hassan

Future Computing and Informatics Journal

Feature Selection (FS) is an efficient technique use to get rid of irrelevant, redundant and noisy attributes in high dimensional datasets while increasing the efficacy of machine learning classification. The CSA is a modest and efficient metaheuristic algorithm which has been used to overcome several FS issues. The flight length (fl) parameter in CSA governs crows' search ability. In CSA, fl is set to a fixed value. As a result, the CSA is plagued by the problem of being hoodwinked in local minimum. This article suggests a remedy to this issue by bringing five new concepts of time dependent fl …


Fake News Detection Using Natural Language Processing, Fabiolla Mayrink Costa, Rael Guimaraes 2023 CCT College Dublin

Fake News Detection Using Natural Language Processing, Fabiolla Mayrink Costa, Rael Guimaraes

ICT

Nowadays with the advance of technologies we have vast access to any sort of information. We are able to use our phone/computer to access the news of any part of the world. It is great to keep us informed about everything that is happening around the world. It is also a powerful tool used for companies while making strategic business decisions. The biggest issue is that technology can and is being used to manipulate people/companies by propagating fake news. Fake news can mislead people's perceptions while forming opinions on a determined subject. It can also have a big impact on …


The Impact Of Big Data Utilization On Quality Improvement In Inpatient Facilities, Lakyn Hare 2023 Marshall University

The Impact Of Big Data Utilization On Quality Improvement In Inpatient Facilities, Lakyn Hare

Theses, Dissertations and Capstones

Introduction: Poor quality in healthcare has resulted in avoidable patient complications, including readmission rates. Big data in healthcare can be analyzed and built into a tools, with machine learning, to aid in reduced readmission rates and overall positive patient outcomes.

Purpose of the Study: The intention of this study was to evaluate the ways that big data can be analyzed to improve healthcare, specifically readmissions, patient outcomes, and show cost savings. This study examined different ways that big data could be used in concordance with machine learning, including predictive analysis, to make these improvements.

Methodology: The hypothesis was the …


The Shortfalls Of Vulnerability Indexes For Public Health Decision-Making In The Face Of Emergent Crises: The Case Of Covid-19 Vaccine Uptake In Virginia, Lydia Cleveland Sa, Erika Frydenlund 2023 Old Dominion University

The Shortfalls Of Vulnerability Indexes For Public Health Decision-Making In The Face Of Emergent Crises: The Case Of Covid-19 Vaccine Uptake In Virginia, Lydia Cleveland Sa, Erika Frydenlund

VMASC Publications

Equitable and effective vaccine uptake is a key issue in addressing COVID-19. To achieve this, we must comprehensively characterize the context-specific socio-behavioral and structural determinants of vaccine uptake. However, to quickly focus public health interventions, state agencies and planners often rely on already existing indexes of "vulnerability." Many such "vulnerability indexes" exist and become benchmarks for targeting interventions in wide ranging scenarios, but they vary considerably in the factors and themes that they cover. Some are even uncritical of the use of the word "vulnerable," which should take on different meanings in different contexts. The objective of this study is …


Prevention Research Center For Healthy Neighborhoods, Madeline Panus 2023 John Carroll University

Prevention Research Center For Healthy Neighborhoods, Madeline Panus

Celebration of Scholarship 2023

No abstract provided.


The Daily Patterns Of Emergency Medical Events, Mary E. Helander, Margaret K. Formica, Dessa K. Bergen-Cico 2023 Syracuse University

The Daily Patterns Of Emergency Medical Events, Mary E. Helander, Margaret K. Formica, Dessa K. Bergen-Cico

Social Science - All Scholarship

This study examines population level daily patterns of time-stamped emergency medical service (EMS) dispatches to establish their situational predictability. Using visualization, sinusoidal regression, and statistical tests to compare empirical cumulative distributions, we analyzed 311,848,450 emergency medical call records from the U.S. National Emergency Medical Services Information System (NEMSIS) for years 2010 through 2022. The analysis revealed a robust daily pattern in the hourly distribution of distress calls across 33 major categories of medical emergency dispatch types. Sinusoidal regression coefficients for all types were statistically significant, mostly at the p < 0.0001 level. The coefficient of determination ($R^2$) ranged from 0.84 and 0.99 for all models, with most falling in the 0.94 to 0.99 range. The common sinusoidal pattern, peaking in mid-afternoon, demonstrates that all major categories of medical emergency dispatch types appear to be influenced by an underlying daily rhythm that is aligned with daylight hours and common sleep/wake cycles. A comparison of results with previous landmark studies revealed new and contrasting EMS patterns for several long-established peak occurrence hours--specifically for chest pain, heart problems, stroke, convulsions and seizures, and sudden cardiac arrest/death. Upon closer examination, we also found that heart attacks, diagnosed by paramedics in the field via 12-lead cardiac monitoring, followed the identified common daily pattern of a mid-afternoon peak, departing from prior generally accepted morning tendencies. Extended analysis revealed that the normative pattern prevailed across the NEMSIS data when re-organized to consider monthly, seasonal, daylight-savings vs civil time, and pre-/post- COVID-19 periods. The predictable daily EMS patterns provide impetus for more research that links daily variation with causal risk and protective factors. Our methods are straightforward and presented with detail to provide accessible and replicable implementation for researchers and practitioners.


Predicting The Effects Of Climate Change On Irish Agriculture, Rodrigo Matsumoto, Sarah Kuprian Carrinho 2023 CCT College Dublin

Predicting The Effects Of Climate Change On Irish Agriculture, Rodrigo Matsumoto, Sarah Kuprian Carrinho

ICT

The impact of climate change on agriculture is a growing concern worldwide, and Ireland is no exception. The purpose of this project is to use machine learning techniques to predict the effects of climate change on Irish agriculture and identify strategies for adaptation and mitigation. The project uses the Cross-Industry Standard Process for Data Mining (CRISP-DM) methodology to guide the data analysis process, MoSCoW prioritization to identify the most critical needs, and SWOT analysis to evaluate the strengths, weaknesses, opportunities, and threats our project may encounter. Historical temperature data for Ireland and Dublin will be used as our data sources. …


Variable Selection And Regression Analysis, Emil Agbemade 2023 University of Central Florida

Variable Selection And Regression Analysis, Emil Agbemade

Data Science and Data Mining

One of the most valuable crop species, maize, has been the subject of genetic study and experimentation for more than a century. However, species that share similarities and differences across a wide spectrum have developed astonishing adaptations as a result of small changes throughout time. Because it is usual practice to determine the genotypes of thousands of single nucleotide polymorphism (SNP) markers for thousands of patients, the data set we are dealing with has an issue with small n and large p. The result of this is that there are noticeably more predictor factors than responder variables. The original data …


Predicting Heart Disease Using Tree-Based Model, Emil Agbemade 2023 University of Central Florida

Predicting Heart Disease Using Tree-Based Model, Emil Agbemade

Data Science and Data Mining

The paper presents a study on the use of machine learning algorithms for the prediction of heart disease, which is the leading cause of death worldwide. The study focuses on the use of decision tree algorithms, which have the advantage of considering a large number of risk factors. The heart disease data set was obtained from the UCI Machine Learning Repository and was analyzed using a decision tree classifier. The data set had 6 missing data points, which were deleted, leaving 279 instances for analysis. One-hot-encoding was performed on categorical variables with more than two responses. The decision tree classifier …


Classification Of Adult Income Using Decision Tree, Roland Fiagbe 2023 University of Central Florida

Classification Of Adult Income Using Decision Tree, Roland Fiagbe

Data Science and Data Mining

Decision tree is a commonly used data mining methodology for performing classification tasks. It is a tree-based supervised machine learning algorithm that is used to classify or make predictions in a path of how previous questions are answered. Generally, the decision tree algorithm categorizes data into branch-like segments that develop into a tree that contains a root, nodes, and leaves. This project seeks to explore the decision tree methodology and apply it to the Adult Income dataset from the UCI Machine Learning Repository, to determine whether a person makes over 50K per year and determine the necessary factors that improve …


The Basil Technique: Bias Adaptive Statistical Inference Learning Agents For Learning From Human Feedback, Jonathan Indigo Watson 2023 University of Kentucky

The Basil Technique: Bias Adaptive Statistical Inference Learning Agents For Learning From Human Feedback, Jonathan Indigo Watson

Theses and Dissertations--Computer Science

We introduce a novel approach for learning behaviors using human-provided feedback that is subject to systematic bias. Our method, known as BASIL, models the feedback signal as a combination of a heuristic evaluation of an action's utility and a probabilistically-drawn bias value, characterized by unknown parameters. We present both the general framework for our technique and specific algorithms for biases drawn from a normal distribution. We evaluate our approach across various environments and tasks, comparing it to interactive and non-interactive machine learning methods, including deep learning techniques, using human trainers and a synthetic oracle with feedback distorted to varying degrees. …


An Investigation Of Methods For Improving Spatial Invariance Of Convolutional Neural Networks For Image Classification, David Noel 2023 Nova Southeastern University

An Investigation Of Methods For Improving Spatial Invariance Of Convolutional Neural Networks For Image Classification, David Noel

CCE Theses and Dissertations

Convolutional Neural Networks (CNNs) have achieved impressive results on complex visual tasks such as image recognition. They are commonly assumed to be spatially invariant to small transformations of their input images. Spatial invariance is a fundamental property that characterizes how a model reacts to input transformations, i.e., its generalizability - and deep networks that can robustly classify objects placed in different orientations or lighting conditions have the property of invariance. However, several authors have recently shown that this is not the case, and that slight rotations, translations, or rescaling of their input images significantly reduce the network’s predictive accuracy. Furthermore, …


A Multistage Framework For Detection Of Very Small Objects, Duleep Rathgamage Don, Ramazan Aygun, Mahmut Karakaya 2023 Kennesaw State University

A Multistage Framework For Detection Of Very Small Objects, Duleep Rathgamage Don, Ramazan Aygun, Mahmut Karakaya

Published and Grey Literature from PhD Candidates

Small object detection is one of the most challenging problems in computer vision. Algorithms based on state-of-the-art object detection methods such as R-CNN, SSD, FPN, and YOLO fail to detect objects of very small sizes. In this study, we propose a novel method to detect very small objects, smaller than 8×8 pixels, that appear in a complex background. The proposed method is a multistage framework consisting of an unsupervised algorithm and three separately trained supervised algorithms. The unsupervised algorithm extracts ROIs from a high-resolution image. Then the ROIs are upsampled using SRGAN, and the enhanced ROIs are detected by our …


Face Anti-Spoofing And Deep Learning Based Unsupervised Image Recognition Systems, Enoch Solomon 2023 Virginia Commonwealth University

Face Anti-Spoofing And Deep Learning Based Unsupervised Image Recognition Systems, Enoch Solomon

Theses and Dissertations

One of the main problems of a supervised deep learning approach is that it requires large amounts of labeled training data, which are not always easily available. This PhD dissertation addresses the above-mentioned problem by using a novel unsupervised deep learning face verification system called UFace, that does not require labeled training data as it automatically, in an unsupervised way, generates training data from even a relatively small size of data. The method starts by selecting, in unsupervised way, k-most similar and k-most dissimilar images for a given face image. Moreover, this PhD dissertation proposes a new loss function to …


Transfer Learning Using Infrared And Optical Full Motion Video Data For Gender Classification, Alexander M. Glandon, Joe Zalameda, Khan M. Iftekharuddin, Gabor F. Fulop (Ed.), David Z. Ting (Ed.), Lucy L. Zheng (Ed.) 2023 Old Dominion University

Transfer Learning Using Infrared And Optical Full Motion Video Data For Gender Classification, Alexander M. Glandon, Joe Zalameda, Khan M. Iftekharuddin, Gabor F. Fulop (Ed.), David Z. Ting (Ed.), Lucy L. Zheng (Ed.)

Electrical & Computer Engineering Faculty Publications

This work is a review and extension of our ongoing research in human recognition analysis using multimodality motion sensor data. We review our work on hand crafted feature engineering for motion capture skeleton (MoCap) data, from the Air Force Research Lab for human gender followed by depth scan based skeleton extraction using LIDAR data from the Army Night Vision Lab for person identification. We then build on these works to demonstrate a transfer learning sensor fusion approach for using the larger MoCap and smaller LIDAR data for gender classification.


Multimodal Neuron Classification Based On Morphology And Electrophysiology, Aqib Ahmad 2023 West Virginia University

Multimodal Neuron Classification Based On Morphology And Electrophysiology, Aqib Ahmad

Graduate Theses, Dissertations, and Problem Reports

Categorizing neurons into different types to understand neural circuits and ultimately brain function is a major challenge in neuroscience. While electrical properties are critical in defining a neuron, its morphology is equally important. Advancements in single-cell analysis methods have allowed neuroscientists to simultaneously capture multiple data modalities from a neuron. We propose a method to classify neurons using both morphological structure and electrophysiology. Current approaches are based on a limited analysis of morphological features. We propose to use a new graph neural network to learn representations that more comprehensively account for the complexity of the shape of neuronal structures. In …


Feature Extraction Of Footwear Impression Images For Quality Assessment, Alexandra Hill 2023 West Virginia University

Feature Extraction Of Footwear Impression Images For Quality Assessment, Alexandra Hill

Graduate Theses, Dissertations, and Problem Reports

Forensic footwear impression analysis is a valuable tool in criminal investigations. Extracting useful features from images of footwear impressions is a critical step in this process. However, the quality of these images can vary widely, making feature extraction challenging. In order to give a quality assessment rating to a footwear impression image, the image should first be analyzed to extract features from the impression. In this paper, we present a method to extract features from a 2D grayscale footwear impression image. A Hierarchical Grid Model implementation has been adapted from use on a 3D dataset to assist in finding features, …


Machine Learning Prediction Of Dod Personal Property Shipment Costs, Tiffany Tucker [*], Torrey J. Wagner, Paul Auclair, Brent T. Langhals 2023 Air Force Institute of Technology

Machine Learning Prediction Of Dod Personal Property Shipment Costs, Tiffany Tucker [*], Torrey J. Wagner, Paul Auclair, Brent T. Langhals

Faculty Publications

U.S. Department of Defense (DoD) personal property moves account for 15% of all domestic and international moves - accurate prediction of their cost could draw attention to outlier shipments and improve budget planning. In this work 136,140 shipments between 13 personal property shipment hubs from April 2022 through March 2023 with a total cost of $1.6B were analyzed. Shipment cost was predicted using recursive feature elimination on linear regression and XGBoost algorithms, as well as through neural network hyperparameter sweeps. Modeling was repeated after removing 28 features related to shipment hub location and branch of service to examine their influence …


A Linear Regression Model To Predict The Critical Temperature Of A Superconductor, Amir Alipour Yengejeh 2023 University of Central Florida

A Linear Regression Model To Predict The Critical Temperature Of A Superconductor, Amir Alipour Yengejeh

Data Science and Data Mining

Since the superconductivity has been introduced, almost all studies in this area have been striving to predict the critical temperature ($T_{c}$) through the features extracted from the superconductor's chemical formula. In this study, thus, we are interested in exploring the linear association between $T_{c}$ and the related features.


Biodiversity Of Philippine Marine Fishes: A Dna Barcode Reference Library Based On Voucher Specimens, Katherine E. Bemis, Matthew G. Girard, Mudjekeewis D. Santos, Kent E. Carpenter, Jonathan R. Deeds, Diane E. Pitassy, Nicko Amor L. Flores, Elizabeth S. Hunter, Amy C. Driskell, Kenneth S. Macdonald III, Lee A. Weigt, Jeffrey T. Williams 2023 National Museum of Natural History, Smithsonian

Biodiversity Of Philippine Marine Fishes: A Dna Barcode Reference Library Based On Voucher Specimens, Katherine E. Bemis, Matthew G. Girard, Mudjekeewis D. Santos, Kent E. Carpenter, Jonathan R. Deeds, Diane E. Pitassy, Nicko Amor L. Flores, Elizabeth S. Hunter, Amy C. Driskell, Kenneth S. Macdonald Iii, Lee A. Weigt, Jeffrey T. Williams

Biological Sciences Faculty Publications

Accurate identification of fishes is essential for understanding their biology and to ensure food safety for consumers. DNA barcoding is an important tool because it can verify identifications of both whole and processed fishes that have had key morphological characters removed (e.g., filets, fish meal); however, DNA reference libraries are incomplete, and public repositories for sequence data contain incorrectly identified sequences. During a nine-year sampling program in the Philippines, a global biodiversity hotspot for marine fishes, we developed a verified reference library of cytochrome c oxidase subunit I (COI) sequences for 2,525 specimens representing 984 species. Specimens were primarily purchased …


Digital Commons powered by bepress