Open Access. Powered by Scholars. Published by Universities.®

Data Science Commons

Open Access. Powered by Scholars. Published by Universities.®

1,010 Full-Text Articles 2,169 Authors 177,939 Downloads 153 Institutions

All Articles in Data Science

Faceted Search

1,010 full-text articles. Page 1 of 51.

Named Entity Recognition From Biomedical Text, Maged Guirguis 2023 American University in Cairo

Named Entity Recognition From Biomedical Text, Maged Guirguis

Theses and Dissertations

As vast amounts of unstructured data are becoming available digitally, computer-based methods to extract relevant and meaningful information are needed. Named entity recognition (NER) is the task of identifying text spans that mention named entities, and to classify them into predefined categories. Despite the existence of numerous and well-versed NER methods, the bio-medical domain remains under-studied. The objective of this research is to identify an efficient technique for NER tasks from biomedical data. This is achieved by investigating using deep learning technologies namely pre-trained BERT [1] model and its variances SciBERT [2] and BioBERT [3]. Preprocessing the data before passing …


2d Respiratory Sound Analysis To Detect Lung Abnormalities, Rafia Sharmin Alice, KC Santosh 2023 University of South Dakota

2d Respiratory Sound Analysis To Detect Lung Abnormalities, Rafia Sharmin Alice, Kc Santosh

SDSU Data Science Symposium

Abstract. In this paper, we analyze deep visual features from 2D data representation(s) of the respiratory sound to detect evidence of lung abnormalities. The primary motivation behind this is that visual cues are more important in decision-making than raw data (lung sound). Early detection and prompt treatments are essential for any future possible respiratory disorders, and respiratory sound is proven to be one of the biomarkers. In contrast to state-of-the-art approaches, we aim at understanding/analyzing visual features using our Convolutional Neural Networks (CNN) tailored Deep Learning Models, where we consider all possible 2D data such as Spectrogram, Mel-frequency Cepstral Coefficients …


A Bidirectional Deep Lstm Machine Learning Method For Flight Delay Modelling And Analysis, Desmond B. Bisandu, Irene Moulitsas 2023 Cranfield University

A Bidirectional Deep Lstm Machine Learning Method For Flight Delay Modelling And Analysis, Desmond B. Bisandu, Irene Moulitsas

National Training Aircraft Symposium (NTAS)

Flight delays can be prevented by providing a reference point from an accurate prediction model because predicting flight delays is a problem with a specific space. Only a few algorithms consider predicted classes' mutual correlation during flight delay classification or prediction modelling tasks. None of these existing methods works for all scenarios. Therefore, the need to investigate the performance of more models in solving the problem of flight delay is vast and rapidly increasing. This paper presents the development and evaluation of LSTM and BiLSTM models by comparing them for a flight delay prediction. The LSTM does the feature extraction …


Integrated Organizational Machine Learning For Aviation Flight Data, Michael J. Pritchard, Paul Thomas, Eric Webb, Jon Martin, Austin Walden 2023 Kansas State University

Integrated Organizational Machine Learning For Aviation Flight Data, Michael J. Pritchard, Paul Thomas, Eric Webb, Jon Martin, Austin Walden

National Training Aircraft Symposium (NTAS)

An increased availability of data and computing power has allowed organizations to apply machine learning techniques to various fleet monitoring activities. Additionally, our ability to acquire aircraft data has increased due to the miniaturization of small form factor computing machines. Aircraft data collection processes contain many data features in the form of multivariate time-series (continuous, discrete, categorical, etc.) which can be used to train machine learning models. Yet, three major challenges still face many flight organizations 1) integration and automation of data collection frameworks, 2) data cleanup and preparation, and 3) embedded machine learning framework. Data cleanup and preparation has …


Visual Analytics And Modeling Of Materials Property Data, Diwas Bhattarai 2023 Louisiana State University and Agricultural and Mechanical College

Visual Analytics And Modeling Of Materials Property Data, Diwas Bhattarai

LSU Doctoral Dissertations

Due to significant advancements in experimental and computational techniques, materials data are abundant. To facilitate data-driven research, it calls for a system for managing and sharing data and supporting a set of tools for effective data analysis and modeling. Generally, a given material property M can be considered as a multivariate data problem. The dimensions of M are the values of the property itself, the conditions (pressure P, temperature T, and multi-component composition X) that control the concerned property, and relevant metadata I (source, date).

Here we present a comprehensive database considering both experimental and computational sources …


Crow Search Algorithm With Time Varying Flight Length Strategies For Feature Selection, Mohammed Abdullahi, Abdulhameed Adamu, Ibrahim Hayatu Hassan 2023 Ahmadu Bello University, Zaria, Nigeria

Crow Search Algorithm With Time Varying Flight Length Strategies For Feature Selection, Mohammed Abdullahi, Abdulhameed Adamu, Ibrahim Hayatu Hassan

Future Computing and Informatics Journal

Feature Selection (FS) is an efficient technique use to get rid of irrelevant, redundant and noisy attributes in high dimensional datasets while increasing the efficacy of machine learning classification. The CSA is a modest and efficient metaheuristic algorithm which has been used to overcome several FS issues. The flight length (fl) parameter in CSA governs crows' search ability. In CSA, fl is set to a fixed value. As a result, the CSA is plagued by the problem of being hoodwinked in local minimum. This article suggests a remedy to this issue by bringing five new concepts of time dependent fl …


Application Of Sentiment Analysis And Machine Learning Techniques To Predict Daily Cryptocurrency Price Returns, Edward Wu 2023 Claremont Colleges

Application Of Sentiment Analysis And Machine Learning Techniques To Predict Daily Cryptocurrency Price Returns, Edward Wu

CMC Senior Theses

This paper examines the effects of social media sentiment relating to Bitcoin on the daily price returns of Bitcoin and other popular cryptocurrencies by utilizing sentiment analysis and machine learning techniques to predict daily price returns. Many investors think that social media sentiment affects cryptocurrency prices. However, the results of this paper find that social media sentiment relating to Bitcoin does not add significant predictive value to forecasting daily price returns for each of the six cryptocurrencies used for analysis and that machine learning models that do not assume linearity between the current day price return and previous daily price …


Defining The "Quadruple-A" Player: What Makes A Baseball Player Succeed In The Minor Leagues And Fail In The Major Leagues?, Sam Bogen 2023 Claremont Colleges

Defining The "Quadruple-A" Player: What Makes A Baseball Player Succeed In The Minor Leagues And Fail In The Major Leagues?, Sam Bogen

CMC Senior Theses

The "Quadruple-A" player is defined as one who is too good to play in Triple-A (the league one step down from Major League Baseball) but not good enough to play consistently in Major League Baseball. This thesis paper attempts to explain the phenomenon of the "Quadruple-A" player. Using Triple-A data from 2013-2022 and Major League data from the "Statcast Era" (2015-2022), I build logistic and linear regression models to predict Major League success based on Triple-A performance data as well as Major League Statcast data, discovering that statistics related to how a player hits the ball such as the speed …


Integrated Machine Learning And Optimization Approaches, Dogacan Yilmaz 2022 New Jersey Institute of Technology

Integrated Machine Learning And Optimization Approaches, Dogacan Yilmaz

Dissertations

This dissertation focuses on the integration of machine learning and optimization. Specifically, novel machine learning-based frameworks are proposed to help solve a broad range of well-known operations research problems to reduce the solution times. The first study presents a bidirectional Long Short-Term Memory framework to learn optimal solutions to sequential decision-making problems. Computational results show that the framework significantly reduces the solution time of benchmark capacitated lot-sizing problems without much loss in feasibility and optimality. Also, models trained using shorter planning horizons can successfully predict the optimal solution of the instances with longer planning horizons. For the hardest data set, …


Digital Technology Enables Construction Of National Governance Modernization, Yue HAO, Kaihua CHEN, Jin KANG, Xiaoguang YANG, Chao ZHANG, Xiaolong ZHENG 2022 Xidian University, Xi'an 710126, China

Digital Technology Enables Construction Of National Governance Modernization, Yue Hao, Kaihua Chen, Jin Kang, Xiaoguang Yang, Chao Zhang, Xiaolong Zheng

Bulletin of Chinese Academy of Sciences (Chinese Version)

As digital technologies continue to be integrated into the whole process of economic and social development, promoting the modernization of digital technology-enabled national governance systems and capabilities has become an important way to seize the strategic initiative in the future world competitive landscape, and has attracted the attention of countries around the world. The rapid development of digital technologies such as big data collection, storage, processing, and analysis is constantly optimizing the organizational system structure of national governance, upgrading and perfecting the quality and methods of national governance personnel, and accelerating the process of making national governance efficient, scientific, intelligent …


Big Data Technology Enabling Legal Supervision, Qingjie LIU, Shuo LIU, Yirong WU, Yueqiang WENG, Yihao WEN, Ming LI 2022 Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China

Big Data Technology Enabling Legal Supervision, Qingjie Liu, Shuo Liu, Yirong Wu, Yueqiang Weng, Yihao Wen, Ming Li

Bulletin of Chinese Academy of Sciences (Chinese Version)

Legal supervision plays an important role in the national governance system and capacity. In the era of digital revolution, the rapid development of digital procuratorial work with big data legal supervision as the core promotes to reshape the legal supervision and governance system. In this study, the inherent need of legal supervision for active prosecution in the new era, and the innovative role of new public interest litigation in comprehensive social governance, are firstly analyzed. Then, the core meaning and reshaping role of big-data-enabling-legalsupervision and supervision-promoting-national-governance of digital prosecution are discussed. After summarizing the practical experiences and challenges of big …


Deepening Digital Technologies To Enable Modernization Of China’S Governance Of Health, Tara Qia SUN, Xia FENG, Yuntao LONG, Zongben XU 2022 School of Public Policy and Management, University of Chinese Academy of Sciences, Beijing 100049, Chin

Deepening Digital Technologies To Enable Modernization Of China’S Governance Of Health, Tara Qia Sun, Xia Feng, Yuntao Long, Zongben Xu

Bulletin of Chinese Academy of Sciences (Chinese Version)

One significant goal of science and technology innovation is to set our sights on the health and safety of the people. The rapid development of digital technologies provides multiple potentials and path to achieve the modernization of China's health governance. the role of digital technologies on enabling multiple stakeholders (i.e., hospitals, doctors, government, and social groups) to improve the supply capacity, the inclusiveness, fairness, friendliness, and convenience of health service. Second, we explore the four key issues of using digital technologies to enable the governance of health construction of digital health infrastructures, the factors affecting the adoption of digital technologies, …


Strengthen Fundamental Role Of Data Element Governance In National Governance Modernization, Kaihua CHEN, Zhuo FENG, Rui GUO, Yue HAO, Jin KANG, Xiaoguang YANG, Chao ZHANG, Binbin ZHAO 2022 School of Public Policy and Management, University of Chinese Academy of Sciences, Beijing 100049, China Institutes of Science and Development, Chinese Academy of Sciences, Beijing 100190, China

Strengthen Fundamental Role Of Data Element Governance In National Governance Modernization, Kaihua Chen, Zhuo Feng, Rui Guo, Yue Hao, Jin Kang, Xiaoguang Yang, Chao Zhang, Binbin Zhao

Bulletin of Chinese Academy of Sciences (Chinese Version)

Data element governance is a key factor to promote the modernization of national governance in the digital era. By strengthening the deep integration of data factors and national governance, a new model of data-driven national governance can be formed, and the national governance can be made more scientific, refined, intelligent, and efficient. The US and European countries have continuously strengthened the top-level system design, technological innovation application, collaborative governance mechanism, and global governance cooperation of data element governance, which has effectively improved the level of data element governance and provided experience for China. Nevertheless, due to the virtuality of data …


Strategic Perspective Of Leveraging New Generation Information Technology To Enable Modernization Of Emergency Management, Haibo ZHANG, Xinyu DAI, Depei QIAN, Jian LYU 2022 School of Government, Nanjing University, Nanjing 210023, China

Strategic Perspective Of Leveraging New Generation Information Technology To Enable Modernization Of Emergency Management, Haibo Zhang, Xinyu Dai, Depei Qian, Jian Lyu

Bulletin of Chinese Academy of Sciences (Chinese Version)

The application and development of the new generation information technology is a vital support to realize the modernization of emergency management. At present, the new generation information technology such as big data and artificial intelligence has been widely used in natural disasters, safe production, and other fields. It has improved the monitoring and early warning, regulation and law enforcement, command and decision support, rescue, and social mobilization capabilities of governments, promoted the level of intrinsic safety of enterprises, provided important support for the precise prevention and control of the COVID-19, and increased the efficiency of China’s emergency management and sense …


Digital Technology Enables Modernization Of National Statistics, Zongben XU, Yanyun ZHAO, Liping ZHU, Guang CHEN, Hongyun ZHANG 2022 School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an 710049, China

Digital Technology Enables Modernization Of National Statistics, Zongben Xu, Yanyun Zhao, Liping Zhu, Guang Chen, Hongyun Zhang

Bulletin of Chinese Academy of Sciences (Chinese Version)

The modernization of national statistics is part of the modernization of national governance. Digital technology has provided power for the transformation of statistical production mode, the improvement of statistical productivity, and the reconstruction of statistical production relations. Digital technology has become an important prerequisite for the promotion of statistical modernization reform. This study summarizes the international experience of digital technology enabling government statistics, the top-level design of national statistical legal system, and the importance of digital technology in promoting the modernization of statistics. This study also analyzes the main challenges existing in the current national statistics and data work. Finally, …


Fairness And Privacy In Machine Learning Algorithms, Neha Bhargava 2022 Kennesaw State University

Fairness And Privacy In Machine Learning Algorithms, Neha Bhargava

Master of Science in Computer Science Theses

Roughly 2.5 quintillion bytes of data is generated daily in this digital era. Manual processing of such huge amounts of data to extract useful information is nearly impossible but with the widespread use of machine learning algorithms and their ability to process enormous data in a fast, cost-effective, and scalable way has proven to be a preferred choice to glean useful insights and solve business problems in many domains. With this widespread use of machine learning algorithms there has always been concerns about the ethical issues that may arise from the use of this modern technology. While achieving high accuracies, …


Discourse, Power Dynamics, And Risk Amplification In Disaster Risk Management In Canada, Martins Oluwole Olu-Omotayo 2022 The University of Western Ontario

Discourse, Power Dynamics, And Risk Amplification In Disaster Risk Management In Canada, Martins Oluwole Olu-Omotayo

Electronic Thesis and Dissertation Repository

The domain of disaster risk management is rife with discursive contentions, whereby dominant discourses amplify the powers of risk actors to precipitate and reinforce political, economic, and environmental inequalities that predispose different sections of the population to unequal disaster risk vulnerabilities. This thesis identified important actors (government, risk experts, media, and NGOs) that shape the power dynamics in disaster risk management in Canada and explained their roles, influences, and the dimensions in which their powers negotiate each other through risk discourses. The patterns of these power dynamics in the three aspects of power –communication, assessment, and social trust –were also …


Spatial Validation Of Agent-Based Models, Kristoffer Wikstrom, Hal T. Nelson 2022 Claremont Graduate University

Spatial Validation Of Agent-Based Models, Kristoffer Wikstrom, Hal T. Nelson

Public Administration Faculty Publications and Presentations

This paper adapts an existing techno–social agent-based model (ABM) in order to develop a new framework for spatially validating ABMs. The ABM simulates citizen opposition to locally unwanted land uses, using historical data from an energy infrastructure siting process in Southern California. Spatial theory, as well as the model’s design, suggest that adequate validation requires multiple tests rather than relying solely on a single test-statistic. A pattern-oriented modeling approach was employed that first mapped real and simulated citizen comments across the US Census tract. The suite of spatial tests included Global Moran’s I, complemented with bivariate correlations, as well as …


Investigating Applications Of Deep Learning For Diagnosis Of Post Traumatic Elbow Disease, Hugh James 2022 Washington University in St. Louis

Investigating Applications Of Deep Learning For Diagnosis Of Post Traumatic Elbow Disease, Hugh James

McKelvey School of Engineering Theses & Dissertations

Traumatic events such as dislocation, breaks, and arthritis of musculoskeletal joints can cause the development of post-traumatic joint contracture (PTJC). Clinically, noninvasive techniques such as Magnetic Resonance Imaging (MRI) scans are used to analyze the disease. Such procedures require a patient to sit sedentary for long periods of time and can be expensive as well. Additionally, years of practice and experience are required for clinicians to accurately recognize the diseased anterior capsule region and make an accurate diagnosis. Manual tracing of the anterior capsule is done to help with diagnosis but is subjective and timely. As a result, there is …


Examining The Relationship Between Stomiiform Fish Morphology And Their Ecological Traits, Mikayla L. Twiss 2022 Nova Southeastern University

Examining The Relationship Between Stomiiform Fish Morphology And Their Ecological Traits, Mikayla L. Twiss

All HCAS Student Capstones, Theses, and Dissertations

Trait-based ecology characterizes individuals’ functional attributes to better understand and predict their interactions with other species and their environments. Utilizing morphological traits to describe functional groups has helped group species with similar ecological niches that are not necessarily taxonomically related. Within the deep-pelagic fishes, the Order Stomiiformes exhibits high morphological and species diversity, and many species undertake diel vertical migration (DVM). While the morphology and behavior of stomiiform fishes have been extensively studied and described through taxonomic assessments, the connection between their form and function regarding their DVM types, morphotypes, and daytime depth distributions is not well known. Here, three …


Digital Commons powered by bepress