Open Access. Powered by Scholars. Published by Universities.®

Data Science Commons

Open Access. Powered by Scholars. Published by Universities.®

1,010 Full-Text Articles 2,169 Authors 177,939 Downloads 153 Institutions

All Articles in Data Science

Faceted Search

1,010 full-text articles. Page 1 of 50.

Named Entity Recognition From Biomedical Text, Maged Guirguis 2023 American University in Cairo

Named Entity Recognition From Biomedical Text, Maged Guirguis

Theses and Dissertations

As vast amounts of unstructured data are becoming available digitally, computer-based methods to extract relevant and meaningful information are needed. Named entity recognition (NER) is the task of identifying text spans that mention named entities, and to classify them into predefined categories. Despite the existence of numerous and well-versed NER methods, the bio-medical domain remains under-studied. The objective of this research is to identify an efficient technique for NER tasks from biomedical data. This is achieved by investigating using deep learning technologies namely pre-trained BERT [1] model and its variances SciBERT [2] and BioBERT [3]. Preprocessing the data before passing …


Crow Search Algorithm With Time Varying Flight Length Strategies For Feature Selection, Mohammed Abdullahi, Abdulhameed Adamu, Ibrahim Hayatu Hassan 2023 Ahmadu Bello University, Zaria, Nigeria

Crow Search Algorithm With Time Varying Flight Length Strategies For Feature Selection, Mohammed Abdullahi, Abdulhameed Adamu, Ibrahim Hayatu Hassan

Future Computing and Informatics Journal

Feature Selection (FS) is an efficient technique use to get rid of irrelevant, redundant and noisy attributes in high dimensional datasets while increasing the efficacy of machine learning classification. The CSA is a modest and efficient metaheuristic algorithm which has been used to overcome several FS issues. The flight length (fl) parameter in CSA governs crows' search ability. In CSA, fl is set to a fixed value. As a result, the CSA is plagued by the problem of being hoodwinked in local minimum. This article suggests a remedy to this issue by bringing five new concepts of time dependent fl …


Digital Technology Enables Construction Of National Governance Modernization, Yue HAO, Kaihua CHEN, Jin KANG, Xiaoguang YANG, Chao ZHANG, Xiaolong ZHENG 2022 Xidian University, Xi'an 710126, China

Digital Technology Enables Construction Of National Governance Modernization, Yue Hao, Kaihua Chen, Jin Kang, Xiaoguang Yang, Chao Zhang, Xiaolong Zheng

Bulletin of Chinese Academy of Sciences (Chinese Version)

As digital technologies continue to be integrated into the whole process of economic and social development, promoting the modernization of digital technology-enabled national governance systems and capabilities has become an important way to seize the strategic initiative in the future world competitive landscape, and has attracted the attention of countries around the world. The rapid development of digital technologies such as big data collection, storage, processing, and analysis is constantly optimizing the organizational system structure of national governance, upgrading and perfecting the quality and methods of national governance personnel, and accelerating the process of making national governance efficient, scientific, intelligent …


Big Data Technology Enabling Legal Supervision, Qingjie LIU, Shuo LIU, Yirong WU, Yueqiang WENG, Yihao WEN, Ming LI 2022 Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China

Big Data Technology Enabling Legal Supervision, Qingjie Liu, Shuo Liu, Yirong Wu, Yueqiang Weng, Yihao Wen, Ming Li

Bulletin of Chinese Academy of Sciences (Chinese Version)

Legal supervision plays an important role in the national governance system and capacity. In the era of digital revolution, the rapid development of digital procuratorial work with big data legal supervision as the core promotes to reshape the legal supervision and governance system. In this study, the inherent need of legal supervision for active prosecution in the new era, and the innovative role of new public interest litigation in comprehensive social governance, are firstly analyzed. Then, the core meaning and reshaping role of big-data-enabling-legalsupervision and supervision-promoting-national-governance of digital prosecution are discussed. After summarizing the practical experiences and challenges of big …


Deepening Digital Technologies To Enable Modernization Of China’S Governance Of Health, Tara Qia SUN, Xia FENG, Yuntao LONG, Zongben XU 2022 School of Public Policy and Management, University of Chinese Academy of Sciences, Beijing 100049, Chin

Deepening Digital Technologies To Enable Modernization Of China’S Governance Of Health, Tara Qia Sun, Xia Feng, Yuntao Long, Zongben Xu

Bulletin of Chinese Academy of Sciences (Chinese Version)

One significant goal of science and technology innovation is to set our sights on the health and safety of the people. The rapid development of digital technologies provides multiple potentials and path to achieve the modernization of China's health governance. the role of digital technologies on enabling multiple stakeholders (i.e., hospitals, doctors, government, and social groups) to improve the supply capacity, the inclusiveness, fairness, friendliness, and convenience of health service. Second, we explore the four key issues of using digital technologies to enable the governance of health construction of digital health infrastructures, the factors affecting the adoption of digital technologies, …


Strengthen Fundamental Role Of Data Element Governance In National Governance Modernization, Kaihua CHEN, Zhuo FENG, Rui GUO, Yue HAO, Jin KANG, Xiaoguang YANG, Chao ZHANG, Binbin ZHAO 2022 School of Public Policy and Management, University of Chinese Academy of Sciences, Beijing 100049, China Institutes of Science and Development, Chinese Academy of Sciences, Beijing 100190, China

Strengthen Fundamental Role Of Data Element Governance In National Governance Modernization, Kaihua Chen, Zhuo Feng, Rui Guo, Yue Hao, Jin Kang, Xiaoguang Yang, Chao Zhang, Binbin Zhao

Bulletin of Chinese Academy of Sciences (Chinese Version)

Data element governance is a key factor to promote the modernization of national governance in the digital era. By strengthening the deep integration of data factors and national governance, a new model of data-driven national governance can be formed, and the national governance can be made more scientific, refined, intelligent, and efficient. The US and European countries have continuously strengthened the top-level system design, technological innovation application, collaborative governance mechanism, and global governance cooperation of data element governance, which has effectively improved the level of data element governance and provided experience for China. Nevertheless, due to the virtuality of data …


Strategic Perspective Of Leveraging New Generation Information Technology To Enable Modernization Of Emergency Management, Haibo ZHANG, Xinyu DAI, Depei QIAN, Jian LYU 2022 School of Government, Nanjing University, Nanjing 210023, China

Strategic Perspective Of Leveraging New Generation Information Technology To Enable Modernization Of Emergency Management, Haibo Zhang, Xinyu Dai, Depei Qian, Jian Lyu

Bulletin of Chinese Academy of Sciences (Chinese Version)

The application and development of the new generation information technology is a vital support to realize the modernization of emergency management. At present, the new generation information technology such as big data and artificial intelligence has been widely used in natural disasters, safe production, and other fields. It has improved the monitoring and early warning, regulation and law enforcement, command and decision support, rescue, and social mobilization capabilities of governments, promoted the level of intrinsic safety of enterprises, provided important support for the precise prevention and control of the COVID-19, and increased the efficiency of China’s emergency management and sense …


Digital Technology Enables Modernization Of National Statistics, Zongben XU, Yanyun ZHAO, Liping ZHU, Guang CHEN, Hongyun ZHANG 2022 School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an 710049, China

Digital Technology Enables Modernization Of National Statistics, Zongben Xu, Yanyun Zhao, Liping Zhu, Guang Chen, Hongyun Zhang

Bulletin of Chinese Academy of Sciences (Chinese Version)

The modernization of national statistics is part of the modernization of national governance. Digital technology has provided power for the transformation of statistical production mode, the improvement of statistical productivity, and the reconstruction of statistical production relations. Digital technology has become an important prerequisite for the promotion of statistical modernization reform. This study summarizes the international experience of digital technology enabling government statistics, the top-level design of national statistical legal system, and the importance of digital technology in promoting the modernization of statistics. This study also analyzes the main challenges existing in the current national statistics and data work. Finally, …


Fairness And Privacy In Machine Learning Algorithms, Neha Bhargava 2022 Kennesaw State University

Fairness And Privacy In Machine Learning Algorithms, Neha Bhargava

Master of Science in Computer Science Theses

Roughly 2.5 quintillion bytes of data is generated daily in this digital era. Manual processing of such huge amounts of data to extract useful information is nearly impossible but with the widespread use of machine learning algorithms and their ability to process enormous data in a fast, cost-effective, and scalable way has proven to be a preferred choice to glean useful insights and solve business problems in many domains. With this widespread use of machine learning algorithms there has always been concerns about the ethical issues that may arise from the use of this modern technology. While achieving high accuracies, …


Discourse, Power Dynamics, And Risk Amplification In Disaster Risk Management In Canada, Martins Oluwole Olu-Omotayo 2022 The University of Western Ontario

Discourse, Power Dynamics, And Risk Amplification In Disaster Risk Management In Canada, Martins Oluwole Olu-Omotayo

Electronic Thesis and Dissertation Repository

The domain of disaster risk management is rife with discursive contentions, whereby dominant discourses amplify the powers of risk actors to precipitate and reinforce political, economic, and environmental inequalities that predispose different sections of the population to unequal disaster risk vulnerabilities. This thesis identified important actors (government, risk experts, media, and NGOs) that shape the power dynamics in disaster risk management in Canada and explained their roles, influences, and the dimensions in which their powers negotiate each other through risk discourses. The patterns of these power dynamics in the three aspects of power –communication, assessment, and social trust –were also …


Spatial Validation Of Agent-Based Models, Kristoffer Wikstrom, Hal T. Nelson 2022 Claremont Graduate University

Spatial Validation Of Agent-Based Models, Kristoffer Wikstrom, Hal T. Nelson

Public Administration Faculty Publications and Presentations

This paper adapts an existing techno–social agent-based model (ABM) in order to develop a new framework for spatially validating ABMs. The ABM simulates citizen opposition to locally unwanted land uses, using historical data from an energy infrastructure siting process in Southern California. Spatial theory, as well as the model’s design, suggest that adequate validation requires multiple tests rather than relying solely on a single test-statistic. A pattern-oriented modeling approach was employed that first mapped real and simulated citizen comments across the US Census tract. The suite of spatial tests included Global Moran’s I, complemented with bivariate correlations, as well as …


Investigating Applications Of Deep Learning For Diagnosis Of Post Traumatic Elbow Disease, Hugh James 2022 Washington University in St. Louis

Investigating Applications Of Deep Learning For Diagnosis Of Post Traumatic Elbow Disease, Hugh James

McKelvey School of Engineering Theses & Dissertations

Traumatic events such as dislocation, breaks, and arthritis of musculoskeletal joints can cause the development of post-traumatic joint contracture (PTJC). Clinically, noninvasive techniques such as Magnetic Resonance Imaging (MRI) scans are used to analyze the disease. Such procedures require a patient to sit sedentary for long periods of time and can be expensive as well. Additionally, years of practice and experience are required for clinicians to accurately recognize the diseased anterior capsule region and make an accurate diagnosis. Manual tracing of the anterior capsule is done to help with diagnosis but is subjective and timely. As a result, there is …


Examining The Relationship Between Stomiiform Fish Morphology And Their Ecological Traits, Mikayla L. Twiss 2022 Nova Southeastern University

Examining The Relationship Between Stomiiform Fish Morphology And Their Ecological Traits, Mikayla L. Twiss

All HCAS Student Capstones, Theses, and Dissertations

Trait-based ecology characterizes individuals’ functional attributes to better understand and predict their interactions with other species and their environments. Utilizing morphological traits to describe functional groups has helped group species with similar ecological niches that are not necessarily taxonomically related. Within the deep-pelagic fishes, the Order Stomiiformes exhibits high morphological and species diversity, and many species undertake diel vertical migration (DVM). While the morphology and behavior of stomiiform fishes have been extensively studied and described through taxonomic assessments, the connection between their form and function regarding their DVM types, morphotypes, and daytime depth distributions is not well known. Here, three …


A Maturity Model Of Data Modeling In Self-Service Business Intelligence Software, Anna Kurenkov 2022 Kennesaw State University

A Maturity Model Of Data Modeling In Self-Service Business Intelligence Software, Anna Kurenkov

Master of Science in Information Technology Theses

Although Self-Service Business Intelligence (SSBI) is continually being adopted in various industries, there is a lack of research focused on data modeling in SSBI. This research aims to fill that research gap and propose a maturity model for SSBI data modeling which is generalizeable between different software and applicable for users of all technical backgrounds. Through extensive literature review, a five-tier maturity model was proposed, explained, and instantiated in PowerBI and Tableau. The testing of the model was found to be simple and intuitive, and the research concludes that the model is applicable to enterprise SSBI environments. This research is …


Payload-Byte: A Tool For Extracting And Labeling Packet Capture Files Of Modern Network Intrusion Detection Datasets, Yasir Farrukh, Irfan Khan, Syed Wali, David A. Bierbrauer, John Pavlik, Nathaniel D. Bastian 2022 Army Cyber Institute, United States Military Academy

Payload-Byte: A Tool For Extracting And Labeling Packet Capture Files Of Modern Network Intrusion Detection Datasets, Yasir Farrukh, Irfan Khan, Syed Wali, David A. Bierbrauer, John Pavlik, Nathaniel D. Bastian

ACI Journal Articles

Adapting modern approaches for network intrusion detection is becoming critical, given the rapid technological advancement and adversarial attack rates. Therefore, packet-based methods utilizing payload data are gaining much popularity due to their effectiveness in detecting certain attacks. However, packet-based approaches suffer from a lack of standardization, resulting in incomparability and reproducibility issues. Unlike flow-based datasets, no standard labeled dataset exists, forcing researchers to follow bespoke labeling pipelines for individual approaches. Without a standardized baseline, proposed approaches cannot be compared and evaluated with each other. One cannot gauge whether the proposed approach is a methodological advancement or is just being benefited …


The Interaction Of Normalisation And Clustering In Sub-Domain Definition For Multi-Source Transfer Learning Based Time Series Anomaly Detection, Matthew Nicholson, Rahul Agrahari, Clare Conran, Haythem Assem, John D. Kelleher 2022 ADAPT Centre, Trinity College Dublin

The Interaction Of Normalisation And Clustering In Sub-Domain Definition For Multi-Source Transfer Learning Based Time Series Anomaly Detection, Matthew Nicholson, Rahul Agrahari, Clare Conran, Haythem Assem, John D. Kelleher

Articles

This paper examines how data normalisation and clustering interact in the definition of sub-domains within multi-source transfer learning systems for time series anomaly detection. The paper introduces a distinction between (i) clustering as a primary/direct method for anomaly detection, and (ii) clustering as a method for identifying sub-domains within the source or target datasets. Reporting the results of three sets of experiments, we find that normalisation after feature extraction and before clustering results in the best performance for anomaly detection. Interestingly, we find that in the multi-source transfer learning scenario clustering on the target dataset and identifying subdomains in the …


Artificial Intelligence In The Medical Field: Medical Review Sentiment Analysis, Nicholas Podlesak 2022 Northern Illinois University

Artificial Intelligence In The Medical Field: Medical Review Sentiment Analysis, Nicholas Podlesak

Honors Capstones

In this research project, natural language processing techniques’ ability to accurately classify medical text was measured to reinforce the relevance of artificial intelligence in the medical field. Sentiment analyses (analyses to determine whether the text was positive or negative) were performed on the prescription drug reviews in an open-source dataset using four different models: lexical, a neural network, a support vector machine, and a logistic regression model. Each model’s effectiveness was gauged by its ability to correctly classify unlabeled drug reviews (i.e., a percentage representing accuracy). The machine learning models were able to accurately classify the text, while the lexical …


Context-Aware Collaborative Neuro-Symbolic Inference In Internet Of Battlefield Things, Tarek Abdelzaher, Nathaniel D. Bastian, Susmit Jha, Lance Kaplan, Mani Srivastava, Venugopal Veeravalli 2022 Army Cyber Institute, U.S. Military Academy

Context-Aware Collaborative Neuro-Symbolic Inference In Internet Of Battlefield Things, Tarek Abdelzaher, Nathaniel D. Bastian, Susmit Jha, Lance Kaplan, Mani Srivastava, Venugopal Veeravalli

ACI Journal Articles

IoBTs must feature collaborative, context-aware, multi-modal fusion for real-time, robust decision-making in adversarial environments. The integration of machine learning (ML) models into IoBTs has been successful at solving these problems at a small scale (e.g., AiTR), but state-of-the-art ML models grow exponentially with increasing temporal and spatial scale of modeled phenomena, and can thus become brittle, untrustworthy, and vulnerable when interpreting large-scale tactical edge data. To address this challenge, we need to develop principles and methodologies for uncertainty-quantified neuro-symbolic ML, where learning and inference exploit symbolic knowledge and reasoning, in addition to, multi-modal and multi-vantage sensor data. The approach features …


Safe Sharing For Sensitive Data, Kristi Thompson 2022 Western University

Safe Sharing For Sensitive Data, Kristi Thompson

Western Libraries Presentations

This workshop focused on the question of when and how human subjects' data can be safely shared. It introduced the basics of data anonymization and discussed how to tell if a dataset has been de-identified. Case studies of successful anonymization and some spectacular failures were shared


Denoising And Deconvolving Sperm Whale Data In The Northern Gulf Of Mexico Using Fourier And Wavelet Techniques, Kendal McCain Leftwich 2022 University of New Orleans, New Orleans

Denoising And Deconvolving Sperm Whale Data In The Northern Gulf Of Mexico Using Fourier And Wavelet Techniques, Kendal Mccain Leftwich

University of New Orleans Theses and Dissertations

The use of underwater acoustics can be an important component in obtaining information from the oceans of the world. It is desirable (but difficult) to compile an acoustic catalog of sounds emitted by various underwater objects to complement optical catalogs. For example, the current visual catalog for whale tail flukes of large marine mammals (whales) can identify even individual whales from their individual fluke characteristics. However, since sperm whales, Physeter microcephalus, do not fluke up when they dive, they cannot be identified in this manner. A corresponding acoustic catalog for sperm whale clicks could be compiled to identify individual …


Digital Commons powered by bepress