Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 22 of 22

Full-Text Articles in Physical Sciences and Mathematics

Selecting And Evaluating Key Mds-Updrs Activities Using Wearable Devices For Parkinson's Disease Self-Assessment, Yuting Zhao, Xulong Wang, Xiyang Peng, Ziheng Li, Fengtao Nan, Menghui Zhuo, Jun Qi, Yun Yang, Zhong Zhao, Lida Xu, Po Yang Jan 2024

Selecting And Evaluating Key Mds-Updrs Activities Using Wearable Devices For Parkinson's Disease Self-Assessment, Yuting Zhao, Xulong Wang, Xiyang Peng, Ziheng Li, Fengtao Nan, Menghui Zhuo, Jun Qi, Yun Yang, Zhong Zhao, Lida Xu, Po Yang

Information Technology & Decision Sciences Faculty Publications

Parkinson's disease (PD) is a complex neurodegenerative disease in the elderly. This disease has no cure, but assessing these motor symptoms will help slow down that progression. Inertial sensing-based wearable devices (ISWDs) such as mobile phones and smartwatches have been widely employed to analyse the condition of PD patients. However, most studies purely focused on a single activity or symptom, which may ignore the correlation between activities and complementary characteristics. In this paper, a novel technical pipeline is proposed for fine-grained classification of PD severity grades, which identify the most representative activities. We also propose a multi-activities combination scheme based …


Malware Detection With Artificial Intelligence: A Systematic Literature Review, Matthew G. Gaber, Mohiuddin Ahmed, Helge Janicke Jan 2024

Malware Detection With Artificial Intelligence: A Systematic Literature Review, Matthew G. Gaber, Mohiuddin Ahmed, Helge Janicke

Research outputs 2022 to 2026

In this survey, we review the key developments in the field of malware detection using AI and analyze core challenges. We systematically survey state-of-the-art methods across five critical aspects of building an accurate and robust AI-powered malware-detection model: malware sophistication, analysis techniques, malware repositories, feature selection, and machine learning vs. deep learning. The effectiveness of an AI model is dependent on the quality of the features it is trained with. In turn, the quality and authenticity of these features is dependent on the quality of the dataset and the suitability of the analysis tool. Static analysis is fast but is …


Learning Mortality Risk For Covid-19 Using Machine Learning And Statistical Methods, Shaoshi Zhang Dec 2023

Learning Mortality Risk For Covid-19 Using Machine Learning And Statistical Methods, Shaoshi Zhang

Electronic Thesis and Dissertation Repository

This research investigates the mortality risk of COVID-19 patients across different variant waves, using the data from Centers for Disease Control and Prevention (CDC) websites. By analyzing the available data, including patient medical records, vaccination rates, and hospital capacities, we aim to discern patterns and factors associated with COVID-19-related deaths.

To explore features linked to COVID-19 mortality, we employ different techniques such as Filter, Wrapper, and Embedded methods for feature selection. Furthermore, we apply various machine learning methods, including support vector machines, decision trees, random forests, logistic regression, K-nearest neighbours, na¨ıve Bayes methods, and artificial neural networks, to uncover underlying …


A Study On Feature Selection Using Multi-Domain Feature Extraction For Automated K-Complex Detection, Yabing Li, Xinglong Dong, Kun Song, Xiangyun Bai, Hongye Li, Fakhreddine Karray Sep 2023

A Study On Feature Selection Using Multi-Domain Feature Extraction For Automated K-Complex Detection, Yabing Li, Xinglong Dong, Kun Song, Xiangyun Bai, Hongye Li, Fakhreddine Karray

Machine Learning Faculty Publications

Background: K-complex detection plays a significant role in the field of sleep research. However, manual annotation for electroencephalography (EEG) recordings by visual inspection from experts is time-consuming and subjective. Therefore, there is a necessity to implement automatic detection methods based on classical machine learning algorithms. However, due to the complexity of EEG signal, current feature extraction methods always produce low relevance to k-complex detection, which leads to a great performance loss for the detection. Hence, finding compact yet effective integrated feature vectors becomes a crucially core task in k-complex detection. Method: In this paper, we first extract multi-domain features based …


Feature Selection From Clinical Surveys Using Semantic Textual Similarity, Benjamin Warner May 2023

Feature Selection From Clinical Surveys Using Semantic Textual Similarity, Benjamin Warner

McKelvey School of Engineering Theses & Dissertations

Survey data collected from human subjects can contain a high number of features while having a comparatively low quantity of examples. Machine learning models that attempt to predict outcomes from survey data under these conditions can overfit and result in poor generalizability. One remedy to this issue is feature selection, which attempts to select an optimal subset of features to learn upon. A relatively unexplored source of information in the feature selection process is the usage of textual names of features, which may be semantically indicative of which features are relevant to a target outcome. The relationships between feature names …


An Explainable Artificial Intelligence Framework For The Predictive Analysis Of Hypo And Hyper Thyroidism Using Machine Learning Algorithms, Md. Bipul Hossain, Anika Shama, Apurba Adhikary, Avi Deb Raha, K. M. Aslam Uddin, Mohammad Amzad Hossain, Imtia Islam, Saydul Akbar Murad, Md. Shirajum Munir, Anupam Kumur Bairagi Jan 2023

An Explainable Artificial Intelligence Framework For The Predictive Analysis Of Hypo And Hyper Thyroidism Using Machine Learning Algorithms, Md. Bipul Hossain, Anika Shama, Apurba Adhikary, Avi Deb Raha, K. M. Aslam Uddin, Mohammad Amzad Hossain, Imtia Islam, Saydul Akbar Murad, Md. Shirajum Munir, Anupam Kumur Bairagi

Electrical & Computer Engineering Faculty Publications

The thyroid gland is the crucial organ in the human body, secreting two hormones that help to regulate the human body's metabolism. Thyroid disease is a severe medical complaint that could be developed by high Thyroid Stimulating Hormone (TSH) levels or an infection in the thyroid tissues. Hypothyroidism and hyperthyroidism are two critical conditions caused by insufficient thyroid hormone production and excessive thyroid hormone production, respectively. Machine learning models can be used to precisely process the data generated from different medical sectors and to build a model to predict several diseases. In this paper, we use different machine-learning algorithms to …


Wrapper And Hybrid Feature Selection Methods Using Metaheuristic Algorithms For English Text Classification: A Systematic Review, Osamah Mohammed Alyasiri, Yu N. Cheah, Ammar Kamal Abasi, Omar Mustafa Al-Janabi Apr 2022

Wrapper And Hybrid Feature Selection Methods Using Metaheuristic Algorithms For English Text Classification: A Systematic Review, Osamah Mohammed Alyasiri, Yu N. Cheah, Ammar Kamal Abasi, Omar Mustafa Al-Janabi

Machine Learning Faculty Publications

Feature selection (FS) constitutes a series of processes used to decide which relevant features/attributes to include and which irrelevant features to exclude for predictive modeling. It is a crucial task that aids machine learning classifiers in reducing error rates, computation time, overfitting, and improving classification accuracy. It has demonstrated its efficacy in myriads of domains, ranging from its use for text classification (TC), text mining, and image recognition. While there are many traditional FS methods, recent research efforts have been devoted to applying metaheuristic algorithms as FS techniques for the TC task. However, there are few literature reviews concerning TC. …


Local Feature Selection For Multiple Instance Learning With Applications., Aliasghar Shahrjooihaghighi Dec 2021

Local Feature Selection For Multiple Instance Learning With Applications., Aliasghar Shahrjooihaghighi

Electronic Theses and Dissertations

Feature selection is a data processing approach that has been successfully and effectively used in developing machine learning algorithms for various applications. It has been proven to effectively reduce the dimensionality of the data and increase the accuracy and interpretability of machine learning algorithms. Conventional feature selection algorithms assume that there is an optimal global subset of features for the whole sample space. Thus, only one global subset of relevant features is learned. An alternative approach is based on the concept of Local Feature Selection (LFS), where each training sample can have its own subset of relevant features. Multiple Instance …


Decomposition Furnace Outlet Temperature Prediction Based On Elasticnet And Lstm, Guangyu Yu, Xueping Dong, Xiangmin Wang, Gan Min Jun 2021

Decomposition Furnace Outlet Temperature Prediction Based On Elasticnet And Lstm, Guangyu Yu, Xueping Dong, Xiangmin Wang, Gan Min

Journal of System Simulation

Abstract: The outlet temperature of the decomposition furnace is a key indicator in the cement production process. Aiming at the problem that traditional prediction methods only consider the influence of wind, coal, and materials, a temperature prediction model of ElasticNet combined with Long Short-Term Memory (LSTM) neural network is proposed. The ElasticNet-LSTM export temperature prediction model is constructed by using the ElasticNet method to estimate the parameters of different variables, fully considering the influencing factors and realizing the variable screening, and analyzing the influence of the number of hidden layers and nodes on the accuracy of the neural network. Simulation …


Binary Black Widow Optimization Algorithm For Feature Selection Problems, Ahmed Al-Saedi Jan 2021

Binary Black Widow Optimization Algorithm For Feature Selection Problems, Ahmed Al-Saedi

Theses and Dissertations (Comprehensive)

This thesis addresses feature selection (FS) problems, which is a primary stage in data mining. FS is a significant pre-processing stage to enhance the performance of the process with regards to computation cost and accuracy to offer a better comprehension of stored data by removing the unnecessary and irrelevant features from the basic dataset. However, because of the size of the problem, FS is known to be very challenging and has been classified as an NP-hard problem. Traditional methods can only be used to solve small problems. Therefore, metaheuristic algorithms (MAs) are becoming powerful methods for addressing the FS problems. …


Sar Object Recognition Based On Multi-Band And Multi-Polarization Simulation Image, Gu Yu, Zhang Qin, Xu Ying Jun 2020

Sar Object Recognition Based On Multi-Band And Multi-Polarization Simulation Image, Gu Yu, Zhang Qin, Xu Ying

Journal of System Simulation

Abstract: The object model was built based on Creator, and object texture-material mapping was performed by Vega TMM tool. The multi-band and multi-polarization SAR image database was built by visual simulation technology. A hybrid intelligent optimization algorithm was designed to optimize combination of band and polarization by genetic algorithm and binary particle optimization. Zernike moment features, Gabor wavelet coefficients, etc were extracted from original image and rectified image to make up of feature candidates, and the feature selection experiments were carried out by using multi-band and multi-polarization SAR images. Simulation results demonstrate that, building SAR image database through simulation …


Sparsity And Weak Supervision In Quantum Machine Learning, Seyran Saeedi Jan 2020

Sparsity And Weak Supervision In Quantum Machine Learning, Seyran Saeedi

Theses and Dissertations

Quantum computing is an interdisciplinary field at the intersection of computer science, mathematics, and physics that studies information processing tasks on a quantum computer. A quantum computer is a device whose operations are governed by the laws of quantum mechanics. As building quantum computers is nearing the era of commercialization and quantum supremacy, it is essential to think of potential applications that we might benefit from. Among many applications of quantum computation, one of the emerging fields is quantum machine learning. We focus on predictive models for binary classification and variants of Support Vector Machines that we expect to be …


Image Features For Tuberculosis Classification In Digital Chest Radiographs, Brian Hooper Jan 2020

Image Features For Tuberculosis Classification In Digital Chest Radiographs, Brian Hooper

All Master's Theses

Tuberculosis (TB) is a respiratory disease which affects millions of people each year, accounting for the tenth leading cause of death worldwide, and is especially prevalent in underdeveloped regions where access to adequate medical care may be limited. Analysis of digital chest radiographs (CXRs) is a common and inexpensive method for the diagnosis of TB; however, a trained radiologist is required to interpret the results, and is subject to human error. Computer-Aided Detection (CAD) systems are a promising machine-learning based solution to automate the diagnosis of TB from CXR images. As the dimensionality of a high-resolution CXR image is very …


Noise Clipping Algorithm Based On Relative Contribution Rate, Shuoyu Liu, Yueming Dai Dec 2019

Noise Clipping Algorithm Based On Relative Contribution Rate, Shuoyu Liu, Yueming Dai

Journal of System Simulation

Abstract: This paper presents a class noise cutting algorithm (Class noise cutting, CNC) based on relative contribution rate. The algorithm calculates the relative contribution rate of features to the theme. The most valuable feature set is selected by using features distinguish rating. The corresponding candidate categories for each feature are selected, to reduece the candidate category set, improves the classification accuracy, and speed up the response speed of the classifier. Compared with another ECN noise cutting algorithm (Eliminating the class whose), CNC-has higher accuracy and because of its simpler feature dimension dictionary and better candidate category set, the response …


Sensor - Based Human Activity Recognition Using Smartphones, Mustafa Badshah May 2019

Sensor - Based Human Activity Recognition Using Smartphones, Mustafa Badshah

Master's Projects

It is a significant technical and computational task to provide precise information regarding the activity performed by a human and find patterns of their behavior. Countless applications can be molded and various problems in domains of virtual reality, health and medical, entertainment and security can be solved with advancements in human activity recognition (HAR) systems. HAR is an active field for research for more than a decade, but certain aspects need to be addressed to improve the system and revolutionize the way humans interact with smartphones. This research provides a holistic view of human activity recognition system architecture and discusses …


Distributed Multi-Label Learning On Apache Spark, Jorge Gonzalez Lopez Jan 2019

Distributed Multi-Label Learning On Apache Spark, Jorge Gonzalez Lopez

Theses and Dissertations

This thesis proposes a series of multi-label learning algorithms for classification and feature selection implemented on the Apache Spark distributed computing model. Five approaches for determining the optimal architecture to speed up multi-label learning methods are presented. These approaches range from local parallelization using threads to distributed computing using independent or shared memory spaces. It is shown that the optimal approach performs hundreds of times faster than the baseline method. Three distributed multi-label k nearest neighbors methods built on top of the Spark architecture are proposed: an exact iterative method that computes pair-wise distances, an approximate tree-based method that indexes …


Feature Set Selection For Improved Classification Of Static Analysis Alerts, Kathleen Goeschel Jan 2019

Feature Set Selection For Improved Classification Of Static Analysis Alerts, Kathleen Goeschel

CCE Theses and Dissertations

With the extreme growth in third party cloud applications, increased exposure of applications to the internet, and the impact of successful breaches, improving the security of software being produced is imperative. Static analysis tools can alert to quality and security vulnerabilities of an application; however, they present developers and analysts with a high rate of false positives and unactionable alerts. This problem may lead to the loss of confidence in the scanning tools, possibly resulting in the tools not being used. The discontinued use of these tools may increase the likelihood of insecure software being released into production. Insecure software …


Data Patterns Discovery Using Unsupervised Learning, Rachel A. Lewis Jan 2019

Data Patterns Discovery Using Unsupervised Learning, Rachel A. Lewis

Electronic Theses and Dissertations

Self-care activities classification poses significant challenges in identifying children’s unique functional abilities and needs within the exceptional children healthcare system. The accuracy of diagnosing a child's self-care problem, such as toileting or dressing, is highly influenced by an occupational therapists’ experience and time constraints. Thus, there is a need for objective means to detect and predict in advance the self-care problems of children with physical and motor disabilities. We use clustering to discover interesting information from self-care problems, perform automatic classification of binary data, and discover outliers. The advantages are twofold: the advancement of knowledge on identifying self-care problems in …


The Impact Of Cost On Feature Selection For Classifiers, Richard Clyde Mccrae Jan 2018

The Impact Of Cost On Feature Selection For Classifiers, Richard Clyde Mccrae

CCE Theses and Dissertations

Supervised machine learning models are increasingly being used for medical diagnosis. The diagnostic problem is formulated as a binary classification task in which trained classifiers make predictions based on a set of input features. In diagnosis, these features are typically procedures or tests with associated costs. The cost of applying a trained classifier for diagnosis may be estimated as the total cost of obtaining values for the features that serve as inputs for the classifier. Obtaining classifiers based on a low cost set of input features with acceptable classification accuracy is of interest to practitioners and researchers. What makes this …


Using Machine Learning To Predict Chemotherapy Response In Cell Lines And Patients Based On Genetic Expression, Dimo Angelov Mar 2017

Using Machine Learning To Predict Chemotherapy Response In Cell Lines And Patients Based On Genetic Expression, Dimo Angelov

Electronic Thesis and Dissertation Repository

The goal of this thesis was to examine different machine learning techniques for predicting chemotherapy response in cell lines and patients based on genetic expression. After trying regression, multi-class classification techniques and binary classification it was concluded that binary classification was the best method for training models due to the limited size of available cell line data. We found support vector machine classifiers trained on cell line data were easier to use and produced better results compared to neural networks. Sequential backward feature selection was able to select genes for the models that produced good results, however the greedy algorithm …


Presenting A Labelled Dataset For Real-Time Detection Of Abusive User Posts, Hao Chen, Susan Mckeever, Sarah Jane Delany Jan 2017

Presenting A Labelled Dataset For Real-Time Detection Of Abusive User Posts, Hao Chen, Susan Mckeever, Sarah Jane Delany

Conference papers

Social media sites facilitate users in posting their own personal comments online. Most support free format user posting, with close to real-time publishing speeds. However, online posts generated by a public user audience carry the risk of containing inappropriate, potentially abusive content. To detect such content, the straightforward approach is to filter against blacklists of profane terms. However, this lexicon filtering approach is prone to problems around word variations and lack of context. Although recent methods inspired by machine learning have boosted detection accuracies, the lack of gold standard labelled datasets limits the development of this approach. In this work, …


K-Means+Id3 And Dependence Tree Methods For Supervised Anomaly Detection, Kiran S. Balagani Apr 2008

K-Means+Id3 And Dependence Tree Methods For Supervised Anomaly Detection, Kiran S. Balagani

Doctoral Dissertations

In this dissertation, we present two novel methods for supervised anomaly detection. The first method "K-Means+ID3" performs supervised anomaly detection by partitioning the training data instances into k clusters using Euclidean distance similarity. Then, on each cluster representing a density region of normal or anomaly instances, an ID3 decision tree is built. The ID3 decision tree on each cluster refines the decision boundaries by learning the subgroups within a cluster. To obtain a final decision on detection, the k-Means and ID3 decision trees are combined using two rules: (1) the nearest neighbor rule; and (2) the nearest consensus rule. The …