Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Institution
-
- China Simulation Federation (3)
- MBZUAI (2)
- Nova Southeastern University (2)
- Old Dominion University (2)
- Virginia Commonwealth University (2)
-
- Western University (2)
- Central Washington University (1)
- Edith Cowan University (1)
- Georgia Southern University (1)
- Louisiana Tech University (1)
- San Jose State University (1)
- Technological University Dublin (1)
- University of Louisville (1)
- Washington University in St. Louis (1)
- Wilfrid Laurier University (1)
- Publication
-
- Journal of System Simulation (3)
- CCE Theses and Dissertations (2)
- Electronic Theses and Dissertations (2)
- Electronic Thesis and Dissertation Repository (2)
- Machine Learning Faculty Publications (2)
-
- Theses and Dissertations (2)
- All Master's Theses (1)
- Conference papers (1)
- Doctoral Dissertations (1)
- Electrical & Computer Engineering Faculty Publications (1)
- Information Technology & Decision Sciences Faculty Publications (1)
- Master's Projects (1)
- McKelvey School of Engineering Theses & Dissertations (1)
- Research outputs 2022 to 2026 (1)
- Theses and Dissertations (Comprehensive) (1)
- Publication Type
Articles 1 - 22 of 22
Full-Text Articles in Physical Sciences and Mathematics
Selecting And Evaluating Key Mds-Updrs Activities Using Wearable Devices For Parkinson's Disease Self-Assessment, Yuting Zhao, Xulong Wang, Xiyang Peng, Ziheng Li, Fengtao Nan, Menghui Zhuo, Jun Qi, Yun Yang, Zhong Zhao, Lida Xu, Po Yang
Selecting And Evaluating Key Mds-Updrs Activities Using Wearable Devices For Parkinson's Disease Self-Assessment, Yuting Zhao, Xulong Wang, Xiyang Peng, Ziheng Li, Fengtao Nan, Menghui Zhuo, Jun Qi, Yun Yang, Zhong Zhao, Lida Xu, Po Yang
Information Technology & Decision Sciences Faculty Publications
Parkinson's disease (PD) is a complex neurodegenerative disease in the elderly. This disease has no cure, but assessing these motor symptoms will help slow down that progression. Inertial sensing-based wearable devices (ISWDs) such as mobile phones and smartwatches have been widely employed to analyse the condition of PD patients. However, most studies purely focused on a single activity or symptom, which may ignore the correlation between activities and complementary characteristics. In this paper, a novel technical pipeline is proposed for fine-grained classification of PD severity grades, which identify the most representative activities. We also propose a multi-activities combination scheme based …
Malware Detection With Artificial Intelligence: A Systematic Literature Review, Matthew G. Gaber, Mohiuddin Ahmed, Helge Janicke
Malware Detection With Artificial Intelligence: A Systematic Literature Review, Matthew G. Gaber, Mohiuddin Ahmed, Helge Janicke
Research outputs 2022 to 2026
In this survey, we review the key developments in the field of malware detection using AI and analyze core challenges. We systematically survey state-of-the-art methods across five critical aspects of building an accurate and robust AI-powered malware-detection model: malware sophistication, analysis techniques, malware repositories, feature selection, and machine learning vs. deep learning. The effectiveness of an AI model is dependent on the quality of the features it is trained with. In turn, the quality and authenticity of these features is dependent on the quality of the dataset and the suitability of the analysis tool. Static analysis is fast but is …
Learning Mortality Risk For Covid-19 Using Machine Learning And Statistical Methods, Shaoshi Zhang
Learning Mortality Risk For Covid-19 Using Machine Learning And Statistical Methods, Shaoshi Zhang
Electronic Thesis and Dissertation Repository
This research investigates the mortality risk of COVID-19 patients across different variant waves, using the data from Centers for Disease Control and Prevention (CDC) websites. By analyzing the available data, including patient medical records, vaccination rates, and hospital capacities, we aim to discern patterns and factors associated with COVID-19-related deaths.
To explore features linked to COVID-19 mortality, we employ different techniques such as Filter, Wrapper, and Embedded methods for feature selection. Furthermore, we apply various machine learning methods, including support vector machines, decision trees, random forests, logistic regression, K-nearest neighbours, na¨ıve Bayes methods, and artificial neural networks, to uncover underlying …
A Study On Feature Selection Using Multi-Domain Feature Extraction For Automated K-Complex Detection, Yabing Li, Xinglong Dong, Kun Song, Xiangyun Bai, Hongye Li, Fakhreddine Karray
A Study On Feature Selection Using Multi-Domain Feature Extraction For Automated K-Complex Detection, Yabing Li, Xinglong Dong, Kun Song, Xiangyun Bai, Hongye Li, Fakhreddine Karray
Machine Learning Faculty Publications
Background: K-complex detection plays a significant role in the field of sleep research. However, manual annotation for electroencephalography (EEG) recordings by visual inspection from experts is time-consuming and subjective. Therefore, there is a necessity to implement automatic detection methods based on classical machine learning algorithms. However, due to the complexity of EEG signal, current feature extraction methods always produce low relevance to k-complex detection, which leads to a great performance loss for the detection. Hence, finding compact yet effective integrated feature vectors becomes a crucially core task in k-complex detection. Method: In this paper, we first extract multi-domain features based …
Feature Selection From Clinical Surveys Using Semantic Textual Similarity, Benjamin Warner
Feature Selection From Clinical Surveys Using Semantic Textual Similarity, Benjamin Warner
McKelvey School of Engineering Theses & Dissertations
Survey data collected from human subjects can contain a high number of features while having a comparatively low quantity of examples. Machine learning models that attempt to predict outcomes from survey data under these conditions can overfit and result in poor generalizability. One remedy to this issue is feature selection, which attempts to select an optimal subset of features to learn upon. A relatively unexplored source of information in the feature selection process is the usage of textual names of features, which may be semantically indicative of which features are relevant to a target outcome. The relationships between feature names …
An Explainable Artificial Intelligence Framework For The Predictive Analysis Of Hypo And Hyper Thyroidism Using Machine Learning Algorithms, Md. Bipul Hossain, Anika Shama, Apurba Adhikary, Avi Deb Raha, K. M. Aslam Uddin, Mohammad Amzad Hossain, Imtia Islam, Saydul Akbar Murad, Md. Shirajum Munir, Anupam Kumur Bairagi
An Explainable Artificial Intelligence Framework For The Predictive Analysis Of Hypo And Hyper Thyroidism Using Machine Learning Algorithms, Md. Bipul Hossain, Anika Shama, Apurba Adhikary, Avi Deb Raha, K. M. Aslam Uddin, Mohammad Amzad Hossain, Imtia Islam, Saydul Akbar Murad, Md. Shirajum Munir, Anupam Kumur Bairagi
Electrical & Computer Engineering Faculty Publications
The thyroid gland is the crucial organ in the human body, secreting two hormones that help to regulate the human body's metabolism. Thyroid disease is a severe medical complaint that could be developed by high Thyroid Stimulating Hormone (TSH) levels or an infection in the thyroid tissues. Hypothyroidism and hyperthyroidism are two critical conditions caused by insufficient thyroid hormone production and excessive thyroid hormone production, respectively. Machine learning models can be used to precisely process the data generated from different medical sectors and to build a model to predict several diseases. In this paper, we use different machine-learning algorithms to …
Wrapper And Hybrid Feature Selection Methods Using Metaheuristic Algorithms For English Text Classification: A Systematic Review, Osamah Mohammed Alyasiri, Yu N. Cheah, Ammar Kamal Abasi, Omar Mustafa Al-Janabi
Wrapper And Hybrid Feature Selection Methods Using Metaheuristic Algorithms For English Text Classification: A Systematic Review, Osamah Mohammed Alyasiri, Yu N. Cheah, Ammar Kamal Abasi, Omar Mustafa Al-Janabi
Machine Learning Faculty Publications
Feature selection (FS) constitutes a series of processes used to decide which relevant features/attributes to include and which irrelevant features to exclude for predictive modeling. It is a crucial task that aids machine learning classifiers in reducing error rates, computation time, overfitting, and improving classification accuracy. It has demonstrated its efficacy in myriads of domains, ranging from its use for text classification (TC), text mining, and image recognition. While there are many traditional FS methods, recent research efforts have been devoted to applying metaheuristic algorithms as FS techniques for the TC task. However, there are few literature reviews concerning TC. …
Local Feature Selection For Multiple Instance Learning With Applications., Aliasghar Shahrjooihaghighi
Local Feature Selection For Multiple Instance Learning With Applications., Aliasghar Shahrjooihaghighi
Electronic Theses and Dissertations
Feature selection is a data processing approach that has been successfully and effectively used in developing machine learning algorithms for various applications. It has been proven to effectively reduce the dimensionality of the data and increase the accuracy and interpretability of machine learning algorithms. Conventional feature selection algorithms assume that there is an optimal global subset of features for the whole sample space. Thus, only one global subset of relevant features is learned. An alternative approach is based on the concept of Local Feature Selection (LFS), where each training sample can have its own subset of relevant features. Multiple Instance …
Decomposition Furnace Outlet Temperature Prediction Based On Elasticnet And Lstm, Guangyu Yu, Xueping Dong, Xiangmin Wang, Gan Min
Decomposition Furnace Outlet Temperature Prediction Based On Elasticnet And Lstm, Guangyu Yu, Xueping Dong, Xiangmin Wang, Gan Min
Journal of System Simulation
Abstract: The outlet temperature of the decomposition furnace is a key indicator in the cement production process. Aiming at the problem that traditional prediction methods only consider the influence of wind, coal, and materials, a temperature prediction model of ElasticNet combined with Long Short-Term Memory (LSTM) neural network is proposed. The ElasticNet-LSTM export temperature prediction model is constructed by using the ElasticNet method to estimate the parameters of different variables, fully considering the influencing factors and realizing the variable screening, and analyzing the influence of the number of hidden layers and nodes on the accuracy of the neural network. Simulation …
Binary Black Widow Optimization Algorithm For Feature Selection Problems, Ahmed Al-Saedi
Binary Black Widow Optimization Algorithm For Feature Selection Problems, Ahmed Al-Saedi
Theses and Dissertations (Comprehensive)
This thesis addresses feature selection (FS) problems, which is a primary stage in data mining. FS is a significant pre-processing stage to enhance the performance of the process with regards to computation cost and accuracy to offer a better comprehension of stored data by removing the unnecessary and irrelevant features from the basic dataset. However, because of the size of the problem, FS is known to be very challenging and has been classified as an NP-hard problem. Traditional methods can only be used to solve small problems. Therefore, metaheuristic algorithms (MAs) are becoming powerful methods for addressing the FS problems. …
Sar Object Recognition Based On Multi-Band And Multi-Polarization Simulation Image, Gu Yu, Zhang Qin, Xu Ying
Sar Object Recognition Based On Multi-Band And Multi-Polarization Simulation Image, Gu Yu, Zhang Qin, Xu Ying
Journal of System Simulation
Abstract: The object model was built based on Creator, and object texture-material mapping was performed by Vega TMM tool. The multi-band and multi-polarization SAR image database was built by visual simulation technology. A hybrid intelligent optimization algorithm was designed to optimize combination of band and polarization by genetic algorithm and binary particle optimization. Zernike moment features, Gabor wavelet coefficients, etc were extracted from original image and rectified image to make up of feature candidates, and the feature selection experiments were carried out by using multi-band and multi-polarization SAR images. Simulation results demonstrate that, building SAR image database through simulation …
Sparsity And Weak Supervision In Quantum Machine Learning, Seyran Saeedi
Sparsity And Weak Supervision In Quantum Machine Learning, Seyran Saeedi
Theses and Dissertations
Quantum computing is an interdisciplinary field at the intersection of computer science, mathematics, and physics that studies information processing tasks on a quantum computer. A quantum computer is a device whose operations are governed by the laws of quantum mechanics. As building quantum computers is nearing the era of commercialization and quantum supremacy, it is essential to think of potential applications that we might benefit from. Among many applications of quantum computation, one of the emerging fields is quantum machine learning. We focus on predictive models for binary classification and variants of Support Vector Machines that we expect to be …
Image Features For Tuberculosis Classification In Digital Chest Radiographs, Brian Hooper
Image Features For Tuberculosis Classification In Digital Chest Radiographs, Brian Hooper
All Master's Theses
Tuberculosis (TB) is a respiratory disease which affects millions of people each year, accounting for the tenth leading cause of death worldwide, and is especially prevalent in underdeveloped regions where access to adequate medical care may be limited. Analysis of digital chest radiographs (CXRs) is a common and inexpensive method for the diagnosis of TB; however, a trained radiologist is required to interpret the results, and is subject to human error. Computer-Aided Detection (CAD) systems are a promising machine-learning based solution to automate the diagnosis of TB from CXR images. As the dimensionality of a high-resolution CXR image is very …
Noise Clipping Algorithm Based On Relative Contribution Rate, Shuoyu Liu, Yueming Dai
Noise Clipping Algorithm Based On Relative Contribution Rate, Shuoyu Liu, Yueming Dai
Journal of System Simulation
Abstract: This paper presents a class noise cutting algorithm (Class noise cutting, CNC) based on relative contribution rate. The algorithm calculates the relative contribution rate of features to the theme. The most valuable feature set is selected by using features distinguish rating. The corresponding candidate categories for each feature are selected, to reduece the candidate category set, improves the classification accuracy, and speed up the response speed of the classifier. Compared with another ECN noise cutting algorithm (Eliminating the class whose), CNC-has higher accuracy and because of its simpler feature dimension dictionary and better candidate category set, the response …
Sensor - Based Human Activity Recognition Using Smartphones, Mustafa Badshah
Sensor - Based Human Activity Recognition Using Smartphones, Mustafa Badshah
Master's Projects
It is a significant technical and computational task to provide precise information regarding the activity performed by a human and find patterns of their behavior. Countless applications can be molded and various problems in domains of virtual reality, health and medical, entertainment and security can be solved with advancements in human activity recognition (HAR) systems. HAR is an active field for research for more than a decade, but certain aspects need to be addressed to improve the system and revolutionize the way humans interact with smartphones. This research provides a holistic view of human activity recognition system architecture and discusses …
Distributed Multi-Label Learning On Apache Spark, Jorge Gonzalez Lopez
Distributed Multi-Label Learning On Apache Spark, Jorge Gonzalez Lopez
Theses and Dissertations
This thesis proposes a series of multi-label learning algorithms for classification and feature selection implemented on the Apache Spark distributed computing model. Five approaches for determining the optimal architecture to speed up multi-label learning methods are presented. These approaches range from local parallelization using threads to distributed computing using independent or shared memory spaces. It is shown that the optimal approach performs hundreds of times faster than the baseline method. Three distributed multi-label k nearest neighbors methods built on top of the Spark architecture are proposed: an exact iterative method that computes pair-wise distances, an approximate tree-based method that indexes …
Feature Set Selection For Improved Classification Of Static Analysis Alerts, Kathleen Goeschel
Feature Set Selection For Improved Classification Of Static Analysis Alerts, Kathleen Goeschel
CCE Theses and Dissertations
With the extreme growth in third party cloud applications, increased exposure of applications to the internet, and the impact of successful breaches, improving the security of software being produced is imperative. Static analysis tools can alert to quality and security vulnerabilities of an application; however, they present developers and analysts with a high rate of false positives and unactionable alerts. This problem may lead to the loss of confidence in the scanning tools, possibly resulting in the tools not being used. The discontinued use of these tools may increase the likelihood of insecure software being released into production. Insecure software …
Data Patterns Discovery Using Unsupervised Learning, Rachel A. Lewis
Data Patterns Discovery Using Unsupervised Learning, Rachel A. Lewis
Electronic Theses and Dissertations
Self-care activities classification poses significant challenges in identifying children’s unique functional abilities and needs within the exceptional children healthcare system. The accuracy of diagnosing a child's self-care problem, such as toileting or dressing, is highly influenced by an occupational therapists’ experience and time constraints. Thus, there is a need for objective means to detect and predict in advance the self-care problems of children with physical and motor disabilities. We use clustering to discover interesting information from self-care problems, perform automatic classification of binary data, and discover outliers. The advantages are twofold: the advancement of knowledge on identifying self-care problems in …
The Impact Of Cost On Feature Selection For Classifiers, Richard Clyde Mccrae
The Impact Of Cost On Feature Selection For Classifiers, Richard Clyde Mccrae
CCE Theses and Dissertations
Supervised machine learning models are increasingly being used for medical diagnosis. The diagnostic problem is formulated as a binary classification task in which trained classifiers make predictions based on a set of input features. In diagnosis, these features are typically procedures or tests with associated costs. The cost of applying a trained classifier for diagnosis may be estimated as the total cost of obtaining values for the features that serve as inputs for the classifier. Obtaining classifiers based on a low cost set of input features with acceptable classification accuracy is of interest to practitioners and researchers. What makes this …
Using Machine Learning To Predict Chemotherapy Response In Cell Lines And Patients Based On Genetic Expression, Dimo Angelov
Using Machine Learning To Predict Chemotherapy Response In Cell Lines And Patients Based On Genetic Expression, Dimo Angelov
Electronic Thesis and Dissertation Repository
The goal of this thesis was to examine different machine learning techniques for predicting chemotherapy response in cell lines and patients based on genetic expression. After trying regression, multi-class classification techniques and binary classification it was concluded that binary classification was the best method for training models due to the limited size of available cell line data. We found support vector machine classifiers trained on cell line data were easier to use and produced better results compared to neural networks. Sequential backward feature selection was able to select genes for the models that produced good results, however the greedy algorithm …
Presenting A Labelled Dataset For Real-Time Detection Of Abusive User Posts, Hao Chen, Susan Mckeever, Sarah Jane Delany
Presenting A Labelled Dataset For Real-Time Detection Of Abusive User Posts, Hao Chen, Susan Mckeever, Sarah Jane Delany
Conference papers
Social media sites facilitate users in posting their own personal comments online. Most support free format user posting, with close to real-time publishing speeds. However, online posts generated by a public user audience carry the risk of containing inappropriate, potentially abusive content. To detect such content, the straightforward approach is to filter against blacklists of profane terms. However, this lexicon filtering approach is prone to problems around word variations and lack of context. Although recent methods inspired by machine learning have boosted detection accuracies, the lack of gold standard labelled datasets limits the development of this approach. In this work, …
K-Means+Id3 And Dependence Tree Methods For Supervised Anomaly Detection, Kiran S. Balagani
K-Means+Id3 And Dependence Tree Methods For Supervised Anomaly Detection, Kiran S. Balagani
Doctoral Dissertations
In this dissertation, we present two novel methods for supervised anomaly detection. The first method "K-Means+ID3" performs supervised anomaly detection by partitioning the training data instances into k clusters using Euclidean distance similarity. Then, on each cluster representing a density region of normal or anomaly instances, an ID3 decision tree is built. The ID3 decision tree on each cluster refines the decision boundaries by learning the subgroups within a cluster. To obtain a final decision on detection, the k-Means and ID3 decision trees are combined using two rules: (1) the nearest neighbor rule; and (2) the nearest consensus rule. The …