Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Engineering

PDF

Technological University Dublin

Machine Learning

Articles 1 - 18 of 18

Full-Text Articles in Engineering

A Comparison Of Feature Selection Methodologies And Learning Algorithms In The Development Of A Dna Methylation-Based Telomere Length Estimator, Trevor Doherty, Emma Dempster, Eilis Hannon, Jonathan Mill, Richie Poulton, David Corcoran, Karen Sugden, Ben Williams, Avshalom Caspi, Terrie E. Moffitt, Sarah Jane Delany, Therese Murphy Dr Jan 2023

A Comparison Of Feature Selection Methodologies And Learning Algorithms In The Development Of A Dna Methylation-Based Telomere Length Estimator, Trevor Doherty, Emma Dempster, Eilis Hannon, Jonathan Mill, Richie Poulton, David Corcoran, Karen Sugden, Ben Williams, Avshalom Caspi, Terrie E. Moffitt, Sarah Jane Delany, Therese Murphy Dr

Articles

The field of epigenomics holds great promise in understanding and treating disease with advances in machine learning (ML) and artificial intelligence being vitally important in this pursuit. Increasingly, research now utilises DNA methylation measures at cytosine–guanine dinucleotides (CpG) to detect disease and estimate biological traits such as aging. Given the challenge of high dimensionality of DNA methylation data, feature-selection techniques are commonly employed to reduce dimensionality and identify the most important subset of features. In this study, our aim was to test and compare a range of feature-selection methods and ML algorithms in the development of a novel DNA methylation-based …


Experimenting An Edge-Cloud Computing Model On The Gpulab Fed4fire Testbed, Vikas Tomer, Sachin Sharma Jul 2022

Experimenting An Edge-Cloud Computing Model On The Gpulab Fed4fire Testbed, Vikas Tomer, Sachin Sharma

Conference papers

There are various open testbeds available for testing algorithms and prototypes, including the Fed4Fire testbeds. This demo paper illustrates how the GPULAB Fed4Fire testbed can be used to test an edge-cloud model that employs an ensemble machine learning algorithm for detecting attacks on the Internet of Things (IoT). We compare experimentation times and other performance metrics of our model based on different characteristics of the testbed, such as GPU model, CPU speed, and memory. Our goal is to demonstrate how an edge-computing model can be run on the GPULab testbed. Results indicate that this use case can be deployed seamlessly …


Hybridization Of Biologically Inspired Algorithms For Discrete Optimisation Problems, Elihu Essian-Thompson Jan 2022

Hybridization Of Biologically Inspired Algorithms For Discrete Optimisation Problems, Elihu Essian-Thompson

Dissertations

In the field of Optimization Algorithms, despite the popularity of hybrid designs, not enough consideration has been given to hybridization strategies. This paper aims to raise awareness of the benefits that such a study can bring. It does this by conducting a systematic review of popular algorithms used for optimization, within the context of Combinatorial Optimization Problems. Then, a comparative analysis is performed between Hybrid and Base versions of the algorithms to demonstrate an increase in optimization performance when hybridization is employed.


Exploration Of Approaches To Arabic Named Entity Recognition, Husamelddin Balla, Sarah Jane Delany Jan 2020

Exploration Of Approaches To Arabic Named Entity Recognition, Husamelddin Balla, Sarah Jane Delany

Conference papers

Abstract. The Named Entity Recognition (NER) task has attracted significant attention in Natural Language Processing (NLP) as it can enhance the performance of many NLP applications. In this paper, we compare English NER with Arabic NER in an experimental way to investigate the impact of using different classifiers and sets of features including language-independent and language-specific features. We explore the features and classifiers on five different datasets. We compare deep neural network architectures for NER with more traditional machine learning approaches to NER. We discover that most of the techniques and features used for English NER perform well on Arabic …


Applications Of Artificial Intelligence To Cryptography, Jonathan Blackledge, Napo Mosola Jan 2020

Applications Of Artificial Intelligence To Cryptography, Jonathan Blackledge, Napo Mosola

Articles

This paper considers some recent advances in the field of Cryptography using Artificial Intelligence (AI). It specifically considers the applications of Machine Learning (ML) and Evolutionary Computing (EC) to analyze and encrypt data. A short overview is given on Artificial Neural Networks (ANNs) and the principles of Deep Learning using Deep ANNs. In this context, the paper considers: (i) the implementation of EC and ANNs for generating unique and unclonable ciphers; (ii) ML strategies for detecting the genuine randomness (or otherwise) of finite binary strings for applications in Cryptanalysis. The aim of the paper is to provide an overview on …


An Examination Of The Smote And Other Smote-Based Techniques That Use Synthetic Data To Oversample The Minority Class In The Context Of Credit-Card Fraud Classification, Eduardo Parkinson De Castro Jan 2020

An Examination Of The Smote And Other Smote-Based Techniques That Use Synthetic Data To Oversample The Minority Class In The Context Of Credit-Card Fraud Classification, Eduardo Parkinson De Castro

Dissertations

This research project seeks to investigate some of the different sampling techniques that generate and use synthetic data to oversample the minority class as a means of handling the imbalanced distribution between non-fraudulent (majority class) and fraudulent (minority class) classes in a credit-card fraud dataset. The purpose of the research project is to assess the effectiveness of these techniques in the context of fraud detection which is a highly imbalanced and cost-sensitive dataset. Machine learning tasks that require learning from datasets that are highly unbalanced have difficulty learning since many of the traditional learning algorithms are not designed to cope …


Machine Learning Assisted Gait Analysis For The Determination Of Handedness In Able-Bodied People, Hugh Gallagher Jan 2020

Machine Learning Assisted Gait Analysis For The Determination Of Handedness In Able-Bodied People, Hugh Gallagher

Dissertations

This study has investigated the potential application of machine learning for video analysis, with a view to creating a system which can determine a person’s hand laterality (handedness) from the way that they walk (their gait). To this end, the convolutional neural network model VGG16 underwent transfer learning in order to classify videos under two ‘activities’: “walking left-handed” and “walking right-handed”. This saw varying degrees of success across five transfer learning trained models: Everything – the entire dataset; FiftyFifty – the dataset with enough right-handed samples removed to produce a set with parity between activities; Female – only the female …


Optimization Of Home Mortgage Mover Predictive Model Applying Geo-Spatial Analysis And Machine Learning Techniques, Natalia Riscovaia Jan 2020

Optimization Of Home Mortgage Mover Predictive Model Applying Geo-Spatial Analysis And Machine Learning Techniques, Natalia Riscovaia

Dissertations

In the last decade digital innovations and online banking services have significantly changed customers banking preferences and behaviour. Banking industry is going through the changes and developments in the provision of banking services that are affecting the structure and the organization of the bank network. However, private home loan, referred as Home Mortgage hereinafter, continue to remain among the products, that customers prefer to have personal interaction about with professional advisors prior making the decision to apply for the loan with financial institution.


Using Machine Learning Classification Methods To Detect The Presence Of Heart Disease, Nestor Pereira Dec 2019

Using Machine Learning Classification Methods To Detect The Presence Of Heart Disease, Nestor Pereira

Dissertations

Cardiovascular disease (CVD) is the most common cause of death in Ireland, and probably, worldwide. According to the Health Service Executive (HSE) cardiovascular disease accounting for 36% of all deaths, and one important fact, 22% of premature deaths (under age 65) are from CVD.

Using data from the Heart Disease UCI Data Set (UCI Machine Learning), we use machine learning techniques to detect the presence or absence of heart disease in the patient according to 14 features provide for this dataset. The different results are compared based on accuracy performance, confusion matrix and area under the Receiver Operating Characteristics (ROC) …


Factor Analysis Of Mixed Data (Famd) And Multiple Linear Regression In R, Nestor Pereira Dec 2019

Factor Analysis Of Mixed Data (Famd) And Multiple Linear Regression In R, Nestor Pereira

Dissertations

In the previous projects, it has been worked to statistically analysis of the factors to impact the score of the subjects of Mathematics and Portuguese for several groups of the student from secondary school from Portugal.

In this project will be interested in finding a model, hypothetically multiple linear regression, to predict the final score, dependent variable G3, of the student according to some features divide into two groups. One group, analyses the features or predictors which impact in the final score more related to the performance of the students, means variables like study time or past failures. The second …


An Investigation Of Three Subjective Rating Scales Of Mental Workload In Third Level Education, Nha Vu Thanh Nguyen Jan 2019

An Investigation Of Three Subjective Rating Scales Of Mental Workload In Third Level Education, Nha Vu Thanh Nguyen

Dissertations

Mental Workload assessment in educational settings is still recognized as an open research problem. Although its application is useful for instructional design, it is still unclear how it can be formally shaped and which factors compose it. This paper is aimed at investigating a set of features believed to shape the construct of mental workload and aggregating them together in models trained with supervised machine learning techniques. In detail, multiple linear regression and decision trees have been chosen for training models with features extracted respectively from the NASA Task Load Index and the Workload Profile, well-known self-reporting instruments for assessing …


Predicting Customer Retention Of An App-Based Business Using Supervised Machine Learning, Jeswin Jose Jan 2019

Predicting Customer Retention Of An App-Based Business Using Supervised Machine Learning, Jeswin Jose

Dissertations

Identification of retainable customers is very essential for the functioning and growth of any business. An effective identification of retainable customers can help the business to identify the reasons of retention and plan their marketing strategies accordingly. This research is aimed at developing a machine learning model that can precisely predict the retainable customers from the total customer data of an e-learning business. Building predictive models that can efficiently classify imbalanced data is a major challenge in data mining and machine learning. Most of the machine learning algorithms deliver a suboptimal performance when introduced to an imbalanced dataset. A variety …


Predicting Violent Crime Reports From Geospatial And Temporal Attributes Of Us 911 Emergency Call Data, Vincent Corcoran Jan 2019

Predicting Violent Crime Reports From Geospatial And Temporal Attributes Of Us 911 Emergency Call Data, Vincent Corcoran

Dissertations

The aim of this study is to create a model to predict which 911 calls will result in crime reports of a violent nature. Such a prediction model could be used by the police to prioritise calls which are most likely to lead to violent crime reports. The model will use geospatial and temporal attributes of the call to predict whether a crime report will be generated. To create this model, a dataset of characteristics relating to the neighbourhood where the 911 call originated will be created and combined with characteristics related to the time of the 911 call. Geospatial …


Performance Comparison Of Hybrid Cnn-Svm And Cnn-Xgboost Models In Concrete Crack Detection, Sahana Thiyagarajan Jan 2019

Performance Comparison Of Hybrid Cnn-Svm And Cnn-Xgboost Models In Concrete Crack Detection, Sahana Thiyagarajan

Dissertations

Detection of cracks mainly has been a sort of essential step in visual inspection involved in construction engineering as it is the commonly used building material and cracks in them is an early sign of de-basement. It is hard to find cracks by a visual check for the massive structures. So, the development of crack detecting systems generally has been a critical issue. The utilization of contextual image processing in crack detection is constrained, as image data usually taken under real-world situations vary widely and also includes the complex modelling of cracks and the extraction of handcrafted features. Therefore the …


Evaluating Load Adjusted Learning Strategies For Client Service Levels Prediction From Cloud-Hosted Video Servers, Ruairí De Fréin, Obinna Izima, Mark Davis Dec 2018

Evaluating Load Adjusted Learning Strategies For Client Service Levels Prediction From Cloud-Hosted Video Servers, Ruairí De Fréin, Obinna Izima, Mark Davis

Conference papers

Network managers that succeed in improving the accuracy of client video service level predictions, where the video is deployed in a cloud infrastructure, will have the ability to deliver responsive, SLA-compliant service to their customers. Meeting up-time guarantees, achieving rapid first-call resolution, and minimizing time-to-recovery af- ter video service outages will maintain customer loyalty.

To date, regression-based models have been applied to generate these predictions for client machines using the kernel metrics of a server clus- ter. The effect of time-varying loads on cloud-hosted video servers, which arise due to dynamic user requests have not been leveraged to improve prediction …


Can Machine Learning Beat Physics At Modeling Car Crashes?, Gavin Byrne Jan 2018

Can Machine Learning Beat Physics At Modeling Car Crashes?, Gavin Byrne

Dissertations

This study aimed to look at a traditional method used for measuring the severity and principle direction of force of a car crash and see if it could be improved on using machine learning models. The data used was publicly available from the NHTSA database and included descriptions of the vehicle, test and sensors as well as the accelerometer data over the period of the crashes. The models built were SVM classifiers and multinomial regression models. Although the SVM and Regression models were built successfully and gave higher levels of accuracy than the momentum models in terms of the severity, …


Investigating The Impact Of Unsupervised Feature-Extraction From Multi-Wavelength Image Data For Photometric Classification Of Stars, Galaxies And Qsos, Annika Lindh Dec 2016

Investigating The Impact Of Unsupervised Feature-Extraction From Multi-Wavelength Image Data For Photometric Classification Of Stars, Galaxies And Qsos, Annika Lindh

Conference papers

Accurate classification of astronomical objects currently relies on spectroscopic data. Acquiring this data is time-consuming and expensive compared to photometric data. Hence, improving the accuracy of photometric classification could lead to far better coverage and faster classification pipelines. This paper investigates the benefit of using unsupervised feature-extraction from multi-wavelength image data for photometric classification of stars, galaxies and QSOs. An unsupervised Deep Belief Network is used, giving the model a higher level of interpretability thanks to its generative nature and layer-wise training. A Random Forest classifier is used to measure the contribution of the novel features compared to a set …


Activist: A New Framework For Dataset Labelling, Jack O'Neill, Sarah Jane Delany, Brian Mac Namee Sep 2016

Activist: A New Framework For Dataset Labelling, Jack O'Neill, Sarah Jane Delany, Brian Mac Namee

Conference papers

Acquiring labels for large datasets can be a costly and time-consuming process. This has motivated the development of the semi-supervised learning problem domain, which makes use of unlabelled data — in conjunction with a small amount of labelled data — to infer the correct labels of a partially labelled dataset. Active Learning is one of the most successful approaches to semi-supervised learning, and has been shown to reduce the cost and time taken to produce a fully labelled dataset. In this paper we present Activist; a free, online, state-of-the-art platform which leverages active learning techniques to improve the efficiency of …