Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 32

Full-Text Articles in Physical Sciences and Mathematics

A Method For Generating A Non-Manual Feature Model For Sign Language Processing, Robert G. Smith Dr, Markus Hofmann Dr Aug 2023

A Method For Generating A Non-Manual Feature Model For Sign Language Processing, Robert G. Smith Dr, Markus Hofmann Dr

Articles

While recent approaches to sign language processing have shifted to the domain of Machine Learning (ML), the treatment of Non-Manual Features (NMFs) remains an open question. The principal challenge facing this method is the comparatively small sign language corpora available for training machine learning models. This study produces a statistical model which may be used in future ML, rules-based, and hybrid-learning approaches for sign language processing tasks. In doing so, this research explores the emerging patterns of non-manual articulation concerning grammatical classes in Irish Sign Language (ISL). The experimental method applied here is a novel implementation of an association rules …


Exploiting Association Rules Mining To Inform The Use Of Non-Manual Features In Sign Language Processing, Robert G. Smith Jun 2023

Exploiting Association Rules Mining To Inform The Use Of Non-Manual Features In Sign Language Processing, Robert G. Smith

Other Resources

In recent years, the use of virtual assistants and voice user interfaces has become a latent part of modern living. Unseen to the user are the various artificial intelligence and natural language processing technologies, the vast datasets, and the linguistic insights that underpin such tools. The technologies supporting them have chiefly targeted widely used spoken languages, leaving sign language users at a disadvantage. One important reason why sign languages are unsupported by such tools is a requirement of the underpinning technologies for a comprehensive description of the language. Sign language processing technologies endeavour to bridge this technology inequality.

Recent approaches …


Determining The Proportionality Of Ischemic Stroke Risk Factors To Age, Elizabeth Hunter, John D. Kelleher Jan 2023

Determining The Proportionality Of Ischemic Stroke Risk Factors To Age, Elizabeth Hunter, John D. Kelleher

Articles

While age is an important risk factor, there are some disadvantages to including it in a stroke risk model: age can dominate the risk score and lead to over-or under-predictions in some age groups. There is evidence to suggest that some of these disadvantages are due to the non-proportionality of other risk factors with age, eg, risk factors contribute differently to stroke risk based on an individual’s age. In this paper, we present a framework to test if risk factors are proportional with age. We then apply the framework to a set of risk factors using Framingham heart study data …


The Interaction Of Normalisation And Clustering In Sub-Domain Definition For Multi-Source Transfer Learning Based Time Series Anomaly Detection, Matthew Nicholson, Rahul Agrahari, Clare Conran, Haythem Assem, John D. Kelleher Dec 2022

The Interaction Of Normalisation And Clustering In Sub-Domain Definition For Multi-Source Transfer Learning Based Time Series Anomaly Detection, Matthew Nicholson, Rahul Agrahari, Clare Conran, Haythem Assem, John D. Kelleher

Articles

This paper examines how data normalisation and clustering interact in the definition of sub-domains within multi-source transfer learning systems for time series anomaly detection. The paper introduces a distinction between (i) clustering as a primary/direct method for anomaly detection, and (ii) clustering as a method for identifying sub-domains within the source or target datasets. Reporting the results of three sets of experiments, we find that normalisation after feature extraction and before clustering results in the best performance for anomaly detection. Interestingly, we find that in the multi-source transfer learning scenario clustering on the target dataset and identifying subdomains in the …


Identity Term Sampling For Measuring Gender Bias In Training Data, Nasim Sobhani, Sarah Jane Delany Dec 2022

Identity Term Sampling For Measuring Gender Bias In Training Data, Nasim Sobhani, Sarah Jane Delany

Conference Papers

Predictions from machine learning models can reflect biases in the data on which they are trained. Gender bias has been identified in natural language processing systems such as those used for recruitment. The development of approaches to mitigate gender bias in training data typically need to be able to isolate the effect of gender on the output to see the impact of gender. While it is possible to isolate and identify gender for some types of training data, e.g. CVs in recruitment, for most textual corpora there is no obvious gender label. This paper proposes a general approach to measure …


An Investigation Of The Reconstruction Capacity Of Stacked Convolutional Autoencoders For Log-Mel-Spectrograms, Anastasia Natsiou, Luca Longo, Seán O'Leary Oct 2022

An Investigation Of The Reconstruction Capacity Of Stacked Convolutional Autoencoders For Log-Mel-Spectrograms, Anastasia Natsiou, Luca Longo, Seán O'Leary

Conference Papers

In audio processing applications, the generation of expressive sounds based on high-level representations demonstrates a high demand. These representations can be used to manipulate the timbre and influence the synthesis of creative instrumental notes. Modern algorithms, such as neural networks, have inspired the development of expressive synthesizers based on musical instrument timbre compression. Unsupervised deep learning methods can achieve audio compression by training the network to learn a mapping from waveforms or spectrograms to low-dimensional representations. This study investigates the use of stacked convolutional autoencoders for the compression of time-frequency audio representations for a variety of instruments for a single …


“Be A Pattern For The World”: The Development Of A Dark Patterns Detection Tool To Prevent Online User Loss, Jordan Donnelly, Alan Downley, Yunpeng Liu, Yufei Su, Quanwei Sun, Lan Zeng, Andrea Curley, Damian Gordon, Paul Kelly, Dympna O'Sullivan, Anna Becevel Sep 2022

“Be A Pattern For The World”: The Development Of A Dark Patterns Detection Tool To Prevent Online User Loss, Jordan Donnelly, Alan Downley, Yunpeng Liu, Yufei Su, Quanwei Sun, Lan Zeng, Andrea Curley, Damian Gordon, Paul Kelly, Dympna O'Sullivan, Anna Becevel

Articles

Dark Patterns are designed to trick users into sharing more information or spending more money than they had intended to do, by configuring online interactions to confuse or add pressure to the users. They are highly varied in their form, and are therefore difficult to classify and detect. Therefore, this research is designed to develop a framework for the automated detection of potential instances of web-based dark patterns, and from there to develop a software tool that will provide a highly useful defensive tool that helps detect and highlight these patterns.


Self-Supervised Learning For Invariant Representations From Multi-Spectral And Sar Images, Pallavi Jain, Bianca Schoen Phelan, Robert J. Ross Sep 2022

Self-Supervised Learning For Invariant Representations From Multi-Spectral And Sar Images, Pallavi Jain, Bianca Schoen Phelan, Robert J. Ross

Articles

Self-Supervised learning (SSL) has become the new state of the art in several domain classification and segmentation tasks. One popular category of SSL are distillation networks such as Bootstrap Your Own Latent (BYOL). This work proposes RS-BYOL, which builds on BYOL in the remote sensing (RS) domain where data are non-trivially different from natural RGB images. Since multi-spectral (MS) and synthetic aperture radar (SAR) sensors provide varied spectral and spatial resolution information, we utilise them as an implicit augmentation to learn invariant feature embeddings. In order to learn RS based invariant features with SSL, we trained RS-BYOL in two ways, …


Assessing Feature Representations For Instance-Based Cross-Domain Anomaly Detection In Cloud Services Univariate Time Series Data, Rahul Agrahari, Matthew Nicholson, Clare Conran, Haythem Assem, John D. Kelleher Jan 2022

Assessing Feature Representations For Instance-Based Cross-Domain Anomaly Detection In Cloud Services Univariate Time Series Data, Rahul Agrahari, Matthew Nicholson, Clare Conran, Haythem Assem, John D. Kelleher

Articles

In this paper, we compare and assess the efficacy of a number of time-series instance feature representations for anomaly detection. To assess whether there are statistically significant differences between different feature representations for anomaly detection in a time series, we calculate and compare confidence intervals on the average performance of different feature sets across a number of different model types and cross-domain time-series datasets. Our results indicate that the catch22 time-series feature set augmented with features based on rolling mean and variance performs best on average, and that the difference in performance between this feature set and the next best …


Towards Exchanging Wearable-Pghd With Ehrs: Developing A Standardized Information Model For Wearable-Based Patient Generated Health Data, Abdullahi Abubakar Kawu, Dympna O'Sullivan, Lucy Hederman Jan 2022

Towards Exchanging Wearable-Pghd With Ehrs: Developing A Standardized Information Model For Wearable-Based Patient Generated Health Data, Abdullahi Abubakar Kawu, Dympna O'Sullivan, Lucy Hederman

Articles

Wearables have become commonplace for tracking and making sense of patient lifestyle, wellbeing and health data. Most of this tracking is done by individuals outside of clinical settings, however some data from wearables may be useful in a clinical context. As such, wearables may be considered a prominent source of Patient Generated Health Data (PGHD). Studies have attempted to maximize the use of the data from wearables including integrating with Electronic Health Records (EHRs). However, usually a limited number of wearables are considered for integration and, in many cases, only one brand is investigated. In addition, we find limited studies …


Provenance: An Intermediary-Free Solution For Digital Content Verification, Bilal Yousuf, M. Atif Qureshi, Brendan Spillane, Gary Munnelly, Oisin Carroll, Matthew Runswick, Kirsty Park, Eileen Culloty, Owen Conlan, Jane Suiter Nov 2021

Provenance: An Intermediary-Free Solution For Digital Content Verification, Bilal Yousuf, M. Atif Qureshi, Brendan Spillane, Gary Munnelly, Oisin Carroll, Matthew Runswick, Kirsty Park, Eileen Culloty, Owen Conlan, Jane Suiter

Articles

The threat posed by misinformation and disinformation is one of the defining challenges of the 21st century. Provenance is designed to help combat this threat by warning users when the content they are looking at may be misinformation or disinformation. It is also designed to improve media literacy among its users and ultimately reduce susceptibility to the threat among vulnerable groups within society. The Provenance browser plugin checks the content that users see on the Internet and social media and provides warnings in their browser or social media feed. Unlike similar plugins, which require human experts to provide evaluations and …


Feature Engineering Vs Feature Selection Vs Hyperparameter Optimization In The Spotify Song Popularity Dataset, Alan Cueva Mora, Brendan Tierney Oct 2021

Feature Engineering Vs Feature Selection Vs Hyperparameter Optimization In The Spotify Song Popularity Dataset, Alan Cueva Mora, Brendan Tierney

Conference Papers

Research in Featuring Engineering has been part of the data pre-processing phase of machine learning projects for many years. It can be challenging for new people working with machine learning to understand its importance along with various approaches to find an optimized model. This work uses the Spotify Song Popularity dataset to compare and evaluate Feature Engineering, Feature Selection and Hyperparameter Optimization. The result of this work will demonstrate Feature Engineering has a greater effect on model efficiency when compared to the alternative approaches.


Multi-Modal Self-Supervised Representation Learning For Earth Observation, Pallavi Jain, Bianca Schoen Phelan, Robert J. Ross Jul 2021

Multi-Modal Self-Supervised Representation Learning For Earth Observation, Pallavi Jain, Bianca Schoen Phelan, Robert J. Ross

Conference papers

Self-Supervised learning (SSL) has reduced the performance gap between supervised and unsupervised learning, due to its ability to learn invariant representations. This is a boon to the domains like Earth Observation (EO), where labelled data availability is scarce but unlabelled data is freely available. While Transfer Learning from generic RGB pre-trained models is still common-place in EO, we argue that, it is essential to have good EO domain specific pre-trained model in order to use with downstream tasks with limited labelled data. Hence, we explored the applicability of SSL with multi-modal satellite imagery for downstream tasks. For this we utilised …


Interrupting The Propaganda Supply Chain, Kyle Hamilton, Bojan Bozic, Luc Longo Apr 2021

Interrupting The Propaganda Supply Chain, Kyle Hamilton, Bojan Bozic, Luc Longo

Conference papers

In this early-stage research, a multidisciplinary approach is presented for the detection of propaganda in the media, and for modeling the spread of propaganda and disinformation using semantic web and graph theory. An ontology will be designed which has the theoretical underpinnings from multiple disciplines including the social sciences and epidemiology. An additional objective of this work is to automate triple extraction from unstructured text which surpasses the state-of-the-art performance.


An Analysis Of The Interpretability Of Neural Networks Trained On Magnetic Resonance Imaging For Stroke Outcome Prediction, Esra Zihni, John D. Kelleher, Bryony Mcgarry Apr 2021

An Analysis Of The Interpretability Of Neural Networks Trained On Magnetic Resonance Imaging For Stroke Outcome Prediction, Esra Zihni, John D. Kelleher, Bryony Mcgarry

Conference papers

Applying deep learning models to MRI scans of acute stroke patients to extract features that are indicative of short-term outcome could assist a clinician’s treatment decisions. Deep learning models are usually accurate but are not easily interpretable. Here, we trained a convolutional neural network on ADC maps from hyperacute ischaemic stroke patients for prediction of short-term functional outcome and used an interpretability technique to highlight regions in the ADC maps that were most important in the prediction of a bad outcome. Although highly accurate, the model’s predictions were not based on aspects of the ADC maps related to stroke pathophysiology.


Virtual Network Function Embedding Under Nodal Outage Using Deep Q-Learning, Swarna Bindu Chetty, Hamed Ahmadi, Sachin Sharma, Avishek Nag Mar 2021

Virtual Network Function Embedding Under Nodal Outage Using Deep Q-Learning, Swarna Bindu Chetty, Hamed Ahmadi, Sachin Sharma, Avishek Nag

Articles

With the emergence of various types of applications such as delay-sensitive applications, future communication networks are expected to be increasingly complex and dynamic. Network Function Virtualization (NFV) provides the necessary support towards efficient management of such complex networks, by virtualizing network functions and placing them on shared commodity servers. However, one of the critical issues in NFV is the resource allocation for the highly complex services; moreover, this problem is classified as an NP-Hard problem. To solve this problem, our work investigates the potential of Deep Reinforcement Learning (DRL) as a swift yet accurate approach (as compared to integer linear …


K-Nearest Neighbour Classifiers - A Tutorial, Padraig Cunningham, Sarah Jane Delany Jan 2021

K-Nearest Neighbour Classifiers - A Tutorial, Padraig Cunningham, Sarah Jane Delany

Conference papers

Perhaps the most straightforward classifier in the arsenal or Machine Learning techniques is the Nearest Neighbour Classifier – classification is achieved by identifying the nearest neighbours to a query example and using those neighbours to determine the class of the query. This approach to classification is of particular importance because issues of poor run-time performance is not such a problem these days with the computational power that is available. This paper presents an overview of techniques for Nearest Neighbour classification focusing on; mechanisms for assessing similarity (distance), computational issues in identifying nearest neighbours and mechanisms for reducing the dimension of …


Data: The Good, The Bad And The Ethical, John D. Kelleher, Filipe Cabral Pinto, Luis M. Cortesao Dec 2020

Data: The Good, The Bad And The Ethical, John D. Kelleher, Filipe Cabral Pinto, Luis M. Cortesao

Articles

It is often the case with new technologies that it is very hard to predict their long-term impacts and as a result, although new technology may be beneficial in the short term, it can still cause problems in the longer term. This is what happened with oil by-products in different areas: the use of plastic as a disposable material did not take into account the hundreds of years necessary for its decomposition and its related long-term environmental damage. Data is said to be the new oil. The message to be conveyed is associated with its intrinsic value. But as in …


Detecting Hacker Threats: Performance Of Word And Sentence Embedding Models In Identifying Hacker Communications, Susan Mckeever, Brian Keegan, Andrei Quieroz Dec 2020

Detecting Hacker Threats: Performance Of Word And Sentence Embedding Models In Identifying Hacker Communications, Susan Mckeever, Brian Keegan, Andrei Quieroz

Conference papers

Abstract—Cyber security is striving to find new forms of protection against hacker attacks. An emerging approach nowadays is the investigation of security-related messages exchanged on deep/dark web and even surface web channels. This approach can be supported by the use of supervised machine learning models and text mining techniques. In our work, we compare a variety of machine learning algorithms, text representations and dimension reduction approaches for the detection accuracies of software-vulnerability-related communications. Given the imbalanced nature of the three public datasets used, we investigate appropriate sampling approaches to boost detection accuracies of our models. In addition, we examine how …


Comparing Variable Importance In Prediction Of Silence Behaviours Between Random Forest And Conditional Inference Forest Models., Stephen Barrett Dr, Geraldine Gray Dr, Colm Mcguinness Dr, Michael Knoll Dr. Oct 2020

Comparing Variable Importance In Prediction Of Silence Behaviours Between Random Forest And Conditional Inference Forest Models., Stephen Barrett Dr, Geraldine Gray Dr, Colm Mcguinness Dr, Michael Knoll Dr.

Articles

This paper explores variable importance metrics of Conditional Inference Trees (CIT) and classical Classification And Regression Trees (CART) based Random Forests. The paper compares both algorithms variable importance rankings and highlights why CIT should be used when dealing with data with different levels of aggregation. The models analysed explored the role of cultural factors at individual and societal level when predicting Organisational Silence behaviours.


An Application Of Machine Learning To Explore Relationships Between Factors Of Organisational Silence And Culture, With Specific Focus On Predicting Silence Behaviours, Stephen Barrett Dr May 2020

An Application Of Machine Learning To Explore Relationships Between Factors Of Organisational Silence And Culture, With Specific Focus On Predicting Silence Behaviours, Stephen Barrett Dr

Articles

Research indicates that there are many individual reasons why people do not speak up when confronted with situations that may concern them within their working environment. One of the areas that requires more focused research is the role culture plays in why a person may remain silent when such situations arise. The purpose of this study is to use data science techniques to explore the patterns in a data set that would lead a person to engage in organisational silence. The main research question the thesis asks is: Is Machine Learning a tool that Social Scientists can use with respect …


Finding Common Ground For Citizen Empowerment In The Smart City, John D. Kelleher, Aphra Kerr Jan 2020

Finding Common Ground For Citizen Empowerment In The Smart City, John D. Kelleher, Aphra Kerr

Articles

Corporate smart city initiatives are just one example of the contemporary culture of surveillance. They rely on extensive information gathering systems and Big Data analysis to predict citizen behaviour and optimise city services. In this paper we argue that many smart city and social media technologies result in a paradox whereby digital inclusion for the purposes of service provision also results in marginalisation and disempowerment of citizens. Drawing upon insights garnered from a digital inclusion workshop conducted in the Galapagos islands, we propose that critically and creatively unpacking the computational techniques embedded in data services is needed as a first …


Multimodal Fusion Strategies For Outcome Prediction In Stroke, Esra Zihni, John D. Kelleher, Vince I. Madai, Ahmed Khalil, Ivana Galinovic, Jochen Fiebach, Michelle Livne, Dietmar Frey Jan 2020

Multimodal Fusion Strategies For Outcome Prediction In Stroke, Esra Zihni, John D. Kelleher, Vince I. Madai, Ahmed Khalil, Ivana Galinovic, Jochen Fiebach, Michelle Livne, Dietmar Frey

Conference papers

Data driven methods are increasingly being adopted in the medical domain for clinical predictive modeling. Prediction of stroke outcome using machine learning could provide a decision support system for physicians to assist them in patient-oriented diagnosis and treatment. While patient-specific clinical parameters play an important role in outcome prediction, a multimodal fusion approach that integrates neuroimaging with clinical data has the potential to improve accuracy. This paper addresses two research questions: (a) does multimodal fusion aid in the prediction of stroke outcome, and (b) what fusion strategy is more suitable for the task at hand. The baselines for our experimental …


Modelling Interleaved Activities Using Language Models, Eoin Rogers, Robert J. Ross, John D. Kelleher Jan 2020

Modelling Interleaved Activities Using Language Models, Eoin Rogers, Robert J. Ross, John D. Kelleher

Conference papers

We propose a new approach to activity discovery, based on the neural language modelling of streaming sensor events. Our approach proceeds in multiple stages: we build binary links between activities using probability distributions generated by a neural language model trained on the dataset, and combine the binary links to produce complex activities. We then use the activities as sensor events, allowing us to build complex hierarchies of activities. We put an emphasis on dealing with interleaving, which represents a major challenge for many existing activity discovery systems. The system is tested on a realistic dataset, demonstrating it as a promising …


Mutual Information Decay Curves And Hyper-Parameter Grid Search Design For Recurrent Neural Architectures, Abhijit Mahalunkar, John Kelleher Jan 2020

Mutual Information Decay Curves And Hyper-Parameter Grid Search Design For Recurrent Neural Architectures, Abhijit Mahalunkar, John Kelleher

Conference papers

We present an approach to design the grid searches for hyper-parameter optimization for recurrent neural architectures. The basis for this approach is the use of mutual information to analyze long distance dependencies (LDDs) within a dataset. We also report a set of experiments that demonstrate how using this approach, we obtain state-of-the-art results for DilatedRNNs across a range of benchmark datasets.


Eavesdropping Hackers: Detecting Software Vulnerability Communication On Social Media Using Text Mining, Susan Mckeever, Brian Keegan, Andrei Quieroz Sep 2019

Eavesdropping Hackers: Detecting Software Vulnerability Communication On Social Media Using Text Mining, Susan Mckeever, Brian Keegan, Andrei Quieroz

Conference papers

Abstract—Cyber security is striving to find new forms of protection against hacker attacks. An emerging approach nowadays is the investigation of security-related messages exchanged on Deep/Dark Web and even Surface Web channels. This approach can be supported by the use of supervised machine learning models and text mining techniques. In our work, we compare a variety of machine learning algorithms, text representations and dimension reduction approaches for the detection accuracies of software-vulnerability-related communications. Given the imbalanced nature of the three public datasets used, we investigate appropriate sampling approaches to boost detection accuracies of our models. In addition, we examine how …


Explorobot: Rapid Exploration With Chart Automation, Tamara Matthews, Rohan Goel, John Mcauley Jan 2019

Explorobot: Rapid Exploration With Chart Automation, Tamara Matthews, Rohan Goel, John Mcauley

Conference papers

General-purpose visualization tools are used by people with varying degrees of data literacy. Often the user is not a professional analyst or data scientist and uses the tool infrequently, to support an aspect of their job. This can present difficulties as the user’s unfamiliarity with visualization practice and infrequent use of the tool can result in long processing time, inaccurate data representations or inappropriate visual encodings. To address this problem, we developed a visual analytics application called exploroBOT. The exploroBOT automatically generates visualizations and the exploration guidance path (an associated network of decision points, mapping nodes where visualizations change). These …


On The Inability Of Markov Models To Capture Criticality In Human Mobility, Vaibhav Klukarni, Abhijit Mahalunkar, Benoit Garbinato, John Kelleher Jan 2019

On The Inability Of Markov Models To Capture Criticality In Human Mobility, Vaibhav Klukarni, Abhijit Mahalunkar, Benoit Garbinato, John Kelleher

Conference papers

We examine the non-Markovian nature of human mobility by exposing the inability of Markov models to capture criticality in human mobility. In particular, the assumed Markovian nature of mobility was used to establish an upper bound on the predictability of human mobility, based on the temporal entropy. Since its inception, this bound has been widely used for validating the performance of mobility prediction models. We show that the variants of recurrent neural network architectures can achieve significantly higher prediction accuracy surpassing this upper bound. The central objective of our work is to show that human-mobility dynamics exhibit criticality characteristics which …


On The Exactitude Of Big Data: La Bêtise And Artificial Intelligence, Noel Fitzpatrick, John D. Kelleher Dec 2018

On The Exactitude Of Big Data: La Bêtise And Artificial Intelligence, Noel Fitzpatrick, John D. Kelleher

Articles

This article revisits the question of ‘la bêtise’ or stupidity in the era of Artificial Intelligence driven by Big Data, it extends on the questions posed by Gille Deleuze and more recently by Bernard Stiegler. However, the framework for revisiting the question of la bêtise will be through the lens of contemporary computer science, in particular the development of data science as a mode of analysis, sometimes, misinterpreted as a mode of intelligence. In particular, this article will argue that with the advent of forms of hype (sometimes referred to as the hype cycle) in relation to big data and …


Non-Manual Articulators In Irish Sign Language Verbs: An Analysis With Data Mining Association Rules, Robert G. Smith, Markus Hofmann Nov 2018

Non-Manual Articulators In Irish Sign Language Verbs: An Analysis With Data Mining Association Rules, Robert G. Smith, Markus Hofmann

Conference Papers

The Signs of Ireland (SOI) corpus (Leeson et al., 2006) deploys a complex multi-tiered temporal data structure. The process of manually analyzing such data is laborious, cannot eliminate bias and often, important patterns can go completely unnoticed. In addition to this, as a result of the complex nature of grammatical structures contained in the corpus, identifying complex linguistic associations or patterns across tiers is simply too intricate a task for a human to carry out in an acceptable timeframe. This work explores the application of data mining techniques on a set of multi-tiered temporal data from the SOI corpus. Building …