Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Series

2022

Machine Learning

Discipline
Institution
Publication
File Type

Articles 1 - 21 of 21

Full-Text Articles in Physical Sciences and Mathematics

Realizing Molecular Machine Learning Through Communications For Biological Ai: Future Directions And Challenges, Sasitharan Balasubramaniam, Samitha Somathilaka, Sehee Sun, Adrian Ratwatte, Massimiliano Pierobon Dec 2022

Realizing Molecular Machine Learning Through Communications For Biological Ai: Future Directions And Challenges, Sasitharan Balasubramaniam, Samitha Somathilaka, Sehee Sun, Adrian Ratwatte, Massimiliano Pierobon

School of Computing: Faculty Publications

Artificial Intelligence (AI) and Machine Learning (ML) are weaving their way into the fabric of society, where they are playing a crucial role in numerous facets of our lives. As we witness the increased deployment of AI and ML in various types of devices, we benefit from their use into energy-efficient algorithms for low powered devices. In this paper, we investigate a scale and medium that is far smaller than conventional devices as we move towards molecular systems that can be utilized to perform machine learning functions, i.e., Molecular Machine Learning (MML). Fundamental to the operation of MML is the …


Investigation, Detection And Prevention Of Online Child Sexual Abuse Material: A Comprehensive Survey, Vuong Ngo, Christina Thorpe, Cach N. Dang, Susan Mckeever Dec 2022

Investigation, Detection And Prevention Of Online Child Sexual Abuse Material: A Comprehensive Survey, Vuong Ngo, Christina Thorpe, Cach N. Dang, Susan Mckeever

Conference papers

Child sexual abuse inflicts lifelong devastating consequences for victims and is a growing social concern. In most countries, child sexual abuse material (CSAM) distribution is illegal. As a result, there are many research papers in the literature which proposed technologies to detect and investigate CSAM. In this survey, a comprehensive search of the peer reviewed journal and conference paper databases (including preprints) is conducted to identify high-quality literature. We use the PRISMA methodology to refine our search space to 2,761 papers published by Springer, Elsevier, IEEE and ACM. After iterative reviews of title, abstract and full text for relevance to …


On The Use Of Machine Learning For Causal Inference In Extreme Weather Events, Yuzhe Wang Dec 2022

On The Use Of Machine Learning For Causal Inference In Extreme Weather Events, Yuzhe Wang

Discovery Undergraduate Interdisciplinary Research Internship

Machine learning has become a helpful tool for analyzing data, and causal Inference is a powerful method in machine learning that can be used to determine the causal relationship in data. In atmospheric and climate science, this technology can also be applied to predicting extreme weather events. One of the causal inference models is Granger causality, which is used in this project. Granger causality is a statistical test for identifying whether one time series is helpful in forecasting the other time series. In granger causality, if a variable X granger-causes Y: it means that by using all information without …


Data From: Machine Learning Predictions Of Electricity Capacity, Marcus Harris, Elizabeth Kirby, Ameeta Agrawal, Rhitabrat Pokharel, Francis Puyleart, Martin Zwick Dec 2022

Data From: Machine Learning Predictions Of Electricity Capacity, Marcus Harris, Elizabeth Kirby, Ameeta Agrawal, Rhitabrat Pokharel, Francis Puyleart, Martin Zwick

Systems Science Faculty Datasets

This research applies machine learning methods to build predictive models of Net Load Imbalance for the Resource Sufficiency Flexible Ramping Requirement in the Western Energy Imbalance Market. Several methods are used in this research, including Reconstructability Analysis, developed in the systems community, and more well-known methods such as Bayesian Networks, Support Vector Regression, and Neural Networks. The aims of the research are to identify predictive variables and obtain a new stand-alone model that improves prediction accuracy and reduces the INC (ability to increase generation) and DEC (ability to decrease generation) Resource Sufficiency Requirements for Western Energy Imbalance Market participants. This …


Identity Term Sampling For Measuring Gender Bias In Training Data, Nasim Sobhani, Sarah Jane Delany Dec 2022

Identity Term Sampling For Measuring Gender Bias In Training Data, Nasim Sobhani, Sarah Jane Delany

Conference Papers

Predictions from machine learning models can reflect biases in the data on which they are trained. Gender bias has been identified in natural language processing systems such as those used for recruitment. The development of approaches to mitigate gender bias in training data typically need to be able to isolate the effect of gender on the output to see the impact of gender. While it is possible to isolate and identify gender for some types of training data, e.g. CVs in recruitment, for most textual corpora there is no obvious gender label. This paper proposes a general approach to measure …


Learnfca: A Fuzzy Fca And Probability Based Approach For Learning And Classification, Suraj Ketan Samal Dec 2022

Learnfca: A Fuzzy Fca And Probability Based Approach For Learning And Classification, Suraj Ketan Samal

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Formal concept analysis(FCA) is a mathematical theory based on lattice and order theory used for data analysis and knowledge representation. Over the past several years, many of its extensions have been proposed and applied in several domains including data mining, machine learning, knowledge management, semantic web, software development, chemistry ,biology, medicine, data analytics, biology and ontology engineering.

This thesis reviews the state-of-the-art of theory of Formal Concept Analysis(FCA) and its various extensions that have been developed and well-studied in the past several years. We discuss their historical roots, reproduce the original definitions and derivations with illustrative examples. Further, we provide …


A New Kind Of Data Science: The Need For Ethical Analytics, Jonathan Boardman Nov 2022

A New Kind Of Data Science: The Need For Ethical Analytics, Jonathan Boardman

Published and Grey Literature from PhD Candidates

Ethics can no longer be regarded as an add-on in data science and analytics. This paper argues for the necessity of formalizing a new, practically-oriented sub-discipline of AI ethics by outlining the needs, highlighting shortcomings in current approaches, and providing a framework for ethical analytics, which is concerned with the study of the ethical issues surrounding the development, deployment, and/or dissemination of ML/AI systems and data science research, as well as the development of tools and procedures to mitigate ethical harms. While data science and machine learning are primarily concerned with data from start to finish, ethical analytics is concerned …


Adaptive Fairness Improvement Based Causality Analysis, Mengdi Zhang, Jun Sun Nov 2022

Adaptive Fairness Improvement Based Causality Analysis, Mengdi Zhang, Jun Sun

Research Collection School Of Computing and Information Systems

Given a discriminating neural network, the problem of fairness improvement is to systematically reduce discrimination without significantly scarifies its performance (i.e., accuracy). Multiple categories of fairness improving methods have been proposed for neural networks, including pre-processing, in-processing and postprocessing. Our empirical study however shows that these methods are not always effective (e.g., they may improve fairness by paying the price of huge accuracy drop) or even not helpful (e.g., they may even worsen both fairness and accuracy). In this work, we propose an approach which adaptively chooses the fairness improving method based on causality analysis. That is, we choose the …


Overview Of The Clpsych 2022 Shared Task: Capturing Moments Of Change In Longitudinal User Posts, Adam Tsakalidis, Jenny Chim, Iman Munire Bilal, Ayah Zirikly, Dana Atzil-Slonim, Federico Nanni, Philip Resnik, Manas Gaur, Kaushik Roy, Becky Inkster, Jeff Leintz, Maria Liakata Oct 2022

Overview Of The Clpsych 2022 Shared Task: Capturing Moments Of Change In Longitudinal User Posts, Adam Tsakalidis, Jenny Chim, Iman Munire Bilal, Ayah Zirikly, Dana Atzil-Slonim, Federico Nanni, Philip Resnik, Manas Gaur, Kaushik Roy, Becky Inkster, Jeff Leintz, Maria Liakata

Publications

We provide an overview of the CLPsych 2022 Shared Task, which focusses on the automatic identification of Moments of Change in longitudinal posts by individuals on social media and its connection with information regarding mental health . This year's task introduced the notion of longitudinal modelling of the text generated by an individual online over time, along with appropriate temporally sensitive evaluation metrics. The Shared Task consisted of two subtasks: (a) the main task of capturing changes in an individual's mood (drastic changes-`Switches'- and gradual changes -`Escalations'- on the basis of textual content shared online; and subsequently (b) the sub-task …


A Gpu-Based Machine Learning Approach For Detection Of Botnet Attacks, Michal Motylinski, Áine Macdermott, Farkhund Iqbal, Babar Shah Sep 2022

A Gpu-Based Machine Learning Approach For Detection Of Botnet Attacks, Michal Motylinski, Áine Macdermott, Farkhund Iqbal, Babar Shah

All Works

Rapid development and adaptation of the Internet of Things (IoT) has created new problems for securing these interconnected devices and networks. There are hundreds of thousands of IoT devices with underlying security vulnerabilities, such as insufficient device authentication/authorisation making them vulnerable to malware infection. IoT botnets are designed to grow and compete with one another over unsecure devices and networks. Once infected, the device will monitor a Command-and-Control (C&C) server indicating the target of an attack via Distributed Denial of Service (DDoS) attack. These security issues, coupled with the continued growth of IoT, presents a much larger attack surface for …


An Evolutionary Optimization Algorithm For Automated Classical Machine Learning, Leila Zahedi Jun 2022

An Evolutionary Optimization Algorithm For Automated Classical Machine Learning, Leila Zahedi

FIU Electronic Theses and Dissertations

Machine learning is an evolving branch of computational algorithms that allow computers to learn from experiences, make predictions, and solve different problems without being explicitly programmed. However, building a useful machine learning model is a challenging process, requiring human expertise to perform various proper tasks and ensure that the machine learning's primary objective --determining the best and most predictive model-- is achieved. These tasks include pre-processing, feature selection, and model selection. Many machine learning models developed by experts are designed manually and by trial and error. In other words, even experts need the time and resources to create good predictive …


Machine Learning With Kay, Lasith Niroshan, James Carswell Jun 2022

Machine Learning With Kay, Lasith Niroshan, James Carswell

Conference Papers

Computational power is very important when training Deep Learning (DL) models with large amounts of data (Wooldridge, 2021). Hence, High-Performance Computing (HPC) can be leveraged to reduce computational cost, and the Irish Centre for High-End Computing (ICHEC) provides significant infrastructure and services for research and development to both academia and industry. A portion of ICHEC's HPC system has been allocated for institutional access, and this paper presents a case study of how to use Kay (Ireland's national supercomputer) in the remote sensing domain. Specifically, this study uses clusters of Kay Graphics Processing Units (GPUs) for training DL models to extract …


Meteorological Characteristics Of Fog Events In Korean Smart Cities And Machine Learning Based Visibility Estimation, Jaemin Kim, Seung Hee Kim, Hyun Woo Seo, Yi Victor Wang, Yun Gon Lee May 2022

Meteorological Characteristics Of Fog Events In Korean Smart Cities And Machine Learning Based Visibility Estimation, Jaemin Kim, Seung Hee Kim, Hyun Woo Seo, Yi Victor Wang, Yun Gon Lee

Institute for ECHO Articles and Research

To address various urban issues such as fine dust, traffic congestion, and water shortage caused by rapid urbanization, a national pilot Smart City is planned in two Korean cities, Sejong and Busan. As weather data is crucial for improving the environment and operating future transportation while constructing a smart city, preparing for future weather disasters by analyzing the characteristics of various meteorological phenomena in the planned development area is necessary. This study analyzed the fog generation characteristics for the period of 2016–2020 at the automatic weather system sites of the Korea Meteorological Administration in Sejong and Busan, and the characteristics …


A Machine Learning And Deep Learning Framework For Binary, Ternary, And Multiclass Emotion Classification Of Covid-19 Vaccine-Related Tweets, Aditya Dubey May 2022

A Machine Learning And Deep Learning Framework For Binary, Ternary, And Multiclass Emotion Classification Of Covid-19 Vaccine-Related Tweets, Aditya Dubey

Honors Scholar Theses

My research mines public emotion toward the Covid-19 vaccine based on Twitter data collected over the past 6-12 months. This project is centered around building and developing machine learning and deep learning models to perform natural language processing of short-form text, which in our case tweets. These tweets are all vaccine-related tweets and the goal of the classification task is for our models to accurately classify a tweet into one of four emotion groups: Apprehension/Anticipation, Sadness/Anger/Frustration, Joy/Humor/Sarcasm, and Gratitude/Relief. Given this data and the goal of the paper, we aim to answer the following questions: (1) Can a framework be …


Crowd-Machine Partnership On Road Infrastructure Quality Recognition And Resilience, Eric J. Thompson May 2022

Crowd-Machine Partnership On Road Infrastructure Quality Recognition And Resilience, Eric J. Thompson

Discovery Undergraduate Interdisciplinary Research Internship

Public roads are a vital component of modern-day society, as they are necessary for the transportation of people and capital; consequently, it is important that they are regularly and effectively maintained. Unfortunately, this maintenance is difficult to manage due to the sheer area that roads span. It is an arduous task to locate every instance of road damage, as well as to determine the urgency that each bit of damage necessitates. Repairing road damage has high costs in labor, time, and money. To provide a more efficient way to monitor road conditions, we are designing a mobile application that collects …


The Executive’S Guide To Getting Ai Wrong, Jerrold Soh May 2022

The Executive’S Guide To Getting Ai Wrong, Jerrold Soh

Asian Management Insights

It’s all math. Really.


Detecting The Emotions Of Animate Beings In Narrative, Samira Zad Mar 2022

Detecting The Emotions Of Animate Beings In Narrative, Samira Zad

FIU Electronic Theses and Dissertations

Identifying emotions as expressed in text (a.k.a. text emotion recognition) has received a lot of attention over the past decade. Narratives often involve a great deal of emotional expression, and so emotion recognition on narrative text is of great interest to computational approaches to narrative understanding. The meaning and impact of narratives is strongly bound up with the emotions expressed therein. Emotions may be experienced by characters in a story (which may include the narrator), by a story-external narrator, or by the reader. There has been so far two separate streams of work relevant to this observation: (1) emotion detection, …


Prediction Of Soil Water Content And Electrical Conductivity Using Random Forest Methods With Uav Multispectral And Ground-Coupled Geophysical Data, Yunyi Guan, Katherine R. Grote, Joel Schott, Kelsi Leverett Feb 2022

Prediction Of Soil Water Content And Electrical Conductivity Using Random Forest Methods With Uav Multispectral And Ground-Coupled Geophysical Data, Yunyi Guan, Katherine R. Grote, Joel Schott, Kelsi Leverett

Geosciences and Geological and Petroleum Engineering Faculty Research & Creative Works

The volumetric water content (VWC) of soil is a critical parameter in agriculture, as VWC strongly influences crop yield, provides nutrients to plants, and maintains the microbes that are needed for the biological health of the soil. Measuring VWC is difficult, as it is spatially and tempo-rally heterogeneous, and most agricultural producers use point measurements that cannot fully capture this parameter. Electrical conductivity (EC) is another soil parameter that is useful in agricul-ture, since it can be used to indicate soil salinity, soil texture, and plant nutrient availability. Soil EC is also very heterogeneous; measuring EC using conventional soil sampling …


Netsec: Real-Time And Scalable Malware Traffic Detection Within Iot Networks, Ethan Weitkamp, Yusuke Satani, Peilong Li, Jingwen Wang Jan 2022

Netsec: Real-Time And Scalable Malware Traffic Detection Within Iot Networks, Ethan Weitkamp, Yusuke Satani, Peilong Li, Jingwen Wang

Summer Scholarship, Creative Arts and Research Projects (SCARP)

Detecting malicious network traffic in real time has become a crucial requirement at smart communities for elderly care and medical facilities with the prevalence of Internet-of-things (IoT) devices. Existing machine learning based solutions for network traffic malware detection often fail to scale with the exponential increase of IoT devices at the facility and to detect malicious traffic with desirable low latency. In this paper we seek to fill the gap by designing a scalable end-to-end network traffic analyzing system that permits real-time malware detection. By leveraging distributed systems such as Apache Kafka and Apache Spark, the system has demonstrated scalable …


A Low-Cost Machine Learning Based Network Intrusion Detection System With Data Privacy Preservation, Jyoti Fakirah, Lauhim Mahfuz Zishan, Roshni Mooruth, Michael L. Johnstone, Wencheng Yang Jan 2022

A Low-Cost Machine Learning Based Network Intrusion Detection System With Data Privacy Preservation, Jyoti Fakirah, Lauhim Mahfuz Zishan, Roshni Mooruth, Michael L. Johnstone, Wencheng Yang

Research outputs 2022 to 2026

Network intrusion is a well-studied area of cyber security. Current machine learning-based network intrusion detection systems (NIDSs) monitor network data and the patterns within those data but at the cost of presenting significant issues in terms of privacy violations which may threaten end-user privacy. Therefore, to mitigate risk and preserve a balance between security and privacy, it is imperative to protect user privacy with respect to intrusion data. Moreover, cost is a driver of a machine learning-based NIDS because such systems are increasingly being deployed on resource-limited edge devices. To solve these issues, in this paper we propose a NIDS …


Integrated Gradients Is A Nonlinear Generalization Of The Industry Standard Approach To Variable Attribution For Credit Risk Models, Jonathan Boardman, Md Shafiul Alam, Xiao Huang, Ying Xie Jan 2022

Integrated Gradients Is A Nonlinear Generalization Of The Industry Standard Approach To Variable Attribution For Credit Risk Models, Jonathan Boardman, Md Shafiul Alam, Xiao Huang, Ying Xie

Published and Grey Literature from PhD Candidates

In modern society, epistemic uncertainty limits trust in financial relationships, necessitating transparency and accountability mechanisms for both consumers and lenders. One upshot is that credit risk assessments must be explainable to the consumer. In the United States regulatory milieu, this entails both the identification of key factors in a decision and the provision of consistent actions that would improve standing. The traditionally accepted approach to explainable credit risk modeling involves generating scores with Generalized Linear Models (GLMs) - usually logistic regression, calculating the contribution of each predictor to the total points lost from the theoretical maximum, and generating reason codes …