Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

Theses/Dissertations

Machine learning

Institution
Publication Year
Publication
File Type

Articles 31 - 60 of 665

Full-Text Articles in Entire DC Network

Autonomous Shipwreck Detection & Mapping, William Ard Aug 2023

Autonomous Shipwreck Detection & Mapping, William Ard

LSU Master's Theses

This thesis presents the development and testing of Bruce, a low-cost hybrid Remote Operated Vehicle (ROV) / Autonomous Underwater Vehicle (AUV) system for the optical survey of marine archaeological sites, as well as a novel sonar image augmentation strategy for semantic segmentation of shipwrecks. This approach takes side-scan sonar and bathymetry data collected using an EdgeTech 2205 AUV sensor integrated with an Harris Iver3, and generates augmented image data to be used for the semantic segmentation of shipwrecks. It is shown that, due to the feature enhancement capabilities of the proposed shipwreck detection strategy, correctly identified areas have a 15% …


Increasing The Efficiency And Accuracy Of Collective Intelligence Methods For Image Classification, Md Mahmudulla Hassan Aug 2023

Increasing The Efficiency And Accuracy Of Collective Intelligence Methods For Image Classification, Md Mahmudulla Hassan

Open Access Theses & Dissertations

Collective intelligence has emerged as a powerful methodology for annotating and classifying challenging data that pose difficulties for automated classifiers. It works by leveraging the concept of "wisdom of the crowds" which approximates a ground truth after aggregating experts' feedback and filtering out noise. However, challenges arise when certain applications, such as medical image classification, security threat detection, and financial fraud detection, demand accurate and reliable data annotation. The unreliability of experts due to inconsistent expertise and competencies, coupled with the associated cost and time-consuming judgment extraction, presents additional challenges.

Input aggregation is the process of consolidating and combining multiple …


Evaluating Chatgpt For Recommendation: How Does The Ability To Converse Impact Recommendation?, Kyle Spurlock Aug 2023

Evaluating Chatgpt For Recommendation: How Does The Ability To Converse Impact Recommendation?, Kyle Spurlock

Electronic Theses and Dissertations

Recommendation algorithms have become an absolute necessity in the modern world to avoid information overload. However, the interaction between the human and the system is largely superficial and without any real contact. If you are given poor recommendations, you have no choice but to sift through mountains of content on your own until the model learns to accommodate your tastes more. This is bad for business as well as the consumer. Recently, large language models like ChatGPT have seen a significant rise in popularity due to their ease of use and wide range of knowledge. It has now become nearly …


Cyber Attack Surface Mapping For Offensive Security Testing, Douglas Everson Aug 2023

Cyber Attack Surface Mapping For Offensive Security Testing, Douglas Everson

All Dissertations

Security testing consists of automated processes, like Dynamic Application Security Testing (DAST) and Static Application Security Testing (SAST), as well as manual offensive security testing, like Penetration Testing and Red Teaming. This nonautomated testing is frequently time-constrained and difficult to scale. Previous literature suggests that most research is spent in support of improving fully automated processes or in finding specific vulnerabilities, with little time spent improving the interpretation of the scanned attack surface critical to nonautomated testing. In this work, agglomerative hierarchical clustering is used to compress the Internet-facing hosts of 13 representative companies as collected by the Shodan search …


On Phishing: Proposing A Host-Based Multi-Layer Passive/Active Anti-Phishing Approach Combating Counterfeit Websites, Wesam Harbi Fadheel Aug 2023

On Phishing: Proposing A Host-Based Multi-Layer Passive/Active Anti-Phishing Approach Combating Counterfeit Websites, Wesam Harbi Fadheel

Dissertations

Phishing is the starting point of most cyberattacks, mainly categorized as Email, Websites, Social Networks, Phone calls (Vishing), and SMS messaging (Smishing). Phishing refers to an attempt to collect sensitive data, typically in the form of usernames, passwords, credit card numbers, bank account information, etc., or other crucial facts, intending to use or sell the information obtained. Similar to how a fisherman uses bait to catch a fish, an attacker will pose as a trustworthy source to attract and deceive the victim.

This study explores the efficacy of host-side APT (Anti-Phishing Techniques) based onWebsite features like Lexical, Host-Based, or Content-Based …


System-Characterized Artificial Intelligence Approaches For Cardiac Cellular Systems And Molecular Signature Analysis, Ziqian Wu Jun 2023

System-Characterized Artificial Intelligence Approaches For Cardiac Cellular Systems And Molecular Signature Analysis, Ziqian Wu

Dartmouth College Ph.D Dissertations

The dissertation presents a significant advancement in the field of cardiac cellular systems and molecular signature systems by employing machine learning and generative artificial intelligence techniques. These methodologies are systematically characterized and applied to address critical challenges in these domains. A novel computational model is developed, which combines machine learning tools and multi-physics models. The main objective of this model is to accurately predict complex cellular dynamics, taking into account the intricate interactions within the cardiac cellular system. Furthermore, a comprehensive framework based on generative adversarial networks (GANs) is proposed. This framework is designed to generate synthetic data that faithfully …


Sarcasm Detection In English And Arabic Tweets Using Transformer Models, Rishik Lad Jun 2023

Sarcasm Detection In English And Arabic Tweets Using Transformer Models, Rishik Lad

Computer Science Senior Theses

This thesis describes our approach toward the detection of sarcasm and its various types in English and Arabic Tweets through methods in deep learning. There are five problems we attempted: (1) detection of sarcasm in English Tweets, (2) detection of sarcasm in Arabic Tweets, (3) determining the type of sarcastic speech subcategory for English Tweets, (4) determining which of two semantically equivalent English Tweets is sarcastic, and (5) determining which of two semantically equivalent Arabic Tweets is sarcastic. All tasks were framed as classification problems, and our contributions are threefold: (a) we developed an English binary classifier system with RoBERTa, …


An Investigation Into Machine Learning Techniques For Designing Dynamic Difficulty Agents In Real-Time Games, Ryan Adare Dunagan Jun 2023

An Investigation Into Machine Learning Techniques For Designing Dynamic Difficulty Agents In Real-Time Games, Ryan Adare Dunagan

Electronic Theses and Dissertations

Video games are an incredibly popular pastime enjoyed by people of all ages world wide. Many different kinds of games exist, but most games feature some elements of the player overcoming some challenge, usually through gameplay. These challenges are insurmountable for some people and may turn them off to video games as a pastime. Games can be made more accessible to players of little skill and/or experience through the use of Dynamic Difficulty Adjustment (DDA) systems that adjust the difficulty of the game in response to the player’s performance. This research seeks to establish the effectiveness of machine learning techniques …


Data-Optimized Spatial Field Predictions For Robotic Adaptive Sampling: A Gaussian Process Approach, Zachary Nathan May 2023

Data-Optimized Spatial Field Predictions For Robotic Adaptive Sampling: A Gaussian Process Approach, Zachary Nathan

Computer Science Senior Theses

We introduce a framework that combines Gaussian Process models, robotic sensor measurements, and sampling data to predict spatial fields. In this context, a spatial field refers to the distribution of a variable throughout a specific area, such as temperature or pH variations over the surface of a lake. Whereas existing methods tend to analyze only the particular field(s) of interest, our approach optimizes predictions through the effective use of all available data. We validated our framework on several datasets, showing that errors can decline by up to two-thirds through the inclusion of additional colocated measurements. In support of adaptive sampling, …


Deep Learning For Skin Photoaging, Gokul Srinivasan May 2023

Deep Learning For Skin Photoaging, Gokul Srinivasan

Computer Science Senior Theses

Skin photoaging is the premature aging of skin that results from ultraviolet light exposure. It is a major risk factor for the development of skin cancer, among other malignant skin pathologies. Accordingly, understanding its etiology is important for both preventative and reparative clinical action. In this study, skin samples obtained from patients with ranging solar elastosis grades – a proxy for skin photoaging – were sequenced using next-generation sequencing techniques to further understand the genomic, epigenomic, and histological signs and signals of skin photoaging. The results of this study suggest that tissues with severe photoaging exhibit increases in the frequency …


Connecting Linguistic Expressions And Pain Relief Through Transformer Model Construction And Analysis, Sarah M. Chacko May 2023

Connecting Linguistic Expressions And Pain Relief Through Transformer Model Construction And Analysis, Sarah M. Chacko

Computer Science Senior Theses

Chronic pain is a widespread problem that significantly impacts quality of life. Overprescription and abuse of pain medication continues to be a major public health issue and can further burden patients due to a fragmented health care system. Previous research has suggested a possible psychological basis to pain and the potential for safer, non-pharmacological alternatives for pain relief. This project leverages language models to study chronic pain development and relief through psychological treatments, which will be assessed through responses to post-treatment interviews. A transformer-based natural language processing model is employed to identify connections between language expressions and pain on a …


Investigating English-Language Dialect-Adjusted Models, Samiha Datta May 2023

Investigating English-Language Dialect-Adjusted Models, Samiha Datta

Computer Science Senior Theses

This thesis describes several approaches to better understand how large language models interpret different dialects of the English language. Our goal is to consider multiple contexts of textual data and to analyze how English-language dialects are realized in them, as well as how a variety of machine learning techniques handle these differences. We focus on two genres of text data: news and social media. In the news context, we establish a dataset covering news articles from five countries and four US states and consider language modeling analysis, topic and sentiment distributions, and manual analysis before performing nine experiments and evaluating …


Using Deep Neural Networks To Classify Astronomical Images, Andrew D. Macpherson May 2023

Using Deep Neural Networks To Classify Astronomical Images, Andrew D. Macpherson

Honors Projects

As the quantity of astronomical data available continues to exceed the resources available for analysis, recent advances in artificial intelligence encourage the development of automated classification tools. This paper lays out a framework for constructing a deep neural network capable of classifying individual astronomical images by describing techniques to extract and label these objects from large images.


Identifying Key Activity Indicators In Rats' Neuronal Data Using Lasso Regularized Logistic Regression, Avery Woods May 2023

Identifying Key Activity Indicators In Rats' Neuronal Data Using Lasso Regularized Logistic Regression, Avery Woods

Honors Theses

This thesis aims to identify timestamps of rats’ neuronal activity that best determine behavior using a machine learning model. Neuronal data is a complex and high-dimensional dataset, and identifying the most informative features is crucial for understanding the underlying neuronal processes. The Lasso regularization technique is employed to select the most relevant features of the data to the model’s prediction. The results of this study provide insights into the key activity indicators that are associated with specific behaviors or cognitive processes in rats, as well as the effect that stress can have on neuronal activity and behavior. Ultimately, it was …


Tornado Outbreak False Alarm Probabilistic Forecasts With Machine Learning, Kirsten Reed Snodgrass May 2023

Tornado Outbreak False Alarm Probabilistic Forecasts With Machine Learning, Kirsten Reed Snodgrass

Theses and Dissertations

Tornadic outbreaks occur annually, causing fatalities and millions of dollars in damage. By improving forecasts, the public can be better equipped to act prior to an event. False alarms (FAs) can hinder the public’s ability (or willingness) to act. As such, a probabilistic FA forecasting scheme would be beneficial to improving public response to outbreaks.

Here, a machine learning approach is employed to predict FA likelihood from Storm Prediction Center (SPC) tornado outbreak forecasts. A database of hit and FA outbreak forecasts spanning 2010 – 2020 was developed using historical SPC convective outlooks and the SPC Storm Reports database. Weather …


Wearable Sensor Gait Analysis For Fall Detection Using Deep Learning Methods, Haben Girmay Yhdego May 2023

Wearable Sensor Gait Analysis For Fall Detection Using Deep Learning Methods, Haben Girmay Yhdego

Electrical & Computer Engineering Theses & Dissertations

World Health Organization (WHO) data show that around 684,000 people die from falls yearly, making it the second-highest mortality rate after traffic accidents [1]. Early detection of falls, followed by pneumatic protection, is one of the most effective means of ensuring the safety of the elderly. In light of the recent widespread adoption of wearable sensors, it has become increasingly critical that fall detection models are developed that can effectively process large and sequential sensor signal data. Several researchers have recently developed fall detection algorithms based on wearable sensor data. However, real-time fall detection remains challenging because of the wide …


Context-Aware Gaze-Based Interface For Smart Wheelchair, Tien Pham May 2023

Context-Aware Gaze-Based Interface For Smart Wheelchair, Tien Pham

Computer Science and Engineering Theses

Human-Computer Interfaces (HCI) is an essential aspect of modern technology that has revolutionized the way we interact with machines. With the revolution of computers and smart devices and the advent of autonomous vehicles and other machines, there has been a significant advancement in this area that brings convenience to users to interact with technology intuitively and efficiently. However, the importance of HCI goes beyond the convenience of everyday technology. It has become crucial in the development of assistive technologies that empower people with disabilities to live more independently. Person with disabilities, who lack control of one or more parts of …


Toward Digital Phenotyping: Human Activity Representation For Embodied Cognition Assessment, Mohammad Zakizadehghariehali May 2023

Toward Digital Phenotyping: Human Activity Representation For Embodied Cognition Assessment, Mohammad Zakizadehghariehali

Computer Science and Engineering Dissertations

Cognition is the mental process of acquiring knowledge and understanding through thought, experience and senses. Based on Embodied Cognition theory, physical activities are an important manifestation of cognitive functions. As a result, they can be employed to both assess and train cognitive skills. In order to assess various cognitive measures, the ATEC system has been proposed. It consists of physical exercises with different variations and difficulty levels, designed to provide assessment of executive and motor functions. This thesis focuses on obtaining human activity representation from recorded videos of ATEC tasks in order to automatically assess embodied cognition performance. Representation learning …


Self-Supervised Representation Learning For Motion Time Series: A Case Study In Activity Recognition, Luis Carlos Garza Perez May 2023

Self-Supervised Representation Learning For Motion Time Series: A Case Study In Activity Recognition, Luis Carlos Garza Perez

Theses and Dissertations

In this thesis we will learn about what contrastive learning and time series are and understand the differences between supervised and self-supervised frameworks in machine learning. In addition, we will describe how the newest and most efficient self-supervised learning framework for visual representations to this date works, called SimCLR, which was originally developed to obtain useful vector representations from static images. We will also explain what TS2Vec is, and how a combination of both approaches can be applied to the concept of a time series, and still be able to extract a vector representation of the subject described by the …


Information-Theoretic Model Diagnostics (Infomod), Armin Esmaeilzadeh May 2023

Information-Theoretic Model Diagnostics (Infomod), Armin Esmaeilzadeh

UNLV Theses, Dissertations, Professional Papers, and Capstones

Model validation is a critical step in the development, deployment, and governance of machine learning models. During the validation process, the predictive power of a model is measured on unseen datasets with a variety of metrics such as Accuracy and F1-Scores for classification tasks. Although the most used metrics are easy to implement and understand, they are aggregate measures over all the segments of heterogeneous datasets, and therefore, they do not identify the performance variation of a model among different data segments. The lack of insight into how the model performs over segments of unseen datasets has raised significant challenges …


Data-Driven Predictive Maintenance: Hvac Health Prognostics Using Power Consumption And Weather Data, Ruiqi Tian Apr 2023

Data-Driven Predictive Maintenance: Hvac Health Prognostics Using Power Consumption And Weather Data, Ruiqi Tian

Electronic Thesis and Dissertation Repository

Data-driven predictive maintenance for heat, ventilation, and air conditioning (HVAC) systems has gained much popularity over recent years due to the increasing availability of integrated internet of things (IoT) sensors capable of reporting HVAC internal operational data. Most existing predictive maintenance methods are designed to analyse these internal operational data for maintenance decision making. However, these methods are not applicable to HVAC systems that are not equipped with internal IoT sensors. Consequently, we propose an AutoEncoder and Artificial Neural Network based HVAC Health Prognostics framework (AE-ANN-HP) that classifies the health condition of HVAC systems using only daily power consumption and …


Beyond News Values On Twitter: Predicting Factors That Drive User Engagement In News, Zhiyan Zhong Apr 2023

Beyond News Values On Twitter: Predicting Factors That Drive User Engagement In News, Zhiyan Zhong

Dartmouth College Master’s Theses

When deciding on what news stories to cover, traditional journalism determines news values by following several elements of newsworthiness, such as impact, timeliness, and prominence. However, these guidelines do not always seem to correspond with the success of content on social media. As people are increasingly turning to social media for news, our research aims to understand and predict factors that drive user engagement for news on social media. In this study, we analyze news content published on Twitter, and examine a diverse set of characteristics like metrics retrieved from the Twitter API and semantics by natural language processing, including …


Defining Safe Training Datasets For Machine Learning Models Using Ontologies, Lynn C. Vonder Haar Apr 2023

Defining Safe Training Datasets For Machine Learning Models Using Ontologies, Lynn C. Vonder Haar

Doctoral Dissertations and Master's Theses

Machine Learning (ML) models have been gaining popularity in recent years in a wide variety of domains, including safety-critical domains. While ML models have shown high accuracy in their predictions, they are still considered black boxes, meaning that developers and users do not know how the models make their decisions. While this is simply a nuisance in some domains, in safetycritical domains, this makes ML models difficult to trust. To fully utilize ML models in safetycritical domains, there needs to be a method to improve trust in their safety and accuracy without human experts checking each decision. This research proposes …


Socially Aware Natural Language Processing With Commonsense Reasoning And Fairness In Intelligent Systems, Sirwe Saeedi Apr 2023

Socially Aware Natural Language Processing With Commonsense Reasoning And Fairness In Intelligent Systems, Sirwe Saeedi

Dissertations

Although Artificial Intelligence (AI) promises to deliver ever more user-friendly consumer applications, recent mishaps involving fake information and biased treatment serve as vivid reminders of the pitfalls of AI. AI can harbor latent biases and flaws that can cause harm in diverse and unexpected ways. It is crucial to understand the reasons for, mechanisms behind, and circumstances under which AI can fail. For instance, a lack of commonsense reasoning can lead to biased or unfair decisions made by Machine Learning (ML) systems. For example, if an ML system is trained on data that is biased or unrepresentative of the real …


Solving Fjssp With A Genetic Algorithm, Michael John Srouji Mar 2023

Solving Fjssp With A Genetic Algorithm, Michael John Srouji

Computer Science and Software Engineering

The Flexible Job Shop Scheduling Problem is an NP-Hard combinatorial problem. This paper aims to find a solution to this problem using genetic algorithms, and discuss the effectiveness of this. Initially, I did exploratory work on whether neural networks would be effective or not, and found a lot of trade offs between using neural networks and chromosome sequencing. In the end, I decided to use chromosome sequencing over neural networks, due to the scope of my problem being on a small scale rather than on a large scale.

Therefore, the genetic algorithm was implemented using chromosome sequencing. My chromosomes were …


Design, Determination, And Evaluation Of Gender-Based Bias Mitigation Techniques For Music Recommender Systems, Sunny Shrestha Mar 2023

Design, Determination, And Evaluation Of Gender-Based Bias Mitigation Techniques For Music Recommender Systems, Sunny Shrestha

Electronic Theses and Dissertations

The majority of smartphone users engage with a recommender system on a daily basis. Many rely on these recommendations to make their next purchase, download the next game, listen to the new music or find the next healthcare provider. Although there are plenty of evidence backed research that demonstrates presence of gender bias in Machine Learning (ML) models like recommender systems, the issue is viewed as a frivolous cause that doesn’t merit much action. However, gender bias poses to effect more than half of the population as by default ML systems are designed to cater to a cisgender man. This …


Characterizing Location-Based Electromagnetic Leakage Of Computing Devices Using Convolutional Neural Networks To Increase The Effectiveness Of Side-Channel Analysis Attacks, Ian C. Heffron Mar 2023

Characterizing Location-Based Electromagnetic Leakage Of Computing Devices Using Convolutional Neural Networks To Increase The Effectiveness Of Side-Channel Analysis Attacks, Ian C. Heffron

Theses and Dissertations

SCA attacks aim to recover some sort of secret information, often in the form of a cipher key, from a target device. Some of these attacks focus on either power-based leakage, or EM-based leakage. Neural networks have recently gained in popularity as tools in SCA attacks. Near-field EM probes with high-spatial resolution enable attackers to isolate physical locations above a processor. This enables attackers to exploit the spatial dependencies of algorithms running on said processor. These spatial dependencies result in different physical locations above a chip emanating different signal strengths. The strengths of different locations can be mapped using the …


Machine Learning Methods For Computational Phenotyping Using Patient Healthcare Data With Noisy Labels, Praveen Kumar Feb 2023

Machine Learning Methods For Computational Phenotyping Using Patient Healthcare Data With Noisy Labels, Praveen Kumar

Computer Science ETDs

Positive and Unlabeled (PU) learning problems abound in many real-world applications. In healthcare informatics, diagnosed patients are considered labeled positive for a specific disease, but being undiagnosed does not mean they can be labeled negative. PU learning can improve classification performance, and estimate the positive fraction, α, among unlabeled samples. However, algorithms based on the Selected Completely At Random (SCAR) assumption are inadequate when the SCAR assumption fails (e.g., severe cases overrepresented), and when class imbalance is substantial. This dissertation presents and evaluates new algorithms to overcome these limitations. The proposed methods outperform the state-of-art for α-estimation, enhance classification performance, …


Visual Analytics And Modeling Of Materials Property Data, Diwas Bhattarai Jan 2023

Visual Analytics And Modeling Of Materials Property Data, Diwas Bhattarai

LSU Doctoral Dissertations

Due to significant advancements in experimental and computational techniques, materials data are abundant. To facilitate data-driven research, it calls for a system for managing and sharing data and supporting a set of tools for effective data analysis and modeling. Generally, a given material property M can be considered as a multivariate data problem. The dimensions of M are the values of the property itself, the conditions (pressure P, temperature T, and multi-component composition X) that control the concerned property, and relevant metadata I (source, date).

Here we present a comprehensive database considering both experimental and computational sources …


Machine Learning Models Interpretability For Malware Detection Using Model Agnostic Language For Exploration And Explanation, Ikuromor Mabel Ogiriki Jan 2023

Machine Learning Models Interpretability For Malware Detection Using Model Agnostic Language For Exploration And Explanation, Ikuromor Mabel Ogiriki

Theses and Dissertations

The adoption of the internet as a global platform has birthed a significant rise in cyber-attacks of various forms ranging from Trojans, worms, spyware, ransomware, botnet malware, rootkit, etc. In order to tackle the issue of all these forms of malware, there is a need to understand and detect them. There are various methods of detecting malware which include signature, behavioral, and machine learning. Machine learning methods have proven to be the most efficient of all for malware detection. In this thesis, a system that utilizes both the signature and dynamic behavior-based detection techniques, with the added layer of the …