Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Data Science

Theses/Dissertations

2023

Institution
Keyword
Publication
File Type

Articles 1 - 30 of 161

Full-Text Articles in Physical Sciences and Mathematics

Roadside Lidar Data Processing For Intelligent Transportation System, Md Parvez Mollah Dec 2023

Roadside Lidar Data Processing For Intelligent Transportation System, Md Parvez Mollah

Computer Science ETDs

Roadside LiDAR (Light Detection and Ranging) sensors are recently being explored for Intelligent Transportation System aiming at safer and faster traffic management and vehicular operations. However, massive data volume, occlusion, and limited viewing angles are significant obstacles to the widespread use of roadside LiDARs. In this dissertation, we address three major challenges to enable applications of Intelligent Transportation System through roadside LiDAR data: (i) real-time transmission of the massive point-cloud data from the roadside LiDAR devices to the cloud using 5G network, (ii) mitigating sensor occlusion problem to increase coverage and detect events occurred in occluded regions of a sensor, …


Interpretable Word-Level Sentiment Analysis With Attention-Based Multiple Instance Classification Models, Chenyu Yang Dec 2023

Interpretable Word-Level Sentiment Analysis With Attention-Based Multiple Instance Classification Models, Chenyu Yang

Statistical Science Theses and Dissertations

In this study, our main objective is to tackle the black-box nature of popular machine learning models in sentiment analysis and enhance model interpretability. We aim to gain more insight into the decision-making process of sentiment analysis models, which is often obscure in those complex models. To achieve this goal, we introduce two word-level sentiment analysis models.

The first model is called the attention-based multiple instance classification (AMIC) model. It combines the transparent model structure of multiple instance classification and the self-attention mechanism in deep learning to incorporate the contextual information from documents. As demonstrated by a wine review dataset …


Utilizing Multitask Transfer Learning For Sonographic Rheumatoid Arthritis Synovitis Grading, Jordan Marie Claire Sanders Dec 2023

Utilizing Multitask Transfer Learning For Sonographic Rheumatoid Arthritis Synovitis Grading, Jordan Marie Claire Sanders

Doctoral Dissertations and Master's Theses

Classifying the four sonographic Rheumatoid Arthritis (RA) synovitis grades (Grade 0, Grade 1, Grade 2, and Grade 3) is a difficult problem due to the complexity of the relevant markers. Therefore, the current research proposes a Multitask Transfer Learning (MTL) framework for sonographic RA synovitis grading of Ultrasound (US) images in Brightness mode (B-Mode) and Power Doppler mode.

In the medical community, the lack of reliability of scoring these images has been an issue and reason for concern for doctors and other medical practitioners. The human/machine variability across the acquisition procedure of these US images creates an additional challenge that …


Learning Mortality Risk For Covid-19 Using Machine Learning And Statistical Methods, Shaoshi Zhang Dec 2023

Learning Mortality Risk For Covid-19 Using Machine Learning And Statistical Methods, Shaoshi Zhang

Electronic Thesis and Dissertation Repository

This research investigates the mortality risk of COVID-19 patients across different variant waves, using the data from Centers for Disease Control and Prevention (CDC) websites. By analyzing the available data, including patient medical records, vaccination rates, and hospital capacities, we aim to discern patterns and factors associated with COVID-19-related deaths.

To explore features linked to COVID-19 mortality, we employ different techniques such as Filter, Wrapper, and Embedded methods for feature selection. Furthermore, we apply various machine learning methods, including support vector machines, decision trees, random forests, logistic regression, K-nearest neighbours, na¨ıve Bayes methods, and artificial neural networks, to uncover underlying …


Cm-Ii Meditation As An Intervention To Reduce Stress And Improve Attention: A Study Of Ml Detection, Spectral Analysis, And Hrv Metrics, Sreekanth Gopi Dec 2023

Cm-Ii Meditation As An Intervention To Reduce Stress And Improve Attention: A Study Of Ml Detection, Spectral Analysis, And Hrv Metrics, Sreekanth Gopi

Master of Science in Computer Science Theses

Students frequently face heightened stress due to academic and social pressures, particularly in de- manding fields like computer science and engineering. These challenges are often associated with serious mental health issues, including ADHD (Attention Deficit Hyperactivity Disorder), depression, and an increased risk of suicide. The average student attention span has notably decreased from 21⁄2 minutes to just 47 seconds, and now it typically takes about 25 minutes to switch attention to a new task (Mark, 2023). Research findings suggest that over 95% of individuals who die by suicide have been diagnosed with depression (Shahtahmasebi, 2013), and almost 20% of students …


An Investigation Into Applications Of Canonical Polyadic Decomposition & Ensemble Learning In Forecasting Thermal Data Streams In Direct Laser Deposition Processes, Jonathan Storey Dec 2023

An Investigation Into Applications Of Canonical Polyadic Decomposition & Ensemble Learning In Forecasting Thermal Data Streams In Direct Laser Deposition Processes, Jonathan Storey

Theses and Dissertations

Additive manufacturing (AM) is a process of creating objects from 3D model data by adding layers of material. AM technologies present several advantages compared to traditional manufacturing technologies, such as producing less material waste and being capable of producing parts with greater geometric complexity. However, deficiencies in the printing process due to high process uncertainty can affect the microstructural properties of a fabricated part leading to defects. In metal AM, previous studies have linked defects in parts with melt pool temperature fluctuations, with the size of the melt pool and the scan pattern being key factors associated with part defects. …


Study Of Augmentations On Historical Manuscripts Using Trocr, Erez Meoded Dec 2023

Study Of Augmentations On Historical Manuscripts Using Trocr, Erez Meoded

Theses and Dissertations

Historical manuscripts are an essential source of original content. For many reasons, it is hard to recognize these manuscripts as text. This thesis used a state-of-the-art Handwritten Text Recognizer, TrOCR, to recognize a 16th-century manuscript. TrOCR uses a vision transformer to encode the input images and a language transformer to decode them back to text. We showed that carefully preprocessed images and designed augmentations can improve the performance of TrOCR. We suggest an ensemble of augmented models to achieve an even better performance.


High-Performance Computing In Covariant Loop Quantum Gravity, Pietropaolo Frisoni Dec 2023

High-Performance Computing In Covariant Loop Quantum Gravity, Pietropaolo Frisoni

Electronic Thesis and Dissertation Repository

This Ph.D. thesis presents a compilation of the scientific papers I published over the last three years during my Ph.D. in loop quantum gravity (LQG). First, we comprehensively introduce spinfoam calculations with a practical pedagogical paper. We highlight LQG's unique features and mathematical formalism and emphasize the computational complexities associated with its calculations. The subsequent articles delve into specific aspects of employing high-performance computing (HPC) in LQG research. We discuss the results obtained by applying numerical methods to studying spinfoams' infrared divergences, or ``bubbles''. This research direction is crucial to define the continuum limit of LQG properly. We investigate the …


Development Of An App For The Kalamazoo Nature Center, Ernest Au Dec 2023

Development Of An App For The Kalamazoo Nature Center, Ernest Au

Honors Theses

Kalamazoo Nature Center (KNC), which has been recognized by its peers as one of the top nature centers in the country, is home to over 14 miles of hiking trails winding through woods, wetlands, and prairies. There are numerous places/plots in KNC that have an interesting and impressive history besides being home to a variety of animals and hundreds of wildflowers and other plant life. To improve the visitor’s experience at KNC, we will design a software app via the senior capstone project at the department of Computer Science at WMU. As the first step towards establishing a reference model …


Generalized Differentiable Neural Architecture Search With Performance And Stability Improvements, Emily J. Herron Dec 2023

Generalized Differentiable Neural Architecture Search With Performance And Stability Improvements, Emily J. Herron

Doctoral Dissertations

This work introduces improvements to the stability and generalizability of Cyclic DARTS (CDARTS). CDARTS is a Differentiable Architecture Search (DARTS)-based approach to neural architecture search (NAS) that uses a cyclic feedback mechanism to train search and evaluation networks concurrently, thereby optimizing the search process by enforcing that the networks produce similar outputs. However, the dissimilarity between the loss functions used by the evaluation networks during the search and retraining phases results in a search-phase evaluation network, a sub-optimal proxy for the final evaluation network utilized during retraining. ICDARTS, a revised algorithm that reformulates the search phase loss functions to ensure …


Exploration And Statistical Modeling Of Profit, Caleb Gibson Dec 2023

Exploration And Statistical Modeling Of Profit, Caleb Gibson

Undergraduate Honors Theses

For any company involved in sales, maximization of profit is the driving force that guides all decision-making. Many factors can influence how profitable a company can be, including external factors like changes in inflation or consumer demand or internal factors like pricing and product cost. Understanding specific trends in one's own internal data, a company can readily identify problem areas or potential growth opportunities to help increase profitability.

In this discussion, we use an extensive data set to examine how a company might analyze their own data to identify potential changes the company might investigate to drive better performance. Based …


Exact Models, Heuristics, And Supervised Learning Approaches For Vehicle Routing Problems, Zefeng Lyu Dec 2023

Exact Models, Heuristics, And Supervised Learning Approaches For Vehicle Routing Problems, Zefeng Lyu

Doctoral Dissertations

This dissertation presents contributions to the field of vehicle routing problems by utilizing exact methods, heuristic approaches, and the integration of machine learning with traditional algorithms. The research is organized into three main chapters, each dedicated to a specific routing problem and a unique methodology. The first chapter addresses the Pickup and Delivery Problem with Transshipments and Time Windows, a variant that permits product transfers between vehicles to enhance logistics flexibility and reduce costs. To solve this problem, we propose an efficient mixed-integer linear programming model that has been shown to outperform existing ones. The second chapter discusses a practical …


Random Variable Spaces: Mathematical Properties And An Extension To Programming Computable Functions, Mohammed Kurd-Misto Dec 2023

Random Variable Spaces: Mathematical Properties And An Extension To Programming Computable Functions, Mohammed Kurd-Misto

Computational and Data Sciences (PhD) Dissertations

This dissertation aims to extend the boundaries of Programming Computable Functions (PCF) by introducing a novel collection of categories referred to as Random Variable Spaces. Originating as a generalization of Quasi-Borel Spaces, Random Variable Spaces are rigorously defined as categories where objects are sets paired with a collection of random variables from an underlying measurable space. These spaces offer a theoretical foundation for extending PCF to natively handle stochastic elements.

The dissertation is structured into seven chapters that provide a multi-disciplinary background, from PCF and Measure Theory to Category Theory with special attention to Monads and the Giry Monad. The …


A Bridge Between Graph Neural Networks And Transformers: Positional Encodings As Node Embeddings, Bright Kwaku Manu Dec 2023

A Bridge Between Graph Neural Networks And Transformers: Positional Encodings As Node Embeddings, Bright Kwaku Manu

Electronic Theses and Dissertations

Graph Neural Networks and Transformers are very powerful frameworks for learning machine learning tasks. While they were evolved separately in diverse fields, current research has revealed some similarities and links between them. This work focuses on bridging the gap between GNNs and Transformers by offering a uniform framework that highlights their similarities and distinctions. We perform positional encodings and identify key properties that make the positional encodings node embeddings. We found that the properties of expressiveness, efficiency and interpretability were achieved in the process. We saw that it is possible to use positional encodings as node embeddings, which can be …


Convolution And Autoencoders Applied To Nonlinear Differential Equations, Noah Borquaye Dec 2023

Convolution And Autoencoders Applied To Nonlinear Differential Equations, Noah Borquaye

Electronic Theses and Dissertations

Autoencoders, a type of artificial neural network, have gained recognition by researchers in various fields, especially machine learning due to their vast applications in data representations from inputs. Recently researchers have explored the possibility to extend the application of autoencoders to solve nonlinear differential equations. Algorithms and methods employed in an autoencoder framework include sparse identification of nonlinear dynamics (SINDy), dynamic mode decomposition (DMD), Koopman operator theory and singular value decomposition (SVD). These approaches use matrix multiplication to represent linear transformation. However, machine learning algorithms often use convolution to represent linear transformations. In our work, we modify these approaches to …


Parameter Estimation For Patient Enrollment In Clinical Trials, Junyan Liu Dec 2023

Parameter Estimation For Patient Enrollment In Clinical Trials, Junyan Liu

Undergraduate Honors Theses

In this paper, we study the Poisson-gamma model for recruitment time in clinical trials. We proved several properties of this model that match our intuitions from a reliability perspective, did simulations on this model, and used different optimization methods to estimate the parameters. Although the behaviors of the optimization methods were unfavorable and unstable, we identified certain conditions and provided potential explanations for this phenomenon and further insights into the Poisson-gamma model.


Implementation Of Hierarchical And K-Means Clustering Techniques On The Trend And Seasonality Components Of Temperature Profile Data, Emmanuel Ogedegbe Dec 2023

Implementation Of Hierarchical And K-Means Clustering Techniques On The Trend And Seasonality Components Of Temperature Profile Data, Emmanuel Ogedegbe

Electronic Theses and Dissertations

In this study, time series decomposition techniques are used in conjunction with Kmeans clustering and Hierarchical clustering, two well-known clustering algorithms, to climate data. Their implementation and comparisons are then examined. The main objective is to identify similar climate trends and group geographical areas with similar environmental conditions. Climate data from specific places are collected and analyzed as part of the project. The time series is then split into trend, seasonality, and residual components. In order to categorize growing regions according to their climatic inclinations, the deconstructed time series are then submitted to K-means clustering and Hierarchical clustering with dynamic …


Wavelet Compression As An Observational Operator In Data Assimilation Systems For Sea Surface Temperature, Bradley J. Sciacca Dec 2023

Wavelet Compression As An Observational Operator In Data Assimilation Systems For Sea Surface Temperature, Bradley J. Sciacca

University of New Orleans Theses and Dissertations

The ocean remains severely under-observed, in part due to its sheer size. Containing nearly billion of water with most of the subsurface being invisible because water is extremely difficult to penetrate using electromagnetic radiation, as is typically used by satellite measuring instruments. For this reason, most observations of the ocean have very low spatial-temporal coverage to get a broad capture of the ocean’s features. However, recent “dense but patchy” data have increased the availability of high-resolution – low spatial coverage observations. These novel data sets have motivated research into multi-scale data assimilation methods. Here, we demonstrate a new assimilation approach …


Making Data Meaningful: Stakeholder Perceptions On Data Visualization And Data Management Practices Within A Multi-Tiered System Of Supports (Mtss), Domenick Saia Dec 2023

Making Data Meaningful: Stakeholder Perceptions On Data Visualization And Data Management Practices Within A Multi-Tiered System Of Supports (Mtss), Domenick Saia

Dissertations

Data-driven decision-making and collaboration are core pillars of a multi-tiered system of supports (MTSS); however, timely and accessible data use, as well as data literacy and visualization literacy skills, are challenges school leaders and educators face related to implementing such frameworks. I hypothesized efficient data management systems and data visualization tools enable school teams to predict student learning outcomes, readily communicate, and better understand student data. The purpose of this study design was to highlight a need for more efficient data structures that allow school stakeholders to balance their roles within an MTSS framework more effectively. The context of this …


Review Classification Using Natural Language Processing And Deep Learning, Brian Nazareth Dec 2023

Review Classification Using Natural Language Processing And Deep Learning, Brian Nazareth

Electronic Theses, Projects, and Dissertations

Sentiment Analysis is an ongoing research in the field of Natural Language Processing (NLP). In this project, I will evaluate my testing against an Amazon Reviews Dataset, which contains more than 100 thousand reviews from customers. This project classifies the reviews using three methods – using a sentiment score by comparing the words of the reviews based on every positive and negative word that appears in the text with the Opinion Lexicon dataset, by considering the text’s variating sentiment polarity scores with a Python library called TextBlob, and with the help of neural network training. I have created a neural …


General Population Projection Model With Census Population Data, Takenori Tsuruga Dec 2023

General Population Projection Model With Census Population Data, Takenori Tsuruga

Electronic Theses, Projects, and Dissertations

The US Census Bureau offers a wide range of data, and within this array, the American Community Survey 5-Year Estimate (ACS5) serves as a valuable resource for understanding the US population. This project embarks on an exploration of Machine Learning and the Software Development process with the goal of generating effective population projections from ACS5 data. The project aims to provide methods to make predictions for every city and town in the US, encompassing their total population and population divided into 5-year age groups. It's worth noting that while the generation of these projections is grounded in the generalized statistical …


Predictive Model For Cfpb Consumer Complaints, Vyshnavi Nalluri Dec 2023

Predictive Model For Cfpb Consumer Complaints, Vyshnavi Nalluri

Electronic Theses, Projects, and Dissertations

Within the dynamic and highly competitive financial industry, the timely and efficient resolution of customer complaints stands as a central challenge, particularly in the intricate domain of mortgage services. The traditional processes for handling these complaints have long been recognized as laborious and resource-intensive, a situation that financial institutions, including the esteemed Wells Fargo, are keen to improve.

Currently, the industry largely relies on basic data analytics for identifying trends in customer complaints. However, this approach has its limitations, especially when dealing with complaints within the mortgage services domain. In response to this challenge, this research advocates the adoption of …


Generative Adversarial Game With Tailored Quantum Feature Maps For Enhanced Classification, Anais Sandra Nguemto Guiawa Dec 2023

Generative Adversarial Game With Tailored Quantum Feature Maps For Enhanced Classification, Anais Sandra Nguemto Guiawa

Doctoral Dissertations

In the burgeoning field of quantum machine learning, the fusion of quantum computing and machine learning methodologies has sparked immense interest, particularly with the emergence of noisy intermediate-scale quantum (NISQ) devices. These devices hold the promise of achieving quantum advantage, but they grapple with limitations like constrained qubit counts, limited connectivity, operational noise, and a restricted set of operations. These challenges necessitate a strategic and deliberate approach to crafting effective quantum machine learning algorithms.

This dissertation revolves around an exploration of these challenges, presenting innovative strategies that tailor quantum algorithms and processes to seamlessly integrate with commercial quantum platforms. A …


Migration In Edge Computing, Arshin Rezazadeh Nov 2023

Migration In Edge Computing, Arshin Rezazadeh

Electronic Thesis and Dissertation Repository

Mobile IoT applications often require low response time and high bandwidth. These applications include virtual reality, augmented reality, and online gaming. Currently, most data processing is done in the cloud. However, for latency-sensitive applications, the latency may need to be reduced. Edge and fog computing can be used to place application services close to mobile devices to reduce latency. However, as mobile devices move, latency increases, which can be decreased by moving the service to a closer edge/fog server. This can be addressed by migrating services so that the mobile device can receive services from the new server. These services …


Ai Assisted Workflows For Computational Electromagnetics And Antenna Design, Oameed Noakoasteen Nov 2023

Ai Assisted Workflows For Computational Electromagnetics And Antenna Design, Oameed Noakoasteen

Electrical and Computer Engineering ETDs

These days large volumes of data can be recorded and manipulated with relative ease. If valuable information can be extracted from them, these vast amounts of data can be a rich resource not just for the digital economy but also for scientific discovery and development of technology. When it comes to deriving valuable information from data, Machine Learning (ML) emerges as the key solution. To unlock the potential benefits of ML to science and technology, extensive research is needed to explore what algorithms are suitable and how they can be applied.

To shine light on various ways that ML can …


Domain Specific Feature Representation Learning For Diverse Temporal Data, Farhan Asif Chowdhury Nov 2023

Domain Specific Feature Representation Learning For Diverse Temporal Data, Farhan Asif Chowdhury

Computer Science ETDs

Humans can leverage domain context to recognize novel patterns and categories based on limited known examples. In contrast, computational learning methods are not adept at exploiting context and require sufficient labeled examples to achieve similar accuracy. Many temporal data domain, for example, seismic signals and oil mining sensor data, requires domain expert annotation, which is both costly and time-consuming. The dependency on training data limits the applicability of machine learning algorithms for domains with limited labeled data. This dissertation aims to address this gap by developing temporal mining algorithms that exploit domain context to learn discriminative feature representation from limited …


Uavs And Deep Neural Networks: An Alternative Approach To Monitoring Waterfowl At The Site Level, Zachary J. Loken Nov 2023

Uavs And Deep Neural Networks: An Alternative Approach To Monitoring Waterfowl At The Site Level, Zachary J. Loken

LSU Master's Theses

Understanding how waterfowl respond to habitat restoration and management activities is crucial for evaluating and refining conservation delivery programs. However, site-specific waterfowl monitoring is challenging, especially in heavily forested systems such as the Mississippi Alluvial Valley (MAV)—a primary wintering region for ducks in North America. I hypothesized that using uncrewed aerial vehicles (UAVs) coupled with deep learning-based methods for object detection would provide an efficient and effective means for surveying non-breeding waterfowl on difficult-to-access restored wetland sites. Accordingly, during the winters of 2021 and 2022, I surveyed wetland restoration easements in the MAV using a UAV equipped with a dual …


Local Model Agnostic Xai Methodologies Applied To Breast Cancer Malignancy Predictions, Heather Hartley Oct 2023

Local Model Agnostic Xai Methodologies Applied To Breast Cancer Malignancy Predictions, Heather Hartley

Electronic Thesis and Dissertation Repository

This thesis examines current state-of-the-art Explainable Artificial Intelligence (XAI) methodologies applicable to breast cancer diagnostics, as well as local model-agnostic XAI methodologies more broadly. It is well known that AI is underutilized in healthcare due to the fact that black box AI methods are largely uninterpretable. The potential for AI to positively affect health care outcomes is massive, and AI adoption by medical practitioners and the community at large will translate to more desirable patient outcomes. The development of XAI is crucial to furthering the integration of AI within healthcare, as it will allow medical practitioners and regulatory bodies to …


Spoken Language Processing And Modeling For Aviation Communications, Aaron Van De Brook Oct 2023

Spoken Language Processing And Modeling For Aviation Communications, Aaron Van De Brook

Doctoral Dissertations and Master's Theses

With recent advances in machine learning and deep learning technologies and the creation of larger aviation-specific corpora, applying natural language processing technologies, especially those based on transformer neural networks, to aviation communications is becoming increasingly feasible. Previous work has focused on machine learning applications to natural language processing, such as N-grams and word lattices. This thesis experiments with a process for pretraining transformer-based language models on aviation English corpora and compare the effectiveness and performance of language models transfer learned from pretrained checkpoints and those trained from their base weight initializations (trained from scratch). The results suggest that transformer language …


Machine Learning And Causality For Interpretable And Automated Decision Making, Maria Lentini Sep 2023

Machine Learning And Causality For Interpretable And Automated Decision Making, Maria Lentini

Theses and Dissertations

This abstract explores two key areas in decision science: automated and interpretable decision making. In the first part, we address challenges related to sparse user interaction data and high item turnover rates in recommender systems. We introduce a novel algorithm called Multi-View Interactive Collaborative Filtering (MV-ICTR) that integrates user-item ratings and contextual information, improving performance, particularly for cold-start scenarios. In the second part, we focus on Student Prescription Trees (SPTs), which are interpretable decision trees. These trees use a black box "teacher" model to predict counterfactuals based on observed covariates. We experiment with a Bayesian hierarchical binomial regression model as …