Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Theses/Dissertations

2022

Machine learning

Discipline
Institution
Publication
File Type

Articles 31 - 60 of 118

Full-Text Articles in Physical Sciences and Mathematics

Tempering The Adversary: An Exploration Into The Applications Of Game Theoretic Feature Selection And Regression, Stephen Mcgee Aug 2022

Tempering The Adversary: An Exploration Into The Applications Of Game Theoretic Feature Selection And Regression, Stephen Mcgee

All Dissertations

Most modern machine learning algorithms tend to focus on an "average-case" approach, where every data point contributes the same amount of influence towards calculating the fit of a model. This "per-data point" error (or loss) is averaged together into an overall loss and typically minimized with an objective function. However, this can be insensitive to valuable outliers. Inspired by game theory, the goal of this work is to explore the utility of incorporating an optimally-playing adversary into feature selection and regression frameworks. The adversary assigns weights to the data elements so as to degrade the modeler's performance in an optimal …


Towards Making Transformer-Based Language Models Learn How Children Learn, Yousra Mahdy Aug 2022

Towards Making Transformer-Based Language Models Learn How Children Learn, Yousra Mahdy

Boise State University Theses and Dissertations

Transformer-based Language Models (LMs), learn contextual meanings for words using a huge amount of unlabeled text data. These models show outstanding performance on various Natural Language Processing (NLP) tasks. However, what the LMs learn is far from what the meaning is for humans, partly due to the fact that humans can differentiate between concrete and abstract words, but language models make no distinction. Concrete words are words that have a physical representation in the world such as “chair”, while abstract words are ideas such as “democracy”. The process of learning word meanings starts from early childhood when children acquire their …


Deep Active Genetic Learning With Evidential Uncertainty For Agriculture Crops And Lake Water Quality Assessment, Oguz M. Aranay Aug 2022

Deep Active Genetic Learning With Evidential Uncertainty For Agriculture Crops And Lake Water Quality Assessment, Oguz M. Aranay

Legacy Theses & Dissertations (2009 - 2024)

Despite significant advancements in the field of machine learning, there are two issues that still require further exploration. First, how to learn from a small dataset; and second, how to select appropriate features from the data. Although there exist many techniques to address these issues, choosing a combination of the techniques from these two groups is challenging, and worth investigating. To address these concerns, this thesis presents a learning framework that is based on a deep learning model utilizing active learning (with evidential uncertainty as a basis for acquisition function) for the first issue and a genetic algorithm for the …


Stability And Differential Privacy Of Stochastic Gradient Methods, Zhenhuan Yang Aug 2022

Stability And Differential Privacy Of Stochastic Gradient Methods, Zhenhuan Yang

Legacy Theses & Dissertations (2009 - 2024)

Recently there are a considerable amount of work devoted to the study of the algorithmic stability as well as differential privacy (DP) for stochastic gradient methods (SGM). However, most of the existing work focus on the empirical risk minimization (ERM) and the population risk minimization problems. In this paper, we study two types of optimization problems that enjoy wide applications in modern machine learning, namely the minimax problem and the pairwise learning problem.


Data Collection And Machine Learning Methods For Automated Pedestrian Facility Detection And Mensuration, Joseph Bailey Luttrell Iv Aug 2022

Data Collection And Machine Learning Methods For Automated Pedestrian Facility Detection And Mensuration, Joseph Bailey Luttrell Iv

Dissertations

Large-scale collection of pedestrian facility (crosswalks, sidewalks, etc.) presence data is vital to the success of efforts to improve pedestrian facility management, safety analysis, and road network planning. However, this kind of data is typically not available on a large scale due to the high labor and time costs that are the result of relying on manual data collection methods. Therefore, methods for automating this process using techniques such as machine learning are currently being explored by researchers. In our work, we mainly focus on machine learning methods for the detection of crosswalks and sidewalks from both aerial and street-view …


Using Machine Learning To Classify Volleyball Jumps, Miki Jauhiainen Aug 2022

Using Machine Learning To Classify Volleyball Jumps, Miki Jauhiainen

Theses and Dissertations

In this study, inertial measurement units (IMUs) were used to train a random forest classifier to correctly classify different jump types in volleyball. Athlete motion data were collected in a controlled setting using three IMUs, one on the waist and one on each ankle. There were 11 participants who at the time played volleyball at the collegiate level in the United States, seven male and four female. Each performed the same number of jumps across the eight jump types--five BASIC jumps and three each of the other seven--resulting in 26 jumps per subject for a total of 286. The data …


Contributions To Random Forest Variable Importance With Applications In R, Kelvyn K. Bladen Aug 2022

Contributions To Random Forest Variable Importance With Applications In R, Kelvyn K. Bladen

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

A major focus in statistics is building and improving computational algorithms that can use data to predict a response. Two fundamental camps of research arise from such a goal. The first camp is researching ways to get more accurate predictions. Many sophisticated methods, collectively known as machine learning methods, have been developed for this very purpose. One such method that is widely used across industry and many other areas of investigation is called Random Forests.

The second camp of research is that of improving the interpretability of machine learning methods. This is worthy of attention when analysts desire to optimize …


Emotion Detection Using An Ensemble Model Trained With Physiological Signals And Inferred Arousal-Valence States, Matthew Nathanael Gray Aug 2022

Emotion Detection Using An Ensemble Model Trained With Physiological Signals And Inferred Arousal-Valence States, Matthew Nathanael Gray

Electrical & Computer Engineering Theses & Dissertations

Affective computing is an exciting and transformative field that is gaining in popularity among psychologists, statisticians, and computer scientists. The ability of a machine to infer human emotion and mood, i.e. affective states, has the potential to greatly improve human-machine interaction in our increasingly digital world. In this work, an ensemble model methodology for detecting human emotions across multiple subjects is outlined. The Continuously Annotated Signals of Emotion (CASE) dataset, which is a dataset of physiological signals labeled with discrete emotions from video stimuli as well as subject-reported continuous emotions, arousal and valence, from the circumplex model, is used for …


Developing Artificial Intelligence And Machine Learning To Support Primary Care Research And Practice, Jacqueline K. Kueper Jul 2022

Developing Artificial Intelligence And Machine Learning To Support Primary Care Research And Practice, Jacqueline K. Kueper

Electronic Thesis and Dissertation Repository

This thesis was motivated by the potential to use "everyday data", especially that collected in electronic health records (EHRs) as part of healthcare delivery, to improve primary care for clients facing complex clinical and/or social situations. Artificial intelligence (AI) techniques can identify patterns or make predictions with these data, producing information to learn about and inform care delivery. Our first objective was to understand and critique the body of literature on AI and primary care. This was achieved through a scoping review wherein we found the field was at an early stage of maturity, primarily focused on clinical decision support …


Reconstructing Historical Earthquake-Induced Tsunamis: Case Study Of 1820 Event Near South Sulawesi, Indonesia, Taylor Jole Paskett Jul 2022

Reconstructing Historical Earthquake-Induced Tsunamis: Case Study Of 1820 Event Near South Sulawesi, Indonesia, Taylor Jole Paskett

Theses and Dissertations

We build on the method introduced by Ringer, et al., applying it to an 1820 event that happened near South Sulawesi, Indonesia. We utilize other statistical models to aid our Metropolis-Hastings sampler, including a Gaussian process which informs the prior. We apply the method to multiple possible fault zones to determine which fault is the most likely source of the earthquake and tsunami. After collecting nearly 80,000 samples, we find that between the two most likely fault zones, the Walanae fault zone matches the anecdotal accounts much better than Flores. However, to support the anecdotal data, both samplers tend toward …


Learning From Machines: Insights In Forest Transpiration Using Machine Learning Methods, Morgan Tholl Jul 2022

Learning From Machines: Insights In Forest Transpiration Using Machine Learning Methods, Morgan Tholl

Dissertations and Theses

Machine learning has been used as a tool to model transpiration for individual sites, but few models are capable of generalizing to new locations without calibration to site data. Using the global SAPFLUXNET database, 95 tree sap flow data sites were grouped using three clustering strategies: by biome, by tree functional type, and through use of a k-means unsupervised clustering algorithm. Two supervised machine learning algorithms, a random forest algorithm and a neural network algorithm, were used to build machine learning models that predicted transpiration for each cluster. The performance and feature importance in each model were analyzed and compared …


Unpaired Style Transfer Conditional Generative Adversarial Network For Scanned Document Generation, David Jonathan Hawbaker Jul 2022

Unpaired Style Transfer Conditional Generative Adversarial Network For Scanned Document Generation, David Jonathan Hawbaker

Dissertations and Theses

Neural networks are a powerful machine learning tool, especially when trained on a large dataset of relevant high-quality data. Generative adversarial networks, image super resolution and most other image manipulation neural networks require a dataset of images and matching target images for training. Collecting and compiling that data can be time consuming and expensive. This work explores an approach for building a dataset of paired document images with a matching scanned version of each document without physical printers or scanners. A dataset of these document image pairs could be used to train a generative adversarial network or image super resolution …


Image-Based Crack Detection By Extracting Depth Of The Crack Using Machine Learning, Nishat Tabassum Jul 2022

Image-Based Crack Detection By Extracting Depth Of The Crack Using Machine Learning, Nishat Tabassum

Theses and Dissertations

Concrete structures have been a major aspect of social infrastructure since the ancient Roman times, so they have been used for many centuries. Concrete is used for the durability and support it provides to buildings and bridges. Assessing the state of these structures is important in preserving the longevity of structures and the safety of the public. Detecting cracks in their early stage allows repairs to be made without the need to replace the whole structure, so it reduces the cost. Traditional methods are slowly falling behind as technology advances and an increase in demand for a practical method of …


Patterns Of Dissolved Methane In Groundwater And Its Contribution To Emissions Inventories, Amanda E. Campbell Jul 2022

Patterns Of Dissolved Methane In Groundwater And Its Contribution To Emissions Inventories, Amanda E. Campbell

Dissertations - ALL

The Marcellus Shale is the largest shale gas play in the U.S. production of natural gas using high-volume hydraulic fracturing (HVHF) and production is prevalent throughout the play except in New York (NY), where it is currently banned. High concentrations of methane, the main component of natural gas, in groundwater, as well as its presence in the atmosphere, can have negative consequences. In this dissertation, three aspects of this issue are explored: 1) how and why naturally-occurring methane concentrations vary through time; 2) how elevated naturally-occurring methane concentrations in domestic water wells can be predicted from commonly observed well characteristics; …


Applications Of Machine Learning Algorithms In Materials Science And Bioinformatics, Mohammed Quazi Jun 2022

Applications Of Machine Learning Algorithms In Materials Science And Bioinformatics, Mohammed Quazi

Mathematics & Statistics ETDs

The piezoelectric response has been a measure of interest in density functional theory (DFT) for micro-electromechanical systems (MEMS) since the inception of MEMS technology. Piezoelectric-based MEMS devices find wide applications in automobiles, mobile phones, healthcare devices, and silicon chips for computers, to name a few. Piezoelectric properties of doped aluminum nitride (AlN) have been under investigation in materials science for piezoelectric thin films because of its wide range of device applicability. In this research using rigorous DFT calculations, high throughput ab-initio simulations for 23 AlN alloys are generated.

This research is the first to report strong enhancements of piezoelectric properties …


Models And Machine Learning Techniques For Improving The Planning And Operation Of Electricity Systems In Developing Regions, Santiago Correa Cardona Jun 2022

Models And Machine Learning Techniques For Improving The Planning And Operation Of Electricity Systems In Developing Regions, Santiago Correa Cardona

Doctoral Dissertations

The enormous innovation in computational intelligence has disrupted the traditional ways we solve the main problems of our society and allowed us to make more data-informed decisions. Energy systems and the ways we deliver electricity are not exceptions to this trend: cheap and pervasive sensing systems and new communication technologies have enabled the collection of large amounts of data that are being used to monitor and predict in real-time the behavior of this infrastructure. Bringing intelligence to the power grid creates many opportunities to integrate new renewable energy sources more efficiently, facilitate grid planning and expansion, improve reliability, optimize electricity …


Leveraging Context Patterns For Medical Entity Classification, Garrett Johnston Jun 2022

Leveraging Context Patterns For Medical Entity Classification, Garrett Johnston

Computer Science Senior Theses

The ability of patients to understand health-related text is important for optimal health outcomes. A system that can automatically annotate medical entities could help patients better understand health-related text. Such a system would also accelerate manual data annotation for this low-resource domain as well as assist in down- stream medical NLP tasks such as finding textual similarity, identifying conflicting medical advice, and aspect-based sentiment analysis. In this work, we investigate a state-of-the-art entity set expansion model, BootstrapNet, for the task of medical entity classification on a new dataset of medical advice text. We also propose EP SBERT, a simple model …


Fine-Tuning A 𝑘-Nearest Neighbors Machine Learning Model For The Detection Of Insurance Fraud, Alliyah Stout Jun 2022

Fine-Tuning A 𝑘-Nearest Neighbors Machine Learning Model For The Detection Of Insurance Fraud, Alliyah Stout

Honors Theses

Billions of dollars are lost within insurance companies due to fraud. Large money losses force insurance companies to increase premium costs and/or restrict policies. This negatively affects a company’s loyal customers. Although this is a prevalent problem, companies are not urgently working toward bettering their machine learning algorithms. Underskilled workers paired with inefficient computer algorithms make it difficult to accurately and reliably detect fraud.

The goal of this study is to understand the idea of -Nearest Neighbors ( -NN) and to use this classification technique to accurately detect fraudulent auto insurance claims. Using -NN requires choosing a value and a …


A Comparison Of Machine Learning Techniques For Validating Students’ Proficiency In Mathematics, Alexander Avdeev Jun 2022

A Comparison Of Machine Learning Techniques For Validating Students’ Proficiency In Mathematics, Alexander Avdeev

Dissertations, Theses, and Capstone Projects

A principal goal of this project was to compare several machine learning (ML) algorithms to explore and validate math proficiency classifications based on standardized test scores. The data used in these analyses came from the 6th-grade students’ mathematics assessment records of the New York State Education Department’s Testing Program (NYSTP). Our approach was to test a number of competing machine learning (ML) algorithms for classifying students’ as proficient based on their test scores and other demographic information. Our samples were drawn from the 2016 test-taking cohort of 6th-grade students (N=156,800). Five classifiers including multinominal logistic regression (MLR), XGBoost, Tree-As, Lagrangian …


Exploring The Effectiveness Of Multiple-Exemplar Training For Visual Analysis Of Ab-Design Graphs, Verena S. Bethke Jun 2022

Exploring The Effectiveness Of Multiple-Exemplar Training For Visual Analysis Of Ab-Design Graphs, Verena S. Bethke

Dissertations, Theses, and Capstone Projects

In behavior analysis, data are usually analyzed using visual analysis of the graphed data. There are a wide range of methods used to visually analyze data, from a basic ‘textbook’ style approach to the use of visual aids, decision-rubrics, and computer-based approaches. In the literature, there have been some comparisons of the efficacy of different approaches. Visual analysis as a behavior can be taught using a variety of methods, independent of how the skill itself is to be performed. Teaching methods include lecture, online instruction, and equivalence-based instruction. There is not much research on the teaching of visual analysis specifically, …


Local Learning Algorithms For Stochastic Spiking Neural Networks, Bleema Rosenfeld May 2022

Local Learning Algorithms For Stochastic Spiking Neural Networks, Bleema Rosenfeld

Dissertations

This dissertation focuses on the development of machine learning algorithms for spiking neural networks, with an emphasis on local three-factor learning rules that are in keeping with the constraints imposed by current neuromorphic hardware. Spiking neural networks (SNNs) are an alternative to artificial neural networks (ANNs) that follow a similar graphical structure but use a processing paradigm more closely modeled after the biological brain in an effort to harness its low power processing capability. SNNs use an event based processing scheme which leads to significant power savings when implemented in dedicated neuromorphic hardware such as Intel’s Loihi chip.

This work …


Adversarially Robust And Accurate Machine Learning For Image Classification, Yanan Yang May 2022

Adversarially Robust And Accurate Machine Learning For Image Classification, Yanan Yang

Dissertations

Machine learning techniques in medical imaging systems are accurate, but minor perturbations in the data known as adversarial attacks can fool them. These attacks make the systems vulnerable to fraud and deception, and thus a significant challenge has been posed in practice. This dissertation presents the gradient-free trained sign activation networks to detect and deter adversarial attacks on medical imaging AI (Artificial Intelligence) systems. Experimental results show a higher distortion value is required to attack the proposed model than other state-of-the-art models on brain MRI (Magnetic resonance imaging), Chest X-ray, and histopathology image datasets. Moreover, the proposed models outperform the …


Un-Fair Trojan: Targeted Backdoor Attacks Against Model Fairness, Nicholas Furth May 2022

Un-Fair Trojan: Targeted Backdoor Attacks Against Model Fairness, Nicholas Furth

Theses

Machine learning models have been shown to be vulnerable against various backdoor and data poisoning attacks that adversely affect model behavior. Additionally, these attacks have been shown to make unfair predictions with respect to certain protected features. In federated learning, multiple local models contribute to a single global model communicating only using local gradients, the issue of attacks become more prevalent and complex. Previously published works revolve around solving these issues both individually and jointly. However, there has been little study on the effects of attacks against model fairness. Demonstrated in this work, a flexible attack, which we call Un-Fair …


Language Learning Using Models Of Intentionality In Repeated Games With Cheap Talk, Jonathan Berry Skaggs May 2022

Language Learning Using Models Of Intentionality In Repeated Games With Cheap Talk, Jonathan Berry Skaggs

Theses and Dissertations

Language is critical to establishing long-term cooperative relationships among intelligent agents (including people), particularly when the agents' preferences are in conflict. In such scenarios, an agent uses speech to coordinate and negotiate behavior with its partner(s). While recent work has shown that neural language modeling can produce effective speech agents, such algorithms typically only accept previous text as input. However, in relationships among intelligent agents, not all relevant context is expressed in conversation. Thus, in this paper, we propose and analyze an algorithm, called Llumi, that incorporates other forms of context to learn to speak in long-term relationships modeled as …


An Interdisciplinary Approach To Understanding Volcanoes And Their Processes, Katherine Cosburn May 2022

An Interdisciplinary Approach To Understanding Volcanoes And Their Processes, Katherine Cosburn

Physics & Astronomy ETDs

To better understand volcanoes and their processes is important from both a fundamental science perspective and for hazard monitoring purposes. The complexity and limitations we face in pursuing such a science are numerous and this dissertation explores how an interdisciplinary approach combining physics, computer science, and volcanology can address this complexity in a straightforward and meaningful way. This is achieved through various modelling techniques across three studies: (1) a first-order analytic modelling of stratovolcano topographic shape, (2) the use of a Bayesian joint inversion on gravity and novel cosmic-ray muon measurements for imaging flat-lying subsurface density anomalies, and (3) the …


The Contribution Of Ethical Governance Of Artificial Intelligence & Machine Learning In Healthcare, Tina Nguyen May 2022

The Contribution Of Ethical Governance Of Artificial Intelligence & Machine Learning In Healthcare, Tina Nguyen

Electronic Theses and Dissertations

With the Internet Age and technology progressively advancing every year, the usage of Artificial Intelligence (AI) along with Machine Learning (ML) algorithms has only increased since its introduction to society. Specifically, in the healthcare field, AI/ML has proven to its end-users how beneficial its assistance has been. However, despite its effectiveness and efficiencies, AI/ML has also been under scrutiny due to its unethical outcomes. As a result of this, two polarizing views are typically debated when discussing AI/ML. One side believes that AI/ML usage should continue regardless of its unsureness, while the other side argues that this technology is too …


Generating A Dataset For Comparing Linear Vs. Non-Linear Prediction Methods In Education Research, Jack Mauro, Elena Martinez, Anna Bargagliotti May 2022

Generating A Dataset For Comparing Linear Vs. Non-Linear Prediction Methods In Education Research, Jack Mauro, Elena Martinez, Anna Bargagliotti

Honors Thesis

Machine learning is often used to build predictive models by extracting patterns from large data sets. Such techniques are increasingly being utilized to predict outcomes in the social sciences. One such application is predicting student success. Machine learning can be applied to predicting student acceptance and success in academia. Using these tools for education-related data analysis, may enable the evaluation of programs, resources and curriculum. Currently, research is needed to examine application, admissions, and retention data in order to address equity in college computer science programs. However, most student-level data sets contain sensitive data that cannot be made public. To …


Computational Approaches To Understanding Subduction Zone Geodynamics, Surface Heat Flow, And The Metamorphic Rock Record, Buchanan C. Kerswell May 2022

Computational Approaches To Understanding Subduction Zone Geodynamics, Surface Heat Flow, And The Metamorphic Rock Record, Buchanan C. Kerswell

Boise State University Theses and Dissertations

Pressure-temperature (PT) estimates from exhumed high-pressure (HP) metamorphic rocks and global surface heat flow observations evidently encode information about subduction zone thermal structure and the nature of mechanical and chemical processing of subducted materials along the interface between converging plates. Previous work demonstrates the possibility of decoding such geodynamic information by comparing numerical geodynamic models with empirical observations of surface heat flow and the metamorphic rock record. However, ambiguous interpretations can arise from this line of inquiry with respect to thermal gradients, plate coupling, and detachment and recovery of subducted materials. This dissertation applies a variety of computational techniques to …


Development Of A Machine Learning-Based Financial Risk Control System, Zhigang Hu May 2022

Development Of A Machine Learning-Based Financial Risk Control System, Zhigang Hu

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

With the gradual end of the COVID-19 outbreak and the gradual recovery of the economy, more and more individuals and businesses are in need of loans. This demand brings business opportunities to various financial institutions, but also brings new risks. The traditional loan application review is mostly manual and relies on the business experience of the auditor, which has the disadvantages of not being able to process large quantities and being inefficient. Since the traditional audit processing method is no longer suitable some other method of reducing the rate of non-performing loans and detecting fraud in applications is urgently needed …


Machine Learning Methods For Statistical Analysis And Representation Learning On Neuroimaging Data, Fan Yang May 2022

Machine Learning Methods For Statistical Analysis And Representation Learning On Neuroimaging Data, Fan Yang

Computer Science and Engineering Dissertations

With the recent advance and widespread adoption of imaging technological innovations, clinical practitioner and scientists can easily acquire and store a large amount of various neuroimaging modalities, such as Diffusion Tensor Imaging (DTI), Magnetic Resonance Imaging (MRI), resting-state functional MRI (rs-fMRI) and Positron Emission Tomography (PET), etc. These novel imaging data sources cover a rich amount of factors that influence patients' cognitive health, offer an objective view of patients at unprecedented multi-resolution for the understanding of brain structure and function, and have the significant potential to improve healthcare by aiding better decision-making in diagnosing, monitoring and treating diseases. Machine Learning …