Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Theses/Dissertations

Machine Learning

Data Science

Institution
Publication Year
Publication

Articles 1 - 30 of 91

Full-Text Articles in Physical Sciences and Mathematics

Code For Care: Hypertension Prediction In Women Aged 18-39 Years, Kruti Sheth May 2024

Code For Care: Hypertension Prediction In Women Aged 18-39 Years, Kruti Sheth

Electronic Theses, Projects, and Dissertations

The longstanding prevalence of hypertension, often undiagnosed, poses significant risks of severe chronic and cardiovascular complications if left untreated. This study investigated the causes and underlying risks of hypertension in females aged between 18-39 years. The research questions were: (Q1.) What factors affect the occurrence of hypertension in females aged 18-39 years? (Q2.) What machine learning algorithms are suited for effectively predicting hypertension? (Q3.) How can SHAP values be leveraged to analyze the factors from model outputs? The findings are: (Q1.) Performing Feature selection using binary classification Logistic regression algorithm reveals an array of 30 most influential factors at an …


Towards Algorithmic Justice: Human Centered Approaches To Artificial Intelligence Design To Support Fairness And Mitigate Bias In The Financial Services Sector, Jihyun Kim Jan 2024

Towards Algorithmic Justice: Human Centered Approaches To Artificial Intelligence Design To Support Fairness And Mitigate Bias In The Financial Services Sector, Jihyun Kim

CMC Senior Theses

Artificial Intelligence (AI) has positively transformed the Financial services sector but also introduced AI biases against protected groups, amplifying existing prejudices against marginalized communities. The financial decisions made by biased algorithms could cause life-changing ramifications in applications such as lending and credit scoring. Human Centered AI (HCAI) is an emerging concept where AI systems seek to augment, not replace human abilities while preserving human control to ensure transparency, equity and privacy. The evolving field of HCAI shares a common ground with and can be enhanced by the Human Centered Design principles in that they both put humans, the user, at …


Data Driven And Machine Learning Based Modeling And Predictive Control Of Combustion At Reactivity Controlled Compression Ignition Engines, Behrouz Khoshbakht Irdmousa Jan 2024

Data Driven And Machine Learning Based Modeling And Predictive Control Of Combustion At Reactivity Controlled Compression Ignition Engines, Behrouz Khoshbakht Irdmousa

Dissertations, Master's Theses and Master's Reports

Reactivity Controlled Compression Ignition (RCCI) engines operates has capacity to provide higher thermal efficiency, lower particular matter (PM), and lower oxides of nitrogen (NOx) emissions compared to conventional diesel combustion (CDC) operation. Achieving these benefits is difficult since real-time optimal control of RCCI engines is challenging during transient operation. To overcome these challenges, data-driven machine learning based control-oriented models are developed in this study. These models are developed based on Linear Parameter-Varying (LPV) modeling approach and input-output based Kernelized Canonical Correlation Analysis (KCCA) approach. The developed dynamic models are used to predict combustion timing (CA50), indicated mean effective pressure (IMEP), …


Study Of Augmentations On Historical Manuscripts Using Trocr, Erez Meoded Dec 2023

Study Of Augmentations On Historical Manuscripts Using Trocr, Erez Meoded

Theses and Dissertations

Historical manuscripts are an essential source of original content. For many reasons, it is hard to recognize these manuscripts as text. This thesis used a state-of-the-art Handwritten Text Recognizer, TrOCR, to recognize a 16th-century manuscript. TrOCR uses a vision transformer to encode the input images and a language transformer to decode them back to text. We showed that carefully preprocessed images and designed augmentations can improve the performance of TrOCR. We suggest an ensemble of augmented models to achieve an even better performance.


Generalized Differentiable Neural Architecture Search With Performance And Stability Improvements, Emily J. Herron Dec 2023

Generalized Differentiable Neural Architecture Search With Performance And Stability Improvements, Emily J. Herron

Doctoral Dissertations

This work introduces improvements to the stability and generalizability of Cyclic DARTS (CDARTS). CDARTS is a Differentiable Architecture Search (DARTS)-based approach to neural architecture search (NAS) that uses a cyclic feedback mechanism to train search and evaluation networks concurrently, thereby optimizing the search process by enforcing that the networks produce similar outputs. However, the dissimilarity between the loss functions used by the evaluation networks during the search and retraining phases results in a search-phase evaluation network, a sub-optimal proxy for the final evaluation network utilized during retraining. ICDARTS, a revised algorithm that reformulates the search phase loss functions to ensure …


Generative Adversarial Game With Tailored Quantum Feature Maps For Enhanced Classification, Anais Sandra Nguemto Guiawa Dec 2023

Generative Adversarial Game With Tailored Quantum Feature Maps For Enhanced Classification, Anais Sandra Nguemto Guiawa

Doctoral Dissertations

In the burgeoning field of quantum machine learning, the fusion of quantum computing and machine learning methodologies has sparked immense interest, particularly with the emergence of noisy intermediate-scale quantum (NISQ) devices. These devices hold the promise of achieving quantum advantage, but they grapple with limitations like constrained qubit counts, limited connectivity, operational noise, and a restricted set of operations. These challenges necessitate a strategic and deliberate approach to crafting effective quantum machine learning algorithms.

This dissertation revolves around an exploration of these challenges, presenting innovative strategies that tailor quantum algorithms and processes to seamlessly integrate with commercial quantum platforms. A …


General Population Projection Model With Census Population Data, Takenori Tsuruga Dec 2023

General Population Projection Model With Census Population Data, Takenori Tsuruga

Electronic Theses, Projects, and Dissertations

The US Census Bureau offers a wide range of data, and within this array, the American Community Survey 5-Year Estimate (ACS5) serves as a valuable resource for understanding the US population. This project embarks on an exploration of Machine Learning and the Software Development process with the goal of generating effective population projections from ACS5 data. The project aims to provide methods to make predictions for every city and town in the US, encompassing their total population and population divided into 5-year age groups. It's worth noting that while the generation of these projections is grounded in the generalized statistical …


Ai Assisted Workflows For Computational Electromagnetics And Antenna Design, Oameed Noakoasteen Nov 2023

Ai Assisted Workflows For Computational Electromagnetics And Antenna Design, Oameed Noakoasteen

Electrical and Computer Engineering ETDs

These days large volumes of data can be recorded and manipulated with relative ease. If valuable information can be extracted from them, these vast amounts of data can be a rich resource not just for the digital economy but also for scientific discovery and development of technology. When it comes to deriving valuable information from data, Machine Learning (ML) emerges as the key solution. To unlock the potential benefits of ML to science and technology, extensive research is needed to explore what algorithms are suitable and how they can be applied.

To shine light on various ways that ML can …


Bayesian Structural Causal Inference With Probabilistic Programming, Sam A. Witty Nov 2023

Bayesian Structural Causal Inference With Probabilistic Programming, Sam A. Witty

Doctoral Dissertations

Reasoning about causal relationships is central to the human experience. This evokes a natural question in our pursuit of human-like artificial intelligence: how might we imbue intelligent systems with similar causal reasoning capabilities? Better yet, how might we imbue intelligent systems with the ability to learn cause and effect relationships from observation and experimentation? Unfortunately, reasoning about cause and effect requires more than just data: it also requires partial knowledge about data generating mechanisms. Given this need, our task then as computational scientists is to design data structures for representing partial causal knowledge, and algorithms for updating that knowledge in …


Machine Learning Modeling Of Polymer Coating Formulations: Benchmark Of Feature Representation Schemes, Nelson I. Evbarunegbe Nov 2023

Machine Learning Modeling Of Polymer Coating Formulations: Benchmark Of Feature Representation Schemes, Nelson I. Evbarunegbe

Masters Theses

Polymer coatings offer a wide range of benefits across various industries, playing a crucial role in product protection and extension of shelf life. However, formulating them can be a non-trivial task given the multitude of variables and factors involved in the production process, rendering it a complex, high-dimensional problem. To tackle this problem, machine learning (ML) has emerged as a promising tool, showing considerable potential in enhancing various polymer and chemistry-based applications, particularly those dealing with high dimensional complexities.

Our research aims to develop a physics-guided ML approach to facilitate the formulations of polymer coatings. As the first step, this …


Data-Driven 2d Materials Discovery For Next-Generation Electronics, Zeyu Zhang Aug 2023

Data-Driven 2d Materials Discovery For Next-Generation Electronics, Zeyu Zhang

Dissertations

The development of material discovery and design has lasted centuries in human history. After the concept of modern chemistry and material science was established, the strategy of material discovery relies on the experiments. Such a strategy becomes expensive and time-consuming with the increasing number of materials nowadays. Therefore, a novel strategy that is faster and more comprehensive is urgently needed. In this dissertation, an experiment-guided material discovery strategy is developed and explained using metal-organic frameworks (MOFs) as instances. The advent of 7r-stacked layered MOFs, which offer electrical conductivity on top of permanent porosity and high surface area, opened up new …


Genetic Programming To Optimize Performance Of Machine Learning Algorithms On Unbalanced Data Set, Asitha Thumpati Aug 2023

Genetic Programming To Optimize Performance Of Machine Learning Algorithms On Unbalanced Data Set, Asitha Thumpati

Electronic Theses, Projects, and Dissertations

Data collected from the real world is often imbalanced, meaning that the distribution of data across known classes is biased or skewed. When using machine learning classification models on such imbalanced data, predictive performance tends to be lower because these models are designed with the assumption of balanced classes or a relatively equal number of instances for each class. To address this issue, we employ data preprocessing techniques such as SMOTE (Synthetic Minority Oversampling Technique) for oversampling data and random undersampling for undersampling data on unbalanced datasets. Once the dataset is balanced, genetic programming is utilized for feature selection to …


Ai Approaches To Understand Human Deceptions, Perceptions, And Perspectives In Social Media, Chih-Yuan Li May 2023

Ai Approaches To Understand Human Deceptions, Perceptions, And Perspectives In Social Media, Chih-Yuan Li

Dissertations

Social media platforms have created virtual space for sharing user generated information, connecting, and interacting among users. However, there are research and societal challenges: 1) The users are generating and sharing the disinformation 2) It is difficult to understand citizens' perceptions or opinions expressed on wide variety of topics; and 3) There are overloaded information and echo chamber problems without overall understanding of the different perspectives taken by different people or groups.

This dissertation addresses these three research challenges with advanced AI and Machine Learning approaches. To address the fake news, as deceptions on the facts, this dissertation presents Machine …


Tempers Rising: The Effect Of Heat On Spite, Jake C. Cosgrove May 2023

Tempers Rising: The Effect Of Heat On Spite, Jake C. Cosgrove

Master's Theses

The relationship between heat and harmful outcomes is well documented, with research connecting various adverse economic outcomes to the climate. In the presence of increasing global warming and climate change, understanding why the climate leads to negative economic outcomes is essential for forming peaceful institutions of the future. We study how behavioral economic outcomes change in the presence of heat through a lab experiment involving 1,110 observations conducted in five different countries. This paper specifically focuses on the social preference outcome of spite. We find that increased time exposure to the treatment effect of heat is required to elicit an …


A Study Of Various Data Sizes Using Machine Learning, Sochaeta Koeum May 2023

A Study Of Various Data Sizes Using Machine Learning, Sochaeta Koeum

Electronic Theses, Projects, and Dissertations

Social media is a great domain for news consumption; however, it is referred to as a double-edged sword. While it is user-friendly and low-cost, social media is the reason why fake news can spread rapidly, which is detrimental to society, businesses, and many consumers. Therefore, fake news detection is an emerging field. However, some challenges have restricted other researchers from developing a universal machine learning model that is fast, efficient, and reliable to stop the proliferation because of the lack of resources available, such as large-sized datasets. The goal of this culminating experience project is to explore how varying datasets …


Distance Correlation Based Feature Selection In Random Forest, Jose Munoz-Lopez May 2023

Distance Correlation Based Feature Selection In Random Forest, Jose Munoz-Lopez

Electronic Theses, Projects, and Dissertations

The Pearson correlation coefficient is a commonly used measure of correlation, but it has limitations as it only measures the linear relationship between two numerical variables. In 2007, Szekely et al. introduced the distance correlation, which measures all types of dependencies between random vectors X and Y in arbitrary dimensions, not just the linear ones. In this thesis, we propose a filter method that utilizes distance correlation as a criterion for feature selection in Random Forest regression. We conduct extensive simulation studies to evaluate its performance compared to existing methods under various data settings, in terms of the prediction mean …


Using Machine Learning To Measure Political Polarization On Social Media, Veronica Cagle Apr 2023

Using Machine Learning To Measure Political Polarization On Social Media, Veronica Cagle

Student Research Submissions

Polarization in the political sphere, seen through combative communication and stalemate, may impose negative social impacts on the population. Attempting to measure political polarization in the masses through self-reported surveys and interviews can present response biases of social desirability. The classification of thought freely written online allows political polarization to be measured in an impartial manner. Reddit is one application that enables users to share opinions and create discussions anonymously; this text can be used to measure the political climate at any given time. Disagreement has grown over the perceived level of polarization in our society. The purpose of my …


A Hybrid Continual Machine Learning Model For Efficient Hierarchical Classification Of Domain-Specific Text In The Presence Of Class Overlap (Case Study: It Support Tickets), Yasmen M. Wahba Mar 2023

A Hybrid Continual Machine Learning Model For Efficient Hierarchical Classification Of Domain-Specific Text In The Presence Of Class Overlap (Case Study: It Support Tickets), Yasmen M. Wahba

Electronic Thesis and Dissertation Repository

In today’s world, support ticketing systems are employed by a wide range of businesses. The ticketing system facilitates the interaction between customers and the support teams when the customer faces an issue with a product or a service. For large-scale IT companies with a large number of clients and a great volume of communications, the task of automating the classification of incoming tickets is key to guaranteeing long-term clients and ensuring business growth.

Although the problem of text classification has been widely studied in the literature, the majority of the proposed approaches revolve around state-of-the-art deep learning models. This thesis …


Revealing The Three-Dimensional Magnetic Texture With Machine Learning Models, Shihua Zhao Feb 2023

Revealing The Three-Dimensional Magnetic Texture With Machine Learning Models, Shihua Zhao

Dissertations, Theses, and Capstone Projects

Revealing three-dimensional (3D) magnetic textures with vector field electron tomography (VFET) is essential in studying novel magnetic materials with topologically protected spin textures potentially being used in the next-generation semiconductor industry. In this dissertation, we use machine learning (ML) models to reconstruct 3D magnetic textures from electron holography (EH) data.

We can feed the EH data, a series of two-dimensional (2D) phasemaps, into a neural network (NN) architecture directly or feed the EH data into a conventional VFET and then feed the reconstructed results into a NN. Thus, perceptive NN, either a simple convolutional neural network (CNN) or Unet architecture, …


Data Poisoning: A New Threat To Artificial Intelligence, Nary Simms Jan 2023

Data Poisoning: A New Threat To Artificial Intelligence, Nary Simms

Mathematics and Computer Science Capstones

Artificial Intelligence (AI) adoption is rapidly being deployed in a number of fields, from banking and finance to healthcare, robotics, transportation, military, e-commerce and social networks. Grand View Research estimates that the global AI market was worth 93.5 billion in 2021 and that it will increase at a compound annual growth rate (CAGR) of 38.1% from 2022 to 2030. According to a 2020 MIT Sloan Management survey, 87% of multinational corporations believe that AI technology will provide a competitive edge. Artificial Intelligence relies heavily on datasets to train its models. The more data, the better it learns and predicts. However, …


Application Of Sentiment Analysis And Machine Learning Techniques To Predict Daily Cryptocurrency Price Returns, Edward Wu Jan 2023

Application Of Sentiment Analysis And Machine Learning Techniques To Predict Daily Cryptocurrency Price Returns, Edward Wu

CMC Senior Theses

This paper examines the effects of social media sentiment relating to Bitcoin on the daily price returns of Bitcoin and other popular cryptocurrencies by utilizing sentiment analysis and machine learning techniques to predict daily price returns. Many investors think that social media sentiment affects cryptocurrency prices. However, the results of this paper find that social media sentiment relating to Bitcoin does not add significant predictive value to forecasting daily price returns for each of the six cryptocurrencies used for analysis and that machine learning models that do not assume linearity between the current day price return and previous daily price …


Utilizing Machine Learning In Healthcare In An Ethical Fashion, Nishka Ayyar Jan 2023

Utilizing Machine Learning In Healthcare In An Ethical Fashion, Nishka Ayyar

CMC Senior Theses

This thesis paper explores the ethical considerations surrounding the use of machine learning (ML) solutions in healthcare. The background section discusses the basics of machine learning techniques and algorithms, and the increasing interest in their utilization in the healthcare sector. The paper then reviews and critically analyzes four studies that highlight concerns related to using ML in healthcare, including issues of bias, privacy, accountability, and transparency. Based on the analysis of these studies, the paper presents several recommendations for addressing these concerns. The paper concludes with a discussion on the potential benefits of using machine learning technology in healthcare. Ultimately, …


Predicting Housing Prices Using Ai, Eric Sconyers Jan 2023

Predicting Housing Prices Using Ai, Eric Sconyers

Williams Honors College, Honors Research Projects

I have created an AI model that can predict housing prices with 70 percent accuracy in Ames Iowa. I was able to use data from a website called Kaggle.com which is a website that provides datasets to the public so they can create AI models with the data. I found the dataset pertaining to housing prices in Ames Iowa. With this data, I was able to create an AI model that can predict the housing price of these homes. The technology I used in this project was Python as the programming language, and I used the scikit-learn library which has …


Development Of Machine Learning Based Approach To Predict Fuel Consumption And Maintenance Cost Of Heavy-Duty Vehicles Using Diesel And Alternative Fuels, Sasanka Katreddi Jan 2023

Development Of Machine Learning Based Approach To Predict Fuel Consumption And Maintenance Cost Of Heavy-Duty Vehicles Using Diesel And Alternative Fuels, Sasanka Katreddi

Graduate Theses, Dissertations, and Problem Reports

One of the major contributors of human-made greenhouse gases (GHG) namely carbon dioxide (CO2), methane (CH4), and nitrous oxide (NOX) in the transportation sector and heavy-duty vehicles (HDV) contributing to about 27% of the overall fraction. In addition to the rapid increase in global temperature, airborne pollutants from diesel vehicles also present a risk to human health. Even a small improvement that could potentially drive energy savings to the century-old mature diesel technology could yield a significant impact on minimizing greenhouse gas emissions. With the increasing focus on reducing emissions and operating costs, there is a need for efficient and …


Classification Of Darknet Traffic By Application Type, Shruti Sharma Jan 2023

Classification Of Darknet Traffic By Application Type, Shruti Sharma

Master's Projects

The darknet is frequently exploited for illegal purposes and activities, which makes darknet traffic detection an important security topic. Previous research has focused on various classification techniques for darknet traffic using machine learning and deep learning. We extend previous work by considering the effectiveness of a wide range of machine learning and deep learning technique for the classification of darknet traffic by application type. We consider the CICDarknet2020 dataset, which has been used in many previous studies, thus enabling a direct comparison of our results to previous work. We find that XGBoost performs the best among the classifiers that we …


Breast Density Classification Using Deep Learning, Conrad Thomas Testagrose Jan 2023

Breast Density Classification Using Deep Learning, Conrad Thomas Testagrose

UNF Graduate Theses and Dissertations

Breast density screenings are an accepted means to determine a patient's predisposed risk of breast cancer development. Although the direct correlation is not fully understood, breast cancer risk increases with higher levels of mammographic breast density. Radiologists visually assess a patient's breast density using mammogram images and assign a density score based on four breast density categories outlined by the Breast Imaging and Reporting Data Systems (BI-RADS). There have been efforts to develop automated tools that assist radiologists with increasing workloads and to help reduce the intra- and inter-rater variability between radiologists. In this thesis, I explored two deep-learning-based approaches …


Integrated Machine Learning And Optimization Approaches, Dogacan Yilmaz Dec 2022

Integrated Machine Learning And Optimization Approaches, Dogacan Yilmaz

Dissertations

This dissertation focuses on the integration of machine learning and optimization. Specifically, novel machine learning-based frameworks are proposed to help solve a broad range of well-known operations research problems to reduce the solution times. The first study presents a bidirectional Long Short-Term Memory framework to learn optimal solutions to sequential decision-making problems. Computational results show that the framework significantly reduces the solution time of benchmark capacitated lot-sizing problems without much loss in feasibility and optimality. Also, models trained using shorter planning horizons can successfully predict the optimal solution of the instances with longer planning horizons. For the hardest data set, …


Application Of Distributed Fiber-Optic Sensing For Pressure Predictions And Multiphase Flow Characterization, Gerald Kelechi Ekechukwu Dec 2022

Application Of Distributed Fiber-Optic Sensing For Pressure Predictions And Multiphase Flow Characterization, Gerald Kelechi Ekechukwu

LSU Doctoral Dissertations

In the oil and gas industry, distributed fiber optics sensing (DFOS) has the potential to revolutionize well and reservoir surveillance applications. Using fiber optic sensors is becoming increasingly common because of its chemically passive and non-magnetic interference properties, the possibility of flexible installations that could be behind the casing, on the tubing, or run on wireline, as well as the potential for densely distributed measurements along the entire length of the fiber. The main objectives of my research are to develop and demonstrate novel signal processing and machine learning computational techniques and workflows on DFOS data for a variety of …


Investigating Applications Of Deep Learning For Diagnosis Of Post Traumatic Elbow Disease, Hugh James Dec 2022

Investigating Applications Of Deep Learning For Diagnosis Of Post Traumatic Elbow Disease, Hugh James

McKelvey School of Engineering Theses & Dissertations

Traumatic events such as dislocation, breaks, and arthritis of musculoskeletal joints can cause the development of post-traumatic joint contracture (PTJC). Clinically, noninvasive techniques such as Magnetic Resonance Imaging (MRI) scans are used to analyze the disease. Such procedures require a patient to sit sedentary for long periods of time and can be expensive as well. Additionally, years of practice and experience are required for clinicians to accurately recognize the diseased anterior capsule region and make an accurate diagnosis. Manual tracing of the anterior capsule is done to help with diagnosis but is subjective and timely. As a result, there is …


Artificial Intelligence In The Medical Field: Medical Review Sentiment Analysis, Nicholas Podlesak Dec 2022

Artificial Intelligence In The Medical Field: Medical Review Sentiment Analysis, Nicholas Podlesak

Honors Capstones

In this research project, natural language processing techniques’ ability to accurately classify medical text was measured to reinforce the relevance of artificial intelligence in the medical field. Sentiment analyses (analyses to determine whether the text was positive or negative) were performed on the prescription drug reviews in an open-source dataset using four different models: lexical, a neural network, a support vector machine, and a logistic regression model. Each model’s effectiveness was gauged by its ability to correctly classify unlabeled drug reviews (i.e., a percentage representing accuracy). The machine learning models were able to accurately classify the text, while the lexical …