Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Data Science

PDF

Machine Learning

Institution
Publication Year
Publication
Publication Type

Articles 1 - 30 of 134

Full-Text Articles in Physical Sciences and Mathematics

Evaluating Neuroimaging Modalities In The A/T/N Framework: Single And Combined Fdg-Pet And T1-Weighted Mri For Alzheimer’S Diagnosis, Peiwang Liu May 2024

Evaluating Neuroimaging Modalities In The A/T/N Framework: Single And Combined Fdg-Pet And T1-Weighted Mri For Alzheimer’S Diagnosis, Peiwang Liu

McKelvey School of Engineering Theses & Dissertations

With the escalating prevalence of dementia, particularly Alzheimer's Disease (AD), the need for early and precise diagnostic techniques is rising. This study delves into the comparative efficacy of Fluorodeoxyglucose Positron Emission Tomography (FDG-PET) and T1-weighted Magnetic Resonance Imaging (MRI) in diagnosing AD, where the integration of multimodal models is becoming a trend. Leveraging data from the Alzheimer's Disease Neuroimaging Initiative (ADNI), we employed linear Support Vector Machines (SVM) to assess the diagnostic potential of these modalities, both individually and in combination, within the AD continuum. Our analysis, under the A/T/N framework's 'N' category, reveals that FDG-PET consistently outperforms T1w-MRI across …


Code For Care: Hypertension Prediction In Women Aged 18-39 Years, Kruti Sheth May 2024

Code For Care: Hypertension Prediction In Women Aged 18-39 Years, Kruti Sheth

Electronic Theses, Projects, and Dissertations

The longstanding prevalence of hypertension, often undiagnosed, poses significant risks of severe chronic and cardiovascular complications if left untreated. This study investigated the causes and underlying risks of hypertension in females aged between 18-39 years. The research questions were: (Q1.) What factors affect the occurrence of hypertension in females aged 18-39 years? (Q2.) What machine learning algorithms are suited for effectively predicting hypertension? (Q3.) How can SHAP values be leveraged to analyze the factors from model outputs? The findings are: (Q1.) Performing Feature selection using binary classification Logistic regression algorithm reveals an array of 30 most influential factors at an …


Gender Detection In Facial Images: A Comprehensive Cnn Analysis, Jose N T Ambrosio, Anas Hourani, Magdalene Moy Apr 2024

Gender Detection In Facial Images: A Comprehensive Cnn Analysis, Jose N T Ambrosio, Anas Hourani, Magdalene Moy

SACAD: John Heinrichs Scholarly and Creative Activity Days

This research investigates the construction of a robust gender detection system using facial features and Convolutional Neural Networks (CNNs), exploring the impact of different layer configurations on accuracy and computational efficiency. With a validation accuracy of 91%, findings illuminate the nuanced relationship between precision and computational resources, enriching discussions on facial recognition technologies.


Assessing Gait Metrics For Early Parkinson's Disease Prediction: A Preliminary Analysis Of Underfit Models, Daniel Salinas, Gerardo Medellin, Katherine Bolado, Tomas Gomez, Kelsey Potter-Baker, Nawaz Khan Abdul Hack, Ramu Vadukapuram Mar 2024

Assessing Gait Metrics For Early Parkinson's Disease Prediction: A Preliminary Analysis Of Underfit Models, Daniel Salinas, Gerardo Medellin, Katherine Bolado, Tomas Gomez, Kelsey Potter-Baker, Nawaz Khan Abdul Hack, Ramu Vadukapuram

Research Symposium

Background: Parkinson's Disease (PD) is characterized by both motor and non-motor symptoms, and its diagnosis primarily relies on clinical presentation. There is a growing need for diagnostic tools to identify the early signs of PD, particularly the initial motor impairments often manifested as gait abnormalities. Here we seek to present preliminary findings to address this need. Our study focuses on using Machine Learning techniques (ML) to predict the PD clinical stage most efficiently and accurately. Specifically, we have sought to evaluate how spatiotemporal characteristics and other locomotor performance variables obtained on a walkway system can be utilized to identify the …


Data Driven And Machine Learning Based Modeling And Predictive Control Of Combustion At Reactivity Controlled Compression Ignition Engines, Behrouz Khoshbakht Irdmousa Jan 2024

Data Driven And Machine Learning Based Modeling And Predictive Control Of Combustion At Reactivity Controlled Compression Ignition Engines, Behrouz Khoshbakht Irdmousa

Dissertations, Master's Theses and Master's Reports

Reactivity Controlled Compression Ignition (RCCI) engines operates has capacity to provide higher thermal efficiency, lower particular matter (PM), and lower oxides of nitrogen (NOx) emissions compared to conventional diesel combustion (CDC) operation. Achieving these benefits is difficult since real-time optimal control of RCCI engines is challenging during transient operation. To overcome these challenges, data-driven machine learning based control-oriented models are developed in this study. These models are developed based on Linear Parameter-Varying (LPV) modeling approach and input-output based Kernelized Canonical Correlation Analysis (KCCA) approach. The developed dynamic models are used to predict combustion timing (CA50), indicated mean effective pressure (IMEP), …


Towards Algorithmic Justice: Human Centered Approaches To Artificial Intelligence Design To Support Fairness And Mitigate Bias In The Financial Services Sector, Jihyun Kim Jan 2024

Towards Algorithmic Justice: Human Centered Approaches To Artificial Intelligence Design To Support Fairness And Mitigate Bias In The Financial Services Sector, Jihyun Kim

CMC Senior Theses

Artificial Intelligence (AI) has positively transformed the Financial services sector but also introduced AI biases against protected groups, amplifying existing prejudices against marginalized communities. The financial decisions made by biased algorithms could cause life-changing ramifications in applications such as lending and credit scoring. Human Centered AI (HCAI) is an emerging concept where AI systems seek to augment, not replace human abilities while preserving human control to ensure transparency, equity and privacy. The evolving field of HCAI shares a common ground with and can be enhanced by the Human Centered Design principles in that they both put humans, the user, at …


Investigation Into A Practical Application Of Reinforcement Learning For The Stock Market, Philip Traxler, Sadik Aman, Will Rogers, Allyn Okun Dec 2023

Investigation Into A Practical Application Of Reinforcement Learning For The Stock Market, Philip Traxler, Sadik Aman, Will Rogers, Allyn Okun

SMU Data Science Review

A major problem of the financial industry is the ability to adapt their trading strategies at the same rate the market evolves. This paper proposes a solution using existing Reinforcement Learning libraries to help find new strategies at a practical scale. Using a wide domain of ticker symbols, an algorithm is trained in an environment that better represents reality. The supplied decision-making algorithm is tested using recorded data from the U.S stock market from 2000 through 2022. The results of this research show that existing techniques are statistically better than making decisions at random. With this result, this research shows …


A Prompt Engineering Approach To Creating Automated Commentary For Microsoft Self-Help Documentation Metric Reports Using Chatgpt, Ryan Herrin, Luke Stodgel, Brian Raffety Dec 2023

A Prompt Engineering Approach To Creating Automated Commentary For Microsoft Self-Help Documentation Metric Reports Using Chatgpt, Ryan Herrin, Luke Stodgel, Brian Raffety

SMU Data Science Review

Microsoft collects an immense amount of data from the users of their product-self-help documentation. Employees use this data to identify these self-help articles' performance trends and measure their impact on business Key Performance Indicators (KPIs). Microsoft uses various tools like Power BI and Python to analyze this data. The problem is that their analysis and findings are summarized manually. Therefore, this research will improve upon their current analysis methods by applying the latest prompt engineering practices and the power of ChatGPT's large language models (LLMs). Using VBA code, Microsoft Excel, and the ChatGPT API as an Excel add-in, this research …


Study Of Augmentations On Historical Manuscripts Using Trocr, Erez Meoded Dec 2023

Study Of Augmentations On Historical Manuscripts Using Trocr, Erez Meoded

Theses and Dissertations

Historical manuscripts are an essential source of original content. For many reasons, it is hard to recognize these manuscripts as text. This thesis used a state-of-the-art Handwritten Text Recognizer, TrOCR, to recognize a 16th-century manuscript. TrOCR uses a vision transformer to encode the input images and a language transformer to decode them back to text. We showed that carefully preprocessed images and designed augmentations can improve the performance of TrOCR. We suggest an ensemble of augmented models to achieve an even better performance.


Generalized Differentiable Neural Architecture Search With Performance And Stability Improvements, Emily J. Herron Dec 2023

Generalized Differentiable Neural Architecture Search With Performance And Stability Improvements, Emily J. Herron

Doctoral Dissertations

This work introduces improvements to the stability and generalizability of Cyclic DARTS (CDARTS). CDARTS is a Differentiable Architecture Search (DARTS)-based approach to neural architecture search (NAS) that uses a cyclic feedback mechanism to train search and evaluation networks concurrently, thereby optimizing the search process by enforcing that the networks produce similar outputs. However, the dissimilarity between the loss functions used by the evaluation networks during the search and retraining phases results in a search-phase evaluation network, a sub-optimal proxy for the final evaluation network utilized during retraining. ICDARTS, a revised algorithm that reformulates the search phase loss functions to ensure …


Generative Adversarial Game With Tailored Quantum Feature Maps For Enhanced Classification, Anais Sandra Nguemto Guiawa Dec 2023

Generative Adversarial Game With Tailored Quantum Feature Maps For Enhanced Classification, Anais Sandra Nguemto Guiawa

Doctoral Dissertations

In the burgeoning field of quantum machine learning, the fusion of quantum computing and machine learning methodologies has sparked immense interest, particularly with the emergence of noisy intermediate-scale quantum (NISQ) devices. These devices hold the promise of achieving quantum advantage, but they grapple with limitations like constrained qubit counts, limited connectivity, operational noise, and a restricted set of operations. These challenges necessitate a strategic and deliberate approach to crafting effective quantum machine learning algorithms.

This dissertation revolves around an exploration of these challenges, presenting innovative strategies that tailor quantum algorithms and processes to seamlessly integrate with commercial quantum platforms. A …


General Population Projection Model With Census Population Data, Takenori Tsuruga Dec 2023

General Population Projection Model With Census Population Data, Takenori Tsuruga

Electronic Theses, Projects, and Dissertations

The US Census Bureau offers a wide range of data, and within this array, the American Community Survey 5-Year Estimate (ACS5) serves as a valuable resource for understanding the US population. This project embarks on an exploration of Machine Learning and the Software Development process with the goal of generating effective population projections from ACS5 data. The project aims to provide methods to make predictions for every city and town in the US, encompassing their total population and population divided into 5-year age groups. It's worth noting that while the generation of these projections is grounded in the generalized statistical …


Ai Assisted Workflows For Computational Electromagnetics And Antenna Design, Oameed Noakoasteen Nov 2023

Ai Assisted Workflows For Computational Electromagnetics And Antenna Design, Oameed Noakoasteen

Electrical and Computer Engineering ETDs

These days large volumes of data can be recorded and manipulated with relative ease. If valuable information can be extracted from them, these vast amounts of data can be a rich resource not just for the digital economy but also for scientific discovery and development of technology. When it comes to deriving valuable information from data, Machine Learning (ML) emerges as the key solution. To unlock the potential benefits of ML to science and technology, extensive research is needed to explore what algorithms are suitable and how they can be applied.

To shine light on various ways that ML can …


Bayesian Structural Causal Inference With Probabilistic Programming, Sam A. Witty Nov 2023

Bayesian Structural Causal Inference With Probabilistic Programming, Sam A. Witty

Doctoral Dissertations

Reasoning about causal relationships is central to the human experience. This evokes a natural question in our pursuit of human-like artificial intelligence: how might we imbue intelligent systems with similar causal reasoning capabilities? Better yet, how might we imbue intelligent systems with the ability to learn cause and effect relationships from observation and experimentation? Unfortunately, reasoning about cause and effect requires more than just data: it also requires partial knowledge about data generating mechanisms. Given this need, our task then as computational scientists is to design data structures for representing partial causal knowledge, and algorithms for updating that knowledge in …


Machine Learning Modeling Of Polymer Coating Formulations: Benchmark Of Feature Representation Schemes, Nelson I. Evbarunegbe Nov 2023

Machine Learning Modeling Of Polymer Coating Formulations: Benchmark Of Feature Representation Schemes, Nelson I. Evbarunegbe

Masters Theses

Polymer coatings offer a wide range of benefits across various industries, playing a crucial role in product protection and extension of shelf life. However, formulating them can be a non-trivial task given the multitude of variables and factors involved in the production process, rendering it a complex, high-dimensional problem. To tackle this problem, machine learning (ML) has emerged as a promising tool, showing considerable potential in enhancing various polymer and chemistry-based applications, particularly those dealing with high dimensional complexities.

Our research aims to develop a physics-guided ML approach to facilitate the formulations of polymer coatings. As the first step, this …


Machine Learning Prediction Of Hea Properties, Nicholas J. Beaver, Nathaniel Melisso, Travis Murphy Oct 2023

Machine Learning Prediction Of Hea Properties, Nicholas J. Beaver, Nathaniel Melisso, Travis Murphy

College of Engineering Summer Undergraduate Research Program

High-entropy alloys (HEA) are a very new development in the field of metallurgical materials. They are made up of multiple principle atoms unlike traditional alloys, which contributes to their high configurational entropy. The microstructure and properties of HEAs are are not well predicted with the models developed for more common engineering alloys, and there is not enough data available on HEAs to fully represent the complex behavior of these alloys. To that end, we explore how the use of machine learning models can be used to model the complex, high dimensional behavior in the HEA composition space. Based on our …


Ethics And Social Justice For Ai In Data Science, Arya Ramchander, Kylene Nicole Landenberger Oct 2023

Ethics And Social Justice For Ai In Data Science, Arya Ramchander, Kylene Nicole Landenberger

College of Engineering Summer Undergraduate Research Program

The advances of AI raise several critical questions about human values and ethics, highlighting the need for researchers and developers to consider the ethical implications and the risks of neglecting them. In the past few years, student researchers have developed an AI model that allows users to test their surveys for possible breaches of subject confidentiality. This allows the users to gauge the ethicality of their proposal. This summer, we have expanded on this research and launched an interactive model for students and researches to assess their current work for ethical and social justice implications. Using Langchain and Figma, we …


Data-Driven 2d Materials Discovery For Next-Generation Electronics, Zeyu Zhang Aug 2023

Data-Driven 2d Materials Discovery For Next-Generation Electronics, Zeyu Zhang

Dissertations

The development of material discovery and design has lasted centuries in human history. After the concept of modern chemistry and material science was established, the strategy of material discovery relies on the experiments. Such a strategy becomes expensive and time-consuming with the increasing number of materials nowadays. Therefore, a novel strategy that is faster and more comprehensive is urgently needed. In this dissertation, an experiment-guided material discovery strategy is developed and explained using metal-organic frameworks (MOFs) as instances. The advent of 7r-stacked layered MOFs, which offer electrical conductivity on top of permanent porosity and high surface area, opened up new …


Genetic Programming To Optimize Performance Of Machine Learning Algorithms On Unbalanced Data Set, Asitha Thumpati Aug 2023

Genetic Programming To Optimize Performance Of Machine Learning Algorithms On Unbalanced Data Set, Asitha Thumpati

Electronic Theses, Projects, and Dissertations

Data collected from the real world is often imbalanced, meaning that the distribution of data across known classes is biased or skewed. When using machine learning classification models on such imbalanced data, predictive performance tends to be lower because these models are designed with the assumption of balanced classes or a relatively equal number of instances for each class. To address this issue, we employ data preprocessing techniques such as SMOTE (Synthetic Minority Oversampling Technique) for oversampling data and random undersampling for undersampling data on unbalanced datasets. Once the dataset is balanced, genetic programming is utilized for feature selection to …


Ai Approaches To Understand Human Deceptions, Perceptions, And Perspectives In Social Media, Chih-Yuan Li May 2023

Ai Approaches To Understand Human Deceptions, Perceptions, And Perspectives In Social Media, Chih-Yuan Li

Dissertations

Social media platforms have created virtual space for sharing user generated information, connecting, and interacting among users. However, there are research and societal challenges: 1) The users are generating and sharing the disinformation 2) It is difficult to understand citizens' perceptions or opinions expressed on wide variety of topics; and 3) There are overloaded information and echo chamber problems without overall understanding of the different perspectives taken by different people or groups.

This dissertation addresses these three research challenges with advanced AI and Machine Learning approaches. To address the fake news, as deceptions on the facts, this dissertation presents Machine …


Tempers Rising: The Effect Of Heat On Spite, Jake C. Cosgrove May 2023

Tempers Rising: The Effect Of Heat On Spite, Jake C. Cosgrove

Master's Theses

The relationship between heat and harmful outcomes is well documented, with research connecting various adverse economic outcomes to the climate. In the presence of increasing global warming and climate change, understanding why the climate leads to negative economic outcomes is essential for forming peaceful institutions of the future. We study how behavioral economic outcomes change in the presence of heat through a lab experiment involving 1,110 observations conducted in five different countries. This paper specifically focuses on the social preference outcome of spite. We find that increased time exposure to the treatment effect of heat is required to elicit an …


A Study Of Various Data Sizes Using Machine Learning, Sochaeta Koeum May 2023

A Study Of Various Data Sizes Using Machine Learning, Sochaeta Koeum

Electronic Theses, Projects, and Dissertations

Social media is a great domain for news consumption; however, it is referred to as a double-edged sword. While it is user-friendly and low-cost, social media is the reason why fake news can spread rapidly, which is detrimental to society, businesses, and many consumers. Therefore, fake news detection is an emerging field. However, some challenges have restricted other researchers from developing a universal machine learning model that is fast, efficient, and reliable to stop the proliferation because of the lack of resources available, such as large-sized datasets. The goal of this culminating experience project is to explore how varying datasets …


Distance Correlation Based Feature Selection In Random Forest, Jose Munoz-Lopez May 2023

Distance Correlation Based Feature Selection In Random Forest, Jose Munoz-Lopez

Electronic Theses, Projects, and Dissertations

The Pearson correlation coefficient is a commonly used measure of correlation, but it has limitations as it only measures the linear relationship between two numerical variables. In 2007, Szekely et al. introduced the distance correlation, which measures all types of dependencies between random vectors X and Y in arbitrary dimensions, not just the linear ones. In this thesis, we propose a filter method that utilizes distance correlation as a criterion for feature selection in Random Forest regression. We conduct extensive simulation studies to evaluate its performance compared to existing methods under various data settings, in terms of the prediction mean …


Automated Classification Of Pectinodon Bakkeri Teeth Images Using Machine Learning, Jacob A. Bahn Apr 2023

Automated Classification Of Pectinodon Bakkeri Teeth Images Using Machine Learning, Jacob A. Bahn

MS in Computer Science Project Reports

Microfossil dinosaur teeth are studied by paleontologists in order to better under- stand dinosaurs. Currently, tooth classification is a long, manual, error-ridden process. Deep learning offers a solution that allows for an automated way of classifying images of these microfossil teeth. In this thesis, we aimed to use deep learning in order to develop an automated approach for classifying images of Pectinodon bakkeri teeth. The proposed model was trained using a custom topology and it classified the images based on clusters created via K-Means. The model had an accuracy of 71%, a precision of 71%, a recall of 70.5%, and …


Using Machine Learning To Measure Political Polarization On Social Media, Veronica Cagle Apr 2023

Using Machine Learning To Measure Political Polarization On Social Media, Veronica Cagle

Student Research Submissions

Polarization in the political sphere, seen through combative communication and stalemate, may impose negative social impacts on the population. Attempting to measure political polarization in the masses through self-reported surveys and interviews can present response biases of social desirability. The classification of thought freely written online allows political polarization to be measured in an impartial manner. Reddit is one application that enables users to share opinions and create discussions anonymously; this text can be used to measure the political climate at any given time. Disagreement has grown over the perceived level of polarization in our society. The purpose of my …


The Role Of Machine Learning In Improved Functionality Of Lower Limb Prostheses, Joaquin Dominguez, Richard Kim, Robert Slater Apr 2023

The Role Of Machine Learning In Improved Functionality Of Lower Limb Prostheses, Joaquin Dominguez, Richard Kim, Robert Slater

SMU Data Science Review

Lower-limb amputations can cause a plethora of obstacles that lead to a lower quality of life. Implementing machine learning techniques means advanced prosthetics can contribute to facilitating the lives of those that live with lower-limb amputations. Using the publicly available HuGaDB data set, the current study investigates several classification models (random forest, neural network, and Vowpal Wabbit) to predict the locomotive intentions of individuals using lower-limb prostheses. The results of this study show that the neural network model yielded the highest accuracy, comparable precision, and recall scores to the other models. However, the Vowpal Wabbit model's advantage in speed may …


A Hybrid Continual Machine Learning Model For Efficient Hierarchical Classification Of Domain-Specific Text In The Presence Of Class Overlap (Case Study: It Support Tickets), Yasmen M. Wahba Mar 2023

A Hybrid Continual Machine Learning Model For Efficient Hierarchical Classification Of Domain-Specific Text In The Presence Of Class Overlap (Case Study: It Support Tickets), Yasmen M. Wahba

Electronic Thesis and Dissertation Repository

In today’s world, support ticketing systems are employed by a wide range of businesses. The ticketing system facilitates the interaction between customers and the support teams when the customer faces an issue with a product or a service. For large-scale IT companies with a large number of clients and a great volume of communications, the task of automating the classification of incoming tickets is key to guaranteeing long-term clients and ensuring business growth.

Although the problem of text classification has been widely studied in the literature, the majority of the proposed approaches revolve around state-of-the-art deep learning models. This thesis …


Chatgpt As Metamorphosis Designer For The Future Of Artificial Intelligence (Ai): A Conceptual Investigation, Amarjit Kumar Singh (Library Assistant), Dr. Pankaj Mathur (Deputy Librarian) Mar 2023

Chatgpt As Metamorphosis Designer For The Future Of Artificial Intelligence (Ai): A Conceptual Investigation, Amarjit Kumar Singh (Library Assistant), Dr. Pankaj Mathur (Deputy Librarian)

Library Philosophy and Practice (e-journal)

Abstract

Purpose: The purpose of this research paper is to explore ChatGPT’s potential as an innovative designer tool for the future development of artificial intelligence. Specifically, this conceptual investigation aims to analyze ChatGPT’s capabilities as a tool for designing and developing near about human intelligent systems for futuristic used and developed in the field of Artificial Intelligence (AI). Also with the helps of this paper, researchers are analyzed the strengths and weaknesses of ChatGPT as a tool, and identify possible areas for improvement in its development and implementation. This investigation focused on the various features and functions of ChatGPT that …


Revealing The Three-Dimensional Magnetic Texture With Machine Learning Models, Shihua Zhao Feb 2023

Revealing The Three-Dimensional Magnetic Texture With Machine Learning Models, Shihua Zhao

Dissertations, Theses, and Capstone Projects

Revealing three-dimensional (3D) magnetic textures with vector field electron tomography (VFET) is essential in studying novel magnetic materials with topologically protected spin textures potentially being used in the next-generation semiconductor industry. In this dissertation, we use machine learning (ML) models to reconstruct 3D magnetic textures from electron holography (EH) data.

We can feed the EH data, a series of two-dimensional (2D) phasemaps, into a neural network (NN) architecture directly or feed the EH data into a conventional VFET and then feed the reconstructed results into a NN. Thus, perceptive NN, either a simple convolutional neural network (CNN) or Unet architecture, …


Data Poisoning: A New Threat To Artificial Intelligence, Nary Simms Jan 2023

Data Poisoning: A New Threat To Artificial Intelligence, Nary Simms

Mathematics and Computer Science Capstones

Artificial Intelligence (AI) adoption is rapidly being deployed in a number of fields, from banking and finance to healthcare, robotics, transportation, military, e-commerce and social networks. Grand View Research estimates that the global AI market was worth 93.5 billion in 2021 and that it will increase at a compound annual growth rate (CAGR) of 38.1% from 2022 to 2030. According to a 2020 MIT Sloan Management survey, 87% of multinational corporations believe that AI technology will provide a competitive edge. Artificial Intelligence relies heavily on datasets to train its models. The more data, the better it learns and predicts. However, …