Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

PDF

Machine Learning

Numerical Analysis and Scientific Computing

Institution
Publication Year
Publication
Publication Type

Articles 1 - 30 of 39

Full-Text Articles in Physical Sciences and Mathematics

Code For Care: Hypertension Prediction In Women Aged 18-39 Years, Kruti Sheth May 2024

Code For Care: Hypertension Prediction In Women Aged 18-39 Years, Kruti Sheth

Electronic Theses, Projects, and Dissertations

The longstanding prevalence of hypertension, often undiagnosed, poses significant risks of severe chronic and cardiovascular complications if left untreated. This study investigated the causes and underlying risks of hypertension in females aged between 18-39 years. The research questions were: (Q1.) What factors affect the occurrence of hypertension in females aged 18-39 years? (Q2.) What machine learning algorithms are suited for effectively predicting hypertension? (Q3.) How can SHAP values be leveraged to analyze the factors from model outputs? The findings are: (Q1.) Performing Feature selection using binary classification Logistic regression algorithm reveals an array of 30 most influential factors at an …


Bayesian Structural Causal Inference With Probabilistic Programming, Sam A. Witty Nov 2023

Bayesian Structural Causal Inference With Probabilistic Programming, Sam A. Witty

Doctoral Dissertations

Reasoning about causal relationships is central to the human experience. This evokes a natural question in our pursuit of human-like artificial intelligence: how might we imbue intelligent systems with similar causal reasoning capabilities? Better yet, how might we imbue intelligent systems with the ability to learn cause and effect relationships from observation and experimentation? Unfortunately, reasoning about cause and effect requires more than just data: it also requires partial knowledge about data generating mechanisms. Given this need, our task then as computational scientists is to design data structures for representing partial causal knowledge, and algorithms for updating that knowledge in …


Your Cursor Reveals: On Analyzing Workers’ Browsing Behavior And Annotation Quality In Crowdsourcing Tasks, Pei-Chi Lo, Ee-Peng Lim Oct 2023

Your Cursor Reveals: On Analyzing Workers’ Browsing Behavior And Annotation Quality In Crowdsourcing Tasks, Pei-Chi Lo, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

In this work, we investigate the connection between browsing behavior and task quality of crowdsourcing workers performing annotation tasks that require information judgements. Such information judgements are often required to derive ground truth answers to information retrieval queries. We explore the use of workers’ browsing behavior to directly determine their annotation result quality. We hypothesize user attention to be the main factor contributing to a worker’s annotation quality. To predict annotation quality at the task level, we model two aspects of task-specific user attention, also known as general and semantic user attentions . Both aspects of user attention can be …


Testsgd: Interpretable Testing Of Neural Networks Against Subtle Group Discrimination, Mengdi Zhang, Jun Sun, Jingyi Wang, Bing Sun Sep 2023

Testsgd: Interpretable Testing Of Neural Networks Against Subtle Group Discrimination, Mengdi Zhang, Jun Sun, Jingyi Wang, Bing Sun

Research Collection School Of Computing and Information Systems

Discrimination has been shown in many machine learning applications, which calls for sufficient fairness testing before their deployment in ethic-relevant domains. One widely concerning type of discrimination, testing against group discrimination, mostly hidden, is much less studied, compared with identifying individual discrimination. In this work, we propose TestSGD, an interpretable testing approach which systematically identifies and measures hidden (which we call ‘subtle’) group discrimination of a neural network characterized by conditions over combinations of the sensitive attributes. Specifically, given a neural network, TestSGD first automatically generates an interpretable rule set which categorizes the input space into two groups. Alongside, TestSGD …


Machine Learning And Network Embedding Methods For Gene Co-Expression Networks, Niloofar Aghaieabiane May 2023

Machine Learning And Network Embedding Methods For Gene Co-Expression Networks, Niloofar Aghaieabiane

Dissertations

High-throughput technologies such as DNA microarrays and RNA-seq are used to measure the expression levels of large numbers of genes simultaneously. To support the extraction of biological knowledge, individual gene expression levels are transformed into Gene Co-expression Networks (GCNs). GCNs are analyzed to discover gene modules. GCN construction and analysis is a well-studied topic, for nearly two decades. While new types of sequencing and the corresponding data are now available, the software package WGCNA and its most recent variants are still widely used, contributing to biological discovery.

The discovery of biologically significant modules of genes from raw expression data is …


Chatgpt As Metamorphosis Designer For The Future Of Artificial Intelligence (Ai): A Conceptual Investigation, Amarjit Kumar Singh (Library Assistant), Dr. Pankaj Mathur (Deputy Librarian) Mar 2023

Chatgpt As Metamorphosis Designer For The Future Of Artificial Intelligence (Ai): A Conceptual Investigation, Amarjit Kumar Singh (Library Assistant), Dr. Pankaj Mathur (Deputy Librarian)

Library Philosophy and Practice (e-journal)

Abstract

Purpose: The purpose of this research paper is to explore ChatGPT’s potential as an innovative designer tool for the future development of artificial intelligence. Specifically, this conceptual investigation aims to analyze ChatGPT’s capabilities as a tool for designing and developing near about human intelligent systems for futuristic used and developed in the field of Artificial Intelligence (AI). Also with the helps of this paper, researchers are analyzed the strengths and weaknesses of ChatGPT as a tool, and identify possible areas for improvement in its development and implementation. This investigation focused on the various features and functions of ChatGPT that …


Machine Learning To Predict Warhead Fragmentation In-Flight Behavior From Static Data, Katharine Larsen Oct 2022

Machine Learning To Predict Warhead Fragmentation In-Flight Behavior From Static Data, Katharine Larsen

Doctoral Dissertations and Master's Theses

Accurate characterization of fragment fly-out properties from high-speed warhead detonations is essential for estimation of collateral damage and lethality for a given weapon. Real warhead dynamic detonation tests are rare, costly, and often unrealizable with current technology, leaving fragmentation experiments limited to static arena tests and numerical simulations. Stereoscopic imaging techniques can now provide static arena tests with time-dependent tracks of individual fragments, each with characteristics such as fragment IDs and their respective position vector. Simulation methods can account for the dynamic case but can exclude relevant dynamics experienced in real-life warhead detonations. This research leverages machine learning methodologies to …


Mathematical Models Yield Insights Into Cnns: Applications In Natural Image Restoration And Population Genetics, Ryan Cecil Aug 2022

Mathematical Models Yield Insights Into Cnns: Applications In Natural Image Restoration And Population Genetics, Ryan Cecil

Electronic Theses and Dissertations

Due to a rise in computational power, machine learning (ML) methods have become the state-of-the-art in a variety of fields. Known to be black-box approaches, however, these methods are oftentimes not well understood. In this work, we utilize our understanding of model-based approaches to derive insights into Convolutional Neural Networks (CNNs). In the field of Natural Image Restoration, we focus on the image denoising problem. Recent work have demonstrated the potential of mathematically motivated CNN architectures that learn both `geometric' and nonlinear higher order features and corresponding regularizers. We extend this work by showing that not only can geometric features …


Computational Models To Detect Radiation In Urban Environments: An Application Of Signal Processing Techniques And Neural Networks To Radiation Data Analysis, Jose Nicolas Gachancipa Jul 2022

Computational Models To Detect Radiation In Urban Environments: An Application Of Signal Processing Techniques And Neural Networks To Radiation Data Analysis, Jose Nicolas Gachancipa

Beyond: Undergraduate Research Journal

Radioactive sources, such as uranium-235, are nuclides that emit ionizing radiation, and which can be used to build nuclear weapons. In public areas, the presence of a radioactive nuclide can present a risk to the population, and therefore, it is imperative that threats are identified by radiological search and response teams in a timely and effective manner. In urban environments, such as densely populated cities, radioactive sources may be more difficult to detect, since background radiation produced by surrounding objects and structures (e.g., buildings, cars) can hinder the effective detection of unnatural radioactive material. This article presents a computational model …


Coded Distributed Function Computation, Pedro J. Soto Jun 2022

Coded Distributed Function Computation, Pedro J. Soto

Dissertations, Theses, and Capstone Projects

A ubiquitous problem in computer science research is the optimization of computation on large data sets. Such computations are usually too large to be performed on one machine and therefore the task needs to be distributed amongst a network of machines. However, a common problem within distributed computing is the mitigation of delays caused by faulty machines. This can be performed by the use of coding theory to optimize the amount of redundancy needed to handle such faults. This problem differs from classical coding theory since it is concerned with the dynamic coded computation on data rather than just statically …


Dementia Classification Through Textual Analysis With Machine Learning Algorithms, Joseph Hurowitz May 2022

Dementia Classification Through Textual Analysis With Machine Learning Algorithms, Joseph Hurowitz

Undergraduate Theses and Capstone Projects

The goal of this work is to build a classifier that can identify whether a patient is suffering from Alzheimer’s Disease of the Dementia Type (AD). A corpus of 2751 texts was used from the DementiaBank database, where each conversation is transcribed and marked using the CHAT format. Each text was analyzed by frequency of disfluencies, use of aphasic language, and lexical features. All parsed data was used to train a Random Forest, Naïve Bayes, and Support Vector Machine algorithm. These classification algorithms will be tested on the combination of all features, as well as each set of features individually.


Building An Artificial Intelligence Framework For Hypertension Diagnosis: A Use Case Of The Problem List Curation, Ketemwabi Yves Shamavu May 2022

Building An Artificial Intelligence Framework For Hypertension Diagnosis: A Use Case Of The Problem List Curation, Ketemwabi Yves Shamavu

Theses & Dissertations

Hypertension is the world's leading factor in cardiovascular disease. Forty-seven percent or close to one in two Americans aged 18 and older are affected. It predicts approximately a thousand deaths per day. Based on recent statistics from the Centers for Disease Control and Prevention, one in three patients with hypertension does not know they are hypertensive. Seventy-five percent of hypertensive patients have uncontrolled hypertension - meaning that they are not treated to target. While there is extensive literature on hypertension diagnosis and management, there is an apparent gap in understanding and acknowledging that a person is hypertensive. Moreover, blood pressure …


Characterizing Students’ Engineering Design Strategies Using Energy3d, Jasmine Singh, Viranga Perera, Alejandra Magana, Brittany Newell Apr 2021

Characterizing Students’ Engineering Design Strategies Using Energy3d, Jasmine Singh, Viranga Perera, Alejandra Magana, Brittany Newell

Discovery Undergraduate Interdisciplinary Research Internship

The goals of this study are to characterize design actions that students performed when solving a design challenge, and to create a machine learning model to help future students make better engineering design choices. We analyze data from an introductory engineering course where students used Energy3D, an open source computer-aided design software, to design a zero-energy home (i.e. a home that consumes no net energy over a period of a year). Student design actions within the software were recorded into text files. Using a sample of over 300 students, we first identify patterns in the data to assess how students …


Exploring Media Portrayals Of People With Mental Disorders Using Nlp, Swapna Gottipati, Mark Chong, Andrew Wei Kiat Lim, Benny Haryanto Kawidiredjo Feb 2021

Exploring Media Portrayals Of People With Mental Disorders Using Nlp, Swapna Gottipati, Mark Chong, Andrew Wei Kiat Lim, Benny Haryanto Kawidiredjo

Research Collection School Of Computing and Information Systems

Media plays an important role in creating an impact in society. Several studies show that news media and entertainment channels, at times may create overwhelming images of the mental illness that emphasize criminality and dangerousness. The consequences of such negative impact may impact the audience with stigma and on the other hand, they impair the self-esteem and help-seeking behavior of the people with mental disorders. This is the first study to examine the Singapore media’s portrayal of persons with mental disorders (MDs) using text analytics and natural language processing. To date, most studies on media portrayal of people with MDs …


Knot Flow Classification And Its Applications In Vehicular Ad-Hoc Networks (Vanet), David Schmidt May 2020

Knot Flow Classification And Its Applications In Vehicular Ad-Hoc Networks (Vanet), David Schmidt

Electronic Theses and Dissertations

Intrusion detection systems (IDSs) play a crucial role in the identification and mitigation for attacks on host systems. Of these systems, vehicular ad hoc networks (VANETs) are difficult to protect due to the dynamic nature of their clients and their necessity for constant interaction with their respective cyber-physical systems. Currently, there is a need for a VANET-specific IDS that meets this criterion. To this end, a spline-based intrusion detection system has been pioneered as a solution. By combining clustering with spline-based general linear model classification, this knot flow classification method (KFC) allows for robust intrusion detection to occur. Due its …


Early Warning Solar Storm Prediction, Ian D. Lumsden, Marvin Joshi, Matthew Smalley, Aiden Rutter, Ben Klein May 2020

Early Warning Solar Storm Prediction, Ian D. Lumsden, Marvin Joshi, Matthew Smalley, Aiden Rutter, Ben Klein

Chancellor’s Honors Program Projects

No abstract provided.


Orthogonal Recurrent Neural Networks And Batch Normalization In Deep Neural Networks, Kyle Eric Helfrich Jan 2020

Orthogonal Recurrent Neural Networks And Batch Normalization In Deep Neural Networks, Kyle Eric Helfrich

Theses and Dissertations--Mathematics

Despite the recent success of various machine learning techniques, there are still numerous obstacles that must be overcome. One obstacle is known as the vanishing/exploding gradient problem. This problem refers to gradients that either become zero or unbounded. This is a well known problem that commonly occurs in Recurrent Neural Networks (RNNs). In this work we describe how this problem can be mitigated, establish three different architectures that are designed to avoid this issue, and derive update schemes for each architecture. Another portion of this work focuses on the often used technique of batch normalization. Although found to be successful …


Process Based Analysis Of Fluvial Stratigraphic Record: Middle Pennsylvanian Allegheny Formation, North-Central Wv, Oluwasegun O. Abatan Jan 2020

Process Based Analysis Of Fluvial Stratigraphic Record: Middle Pennsylvanian Allegheny Formation, North-Central Wv, Oluwasegun O. Abatan

Graduate Theses, Dissertations, and Problem Reports

Fluvial deposits represent some of the best hydrocarbon reservoirs, but the quality of fluvial reservoirs varies depending on the reservoir architecture, which is controlled by allogenic and autogenic processes. Allogenic controls, including paleoclimate, tectonics, and glacio-eustasy, have long been debated as dominant controls in the deposition of fluvial strata. However, recent research has questioned the validity of this cyclicity and may indicate major influence from autogenic controls. To further investigate allogenic controls on stratal order, I analyzed the facies architecture, geomorphology, paleohydrology, and the stratigraphic framework of the Middle Pennsylvanian Allegheny Formation (MPAF), a fluvial depositional system in the Appalachian …


Ordinal Hyperplane Loss, Bob Vanderheyden Dec 2019

Ordinal Hyperplane Loss, Bob Vanderheyden

Doctor of Data Science and Analytics Dissertations

This research presents the development of a new framework for analyzing ordered class data, commonly called “ordinal class” data. The focus of the work is the development of classifiers (predictive models) that predict classes from available data. Ratings scales, medical classification scales, socio-economic scales, meaningful groupings of continuous data, facial emotional intensity and facial age estimation are examples of ordinal data for which data scientists may be asked to develop predictive classifiers. It is possible to treat ordinal classification like any other classification problem that has more than two classes. Specifying a model with this strategy does not fully utilize …


Multimodal Data Analytics And Fusion For Data Science, Haiman Tian Jun 2019

Multimodal Data Analytics And Fusion For Data Science, Haiman Tian

FIU Electronic Theses and Dissertations

Advances in technologies have rapidly accumulated a zettabyte of “new” data every two years. The huge amount of data have a powerful impact on various areas in science and engineering and generates enormous research opportunities, which calls for the design and development of advanced approaches in data analytics. Given such demands, data science has become an emerging hot topic in both industry and academia, ranging from basic business solutions, technological innovations, and multidisciplinary research to political decisions, urban planning, and policymaking. Within the scope of this dissertation, a multimodal data analytics and fusion framework is proposed for data-driven knowledge discovery …


Classifying Challenging Behaviors In Autism Spectrum Disorder With Neural Document Embeddings, Abigail Atchison May 2019

Classifying Challenging Behaviors In Autism Spectrum Disorder With Neural Document Embeddings, Abigail Atchison

Computational and Data Sciences (MS) Theses

The understanding and treatment of challenging behaviors in individuals with Autism Spectrum Disorder is paramount to enabling the success of behavioral therapy; an essential step in this process being the labeling of challenging behaviors demonstrated in therapy sessions. These manifestations differ across individuals and within individuals over time and thus, the appropriate classification of a challenging behavior when considering purely qualitative factors can be unclear. In this thesis we seek to add quantitative depth to this otherwise qualitative task of challenging behavior classification. We do so through the application of natural language processing techniques to behavioral descriptions extracted from the …


Visualization And Machine Learning Techniques For Nasa’S Em-1 Big Data Problem, Antonio P. Garza Iii, Jose Quinonez, Misael Santana, Nibhrat Lohia May 2019

Visualization And Machine Learning Techniques For Nasa’S Em-1 Big Data Problem, Antonio P. Garza Iii, Jose Quinonez, Misael Santana, Nibhrat Lohia

SMU Data Science Review

In this paper, we help NASA solve three Exploration Mission-1 (EM-1) challenges: data storage, computation time, and visualization of complex data. NASA is studying one year of trajectory data to determine available launch opportunities (about 90TBs of data). We improve data storage by introducing a cloud-based solution that provides elasticity and server upgrades. This migration will save $120k in infrastructure costs every four years, and potentially avoid schedule slips. Additionally, it increases computational efficiency by 125%. We further enhance computation via machine learning techniques that use the classic orbital elements to predict valid trajectories. Our machine learning model decreases trajectory …


An Evaluation Of Training Size Impact On Validation Accuracy For Optimized Convolutional Neural Networks, Jostein Barry-Straume, Adam Tschannen, Daniel W. Engels, Edward Fine Jan 2019

An Evaluation Of Training Size Impact On Validation Accuracy For Optimized Convolutional Neural Networks, Jostein Barry-Straume, Adam Tschannen, Daniel W. Engels, Edward Fine

SMU Data Science Review

In this paper, we present an evaluation of training size impact on validation accuracy for an optimized Convolutional Neural Network (CNN). CNNs are currently the state-of-the-art architecture for object classification tasks. We used Amazon’s machine learning ecosystem to train and test 648 models to find the optimal hyperparameters with which to apply a CNN towards the Fashion-MNIST (Mixed National Institute of Standards and Technology) dataset. We were able to realize a validation accuracy of 90% by using only 40% of the original data. We found that hidden layers appear to have had zero impact on validation accuracy, whereas the neural …


Improving Vix Futures Forecasts Using Machine Learning Methods, James Hosker, Slobodan Djurdjevic, Hieu Nguyen, Robert Slater Jan 2019

Improving Vix Futures Forecasts Using Machine Learning Methods, James Hosker, Slobodan Djurdjevic, Hieu Nguyen, Robert Slater

SMU Data Science Review

The problem of forecasting market volatility is a difficult task for most fund managers. Volatility forecasts are used for risk management, alpha (risk) trading, and the reduction of trading friction. Improving the forecasts of future market volatility assists fund managers in adding or reducing risk in their portfolios as well as in increasing hedges to protect their portfolios in anticipation of a market sell-off event. Our analysis compares three existing financial models that forecast future market volatility using the Chicago Board Options Exchange Volatility Index (VIX) to six machine/deep learning supervised regression methods. This analysis determines which models provide best …


Randomized Algorithms For Preconditioner Selection With Applications To Kernel Regression, Conner Dipaolo Jan 2019

Randomized Algorithms For Preconditioner Selection With Applications To Kernel Regression, Conner Dipaolo

HMC Senior Theses

The task of choosing a preconditioner M to use when solving a linear system Ax=b with iterative methods is often tedious and most methods remain ad-hoc. This thesis presents a randomized algorithm to make this chore less painful through use of randomized algorithms for estimating traces. In particular, we show that the preconditioner stability || I - M-1A ||F, known to forecast preconditioner quality, can be computed in the time it takes to run a constant number of iterations of conjugate gradients through use of sketching methods. This is in spite of folklore which …


Classification Of Stars From Redshifted Stellar Spectra Utilizing Machine Learning, Michael J. Brice Jan 2019

Classification Of Stars From Redshifted Stellar Spectra Utilizing Machine Learning, Michael J. Brice

All Master's Theses

The classification of stellar spectra is a fundamental task in stellar astrophysics. There have been many explorations into the automated classification of stellar spectra but few that involve the Sloan Digital Sky Survey (SDSS). Stellar spectra from the SDSS are applied to standard classification methods such as K-Nearest Neighbors, Random Forest, and Support Vector Machine to automatically classify the spectra. Stellar spectra are high dimensional data and the dimensionality is reduced using standard Feature Selection methods such as Chi-Squared and Fisher score and with domain-specific astronomical knowledge because classifiers work in low dimensional space. These methods are utilized to classify …


Automatic Identification Of Animals In The Wild: A Comparative Study Between C-Capsule Networks And Deep Convolutional Neural Networks., Joel Kamdem Teto, Ying Xie Nov 2018

Automatic Identification Of Animals In The Wild: A Comparative Study Between C-Capsule Networks And Deep Convolutional Neural Networks., Joel Kamdem Teto, Ying Xie

Master of Science in Computer Science Theses

The evolution of machine learning and computer vision in technology has driven a lot of

improvements and innovation into several domains. We see it being applied for credit decisions, insurance quotes, malware detection, fraud detection, email composition, and any other area having enough information to allow the machine to learn patterns. Over the years the number of sensors, cameras, and cognitive pieces of equipment placed in the wilderness has been growing exponentially. However, the resources (human) to leverage these data into something meaningful are not improving at the same rate. For instance, a team of scientist volunteers took 8.4 years, …


Game-Theoretic And Machine-Learning Techniques For Cyber-Physical Security And Resilience In Smart Grid, Longfei Wei Oct 2018

Game-Theoretic And Machine-Learning Techniques For Cyber-Physical Security And Resilience In Smart Grid, Longfei Wei

FIU Electronic Theses and Dissertations

The smart grid is the next-generation electrical infrastructure utilizing Information and Communication Technologies (ICTs), whose architecture is evolving from a utility-centric structure to a distributed Cyber-Physical System (CPS) integrated with a large-scale of renewable energy resources. However, meeting reliability objectives in the smart grid becomes increasingly challenging owing to the high penetration of renewable resources and changing weather conditions. Moreover, the cyber-physical attack targeted at the smart grid has become a major threat because millions of electronic devices interconnected via communication networks expose unprecedented vulnerabilities, thereby increasing the potential attack surface. This dissertation is aimed at developing novel game-theoretic and …


Predict The Failure Of Hydraulic Pumps By Different Machine Learning Algorithms, Yifei Zhou, Monika Ivantysynova, Nathan Keller Aug 2018

Predict The Failure Of Hydraulic Pumps By Different Machine Learning Algorithms, Yifei Zhou, Monika Ivantysynova, Nathan Keller

The Summer Undergraduate Research Fellowship (SURF) Symposium

Pump failure is a general concerned problem in the hydraulic field. Once happening, it will cause a huge property loss and even the life loss. The common methods to prevent the occurrence of pump failure is by preventative maintenance and breakdown maintenance, however, both of them have significant drawbacks. This research focuses on the axial piston pump and provides a new solution by the prognostic of pump failure using the classification of machine learning. Different kinds of sensors (temperature, acceleration and etc.) were installed into a good condition pump and three different kinds of damaged pumps to measure 10 of …


Online Deep Learning: Learning Deep Neural Networks On The Fly, Doyen Sahoo, Hong Quang Pham, Jing Lu, Steven C. H. Hoi Jul 2018

Online Deep Learning: Learning Deep Neural Networks On The Fly, Doyen Sahoo, Hong Quang Pham, Jing Lu, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

Deep Neural Networks (DNNs) are typically trained by backpropagation in a batch setting, requiring the entire training data to be made available prior to the learning task. This is not scalable for many real-world scenarios where new data arrives sequentially in a stream. We aim to address an open challenge of “Online Deep Learning” (ODL) for learning DNNs on the fly in an online setting. Unlike traditional online learning that often optimizes some convex objective function with respect to a shallow model (e.g., a linear/kernel-based hypothesis), ODL is more challenging as the optimization objective is non-convex, and regular DNN with …