Data Science | Open Access Articles | Digital Commons Network™

Sequential Optimization For Stressor-Informed Test Planning Through Integration Of Experimental And Simulated Data, Jacob Brecheisen May 2024

Sequential Optimization For Stressor-Informed Test Planning Through Integration Of Experimental And Simulated Data, Jacob Brecheisen

Data Science Undergraduate Honors Theses

This technical report details an innovative approach in reliability engineering aimed at maximizing system durability through a synergistic use of physical experimentation and computer-based modeling. Our methodology explores the efficient design and analysis of computer experiments and physical tests to facilitate accelerated reliability growth, while leveraging a sequential integration of data from these two distinct sources: costly physical experiments, characterized by random errors, and inexpensive computer simulations, marked by inherent systematic errors. The key innovation lies in the adoption of a closed-loop design and analysis method. This method begins by identifying a viable subset of important environmental stressors—such as temperature, …

Go to article

Uconn Baseball Reliever Lane Optimization Tool, Jason Bartholomew Apr 2024

Uconn Baseball Reliever Lane Optimization Tool, Jason Bartholomew

Honors Scholar Theses

The building of a tool to be utilized by UConn’s Division I baseball team that will generate a game plan for when different relievers should be used against different parts of the opponent’s lineup to achieve the lowest total expected value of runs allowed for the remainder of the game based on game situations and matchup probabilities. The tool will also examine and determine situations that may be vital enough to the outcome of the game to bring in a better reliever normally saved for later in the game.

Go to article

Classification In Supervised Statistical Learning With The New Weighted Newton-Raphson Method, Toma Debnath Jan 2024

Classification In Supervised Statistical Learning With The New Weighted Newton-Raphson Method, Toma Debnath

Electronic Theses and Dissertations

In this thesis, the Weighted Newton-Raphson Method (WNRM), an innovative optimization technique, is introduced in statistical supervised learning for categorization and applied to a diabetes predictive model, to find maximum likelihood estimates. The iterative optimization method solves nonlinear systems of equations with singular Jacobian matrices and is a modification of the ordinary Newton-Raphson algorithm. The quadratic convergence of the WNRM, and high efficiency for optimizing nonlinear likelihood functions, whenever singularity in the Jacobians occur allow for an easy inclusion to classical categorization and generalized linear models such as the Logistic Regression model in supervised learning. The WNRM is thoroughly investigated …

Go to article

An Unsupervised Machine Learning Algorithm For Clustering Low Dimensional Data Points In Euclidean Grid Space, Josef Lazar Jan 2024

An Unsupervised Machine Learning Algorithm For Clustering Low Dimensional Data Points In Euclidean Grid Space, Josef Lazar

Senior Projects Spring 2024

Clustering algorithms provide a useful method for classifying data. The majority of well known clustering algorithms are designed to find globular clusters, however this is not always desirable. In this senior project I present a new clustering algorithm, GBCN (Grid Box Clustering with Noise), which applies a box grid to points in Euclidean space to identify areas of high point density. Points within the grid space that are in adjacent boxes are classified into the same cluster. Conversely, if a path from one point to another can only be completed by traversing an empty grid box, then they are classified …

Go to article

Deep Hybrid Modeling Of Neuronal Dynamics Using Generative Adversarial Networks, Soheil Saghafi May 2023

Deep Hybrid Modeling Of Neuronal Dynamics Using Generative Adversarial Networks, Soheil Saghafi

Dissertations

Mechanistic modeling and machine learning methods are powerful techniques for approximating biological systems and making accurate predictions from data. However, when used in isolation these approaches suffer from distinct shortcomings: model and parameter uncertainty limit mechanistic modeling, whereas machine learning methods disregard the underlying biophysical mechanisms. This dissertation constructs Deep Hybrid Models that address these shortcomings by combining deep learning with mechanistic modeling. In particular, this dissertation uses Generative Adversarial Networks (GANs) to provide an inverse mapping of data to mechanistic models and identifies the distributions of mechanistic model parameters coherent to the data.

Chapter 1 provides background information on …

Go to article

Crow Search Algorithm With Time Varying Flight Length Strategies For Feature Selection, Mohammed Abdullahi, Abdulhameed Adamu, Ibrahim Hayatu Hassan Jan 2023

Crow Search Algorithm With Time Varying Flight Length Strategies For Feature Selection, Mohammed Abdullahi, Abdulhameed Adamu, Ibrahim Hayatu Hassan

Future Computing and Informatics Journal

Feature Selection (FS) is an efficient technique use to get rid of irrelevant, redundant and noisy attributes in high dimensional datasets while increasing the efficacy of machine learning classification. The CSA is a modest and efficient metaheuristic algorithm which has been used to overcome several FS issues. The flight length (fl) parameter in CSA governs crows' search ability. In CSA, fl is set to a fixed value. As a result, the CSA is plagued by the problem of being hoodwinked in local minimum. This article suggests a remedy to this issue by bringing five new concepts of time dependent fl …

Go to article

Optimal Design And Operation Of Integrated Hydrogen Generation And Utilization Plants, Ijiwole Solomon Ijiyinka Jan 2023

Optimal Design And Operation Of Integrated Hydrogen Generation And Utilization Plants, Ijiwole Solomon Ijiyinka

Graduate Theses, Dissertations, and Problem Reports

There are considerable efforts worldwide for reducing the use of fossil fuel for energy production. While renewable energy sources are being increasingly used, fossil fuel still contribute about 80% of the energy used worldwide. As a result, the level of CO₂ is still increasing fast in the atmosphere currently exceeding about 410 parts per million (ppm). For reducing CO₂ build up in the atmosphere, various approaches are being investigated. For the electric power generation sector, two key approaches are post-combustion CO₂ capture and use of hydrogen as a fuel for power generation. These two solutions can also …

Go to article

Debiasing Cyber Incidents – Correcting For Reporting Delays And Under-Reporting, Seema Sangari Aug 2022

Debiasing Cyber Incidents – Correcting For Reporting Delays And Under-Reporting, Seema Sangari

Doctor of Data Science and Analytics Dissertations

This research addresses two key problems in the cyber insurance industry – reporting delays and under-reporting of cyber incidents. Both problems are important to understand the true picture of cyber incident rates. While reporting delays addresses the problem of delays in reporting due to delays in timely detection, under-reporting addresses the problem of cyber incidents frequently under-reported due to brand damage, reputation risk and eventual financial impacts.

The problem of reporting delays in cyber incidents is resolved by generating the distribution of reporting delays and fitting modeled parametric distributions on the given domain. The reporting delay distribution was found to …

Go to article

Tempering The Adversary: An Exploration Into The Applications Of Game Theoretic Feature Selection And Regression, Stephen Mcgee Aug 2022

Tempering The Adversary: An Exploration Into The Applications Of Game Theoretic Feature Selection And Regression, Stephen Mcgee

All Dissertations

Most modern machine learning algorithms tend to focus on an "average-case" approach, where every data point contributes the same amount of influence towards calculating the fit of a model. This "per-data point" error (or loss) is averaged together into an overall loss and typically minimized with an objective function. However, this can be insensitive to valuable outliers. Inspired by game theory, the goal of this work is to explore the utility of incorporating an optimally-playing adversary into feature selection and regression frameworks. The adversary assigns weights to the data elements so as to degrade the modeler's performance in an optimal …

Go to article

Anomaly Detection In Sequential Data: A Deep Learning-Based Approach, Jayesh Soni Jun 2022

Anomaly Detection In Sequential Data: A Deep Learning-Based Approach, Jayesh Soni

FIU Electronic Theses and Dissertations

Anomaly Detection has been researched in various domains with several applications in intrusion detection, fraud detection, system health management, and bio-informatics. Conventional anomaly detection methods analyze each data instance independently (univariate or multivariate) and ignore the sequential characteristics of the data. Anomalies in the data can be detected by grouping the individual data instances into sequential data and hence conventional way of analyzing independent data instances cannot detect anomalies. Currently: (1) Deep learning-based algorithms are widely used for anomaly detection purposes. However, significant computational overhead time is incurred during the training process due to static constant batch size and learning …

Go to article

Prediction Of Iraqi Stock Exchange Using Optimized Based-Neural Network, Ameer Al-Haq Al-Shamery, Prof. Dr. Eman Salih Al-Shamery Dec 2021

Prediction Of Iraqi Stock Exchange Using Optimized Based-Neural Network, Ameer Al-Haq Al-Shamery, Prof. Dr. Eman Salih Al-Shamery

Karbala International Journal of Modern Science

Stock market prediction is an interesting financial topic that has attracted the attention of researchers for the last years. This paper aims at improving the prediction of the Iraq-Stock-Exchange (ISX) using a developed method of feedforward Neural-Networks based on the Quasi-Newton optimization approach. The proposed method reduces the error factor depending on the Jacobian vector and Lagrange multiplier. This improvement has led to accelerating convergence during the learning process. A sample of companies listed on ISX was selected. This includes twenty-six banks for the years from 2010 to 2020. To evaluate the proposed model, the research findings are compared with …

Go to article

Plant Species Identification In The Wild Based On Images Of Organs, Meghana Kovur Jan 2021

Plant Species Identification In The Wild Based On Images Of Organs, Meghana Kovur

Graduate Theses, Dissertations, and Problem Reports

Image-based plant species identification in the wild is a difficult problem for several reasons. First, the input data is subject to a very high degree of variability because it is captured under fully unconstrained conditions. The same plant species may look very different in different images, while different species can often appear very similar, challenging even the recognition skills of human experts in the field. The large intra-class and small inter-class image variability makes this a fine-grained visual classification problem. One way to cope with this variability and to reduce image background noise is to predict species based on the …

Go to article

Big Data, Spatial Optimization, And Planning, Kai Cao, Wenwen Li, Richard Church Jul 2020

Big Data, Spatial Optimization, And Planning, Kai Cao, Wenwen Li, Richard Church

Research Collection School Of Computing and Information Systems

Spatial optimization represents a set of powerful spatial analysis techniques that can be used to identify optimal solution(s) and even generate a large number of competitive alternatives. The formulation of such problems involves maximizing or minimizing one or more objectives while satisfying a number of constraints. Solution techniques range from exact models solved with such approaches as linear programming and integer programming, or heuristic algorithms, i.e. Tabu Search, Simulated Annealing, and Genetic Algorithms. Spatial optimization techniques have been utilized in numerous planning applications, such as location-allocation modeling/site selection, land use planning, school districting, regionalization, routing, and urban design. These methods …

Go to article

Knowledge Management Overview Of Feature Selection Problem In High-Dimensional Financial Data: Cooperative Co-Evolution And Map Reduce Perspectives, A. N. M. Bazlur Rashid, Tonmoy Choudhury Jan 2019

Knowledge Management Overview Of Feature Selection Problem In High-Dimensional Financial Data: Cooperative Co-Evolution And Map Reduce Perspectives, A. N. M. Bazlur Rashid, Tonmoy Choudhury

Research outputs 2014 to 2021

The term "big data" characterizes the massive amounts of data generation by the advanced technologies in different domains using 4Vs volume, velocity, variety, and veracity-to indicate the amount of data that can only be processed via computationally intensive analysis, the speed of their creation, the different types of data, and their accuracy. High-dimensional financial data, such as time-series and space-Time data, contain a large number of features (variables) while having a small number of samples, which are used to measure various real-Time business situations for financial organizations. Such datasets are normally noisy, and complex correlations may exist between their features, …

Go to article

Data Science Commons^™

Full-Text Articles in Data Science

Sequential Optimization For Stressor-Informed Test Planning Through Integration Of Experimental And Simulated Data, Jacob Brecheisen

Data Science Undergraduate Honors Theses

Uconn Baseball Reliever Lane Optimization Tool, Jason Bartholomew

Honors Scholar Theses

Classification In Supervised Statistical Learning With The New Weighted Newton-Raphson Method, Toma Debnath

Electronic Theses and Dissertations

An Unsupervised Machine Learning Algorithm For Clustering Low Dimensional Data Points In Euclidean Grid Space, Josef Lazar

Senior Projects Spring 2024

Deep Hybrid Modeling Of Neuronal Dynamics Using Generative Adversarial Networks, Soheil Saghafi

Dissertations

Crow Search Algorithm With Time Varying Flight Length Strategies For Feature Selection, Mohammed Abdullahi, Abdulhameed Adamu, Ibrahim Hayatu Hassan

Future Computing and Informatics Journal

Optimal Design And Operation Of Integrated Hydrogen Generation And Utilization Plants, Ijiwole Solomon Ijiyinka

Graduate Theses, Dissertations, and Problem Reports

Debiasing Cyber Incidents – Correcting For Reporting Delays And Under-Reporting, Seema Sangari

Doctor of Data Science and Analytics Dissertations

Tempering The Adversary: An Exploration Into The Applications Of Game Theoretic Feature Selection And Regression, Stephen Mcgee

All Dissertations

Anomaly Detection In Sequential Data: A Deep Learning-Based Approach, Jayesh Soni

FIU Electronic Theses and Dissertations

Prediction Of Iraqi Stock Exchange Using Optimized Based-Neural Network, Ameer Al-Haq Al-Shamery, Prof. Dr. Eman Salih Al-Shamery

Karbala International Journal of Modern Science

Plant Species Identification In The Wild Based On Images Of Organs, Meghana Kovur

Graduate Theses, Dissertations, and Problem Reports

Big Data, Spatial Optimization, And Planning, Kai Cao, Wenwen Li, Richard Church

Research Collection School Of Computing and Information Systems

Knowledge Management Overview Of Feature Selection Problem In High-Dimensional Financial Data: Cooperative Co-Evolution And Map Reduce Perspectives, A. N. M. Bazlur Rashid, Tonmoy Choudhury

Research outputs 2014 to 2021