Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 12 of 12

Full-Text Articles in Physical Sciences and Mathematics

Evaluation Of Evapotranspiration Estimates Using An Existing Hybrid Machine Learning Model In A Natural And A Managed Dryland Site, Katya Esquivel Herrera Dec 2023

Evaluation Of Evapotranspiration Estimates Using An Existing Hybrid Machine Learning Model In A Natural And A Managed Dryland Site, Katya Esquivel Herrera

Open Access Theses & Dissertations

Evapotranspiration (ET) is a critical component of the hydrologic cycle, encompassing both evaporative water loss from surfaces and transpiration through plant stomata. The environmental factors influencing ET include water and energy availability, atmospheric capacity for water uptake, and various meteorological variables. ET serves as a unique climate variable linking water, energy, and carbon cycles. In agroecosystems, accurate ET quantification is vital for optimizing water use efficiency, irrigation management, and crop yield. Traditional methods for ET estimation involve direct measurements and indirect models, with both presenting limitations.

Recent years have witnessed the integration of remote sensing and machine learning (ML) algorithms …


Increasing The Efficiency And Accuracy Of Collective Intelligence Methods For Image Classification, Md Mahmudulla Hassan Aug 2023

Increasing The Efficiency And Accuracy Of Collective Intelligence Methods For Image Classification, Md Mahmudulla Hassan

Open Access Theses & Dissertations

Collective intelligence has emerged as a powerful methodology for annotating and classifying challenging data that pose difficulties for automated classifiers. It works by leveraging the concept of "wisdom of the crowds" which approximates a ground truth after aggregating experts' feedback and filtering out noise. However, challenges arise when certain applications, such as medical image classification, security threat detection, and financial fraud detection, demand accurate and reliable data annotation. The unreliability of experts due to inconsistent expertise and competencies, coupled with the associated cost and time-consuming judgment extraction, presents additional challenges.

Input aggregation is the process of consolidating and combining multiple …


Generation Of Phase Transitions Boundaries Via Convolutional Neural Networks, Christopher Alexis Ibarra Dec 2022

Generation Of Phase Transitions Boundaries Via Convolutional Neural Networks, Christopher Alexis Ibarra

Open Access Theses & Dissertations

Accurate mapping of phase transitions boundaries is crucial in accurately modeling the equation of state of materials. The phase transitions can be structural (solid-solid) driven by temperature or pressure or a phase change like melting which defines the solid-liquid melt line. There exist many computational methods for evaluating the phase diagram at a particular point in temperature (T) and pressure (P). Most of these methods involve evaluation of a single (P,T) point at a time. The present work partially automates the search for phase boundaries lines utilizing a machine learning method based on convolutional neural networks and an efficient search …


Gene Selection And Classification In High-Throughput Biological Data With Integrated Machine Learning Algorithms And Bioinformatics Approaches, Abhijeet R Patil May 2021

Gene Selection And Classification In High-Throughput Biological Data With Integrated Machine Learning Algorithms And Bioinformatics Approaches, Abhijeet R Patil

Open Access Theses & Dissertations

With the rise of high throughput technologies in biomedical research, large volumes of expression profiling, methylation profiling, and RNA-sequencing data are being generated. These high-dimensional data have large number of features with small number of samples, a characteristic called the "curse of dimensionality." The selection of optimal features, which largely affects the performance of classification algorithms in machine learning models, has led to challenging problems in bioinformatics analyses of such high-dimensional datasets. In this work, I focus on the design of two-stage frameworks of feature selection and classification and their applications in multiple sets of colorectal cancer data. The first …


Comparing Predictive Performance Of Statistical Learning Models On Medical Data, Francis Biney Jan 2020

Comparing Predictive Performance Of Statistical Learning Models On Medical Data, Francis Biney

Open Access Theses & Dissertations

This work investigates the predictive performance of 10 Machine learning models on three medical data including Breast cancer, Heart disease and Prostate cancer. Furthermore, we use the models to identify risk factors that contribute significantly to these diseases.

The models considered include; Logistic regression with L1 and L_2 penalties, Principal component logistic regression(PCR-LR), Partial least squares logistic regression(PLS-LR), Multivariate adaptive regression splines(MARS), Support vector machine with Radial Basis Kernel (SVM-RBK), Random Forest(RF), Gradient Boosting Machines(GBM), Elastic Net (Enet) and Feedforward Neural Network(FFNN). The models were grouped according to their similarities and learning style; i) Linear regularized models: LR-Lasso, LR-Ridge and …


Using Machine Learning On An Imbalanced Cancer Dataset, James Ekow Arthur Jan 2020

Using Machine Learning On An Imbalanced Cancer Dataset, James Ekow Arthur

Open Access Theses & Dissertations

With an estimated 1.4 million cancer diagnosis worldwide and the increasing death of cancer patients. It is prudent to investigate methods, approaches and smarter ways of predicting and diagnosing of cancer so that a holistic techniques can be used to curb or reduce false predictions , increase exact predictions and also meticulos prognosis information .

Can a feasible technique be developed for the general problem of prognosis and diagnosis of cancer be developed ?

We will show here that this problem of cancer prognosis and diagnosis can be efficiently tackled with the aid of machine learning techniques and the best, …


Forecasting Crashes, Credit Card Default, And Imputation Analysis On Missing Values By The Use Of Neural Networks, Jazmin Quezada Jan 2019

Forecasting Crashes, Credit Card Default, And Imputation Analysis On Missing Values By The Use Of Neural Networks, Jazmin Quezada

Open Access Theses & Dissertations

A neural network is a system of hardware and/or software patterned after the operation of neurons in the human brain. Neural networks,- also called Artificial Neural Networks - are a variety of deep learning technology, which also falls under the umbrella of artificial intelligence, or AI. Recent studies shows that Artificial Neural Network has the highest coefficient of determination (i.e. measure to assess how well a model explains and predicts future outcomes.) in comparison to the K-nearest neighbor classifiers, logistic regression, discriminant analysis, naive Bayesian classifier, and classification trees. In this work, the theoretical description of the neural network methodology …


Estimating The Optimal Cutoff Point For Logistic Regression, Zheng Zhang Jan 2018

Estimating The Optimal Cutoff Point For Logistic Regression, Zheng Zhang

Open Access Theses & Dissertations

Binary classification is one of the main themes of supervised learning. This research is concerned about determining the optimal cutoff point for the continuous-scaled outcomes (e.g., predicted probabilities) resulting from a classifier such as logistic regression. We make note of the fact that the cutoff point obtained from various methods is a statistic, which can be unstable with substantial variation. Nevertheless, due partly to complexity involved in estimating the cutpoint, there has been no formal study on the variance or standard error of the estimated cutoff point.

In this Thesis, a bootstrap aggregation method is put forward to estimate the …


Deep Learning Method Vs. Hand-Crafted Features For Lung Cancer Diagnosis And Breast Cancer Risk Analysis, Wenqing Sun Jan 2017

Deep Learning Method Vs. Hand-Crafted Features For Lung Cancer Diagnosis And Breast Cancer Risk Analysis, Wenqing Sun

Open Access Theses & Dissertations

Breast cancer and lung cancer are two major leading causes of cancer deaths, and researchers have been developing computer aided diagnosis (CAD) system to automatically diagnose them for decades. In recent studies, we found that the techniques in CAD system can also be used for breast cancer risk analysis, like feature design and machine learning. Also we noticed that with the development of deep learning methods, the performance of CAD system can be improved by using computer automatically generated features. To explore these possibilities, we conducted a series of studies: the first two studies focused on transferring the original CAD …


Forecasting Customer Electricity Load Demand In The Power Trading Agent Competition Using Machine Learning, Saiful Abu Jan 2016

Forecasting Customer Electricity Load Demand In The Power Trading Agent Competition Using Machine Learning, Saiful Abu

Open Access Theses & Dissertations

Accurate electricity load demand forecasting is an important problem in managing the power grid for both economic and environmental reasons. The Power TAC simulation provides a platform to do research on smart grid energy generation and distribution systems. Brokers are the focus of the design task posed to developers by the system. The brokers work as self-interested entities that try to maximize profits by trading electricity across multiple markets. To be successful, a broker has to forecast the electricity demand for customers as accurately as possible so it can use this information to operate efficiently. My proposed forecasting method uses …


Assessing Data Quality In A Sensor Network For Environmental Monitoring, Gesuri Ramirez Jan 2011

Assessing Data Quality In A Sensor Network For Environmental Monitoring, Gesuri Ramirez

Open Access Theses & Dissertations

Assessing the quality of sensor data in environmental monitoring applications is important, as erroneous readings produced by malfunctioning sensors, calibration drift, and problematic climatic conditions, such as icing or dust, are common.Traditional data quality checking and correction is a painstaking manual process, so the development of automatic systems for this task is highly desirable.

This study investigates machine learning methods to identify and clean incorrect data from a real-world environmental sensor network, the Jornada Experimental Range, located in Southern New Mexico. We evaluated several learning algorithms and data replacement schemes, and developed a method to identify the problematic sensor. The …


Algorithms For Training Large-Scale Linear Programming Support Vector Regression And Classification, Pablo Rivas Perea Jan 2011

Algorithms For Training Large-Scale Linear Programming Support Vector Regression And Classification, Pablo Rivas Perea

Open Access Theses & Dissertations

The main contribution of this dissertation is the development of a method to train a Support Vector Regression (SVR) model for the large-scale case where the number of training samples supersedes the computational resources. The proposed scheme consists of posing the SVR problem entirely as a Linear Programming (LP) problem and on the development of a sequential optimization method based on variables decomposition, constraints decomposition, and the use of primal-dual interior point methods. Experimental results demonstrate that the proposed approach has comparable performance with other SV-based classifiers. Particularly, experiments demonstrate that as the problem size increases, the sparser the solution …