Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Institution
-
- University of New Mexico (3)
- City University of New York (CUNY) (1)
- East Tennessee State University (1)
- Florida International University (1)
- LSU Health Science Center (1)
-
- Loyola Marymount University and Loyola Law School (1)
- Marshall University (1)
- Sacred Heart University (1)
- University at Albany, State University of New York (1)
- University of Montana (1)
- University of New Hampshire (1)
- University of South Dakota (1)
- University of South Florida (1)
- Utah State University (1)
- Washington University in St. Louis (1)
- West Virginia University (1)
- Western University (1)
- Publication
-
- Mathematics & Statistics ETDs (2)
- All Graduate Theses and Dissertations, Spring 1920 to Summer 2023 (1)
- Dissertations and Theses (1)
- Electrical and Computer Engineering ETDs (1)
- Electronic Thesis and Dissertation Repository (1)
-
- FIU Electronic Theses and Dissertations (1)
- Graduate Student Theses, Dissertations, & Professional Papers (1)
- Graduate Theses, Dissertations, and Problem Reports (1)
- Honors Theses and Capstones (1)
- Honors Thesis (1)
- Legacy Theses & Dissertations (2009 - 2024) (1)
- Open Educational Resources (1)
- School of Public Health Faculty Publications (1)
- Senior Honors Papers / Undergraduate Theses (1)
- Theses, Dissertations and Capstones (1)
- USF Tampa Graduate Theses and Dissertations (1)
- Undergraduate Honors Theses (1)
- WCBT Faculty Publications (1)
- Publication Type
Articles 1 - 19 of 19
Full-Text Articles in Physical Sciences and Mathematics
Predictors Of Covid-19 Vaccination Rate In Usa: A Machine Learning Approach, Syed M. I. Osman, Ahmed Sabit
Predictors Of Covid-19 Vaccination Rate In Usa: A Machine Learning Approach, Syed M. I. Osman, Ahmed Sabit
WCBT Faculty Publications
In this study, we examine state-level features and policies that are most important in achieving a threshold level vaccination rate to curve the effects of the COVID-19 pandemic. We employ CHAID, a decision tree algorithm, on three different model specifications to answer this question based on a dataset that includes all the states in the United States. Workplace travel emerges as the most important predictor; however, the governors’ political affiliation (PA) replaces it in a more conservative feature set that includes economic features and the growth rate of COVID-19 cases. We also employ several alternative algorithms as a robustness check. …
Machine Learning Model Comparison And Arma Simulation Of Exhaled Breath Signals Classifying Covid-19 Patients, Aaron Christopher Segura
Machine Learning Model Comparison And Arma Simulation Of Exhaled Breath Signals Classifying Covid-19 Patients, Aaron Christopher Segura
Mathematics & Statistics ETDs
This study compared the performance of machine learning models in classifying COVID-19 patients using exhaled breath signals and simulated datasets. Ground truth classification was determined by the gold standard Polymerase Chain Reaction (PCR) test results. A residual bootstrapped method generated the simulated datasets by fitting signal data to Autoregressive Moving Average (ARMA) models. Classification models included neural networks, k-nearest neighbors, naïve Bayes, random forest, and support vector machines. A Recursive Feature Elimination (RFE) study was performed to determine if reducing signal features would improve the classification models performance using Gini Importance scoring for the two classes. The top 25% of …
Contributions To Random Forest Variable Importance With Applications In R, Kelvyn K. Bladen
Contributions To Random Forest Variable Importance With Applications In R, Kelvyn K. Bladen
All Graduate Theses and Dissertations, Spring 1920 to Summer 2023
A major focus in statistics is building and improving computational algorithms that can use data to predict a response. Two fundamental camps of research arise from such a goal. The first camp is researching ways to get more accurate predictions. Many sophisticated methods, collectively known as machine learning methods, have been developed for this very purpose. One such method that is widely used across industry and many other areas of investigation is called Random Forests.
The second camp of research is that of improving the interpretability of machine learning methods. This is worthy of attention when analysts desire to optimize …
Stability And Differential Privacy Of Stochastic Gradient Methods, Zhenhuan Yang
Stability And Differential Privacy Of Stochastic Gradient Methods, Zhenhuan Yang
Legacy Theses & Dissertations (2009 - 2024)
Recently there are a considerable amount of work devoted to the study of the algorithmic stability as well as differential privacy (DP) for stochastic gradient methods (SGM). However, most of the existing work focus on the empirical risk minimization (ERM) and the population risk minimization problems. In this paper, we study two types of optimization problems that enjoy wide applications in modern machine learning, namely the minimax problem and the pairwise learning problem.
Applications Of Machine Learning Algorithms In Materials Science And Bioinformatics, Mohammed Quazi
Applications Of Machine Learning Algorithms In Materials Science And Bioinformatics, Mohammed Quazi
Mathematics & Statistics ETDs
The piezoelectric response has been a measure of interest in density functional theory (DFT) for micro-electromechanical systems (MEMS) since the inception of MEMS technology. Piezoelectric-based MEMS devices find wide applications in automobiles, mobile phones, healthcare devices, and silicon chips for computers, to name a few. Piezoelectric properties of doped aluminum nitride (AlN) have been under investigation in materials science for piezoelectric thin films because of its wide range of device applicability. In this research using rigorous DFT calculations, high throughput ab-initio simulations for 23 AlN alloys are generated.
This research is the first to report strong enhancements of piezoelectric properties …
A Course In Data Science: R And Prediction Modeling, Adam Kapelner
A Course In Data Science: R And Prediction Modeling, Adam Kapelner
Open Educational Resources
This is a self-contained course in data science and machine learning using R. It covers philosophy of modeling with data, prediction via linear models, machine learning including support vector machines and random forests, probability estimation and asymmetric costs using logistic regression and probit regression, underfitting vs. overfitting, model validation, handling missingness and much more. There is formal instruction of data manipulation using dplyr and data.table, visualization using ggplot2 and statistical computing.
Generating A Dataset For Comparing Linear Vs. Non-Linear Prediction Methods In Education Research, Jack Mauro, Elena Martinez, Anna Bargagliotti
Generating A Dataset For Comparing Linear Vs. Non-Linear Prediction Methods In Education Research, Jack Mauro, Elena Martinez, Anna Bargagliotti
Honors Thesis
Machine learning is often used to build predictive models by extracting patterns from large data sets. Such techniques are increasingly being utilized to predict outcomes in the social sciences. One such application is predicting student success. Machine learning can be applied to predicting student acceptance and success in academia. Using these tools for education-related data analysis, may enable the evaluation of programs, resources and curriculum. Currently, research is needed to examine application, admissions, and retention data in order to address equity in college computer science programs. However, most student-level data sets contain sensitive data that cannot be made public. To …
Dataset Evaluation For Data Trading Using Expected Loss And Homomorphic Encryption, Minsung Joo
Dataset Evaluation For Data Trading Using Expected Loss And Homomorphic Encryption, Minsung Joo
Senior Honors Papers / Undergraduate Theses
Supervised machine learning suffers from the ``garbage-in garbage-out" phenomenon where the performance of a model is limited by the quality of the data. While a myriad of data is collected every second, there is no general rigorous method of evaluating the quality of a given dataset. This hinders fair pricing of data in scenarios where a buyer may look to buy data for use with machine learning. In this work, I propose using the expected loss corresponding to a dataset as a measure of its quality, relying on Bayesian methods for uncertainty quantification. Furthermore, I present a secure multi-party computation …
Intraday Algorithmic Trading Using Momentum And Long Short-Term Memory Network Strategies, Andrew R. Whitinger Ii
Intraday Algorithmic Trading Using Momentum And Long Short-Term Memory Network Strategies, Andrew R. Whitinger Ii
Undergraduate Honors Theses
Intraday stock trading is an infamously difficult and risky strategy. Momentum and reversal strategies and long short-term memory (LSTM) neural networks have been shown to be effective for selecting stocks to buy and sell over time periods of multiple days. To explore whether these strategies can be effective for intraday trading, their implementations were simulated using intraday price data for stocks in the S&P 500 index, collected at 1-second intervals between February 11, 2021 and March 9, 2021 inclusive. The study tested 160 variations of momentum and reversal strategies for profitability in long, short, and market-neutral portfolios, totaling 480 portfolios. …
Early-Warning Alert Systems For Financial-Instability Detection: An Hmm-Driven Approach, Xing Gu
Early-Warning Alert Systems For Financial-Instability Detection: An Hmm-Driven Approach, Xing Gu
Electronic Thesis and Dissertation Repository
Regulators’ early intervention is crucial when the financial system is experiencing difficulties. Financial stability must be preserved to avert banks’ bailouts, which hugely drain government's financial resources. Detecting in advance periods of financial crisis entails the development and customisation of accurate and robust quantitative techniques. The goal of this thesis is to construct automated systems via the interplay of various mathematical and statistical methodologies to signal financial instability episodes in the near-term horizon. These signal alerts could provide regulatory bodies with the capacity to initiate appropriate response that will thwart or at least minimise the occurrence of a financial crisis. …
Intra-Hour Solar Forecasting Using Cloud Dynamics Features Extracted From Ground-Based Infrared Sky Images, Guillermo Terrén-Serrano
Intra-Hour Solar Forecasting Using Cloud Dynamics Features Extracted From Ground-Based Infrared Sky Images, Guillermo Terrén-Serrano
Electrical and Computer Engineering ETDs
Due to the increasing use of photovoltaic systems, power grids are vulnerable to the projection of shadows from moving clouds. An intra-hour solar forecast provides power grids with the capability of automatically controlling the dispatch of energy, reducing the additional cost for a guaranteed, reliable supply of energy (i.e., energy storage). This dissertation introduces a novel sky imager consisting of a long-wave radiometric infrared camera and a visible light camera with a fisheye lens. The imager is mounted on a solar tracker to maintain the Sun in the center of the images throughout the day, reducing the scattering effect produced …
Using Fine-Scale Aquatic Habitat Data To Construct Dreissenid Sdms In The Laurentian Great Lakes, Grace C. Henderson
Using Fine-Scale Aquatic Habitat Data To Construct Dreissenid Sdms In The Laurentian Great Lakes, Grace C. Henderson
USF Tampa Graduate Theses and Dissertations
The invasion of the Laurentian Great Lakes by aquatic invasive species (AIS) has been the subject of investigation for decades, due to their dramatic alterations to the ecosystem and high economic costs. Two AIS with the largest impacts are dreissenid zebra and quagga mussels, and though these species have been studied extensively, questions remain about what factors control their distributions, and whether lake warming will alter these distributions. Species distribution models (SDMs) offer a powerful tool to examine the relationship between species presences and environmental variables, which are typically bioclimactic data. The creation of the Aquatic Habitat (AqHab) dataset containing …
Volitional Control Of Lower-Limb Prosthesis With Vision-Assisted Environmental Awareness, S M Shafiul Hasan
Volitional Control Of Lower-Limb Prosthesis With Vision-Assisted Environmental Awareness, S M Shafiul Hasan
FIU Electronic Theses and Dissertations
Early and reliable prediction of user’s intention to change locomotion mode or speed is critical for a smooth and natural lower limb prosthesis. Meanwhile, incorporation of explicit environmental feedback can facilitate context aware intelligent prosthesis which allows seamless operation in a variety of gait demands. This dissertation introduces environmental awareness through computer vision and enables early and accurate prediction of intention to start, stop or change speeds while walking. Electromyography (EMG), Electroencephalography (EEG), Inertial Measurement Unit (IMU), and Ground Reaction Force (GRF) sensors were used to predict intention to start, stop or increase walking speed. Furthermore, it was investigated whether …
A Keyword-Enhanced Approach To Handle Class Imbalance In Clinical Text Classification, Andrew E. Blanchard, Shang Gao, Hong Jun Yoon, J. Blair Christian, Eric B. Durbin, Xiao Cheng Wu, Antoinette Stroup, Jennifer Doherty, Stephen M. Schwartz, Charles Wiggins, Linda Coyle, Lynne Penberthy, Georgia D. Tourassi
A Keyword-Enhanced Approach To Handle Class Imbalance In Clinical Text Classification, Andrew E. Blanchard, Shang Gao, Hong Jun Yoon, J. Blair Christian, Eric B. Durbin, Xiao Cheng Wu, Antoinette Stroup, Jennifer Doherty, Stephen M. Schwartz, Charles Wiggins, Linda Coyle, Lynne Penberthy, Georgia D. Tourassi
School of Public Health Faculty Publications
Recent applications ofdeep learning have shown promising results for classifying unstructured text in the healthcare domain. However, the reliability of models in production settings has been hindered by imbalanced data sets in which a small subset of the classes dominate. In the absence of adequate training data, rare classes necessitate additional model constraints for robust performance. Here, we present a strategy for incorporating short sequences of text (i.e. keywords) into training to boost model accuracy on rare classes. In our approach, we assemble a set of keywords, including short phrases, associated with each class. The keywords are then used as …
A Predictive Model To Predict Cyberattack Using Self-Normalizing Neural Networks, Oluwapelumi Eniodunmo
A Predictive Model To Predict Cyberattack Using Self-Normalizing Neural Networks, Oluwapelumi Eniodunmo
Theses, Dissertations and Capstones
Cyberattack is a never-ending war that has greatly threatened secured information systems. The development of automated and intelligent systems provides more computing power to hackers to steal information, destroy data or system resources, and has raised global security issues. Statistical and Data mining tools have received continuous research and improvements. These tools have been adopted to create sophisticated intrusion detection systems that help information systems mitigate and defend against cyberattacks. However, the advancement in technology and accessibility of information makes more identifiable elements that can be used to gain unauthorized access to systems and resources. Data mining and classification tools …
Framework For The Evaluation Of Perturbations In The Systems Biology Landscape And Inter-Sample Similarity From Transcriptomic Datasets — A Digital Twin Perspective, Mariah Marie Hoffman
Framework For The Evaluation Of Perturbations In The Systems Biology Landscape And Inter-Sample Similarity From Transcriptomic Datasets — A Digital Twin Perspective, Mariah Marie Hoffman
Dissertations and Theses
One approach to interrogating the complexities of human systems in their well-regulated and dysregulated states is through the use of digital twins. Digital twins are virtual representations of physical systems that are descriptive of an individual's state of health, an object fundamentally related to precision medicine. A key element for building a functional digital twin type for a disease or predicting the therapeutic efficacy of a potential treatment is harmonized, machine-parsable domain knowledge. Hypothesis-driven investigations are the gold standard for representing subsystems, but their results encompass a limited knowledge of the full biosystem. Multi-omics data is one rich source of …
Finding The Best Predictors For Foot Traffic In Us Seafood Restaurants, Isabel Paige Beaulieu
Finding The Best Predictors For Foot Traffic In Us Seafood Restaurants, Isabel Paige Beaulieu
Honors Theses and Capstones
COVID-19 caused state and nation-wide lockdowns, which altered human foot traffic, especially in restaurants. The seafood sector in particular suffered greatly as there was an increase in illegal fishing, it is made up of perishable goods, it is seasonal in some places, and imports and exports were slowed. Foot traffic data is useful for business owners to have to know how much to order, how many employees to schedule, etc. One issue is that the data is very expensive, hard to get, and not available until months after it is recorded. Our goal is to not only find covariates that …
Exploring Cyberterrorism, Topic Models And Social Networks Of Jihadists Dark Web Forums: A Computational Social Science Approach, Vivian Fiona Guetler
Exploring Cyberterrorism, Topic Models And Social Networks Of Jihadists Dark Web Forums: A Computational Social Science Approach, Vivian Fiona Guetler
Graduate Theses, Dissertations, and Problem Reports
This three-article dissertation focuses on cyber-related topics on terrorist groups, specifically Jihadists’ use of technology, the application of natural language processing, and social networks in analyzing text data derived from terrorists' Dark Web forums. The first article explores cybercrime and cyberterrorism. As technology progresses, it facilitates new forms of behavior, including tech-related crimes known as cybercrime and cyberterrorism. In this article, I provide an analysis of the problems of cybercrime and cyberterrorism within the field of criminology by reviewing existing literature focusing on (a) the issues in defining terrorism, cybercrime, and cyberterrorism, (b) ways that cybercriminals commit a crime in …
A Non-Deterministic Deep Learning Based Surrogate For Ice Sheet Modeling, Hannah Jordan
A Non-Deterministic Deep Learning Based Surrogate For Ice Sheet Modeling, Hannah Jordan
Graduate Student Theses, Dissertations, & Professional Papers
Surrogate modeling is a new and expanding field in the world of deep learning, providing a computationally inexpensive way to approximate results from computationally demanding high-fidelity simulations. Ice sheet modeling is one of these computationally expensive models, the model used in this study currently requires between 10 and 20 minutes to complete one simulation. While this process is adequate for certain applications, the ability to use sampling approaches to perform statistical inference becomes infeasible. This issue can be overcome by using a surrogate model to approximate the ice sheet model, bringing the time to produce output down to a tenth …