Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 7 of 7

Full-Text Articles in Physical Sciences and Mathematics

Machine Learning With Topological Data Analysis, Ephraim Robert Love May 2021

Machine Learning With Topological Data Analysis, Ephraim Robert Love

Doctoral Dissertations

Topological Data Analysis (TDA) is a relatively new focus in the fields of statistics and machine learning. Methods of exploiting the geometry of data, such as clustering, have proven theoretically and empirically invaluable. TDA provides a general framework within which to study topological invariants (shapes) of data, which are more robust to noise and can recover information on higher dimensional features than immediately apparent in the data. A common tool for conducting TDA is persistence homology, which measures the significance of these invariants. Persistence homology has prominent realizations in methods of data visualization, statistics and machine learning. Extending ML with …


The Effect Of Initial Conditions On The Weather Research And Forecasting Model, Aaron D. Baker May 2021

The Effect Of Initial Conditions On The Weather Research And Forecasting Model, Aaron D. Baker

Electronic Theses and Dissertations

Modeling our atmosphere and determining forecasts using numerical methods has been a challenge since the early 20th Century. Most models use a complex dynamical system of equations that prove difficult to solve by hand as they are chaotic by nature. When computer systems became more widely adopted and available, approximating the solution of these equations, numerically, became easier as computational power increased. This advancement in computing has caused numerous weather models to be created and implemented across the world. However a challenge of approximating these solutions accurately still exists as each model have varying set of equations and variables to …


Clustering Web Users By Mouse Movement To Detect Bots And Botnet Attacks, Justin L. Morgan Mar 2021

Clustering Web Users By Mouse Movement To Detect Bots And Botnet Attacks, Justin L. Morgan

Master's Theses

The need for website administrators to efficiently and accurately detect the presence of web bots has shown to be a challenging problem. As the sophistication of modern web bots increases, specifically their ability to more closely mimic the behavior of humans, web bot detection schemes are more quickly becoming obsolete by failing to maintain effectiveness. Though machine learning-based detection schemes have been a successful approach to recent implementations, web bots are able to apply similar machine learning tactics to mimic human users, thus bypassing such detection schemes. This work seeks to address the issue of machine learning based bots bypassing …


Machine Learning Morphisms: A Framework For Designing And Analyzing Machine Learning Work Ows, Applied To Separability, Error Bounds, And 30-Day Hospital Readmissions, Eric Zenon Cawi Jan 2021

Machine Learning Morphisms: A Framework For Designing And Analyzing Machine Learning Work Ows, Applied To Separability, Error Bounds, And 30-Day Hospital Readmissions, Eric Zenon Cawi

McKelvey School of Engineering Theses & Dissertations

A machine learning workflow is the sequence of tasks necessary to implement a machine learning application, including data collection, preprocessing, feature engineering, exploratory analysis, and model training/selection. In this dissertation we propose the Machine Learning Morphism (MLM) as a mathematical framework to describe the tasks in a workflow. The MLM is a tuple consisting of: Input Space, Output Space, Learning Morphism, Parameter Prior, Empirical Risk Function. This contains the information necessary to learn the parameters of the learning morphism, which represents a workflow task. In chapter 1, we give a short review of typical tasks present in a workflow, as …


Ensemble Protein Inference Evaluation, Kyle Lee Lucke Jan 2021

Ensemble Protein Inference Evaluation, Kyle Lee Lucke

Graduate Student Theses, Dissertations, & Professional Papers

The Protein inference problem is becoming an increasingly important tool that aids in the characterization of complex proteomes and analysis of complex protein samples. In bottom-up shotgun proteomics experiments the metrics for evaluation (like AUC and calibration error) are based on an often imperfect target-decoy database. These metrics make the inherent assumption that all of the proteins in the target set are present in the sample being analyzed. In general, this is not the case, they are typically a mix of present and absent proteins. To objectively evaluate inference methods, protein standard datasets are used. These datasets are special in …


Statistical Modeling Of Hpc Performance Variability And Communication, Jered B. Dominguez-Trujillo Jan 2021

Statistical Modeling Of Hpc Performance Variability And Communication, Jered B. Dominguez-Trujillo

Computer Science ETDs

Understanding the performance of parallel and distributed programs remains a focal point in determining how compute systems can be optimized to achieve exascale performance. Lightweight, statistical models allow developers to both characterize and predict performance trade-offs, especially as HPC systems become more heterogeneous with many-core CPUs and GPUs. This thesis presents a lightweight, statistical modeling approach of performance variation which leverages extreme value theory by focusing on the maximum length of distributed workload intervals. This approach was implemented in MPI and evaluated on several HPC systems and workloads. I then present a performance model of partitioned communication which also uses …


Computational Simulation And Analysis Of Neuroplasticity, Madison E. Yancey Jan 2021

Computational Simulation And Analysis Of Neuroplasticity, Madison E. Yancey

Browse all Theses and Dissertations

Homeostatic synaptic plasticity is the process by which neurons alter their activity in response to changes in network activity. Neuroscientists attempting to understand homeostatic synaptic plasticity have developed three different mathematical methods to analyze collections of event recordings from neurons acting as a proxy for neuronal activity. These collections of events are from control data and treatment data, referring to the treatment of neuron cultures with pharmacological agents that augment or inhibit network activity. If the distribution of control events can be functionally mapped to the distribution of treatment events, a better understanding of the biological processes underlying homeostatic synaptic …