Open Access. Powered by Scholars. Published by Universities.®

Statistical Models Commons

Open Access. Powered by Scholars. Published by Universities.®

Mathematics

Institution
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 1 - 30 of 95

Full-Text Articles in Statistical Models

Model Selection Through Cross-Validation For Supervised Learning Tasks With Manifold Data, Derek Brown Jan 2024

Model Selection Through Cross-Validation For Supervised Learning Tasks With Manifold Data, Derek Brown

The Journal of Purdue Undergraduate Research

No abstract provided.


Machine Learning Approaches For Cyberbullying Detection, Roland Fiagbe Jan 2024

Machine Learning Approaches For Cyberbullying Detection, Roland Fiagbe

Data Science and Data Mining

Cyberbullying refers to the act of bullying using electronic means and the internet. In recent years, this act has been identifed to be a major problem among young people and even adults. It can negatively impact one’s emotions and lead to adverse outcomes like depression, anxiety, harassment, and suicide, among others. This has led to the need to employ machine learning techniques to automatically detect cyberbullying and prevent them on various social media platforms. In this study, we want to analyze the combination of some Natural Language Processing (NLP) algorithms (such as Bag-of-Words and TFIDF) with some popular machine learning …


Multiscale Modelling Of Brain Networks And The Analysis Of Dynamic Processes In Neurodegenerative Disorders, Hina Shaheen Jan 2024

Multiscale Modelling Of Brain Networks And The Analysis Of Dynamic Processes In Neurodegenerative Disorders, Hina Shaheen

Theses and Dissertations (Comprehensive)

The complex nature of the human brain, with its intricate organic structure and multiscale spatio-temporal characteristics ranging from synapses to the entire brain, presents a major obstacle in brain modelling. Capturing this complexity poses a significant challenge for researchers. The complex interplay of coupled multiphysics and biochemical activities within this intricate system shapes the brain's capacity, functioning within a structure-function relationship that necessitates a specific mathematical framework. Advanced mathematical modelling approaches that incorporate the coupling of brain networks and the analysis of dynamic processes are essential for advancing therapeutic strategies aimed at treating neurodegenerative diseases (NDDs), which afflict millions of …


Reducing Food Scarcity: The Benefits Of Urban Farming, S.A. Claudell, Emilio Mejia Dec 2023

Reducing Food Scarcity: The Benefits Of Urban Farming, S.A. Claudell, Emilio Mejia

Journal of Nonprofit Innovation

Urban farming can enhance the lives of communities and help reduce food scarcity. This paper presents a conceptual prototype of an efficient urban farming community that can be scaled for a single apartment building or an entire community across all global geoeconomics regions, including densely populated cities and rural, developing towns and communities. When deployed in coordination with smart crop choices, local farm support, and efficient transportation then the result isn’t just sustainability, but also increasing fresh produce accessibility, optimizing nutritional value, eliminating the use of ‘forever chemicals’, reducing transportation costs, and fostering global environmental benefits.

Imagine Doris, who is …


Langevin Dynamic Models For Smfret Dynamic Shift, David Frost, Keisha Cook Dr, Hugo Sanabria Dr Nov 2023

Langevin Dynamic Models For Smfret Dynamic Shift, David Frost, Keisha Cook Dr, Hugo Sanabria Dr

Annual Symposium on Biomathematics and Ecology Education and Research

No abstract provided.


Using Geographic Information To Explore Player-Specific Movement And Its Effects On Play Success In The Nfl, Hayley Horn, Eric Laigaie, Alexander Lopez, Shravan Reddy Aug 2023

Using Geographic Information To Explore Player-Specific Movement And Its Effects On Play Success In The Nfl, Hayley Horn, Eric Laigaie, Alexander Lopez, Shravan Reddy

SMU Data Science Review

American Football is a billion-dollar industry in the United States. The analytical aspect of the sport is an ever-growing domain, with open-source competitions like the NFL Big Data Bowl accelerating this growth. With the amount of player movement during each play, tracking data can prove valuable in many areas of football analytics. While concussion detection, catch recognition, and completion percentage prediction are all existing use cases for this data, player-specific movement attributes, such as speed and agility, may be helpful in predicting play success. This research calculates player-specific speed and agility attributes from tracking data and supplements them with descriptive …


Stressor: An R Package For Benchmarking Machine Learning Models, Samuel A. Haycock Aug 2023

Stressor: An R Package For Benchmarking Machine Learning Models, Samuel A. Haycock

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Many discipline specific researchers need a way to quickly compare the accuracy of their predictive models to other alternatives. However, many of these researchers are not experienced with multiple programming languages. Python has recently been the leader in machine learning functionality, which includes the PyCaret library that allows users to develop high-performing machine learning models with only a few lines of code. The goal of the stressor package is to help users of the R programming language access the advantages of PyCaret without having to learn Python. This allows the user to leverage R’s powerful data analysis workflows, while simultaneously …


Movie Recommender System Using Matrix Factorization, Roland Fiagbe May 2023

Movie Recommender System Using Matrix Factorization, Roland Fiagbe

Data Science and Data Mining

Recommendation systems are a popular and beneficial field that can help people make informed decisions automatically. This technique assists users in selecting relevant information from an overwhelming amount of available data. When it comes to movie recommendations, two common methods are collaborative filtering, which compares similarities between users, and content-based filtering, which takes a user’s specific preferences into account. However, our study focuses on the collaborative filtering approach, specifically matrix factorization. Various similarity metrics are used to identify user similarities for recommendation purposes. Our project aims to predict movie ratings for unwatched movies using the MovieLens rating dataset. We developed …


Uconn Baseball Batting Order Optimization, Gavin Rublewski, Gavin Rublewski May 2023

Uconn Baseball Batting Order Optimization, Gavin Rublewski, Gavin Rublewski

Honors Scholar Theses

Challenging conventional wisdom is at the very core of baseball analytics. Using data and statistical analysis, the sets of rules by which coaches make decisions can be justified, or possibly refuted. One of those sets of rules relates to the construction of a batting order. Through data collection, data adjustment, the construction of a baseball simulator, and the use of a Monte Carlo Simulation, I have assessed thousands of possible batting orders to determine the roster-specific strategies that lead to optimal run production for the 2023 UConn baseball team. This paper details a repeatable process in which basic player statistics …


Employee Attrition: Analyzing Factors Influencing Job Satisfaction Of Ibm Data Scientists, Graham Nash Apr 2023

Employee Attrition: Analyzing Factors Influencing Job Satisfaction Of Ibm Data Scientists, Graham Nash

Symposium of Student Scholars

Employee attrition is a relevant issue that every business employer must consider when gauging the effectiveness of their employees. Whether or not an employee chooses to leave their job can come from a multitude of factors. As a result, employers need to develop methods in which they can measure attrition by calculating the several qualities of their employees. Factors like their age, years with the company, which department they work in, their level of education, their job role, and even their marital status are all considered by employers to assist in predicting employee attrition. This project will be analyzing a …


Using A Distributive Approach To Model Insurance Loss, Kayla Kippes Apr 2023

Using A Distributive Approach To Model Insurance Loss, Kayla Kippes

Student Research Submissions

Insurance loss is an unpredicted event that stands at the forefront of the insurance industry. Loss in insurance represents the costs or expenses incurred due to a claim. An insurance claim is a request for the insurance company to pay for damage caused to an individual’s property. Loss can be measured by how much money (the dollar amount) has been paid out by the insurance company to repair the damage or it can be measured by the number of claims (claim count) made to the insurance company. Insured events include property damage due to fire, theft, flood, a car accident, …


Bridging The Chasm Between Fundamental, Momentum, And Quantitative Investing, Allen Hoskins, Jeff Reed, Robert Slater Apr 2023

Bridging The Chasm Between Fundamental, Momentum, And Quantitative Investing, Allen Hoskins, Jeff Reed, Robert Slater

SMU Data Science Review

A chasm exists between the active public equity investment management industry's fundamental, momentum, and quantitative styles. In this study, the researchers explore ways to bridge this gap by leveraging domain knowledge, fundamental analysis, momentum, crowdsourcing, and data science methods. This research also seeks to test the developed tools and strategies during the volatile time period of 2020 and 2021.


Stochastic Optimization To Reduce Aircraft Taxi-In Time At Igia, New Delhi, Rajib Das, Saileswar Ghosh, Rajendra Desai, Pijus Kanti Bhuin, Stuti Agarwal Jan 2023

Stochastic Optimization To Reduce Aircraft Taxi-In Time At Igia, New Delhi, Rajib Das, Saileswar Ghosh, Rajendra Desai, Pijus Kanti Bhuin, Stuti Agarwal

International Journal of Aviation, Aeronautics, and Aerospace

Since there is an uncertainty in the arrival times of flights, pre-scheduled allocation of runways and stands and the subsequent first-come-first-served treatment results in a sub-optimal allocation of runways and stands, this is the prime reason for the unusual delays in taxi-in times at IGIA, New Delhi.

We simulated the arrival pattern of aircraft and utilized stochastic optimization to arrive at the best runway-stands allocation for a day. Optimization is done using a GRG Non-Linear algorithm in the Frontline Systems Analytic Solver platform. We applied this model to eight representative scenarios of two different days. Our results show that without …


Functional Data Analysis Of Covid-19, Nichole L. Fluke Nov 2022

Functional Data Analysis Of Covid-19, Nichole L. Fluke

Mathematics & Statistics ETDs

This thesis deals with Functional Data Analysis (FDA) on COVID data. The Data involves counts for new COVID cases, hospitalized COVID patients, and new COVID deaths. The data used is for all the states and regions in the United States. The data starts in March 1st, 2020 and goes through March 31st, 2021. The FDA smooths the data and looks to see if there are similarities or differences between the states and regions in the data. The data also shows which states and regions stand out from the others and which ones are similar. Also shown …


Applications Of Statistical Physics To Ecology: Ising Models And Two-Cycle Coupled Oscillators, Vahini Reddy Nareddy Oct 2022

Applications Of Statistical Physics To Ecology: Ising Models And Two-Cycle Coupled Oscillators, Vahini Reddy Nareddy

Doctoral Dissertations

Many ecological systems exhibit noisy period-2 oscillations and, when they are spatially extended, they undergo phase transition from synchrony to incoherence in the Ising universality class. Period-2 cycles have two possible phases of oscillations and can be represented as two states in the bistable systems. Understanding the dynamics of ecological systems by representing their oscillations as bistable states and developing dynamical models using the tools from statistical physics to predict their future states is the focus of this thesis. As the ecological oscillators with two-cycle behavior undergo phase transitions in the Ising universality class, many features of synchrony and equilibrium …


Applications Of Machine Learning Algorithms In Materials Science And Bioinformatics, Mohammed Quazi Jun 2022

Applications Of Machine Learning Algorithms In Materials Science And Bioinformatics, Mohammed Quazi

Mathematics & Statistics ETDs

The piezoelectric response has been a measure of interest in density functional theory (DFT) for micro-electromechanical systems (MEMS) since the inception of MEMS technology. Piezoelectric-based MEMS devices find wide applications in automobiles, mobile phones, healthcare devices, and silicon chips for computers, to name a few. Piezoelectric properties of doped aluminum nitride (AlN) have been under investigation in materials science for piezoelectric thin films because of its wide range of device applicability. In this research using rigorous DFT calculations, high throughput ab-initio simulations for 23 AlN alloys are generated.

This research is the first to report strong enhancements of piezoelectric properties …


Data Ethics: An Investigation Of Data, Algorithms, And Practice, Gabrialla S. Cockerell May 2022

Data Ethics: An Investigation Of Data, Algorithms, And Practice, Gabrialla S. Cockerell

Honors Projects

This paper encompasses an examination of defective data collection, algorithms, and practices that continue to be cycled through society under the illusion that all information is processed uniformly, and technological innovation consistently parallels societal betterment. However, vulnerable communities, typically the impoverished and racially discriminated, get ensnared in these harmful cycles due to their disadvantages. Their hindrances are reflected in their information due to the interconnectedness of data, such as race being highly correlated to wealth, education, and location. However, their information continues to be analyzed with the same measures as populations who are not significantly affected by racial bias. Not …


Intra-Hour Solar Forecasting Using Cloud Dynamics Features Extracted From Ground-Based Infrared Sky Images, Guillermo Terrén-Serrano Apr 2022

Intra-Hour Solar Forecasting Using Cloud Dynamics Features Extracted From Ground-Based Infrared Sky Images, Guillermo Terrén-Serrano

Electrical and Computer Engineering ETDs

Due to the increasing use of photovoltaic systems, power grids are vulnerable to the projection of shadows from moving clouds. An intra-hour solar forecast provides power grids with the capability of automatically controlling the dispatch of energy, reducing the additional cost for a guaranteed, reliable supply of energy (i.e., energy storage). This dissertation introduces a novel sky imager consisting of a long-wave radiometric infrared camera and a visible light camera with a fisheye lens. The imager is mounted on a solar tracker to maintain the Sun in the center of the images throughout the day, reducing the scattering effect produced …


Period Doubling Cascades From Data, Alexander Berliner Apr 2022

Period Doubling Cascades From Data, Alexander Berliner

Undergraduate Honors Theses

Orbit diagrams of period doubling cascades represent systems going from periodicity to chaos. Here, we investigate whether a Gaussian process regression can be used to approximate a system from data and recover asymptotic dynamics in the orbit diagrams for period doubling cascades. To compare the orbits of a system to the approximation, we compute the Wasserstein metric between the point clouds of their obits for varying bifurcation parameter values. Visually comparing the period doubling cascades, we note that the exact bifurcation values may shift, which is confirmed in the plots of the Wasserstein distance. This has implications for studying dynamics …


Session 5: Equipment Finance Credit Risk Modeling - A Case Study In Creative Model Development & Nimble Data Engineering, Edward Krueger, Landon Thompson, Josh Moore Feb 2022

Session 5: Equipment Finance Credit Risk Modeling - A Case Study In Creative Model Development & Nimble Data Engineering, Edward Krueger, Landon Thompson, Josh Moore

SDSU Data Science Symposium

This presentation will focus first on providing an overview of Channel and the Risk Analytics team that performed this case study. Given that context, we’ll then dive into our approach for building the modeling development data set, techniques and tools used to develop and implement the model into a production environment, and some of the challenges faced upon launch. Then, the presentation will pivot to the data engineering pipeline. During this portion, we will explore the application process and what happens to the data we collect. This will include how we extract & store the data along with how it …


Finding The Best Predictors For Foot Traffic In Us Seafood Restaurants, Isabel Paige Beaulieu Jan 2022

Finding The Best Predictors For Foot Traffic In Us Seafood Restaurants, Isabel Paige Beaulieu

Honors Theses and Capstones

COVID-19 caused state and nation-wide lockdowns, which altered human foot traffic, especially in restaurants. The seafood sector in particular suffered greatly as there was an increase in illegal fishing, it is made up of perishable goods, it is seasonal in some places, and imports and exports were slowed. Foot traffic data is useful for business owners to have to know how much to order, how many employees to schedule, etc. One issue is that the data is very expensive, hard to get, and not available until months after it is recorded. Our goal is to not only find covariates that …


Realtime Event Detection In Sports Sensor Data With Machine Learning, Mallory Cashman Jan 2022

Realtime Event Detection In Sports Sensor Data With Machine Learning, Mallory Cashman

Honors Theses and Capstones

Machine learning models can be trained to classify time series based sports motion data, without reliance on assumptions about the capabilities of the users or sensors. This can be applied to predict the count of occurrences of an event in a time period. The experiment for this research uses lacrosse data, collected in partnership with SPAITR - a UNH undergraduate startup developing motion tracking devices for lacrosse. Decision Tree and Support Vector Machine (SVM) models are trained and perform with high success rates. These models improve upon previous work in human motion event detection and can be used a reference …


Estimating The Statistics Of Operational Loss Through The Analyzation Of A Time Series, Maurice L. Brown Jan 2022

Estimating The Statistics Of Operational Loss Through The Analyzation Of A Time Series, Maurice L. Brown

Theses and Dissertations

In the world of finance, appropriately understanding risk is key to success or failure because it is a fundamental driver for institutional behavior. Here we focus on risk as it relates to the operations of financial institutions, namely operational risk. Quantifying operational risk begins with data in the form of a time series of realized losses, which can occur for a number of reasons, can vary over different time intervals, and can pose a challenge that is exacerbated by having to account for both frequency and severity of losses. We introduce a stochastic point process model for the frequency distribution …


Role Of Inhibition And Spiking Variability In Ortho- And Retronasal Olfactory Processing, Michelle F. Craft Jan 2022

Role Of Inhibition And Spiking Variability In Ortho- And Retronasal Olfactory Processing, Michelle F. Craft

Theses and Dissertations

Odor perception is the impetus for important animal behaviors, most pertinently for feeding, but also for mating and communication. There are two predominate modes of odor processing: odors pass through the front of nose (ortho) while inhaling and sniffing, or through the rear (retro) during exhalation and while eating and drinking. Despite the importance of olfaction for an animal’s well-being and specifically that ortho and retro naturally occur, it is unknown whether the modality (ortho versus retro) is transmitted to cortical brain regions, which could significantly instruct how odors are processed. Prior imaging studies show different …


Application Of Randomness In Finance, Jose Sanchez, Daanial Ahmad, Satyanand Singh May 2021

Application Of Randomness In Finance, Jose Sanchez, Daanial Ahmad, Satyanand Singh

Publications and Research

Brownian Motion which is also considered to be a Wiener process and can be thought of as a random walk. In our project we had briefly discussed the fluctuations of financial indices and related it to Brownian Motion and the modeling of Stock prices.


Markov Chains And Their Applications, Fariha Mahfuz Apr 2021

Markov Chains And Their Applications, Fariha Mahfuz

Math Theses

Markov chain is a stochastic model that is used to predict future events. Markov chain is relatively simple since it only requires the information of the present state to predict the future states. In this paper we will go over the basic concepts of Markov Chain and several of its applications including Google PageRank algorithm, weather prediction and gamblers ruin.

We examine on how the Google PageRank algorithm works efficiently to provide PageRank for a Google search result. We also show how can we use Markov chain to predict weather by creating a model from real life data.


An Exploratory Analysis Of The Bgsu Learning Commons Student Usage Data, Emily Eskuri Apr 2021

An Exploratory Analysis Of The Bgsu Learning Commons Student Usage Data, Emily Eskuri

Honors Projects

The purpose of this study was to explore past student usage data in individualized tutoring sessions from the Learning Commons from two academic years. The Bowling Green State University (BGSU) Learning Commons is a learning assistance center that offers various services, such as individualized tutoring, math assistance, writing assistance, study hours, and academic coaching. There have been limited research studies into how big data and analytics can have an impact in higher education, especially research utilizing predictive analytics.

This project applied analytics to individualized tutoring data in the Learning Commons to create a better understanding of why those trends happen …


Understanding The Effect Of Adaptive Mutations On The Three-Dimensional Structure Of Rna, Justin Cook Apr 2021

Understanding The Effect Of Adaptive Mutations On The Three-Dimensional Structure Of Rna, Justin Cook

Undergraduate Research and Scholarship Symposium

Single-nucleotide polymorphisms (SNPs) are variations in the genome where one base pair can differ between individuals.1 SNPs occur throughout the genome and can correlate to a disease-state if they occur in a functional region of DNA.1According to the central dogma of molecular biology, any variation in the DNA sequence will have a direct effect on the RNA sequence and will potentially alter the identity or conformation of a protein product. A single RNA molecule, due to intramolecular base pairing, can acquire a plethora of 3-D conformations that are described by its structural ensemble. One SNP, rs12477830, which …


Lecture 04: Spatial Statistics Applications Of Hrl, Trl, And Mixed Precision, David Keyes Apr 2021

Lecture 04: Spatial Statistics Applications Of Hrl, Trl, And Mixed Precision, David Keyes

Mathematical Sciences Spring Lecture Series

As simulation and analytics enter the exascale era, numerical algorithms, particularly implicit solvers that couple vast numbers of degrees of freedom, must span a widening gap between ambitious applications and austere architectures to support them. We present fifteen universals for researchers in scalable solvers: imperatives from computer architecture that scalable solvers must respect, strategies towards achieving them that are currently well established, and additional strategies currently being developed for an effective and efficient exascale software ecosystem. We consider recent generalizations of what it means to “solve” a computational problem, which suggest that we have often been “oversolving” them at the …


On The Evolution Equation For Modelling The Covid-19 Pandemic, Jonathan Blackledge Jan 2021

On The Evolution Equation For Modelling The Covid-19 Pandemic, Jonathan Blackledge

Books/Book chapters

The paper introduces and discusses the evolution equation, and, based exclusively on this equation, considers random walk models for the time series available on the daily confirmed Covid-19 cases for different countries. It is shown that a conventional random walk model is not consistent with the current global pandemic time series data, which exhibits non-ergodic properties. A self-affine random walk field model is investigated, derived from the evolutionary equation for a specified memory function which provides the non-ergodic fields evident in the available Covid-19 data. This is based on using a spectral scaling relationship of the type 1/ωα where ω …