Open Access. Powered by Scholars. Published by Universities.®

Analysis Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability

PDF

Institution
Keyword
Publication Year
Publication
Publication Type

Articles 1 - 30 of 89

Full-Text Articles in Analysis

Machine Learning Approaches For Cyberbullying Detection, Roland Fiagbe Jan 2024

Machine Learning Approaches For Cyberbullying Detection, Roland Fiagbe

Data Science and Data Mining

Cyberbullying refers to the act of bullying using electronic means and the internet. In recent years, this act has been identifed to be a major problem among young people and even adults. It can negatively impact one’s emotions and lead to adverse outcomes like depression, anxiety, harassment, and suicide, among others. This has led to the need to employ machine learning techniques to automatically detect cyberbullying and prevent them on various social media platforms. In this study, we want to analyze the combination of some Natural Language Processing (NLP) algorithms (such as Bag-of-Words and TFIDF) with some popular machine learning …


Multiscale Modelling Of Brain Networks And The Analysis Of Dynamic Processes In Neurodegenerative Disorders, Hina Shaheen Jan 2024

Multiscale Modelling Of Brain Networks And The Analysis Of Dynamic Processes In Neurodegenerative Disorders, Hina Shaheen

Theses and Dissertations (Comprehensive)

The complex nature of the human brain, with its intricate organic structure and multiscale spatio-temporal characteristics ranging from synapses to the entire brain, presents a major obstacle in brain modelling. Capturing this complexity poses a significant challenge for researchers. The complex interplay of coupled multiphysics and biochemical activities within this intricate system shapes the brain's capacity, functioning within a structure-function relationship that necessitates a specific mathematical framework. Advanced mathematical modelling approaches that incorporate the coupling of brain networks and the analysis of dynamic processes are essential for advancing therapeutic strategies aimed at treating neurodegenerative diseases (NDDs), which afflict millions of …


Aspects Of Stochastic Geometric Mechanics In Molecular Biophysics, David Frost Dec 2023

Aspects Of Stochastic Geometric Mechanics In Molecular Biophysics, David Frost

All Dissertations

In confocal single-molecule FRET experiments, the joint distribution of FRET efficiency and donor lifetime distribution can reveal underlying molecular conformational dynamics via deviation from their theoretical Forster relationship. This shift is referred to as a dynamic shift. In this study, we investigate the influence of the free energy landscape in protein conformational dynamics on the dynamic shift by simulation of the associated continuum reaction coordinate Langevin dynamics, yielding a deeper understanding of the dynamic and structural information in the joint FRET efficiency and donor lifetime distribution. We develop novel Langevin models for the dye linker dynamics, including rotational dynamics, based …


Using Geographic Information To Explore Player-Specific Movement And Its Effects On Play Success In The Nfl, Hayley Horn, Eric Laigaie, Alexander Lopez, Shravan Reddy Aug 2023

Using Geographic Information To Explore Player-Specific Movement And Its Effects On Play Success In The Nfl, Hayley Horn, Eric Laigaie, Alexander Lopez, Shravan Reddy

SMU Data Science Review

American Football is a billion-dollar industry in the United States. The analytical aspect of the sport is an ever-growing domain, with open-source competitions like the NFL Big Data Bowl accelerating this growth. With the amount of player movement during each play, tracking data can prove valuable in many areas of football analytics. While concussion detection, catch recognition, and completion percentage prediction are all existing use cases for this data, player-specific movement attributes, such as speed and agility, may be helpful in predicting play success. This research calculates player-specific speed and agility attributes from tracking data and supplements them with descriptive …


Movie Recommender System Using Matrix Factorization, Roland Fiagbe May 2023

Movie Recommender System Using Matrix Factorization, Roland Fiagbe

Data Science and Data Mining

Recommendation systems are a popular and beneficial field that can help people make informed decisions automatically. This technique assists users in selecting relevant information from an overwhelming amount of available data. When it comes to movie recommendations, two common methods are collaborative filtering, which compares similarities between users, and content-based filtering, which takes a user’s specific preferences into account. However, our study focuses on the collaborative filtering approach, specifically matrix factorization. Various similarity metrics are used to identify user similarities for recommendation purposes. Our project aims to predict movie ratings for unwatched movies using the MovieLens rating dataset. We developed …


Formula 101 Using 2022 Formula One Season Data To Understand The Race Results, Christopher Garcia, Oliver Lopez May 2023

Formula 101 Using 2022 Formula One Season Data To Understand The Race Results, Christopher Garcia, Oliver Lopez

Student Scholar Symposium Abstracts and Posters

The reason why I am interested in Formula One is that my friend showed me what Formula One was all about. It became interesting to see the action of the sport, including the battles the drivers have during the race and how fast they go through a corner. Also, when qualifying comes around, they push their car to the absolute limit to gain a few seconds off their opponents. The drivers only in the top 10 receive points from the winner getting 25 points, the last driver in the top 10 getting 1 point, and those below the top ten …


Employee Attrition: Analyzing Factors Influencing Job Satisfaction Of Ibm Data Scientists, Graham Nash Apr 2023

Employee Attrition: Analyzing Factors Influencing Job Satisfaction Of Ibm Data Scientists, Graham Nash

Symposium of Student Scholars

Employee attrition is a relevant issue that every business employer must consider when gauging the effectiveness of their employees. Whether or not an employee chooses to leave their job can come from a multitude of factors. As a result, employers need to develop methods in which they can measure attrition by calculating the several qualities of their employees. Factors like their age, years with the company, which department they work in, their level of education, their job role, and even their marital status are all considered by employers to assist in predicting employee attrition. This project will be analyzing a …


Defining Characteristics That Lead To Cost-Efficient Veteran Nba Free Agent Signings, David Mccain Apr 2023

Defining Characteristics That Lead To Cost-Efficient Veteran Nba Free Agent Signings, David Mccain

Honors Projects in Mathematics

Throughout the history of the NBA, decisions regarding the signing of free agents have been riddled with complexity. Franchises are tasked with finding out what players will serve as optimal free agent signings prior to seeing them perform within the framework of their team. This study hypothesizes that the adequacy of an NBA free agent signing can be modeled and predicted through the implementation of a machine learning model. The model will learn the necessary information using training and testing data sets that include various player biometrics, game statistics, and financial information. The application of this machine learning model will …


Using A Distributive Approach To Model Insurance Loss, Kayla Kippes Apr 2023

Using A Distributive Approach To Model Insurance Loss, Kayla Kippes

Student Research Submissions

Insurance loss is an unpredicted event that stands at the forefront of the insurance industry. Loss in insurance represents the costs or expenses incurred due to a claim. An insurance claim is a request for the insurance company to pay for damage caused to an individual’s property. Loss can be measured by how much money (the dollar amount) has been paid out by the insurance company to repair the damage or it can be measured by the number of claims (claim count) made to the insurance company. Insured events include property damage due to fire, theft, flood, a car accident, …


Bridging The Chasm Between Fundamental, Momentum, And Quantitative Investing, Allen Hoskins, Jeff Reed, Robert Slater Apr 2023

Bridging The Chasm Between Fundamental, Momentum, And Quantitative Investing, Allen Hoskins, Jeff Reed, Robert Slater

SMU Data Science Review

A chasm exists between the active public equity investment management industry's fundamental, momentum, and quantitative styles. In this study, the researchers explore ways to bridge this gap by leveraging domain knowledge, fundamental analysis, momentum, crowdsourcing, and data science methods. This research also seeks to test the developed tools and strategies during the volatile time period of 2020 and 2021.


Mlb 2023 Season Attendance Predictions, Sophia Andersen, Anna Tollette, Hannah Clinton Apr 2023

Mlb 2023 Season Attendance Predictions, Sophia Andersen, Anna Tollette, Hannah Clinton

Research and Scholarship Symposium Posters

The goal of this project was to predict home game attendance for all 30 Major League Baseball (MLB) teams in their 2023 season. Researching and understanding that data as well as identifying influential factors of attendance were key factors before building a predictive model. Both the given material and data sets from MinneMUDAC, the competition organizer, was used as well as some outside sources. Finally, a predictive model was coded in Python which gave attendance predictions for every MLB game scheduled in 2023. From these results, insights could be offered to Major League Baseball or each team individually, to help …


Changing Nfl Playoff Overtime Rules To Create Equal Opportunities To Win A Game, Matthew Silvia Apr 2023

Changing Nfl Playoff Overtime Rules To Create Equal Opportunities To Win A Game, Matthew Silvia

Honors Projects in Mathematics

The NFL has attempted to create fair overtime rules over the course of the past decade; however, this study is interested in determining what playoff overtime rule (or rules) could the NFL implement to result in outcomes where both teams have a relatively equal chance of winning a game. This study aims to find which overtime rules work best at minimizing the differences between teams who possess the ball first versus teams that kick the ball off to start an overtime period. By collecting various NFL statistics from ESPN.com and FantasyOutsiders.com, this study hopes to run multiple simulations of different …


An Adaptive Algorithm For `The Secretary Problem': Alternate Proof Of The Divergence Of A Maximizer Sequence, Andrew Benfante, Xiang Xu Jan 2023

An Adaptive Algorithm For `The Secretary Problem': Alternate Proof Of The Divergence Of A Maximizer Sequence, Andrew Benfante, Xiang Xu

OUR Journal: ODU Undergraduate Research Journal

This paper presents an alternate proof of the divergence of the unique maximizer sequence {𝑥∗ 𝑛} of a function sequence {𝐹𝑛(𝑥)} that is derived from an adaptive algorithm based on the now classic optimal stopping problem, known by many names but here ‘the secretary problem’. The alternate proof uses a result established by Nguyen, Xu, and Zhao (n.d.) regarding the uniqueness of maximizer points of a generalized function sequence {𝑆𝜇,𝜎 𝑛 } and relies on the strict monotonicity of 𝐹𝑛(𝑥) as 𝑛 increases in order to show divergence of {𝑥∗ 𝑛}. Towards this, limits of the exponentiated Gaussian CDF are …


Stochastic Optimization To Reduce Aircraft Taxi-In Time At Igia, New Delhi, Rajib Das, Saileswar Ghosh, Rajendra Desai, Pijus Kanti Bhuin, Stuti Agarwal Jan 2023

Stochastic Optimization To Reduce Aircraft Taxi-In Time At Igia, New Delhi, Rajib Das, Saileswar Ghosh, Rajendra Desai, Pijus Kanti Bhuin, Stuti Agarwal

International Journal of Aviation, Aeronautics, and Aerospace

Since there is an uncertainty in the arrival times of flights, pre-scheduled allocation of runways and stands and the subsequent first-come-first-served treatment results in a sub-optimal allocation of runways and stands, this is the prime reason for the unusual delays in taxi-in times at IGIA, New Delhi.

We simulated the arrival pattern of aircraft and utilized stochastic optimization to arrive at the best runway-stands allocation for a day. Optimization is done using a GRG Non-Linear algorithm in the Frontline Systems Analytic Solver platform. We applied this model to eight representative scenarios of two different days. Our results show that without …


Improving Computation For Hierarchical Bayesian Spatial Gaussian Mixture Models With Application To The Analysis Of Thz Image Of Breast Tumor, Jean Remy Habimana Aug 2022

Improving Computation For Hierarchical Bayesian Spatial Gaussian Mixture Models With Application To The Analysis Of Thz Image Of Breast Tumor, Jean Remy Habimana

Graduate Theses and Dissertations

In the first chapter of this dissertation we give a brief introduction to Markov chain Monte Carlo methods (MCMC) and their application in Bayesian inference. In particular, we discuss the Metropolis-Hastings and conjugate Gibbs algorithms and explore the computational underpinnings of these methods. The second chapter discusses how to incorporate spatial autocorrelation in linear a regression model with an emphasis on the computational framework for estimating the spatial correlation patterns.

The third chapter starts with an overview of Gaussian mixture models (GMMs). However, because in the GMM framework the observations are assumed to be independent, GMMs are less effective when …


Random Walks In The Quarter Plane: Solvable Models With An Analytical Approach, Harshita Bali, Enrico Au-Yeung Jul 2022

Random Walks In The Quarter Plane: Solvable Models With An Analytical Approach, Harshita Bali, Enrico Au-Yeung

DePaul Discoveries

Initially, an urn contains 3 blue balls and 1 red ball. A ball is randomly chosen from the urn. The ball is returned to the urn, together with one additional ball of the same type (red or blue). When the urn has twenty balls in it, what is the probability that exactly ten balls are blue? This is a model for a random process. This urn model has been extended in various ways and we consider some of these generalizations. Urn models can be formulated as random walks in the quarter plane. Our findings indicate that for a specific type …


Impact Of Treatment Length On Individuals With Substance Use Disorders In Allegheny County, Cassie Dibenedetti, Kate Rosello Apr 2022

Impact Of Treatment Length On Individuals With Substance Use Disorders In Allegheny County, Cassie Dibenedetti, Kate Rosello

Undergraduate Research and Scholarship Symposium

Auberle social services is opening the Family Healing Center (FHC), a level 3.5 treatment program in Pittsburgh, PA that provides housing and 24-hour support for families struggling with opioid addiction. We partnered with Auberle to study characteristics of individuals receiving level 3.5 treatment and to determine whether longer treatment lengths correlate with fewer adverse outcomes. We obtained data from the Allegheny County Department of Human Services on 2,016 individuals admitted to level 3.5 treatment in 2019. The data included birth year, race, gender, admittance date, discharge date, and Children Youth and Family (CYF) incidents before and after treatment. We categorized …


Session 5: Equipment Finance Credit Risk Modeling - A Case Study In Creative Model Development & Nimble Data Engineering, Edward Krueger, Landon Thompson, Josh Moore Feb 2022

Session 5: Equipment Finance Credit Risk Modeling - A Case Study In Creative Model Development & Nimble Data Engineering, Edward Krueger, Landon Thompson, Josh Moore

SDSU Data Science Symposium

This presentation will focus first on providing an overview of Channel and the Risk Analytics team that performed this case study. Given that context, we’ll then dive into our approach for building the modeling development data set, techniques and tools used to develop and implement the model into a production environment, and some of the challenges faced upon launch. Then, the presentation will pivot to the data engineering pipeline. During this portion, we will explore the application process and what happens to the data we collect. This will include how we extract & store the data along with how it …


Finding The Best Predictors For Foot Traffic In Us Seafood Restaurants, Isabel Paige Beaulieu Jan 2022

Finding The Best Predictors For Foot Traffic In Us Seafood Restaurants, Isabel Paige Beaulieu

Honors Theses and Capstones

COVID-19 caused state and nation-wide lockdowns, which altered human foot traffic, especially in restaurants. The seafood sector in particular suffered greatly as there was an increase in illegal fishing, it is made up of perishable goods, it is seasonal in some places, and imports and exports were slowed. Foot traffic data is useful for business owners to have to know how much to order, how many employees to schedule, etc. One issue is that the data is very expensive, hard to get, and not available until months after it is recorded. Our goal is to not only find covariates that …


Estimating The Statistics Of Operational Loss Through The Analyzation Of A Time Series, Maurice L. Brown Jan 2022

Estimating The Statistics Of Operational Loss Through The Analyzation Of A Time Series, Maurice L. Brown

Theses and Dissertations

In the world of finance, appropriately understanding risk is key to success or failure because it is a fundamental driver for institutional behavior. Here we focus on risk as it relates to the operations of financial institutions, namely operational risk. Quantifying operational risk begins with data in the form of a time series of realized losses, which can occur for a number of reasons, can vary over different time intervals, and can pose a challenge that is exacerbated by having to account for both frequency and severity of losses. We introduce a stochastic point process model for the frequency distribution …


Role Of Inhibition And Spiking Variability In Ortho- And Retronasal Olfactory Processing, Michelle F. Craft Jan 2022

Role Of Inhibition And Spiking Variability In Ortho- And Retronasal Olfactory Processing, Michelle F. Craft

Theses and Dissertations

Odor perception is the impetus for important animal behaviors, most pertinently for feeding, but also for mating and communication. There are two predominate modes of odor processing: odors pass through the front of nose (ortho) while inhaling and sniffing, or through the rear (retro) during exhalation and while eating and drinking. Despite the importance of olfaction for an animal’s well-being and specifically that ortho and retro naturally occur, it is unknown whether the modality (ortho versus retro) is transmitted to cortical brain regions, which could significantly instruct how odors are processed. Prior imaging studies show different …


On Extensions And Restrictions Of Τ-Smooth And Τ-Maxitive Idempotent Measures, Muzaffar Eshimbetov Sep 2021

On Extensions And Restrictions Of Τ-Smooth And Τ-Maxitive Idempotent Measures, Muzaffar Eshimbetov

Bulletin of National University of Uzbekistan: Mathematics and Natural Sciences

In the paper we investigate maps between idempotent measures spaces, τ-maxitive idempotent measures and their extensions and restrictions. For an idempotent measure we prove that its extension is τ-maxitive if and only if its restriction is τ-maxitive.


Applications Of Nonstandard Analysis In Probability And Measure Theory, Irfan Alam May 2021

Applications Of Nonstandard Analysis In Probability And Measure Theory, Irfan Alam

LSU Doctoral Dissertations

This dissertation broadly deals with two areas of probability theory and investigates how methods from nonstandard analysis may provide new perspectives in these topics. In particular, we use nonstandard analysis to prove new results in the topics of limiting spherical integrals and of exchangeability.

In the former area, our methods allow us to represent finite dimensional Gaussian measures in terms of marginals of measures on hyperfinite-dimensional spheres in a certain strong sense, thus generalizing some previously known results on Gaussian Radon transforms as limits of spherical integrals. This first area has roots in the kinetic theory of gases, which is …


Applying Emotional Analysis For Automated Content Moderation, John Shelnutt May 2021

Applying Emotional Analysis For Automated Content Moderation, John Shelnutt

Computer Science and Computer Engineering Undergraduate Honors Theses

The purpose of this project is to explore the effectiveness of emotional analysis as a means to automatically moderate content or flag content for manual moderation in order to reduce the workload of human moderators in moderating toxic content online. In this context, toxic content is defined as content that features excessive negativity, rudeness, or malice. This often features offensive language or slurs. The work involved in this project included creating a simple website that imitates a social media or forum with a feed of user submitted text posts, implementing an emotional analysis algorithm from a word emotions dataset, designing …


Zeta Function Regularization And Its Relationship To Number Theory, Stephen Wang May 2021

Zeta Function Regularization And Its Relationship To Number Theory, Stephen Wang

Electronic Theses and Dissertations

While the "path integral" formulation of quantum mechanics is both highly intuitive and far reaching, the path integrals themselves often fail to converge in the usual sense. Richard Feynman developed regularization as a solution, such that regularized path integrals could be calculated and analyzed within a strictly physics context. Over the past 50 years, mathematicians and physicists have retroactively introduced schemes for achieving mathematical rigor in the study and application of regularized path integrals. One such scheme was introduced in 2007 by the mathematicians Klaus Kirsten and Paul Loya. In this thesis, we reproduce the Kirsten and Loya approach to …


An Exploratory Analysis Of The Bgsu Learning Commons Student Usage Data, Emily Eskuri Apr 2021

An Exploratory Analysis Of The Bgsu Learning Commons Student Usage Data, Emily Eskuri

Honors Projects

The purpose of this study was to explore past student usage data in individualized tutoring sessions from the Learning Commons from two academic years. The Bowling Green State University (BGSU) Learning Commons is a learning assistance center that offers various services, such as individualized tutoring, math assistance, writing assistance, study hours, and academic coaching. There have been limited research studies into how big data and analytics can have an impact in higher education, especially research utilizing predictive analytics.

This project applied analytics to individualized tutoring data in the Learning Commons to create a better understanding of why those trends happen …


Stochastic Navier-Stokes Equations With Markov Switching, Po-Han Hsu Mar 2021

Stochastic Navier-Stokes Equations With Markov Switching, Po-Han Hsu

LSU Doctoral Dissertations

This dissertation is devoted to the study of three-dimensional (regularized) stochastic Navier-Stokes equations with Markov switching. A Markov chain is introduced into the noise term to capture the transitions from laminar to turbulent flow, and vice versa. The existence of the weak solution (in the sense of stochastic analysis) is shown by studying the martingale problem posed by it. This together with the pathwise uniqueness yields existence of the unique strong solution (in the sense of stochastic analysis). The existence and uniqueness of a stationary measure is established when the noise terms are additive and autonomous. Certain exit time estimates …


On The Evolution Equation For Modelling The Covid-19 Pandemic, Jonathan Blackledge Jan 2021

On The Evolution Equation For Modelling The Covid-19 Pandemic, Jonathan Blackledge

Books/Book chapters

The paper introduces and discusses the evolution equation, and, based exclusively on this equation, considers random walk models for the time series available on the daily confirmed Covid-19 cases for different countries. It is shown that a conventional random walk model is not consistent with the current global pandemic time series data, which exhibits non-ergodic properties. A self-affine random walk field model is investigated, derived from the evolutionary equation for a specified memory function which provides the non-ergodic fields evident in the available Covid-19 data. This is based on using a spectral scaling relationship of the type 1/ωα where ω …


Neither “Post-War” Nor Post-Pregnancy Paranoia: How America’S War On Drugs Continues To Perpetuate Disparate Incarceration Outcomes For Pregnant, Substance-Involved Offenders, Becca S. Zimmerman Jan 2021

Neither “Post-War” Nor Post-Pregnancy Paranoia: How America’S War On Drugs Continues To Perpetuate Disparate Incarceration Outcomes For Pregnant, Substance-Involved Offenders, Becca S. Zimmerman

Pitzer Senior Theses

This thesis investigates the unique interactions between pregnancy, substance involvement, and race as they relate to the War on Drugs and the hyper-incarceration of women. Using ordinary least square regression analyses and data from the Bureau of Justice Statistics’ 2016 Survey of Prison Inmates, I examine if (and how) pregnancy status, drug use, race, and their interactions influence two length of incarceration outcomes: sentence length and amount of time spent in jail between arrest and imprisonment. The results collectively indicate that pregnancy decreases length of incarceration outcomes for those offenders who are not substance-involved but not evenhandedly -- benefitting white …


Analyzing Yankees And Red Sox Sentiment Over The Course Of A Season, Connor Koch Nov 2020

Analyzing Yankees And Red Sox Sentiment Over The Course Of A Season, Connor Koch

Honors Projects in Data Science

This paper investigates data collected on twitter which references the Yankees or Red Sox during the 2020 Major League Baseball (MLB) season. The objective is to analyze the sentiment of tweets referencing the Yankees and Red Sox over the course of the season. In addition, an investigation of the networks within the data and the topics that were prevalent will be conducted. The 2020 MLB season was started late because of the COVID-19 pandemic and was a season like no other. The expectation of a dataset revolving around baseball is that the topics discussed would be about baseball. The findings …