Open Access. Powered by Scholars. Published by Universities.®

Applied Statistics Commons

Open Access. Powered by Scholars. Published by Universities.®

3,451 Full-Text Articles 4,821 Authors 2,532,852 Downloads 160 Institutions

All Articles in Applied Statistics

Faceted Search

3,451 full-text articles. Page 1 of 103.

Statistical Methods To Generate Artificial Slot Floor Data For The Advancement Of Casino Related Research, Courtney Bonner, Anastasia (Stasi) D. Baran, Jason D. Fiege, Saman Muthukumarana 2023 nQube Data Science Inc.

Statistical Methods To Generate Artificial Slot Floor Data For The Advancement Of Casino Related Research, Courtney Bonner, Anastasia (Stasi) D. Baran, Jason D. Fiege, Saman Muthukumarana

International Conference on Gambling & Risk Taking

Abstract:

A common difficulty when researching gambling topics is the availability of high-quality data sets for development and testing. Due to the high level of secrecy within the gambling industry, if data is obtained for research purposes it is often prohibitively obfuscated, incomplete, or aggregated. Although these data have allowed for advancement in academic work, it leaves both the researchers and readers left wondering about what would be possible if more detailed data sets were available. To mitigate the paucity of data available to researchers, we present a Markov chain-based statistical process for producing artificial event data for a simulated …


Stake Size And Wagering In A Professional Betting Environment – When Data Affects Decision Making, Anthony Bedford, Tristan Barnett 2023 Flinders University

Stake Size And Wagering In A Professional Betting Environment – When Data Affects Decision Making, Anthony Bedford, Tristan Barnett

International Conference on Gambling & Risk Taking

In this work, we discuss the structure of a number of professional wagering organisations, and how they attempt to deal with the “Ender’s Game” effect – when knowledge of the true nature of the ‘war being wagered’ may have affected the process and choice of betting. We analyse the responses from professional wagering and betting organisations, whom operate predominately in Horseracing and sportsbetting, and they identify the importance of separation of decisions around choices to make and the stakes and size of wagers that are linked to the decisions. The proposed model, practically carried out by one company, is an …


Analytical Approach For Monitoring The Behavior Of Patients With Pancreatic Adenocarcinoma At Different Stages As A Function Of Time, Aditya Chakaborty Dr, Chris P. Tsokos Dr 2023 Eastern Virginia Medical School

Analytical Approach For Monitoring The Behavior Of Patients With Pancreatic Adenocarcinoma At Different Stages As A Function Of Time, Aditya Chakaborty Dr, Chris P. Tsokos Dr

Biology and Medicine Through Mathematics Conference

No abstract provided.


Optimizing Tumor Xenograft Experiments Using Bayesian Linear And Nonlinear Mixed Modelling And Reinforcement Learning, Mary Lena Bleile 2023 Southern Methodist University

Optimizing Tumor Xenograft Experiments Using Bayesian Linear And Nonlinear Mixed Modelling And Reinforcement Learning, Mary Lena Bleile

Statistical Science Theses and Dissertations

Tumor xenograft experiments are a popular tool of cancer biology research. In a typical such experiment, one implants a set of animals with an aliquot of the human tumor of interest, applies various treatments of interest, and observes the subsequent response. Efficient analysis of the data from these experiments is therefore of utmost importance. This dissertation proposes three methods for optimizing cancer treatment and data analysis in the tumor xenograft context. The first of these is applicable to tumor xenograft experiments in general, and the second two seek to optimize the combination of radiotherapy with immunotherapy in the tumor xenograft …


Movie Recommender System Using Matrix Factorization, Roland Fiagbe 2023 University of Central Florida

Movie Recommender System Using Matrix Factorization, Roland Fiagbe

Data Science and Data Mining

Recommendation systems are a popular and beneficial field that can help people make informed decisions automatically. This technique assists users in selecting relevant information from an overwhelming amount of available data. When it comes to movie recommendations, two common methods are collaborative filtering, which compares similarities between users, and content-based filtering, which takes a user’s specific preferences into account. However, our study focuses on the collaborative filtering approach, specifically matrix factorization. Various similarity metrics are used to identify user similarities for recommendation purposes. Our project aims to predict movie ratings for unwatched movies using the MovieLens rating dataset. We developed …


Formula 101 Using 2022 Formula One Season Data To Understand The Race Results, Christopher Garcia, Oliver Lopez 2023 Chapman University

Formula 101 Using 2022 Formula One Season Data To Understand The Race Results, Christopher Garcia, Oliver Lopez

Student Scholar Symposium Abstracts and Posters

The reason why I am interested in Formula One is that my friend showed me what Formula One was all about. It became interesting to see the action of the sport, including the battles the drivers have during the race and how fast they go through a corner. Also, when qualifying comes around, they push their car to the absolute limit to gain a few seconds off their opponents. The drivers only in the top 10 receive points from the winner getting 25 points, the last driver in the top 10 getting 1 point, and those below the top ten …


A Monte Carlo Analysis Of Nonprobability Sampling & Post Hoc Corrections, Julia Hong 2023 Western Kentucky University

A Monte Carlo Analysis Of Nonprobability Sampling & Post Hoc Corrections, Julia Hong

Masters Theses & Specialist Projects

Nonprobability samples are often used in place of probability samples because the former are less trouble and less expensive. Unfortunately, it is difficult to determine how well a sample represents population parameters when using nonprobability samples. Researchers attempt to mitigate the disadvantages of nonprobability sampling by performing post hoc corrections, but this adjustment may not successfully undo the effects of nonprobability sampling. To examine these effects, a Monte Carlo simulation was conducted to create a pseudo-population from which samples were drawn. Forty-one conditions were replicated 10,000 times each, with each sample consisting of 100 observations. A post-stratification adjustment was made …


Examining The Effect Of Word Embeddings And Preprocessing Methods On Fake News Detection, Jessica Hauschild 2023 University of Nebraska-Lincoln

Examining The Effect Of Word Embeddings And Preprocessing Methods On Fake News Detection, Jessica Hauschild

Dissertations and Theses in Statistics

The words people choose to use hold a lot of power, whether that be in spreading truth or deception. As listeners and readers, we do our best to understand how words are being used. There are many current methods in computer science literature attempting to embed words into numerical information for statistical analyses. Some of these embedding methods, such as Bag of Words, treat words as independent, while others, such as Word2Vec, attempt to gain information about the context of words. It is of interest to compare how well these various methods of translating text into numerical data work specifically …


Small But Mighty: Examing The Utility Of Microstatistics In Modeling Ice Hockey, Matt Palmer 2023 Liberty University

Small But Mighty: Examing The Utility Of Microstatistics In Modeling Ice Hockey, Matt Palmer

Senior Honors Theses

As research into hockey analytics continues, an increasing number of metrics are being introduced into the knowledge base of the field, creating a need to determine whether various stats are useful or simply add noise to the discussion. This paper examines microstatistics – manually tracked metrics which go beyond the NHL’s publicly released stats – both through the lens of meta-analytics (which attempt to objectively assess how useful a metric is) and modeling game probabilities. Results show that while there is certainly room for improvement in understanding and use of microstats in modeling, the metrics overall represent an area of …


Jackknife Empirical Likelihood Tests For Equality Of Generalized Lorenz Curves, Anton Butenko 2023 California State University, San Bernardino

Jackknife Empirical Likelihood Tests For Equality Of Generalized Lorenz Curves, Anton Butenko

Electronic Theses, Projects, and Dissertations

A Lorenz curve is a graphical representation of the distribution of income or wealth within a population. The generalized Lorenz curve can be created by scaling the values on the vertical axis of a Lorenz curve by the average output of the distribution. In this thesis, we propose two nonparametric methods for testing the equality of two generalized Lorenz curves. Both methods are based on empirical likelihood and utilize a U -statistic. We derive the limiting distribution of the likelihood ratio, which is shown to follow a chi-squared distribution with one degree of freedom. We conduct simulations to compare the …


Time Series Analysis Of Longitudinally Collected Standard Autoperimetry Data In Glaucoma Patients, Carlyn Childress 2023 Murray State University

Time Series Analysis Of Longitudinally Collected Standard Autoperimetry Data In Glaucoma Patients, Carlyn Childress

Honors College Theses

Glaucoma is a group of eye diseases in which damage gradually occurs to the optic nerve, which often leads to partial or complete loss of vision. As the second leading cause of blindness, there is no cure for glaucoma. Early detection and the tracking of its progression is key to managing the effects of glaucoma. Ordinary Least Squares Regression (OLSR), the most commonly used methodology for tracking glaucoma progression, is inappropriate as the longitudinally collected perimetry data from the glaucoma patients appears to be temporally correlated. Time series models, that account for temporal correlation, are better methods to analyze Mean …


Employee Attrition: Analyzing Factors Influencing Job Satisfaction Of Ibm Data Scientists, Graham Nash 2023 Kennesaw State University

Employee Attrition: Analyzing Factors Influencing Job Satisfaction Of Ibm Data Scientists, Graham Nash

Symposium of Student Scholars

Employee attrition is a relevant issue that every business employer must consider when gauging the effectiveness of their employees. Whether or not an employee chooses to leave their job can come from a multitude of factors. As a result, employers need to develop methods in which they can measure attrition by calculating the several qualities of their employees. Factors like their age, years with the company, which department they work in, their level of education, their job role, and even their marital status are all considered by employers to assist in predicting employee attrition. This project will be analyzing a …


Reducing Restaurant Inventory Costs Through Sales Forecasting, Tyler Mason, Chris Schoen, Trevor Gilbert, Jonathan Enriquez 2023 Kennesaw State University

Reducing Restaurant Inventory Costs Through Sales Forecasting, Tyler Mason, Chris Schoen, Trevor Gilbert, Jonathan Enriquez

Senior Design Project For Engineers

Family Restaurant is a local restaurant in the greater Atlanta area that serves a variety of dishes that include an assortment of 19 different proteins. Currently, Family Restaurant places protein orders based on business intuition, and tends to over-stock and sometimes under-stock. To minimize inventory costs by reducing over-stocking and preventing under-stocking of proteins, we applied Facebook Prophet (FB Prophet), ARIMA, and XG Boost machine learning models to predict protein demand and then fed these results into a Fixed Time Period inventory model to make an overall order suggestion based on the specified time period. We trained our models on …


Two Sample Statistical Test For Location Parameters, Narinder Kumar, Arun Kumar 2023 Panjab University, Chandigarh

Two Sample Statistical Test For Location Parameters, Narinder Kumar, Arun Kumar

Journal of Modern Applied Statistical Methods

A class of distribution-free tests for the homogeneity of location parameters is proposed and compared with different competitors in terms of Pitman asymptotic relative efficiency. A numerical example is provided and a simulation study is made to check the performance of the tests.


Interpretable Learning In Multivariate Big Data Analysis For Network Monitoring, José Camacho, Rasmus Bro, David Kotz 2023 University of Granada

Interpretable Learning In Multivariate Big Data Analysis For Network Monitoring, José Camacho, Rasmus Bro, David Kotz

Dartmouth Scholarship

There is an increasing interest in the development of new data-driven models useful to assess the performance of communication networks. For many applications, like network monitoring and troubleshooting, a data model is of little use if it cannot be interpreted by a human operator. In this paper, we present an extension of the Multivariate Big Data Analysis (MBDA) methodology, a recently proposed interpretable data analysis tool. In this extension, we propose a solution to the automatic derivation of features, a cornerstone step for the application of MBDA when the amount of data is massive. The resulting network monitoring approach allows …


Modeling The Probability Of A Successful Stolen Base Attempt In Major League Baseball, Cade Stanley 2023 University of South Carolina - Columbia

Modeling The Probability Of A Successful Stolen Base Attempt In Major League Baseball, Cade Stanley

Senior Theses

In Major League Baseball (MLB), the outcome of a stolen base attempt has important implications. Success moves the runner closer to scoring, while failure records an out and removes the runner from the basepaths altogether. Therefore, it is important that the decision by a coach or player to steal a base is well-informed. In this thesis, I explore a statistical approach to making this decision. I train logistic regression and random forest models, using data about the game situation and about the runner, pitcher, and catcher involved in the stolen base attempt, to estimate the probability that a stolen base …


Moral Injury To Inform Analysis Of Post-Traumatic Stress Disorder, Amanda Julia Manea 2023 University of South Carolina - Columbia

Moral Injury To Inform Analysis Of Post-Traumatic Stress Disorder, Amanda Julia Manea

Senior Theses

Post-traumatic stress disorder (PTSD) is a mental health condition that almost one out of ten veterans struggle with. Although the National Center for PTSD has made extensive progress in characterizing and developing new treatments for PTSD, most veterans still experience symptoms of PTSD following treatment. Novel avenues of investigation, such as developing algorithms to review electronic health record (EHR) data and better understanding moral injury, are being pursued to address the gap that still exists when it comes to treating veterans. Moral injury is the individual evaluation of exposure to a potentially morally injurious event (PMIE) and can lead to …


Self-Learning Algorithms For Intrusion Detection And Prevention Systems (Idps), Juan E. Nunez, Roger W. Tchegui Donfack, Rohit Rohit, Hayley Horn 2023 Southern Methodist University

Self-Learning Algorithms For Intrusion Detection And Prevention Systems (Idps), Juan E. Nunez, Roger W. Tchegui Donfack, Rohit Rohit, Hayley Horn

SMU Data Science Review

Today, there is an increased risk to data privacy and information security due to cyberattacks that compromise data reliability and accessibility. New machine learning models are needed to detect and prevent these cyberattacks. One application of these models is cybersecurity threat detection and prevention systems that can create a baseline of a network's traffic patterns to detect anomalies without needing pre-labeled data; thus, enabling the identification of abnormal network events as threats. This research explored algorithms that can help automate anomaly detection on an enterprise network using Canadian Institute for Cybersecurity data. This study demonstrates that Neural Networks with Bayesian …


A Chairpersons Guide To Managing Time And Stress, Christian K. Hansen 2023 Eastern Washington University

A Chairpersons Guide To Managing Time And Stress, Christian K. Hansen

Academic Chairpersons Conference Proceedings

In this interactive workshop we discuss time and stress management specifically from the perspective of a department chairperson responsible for leading an academic department through numerous internal and external challenges. The focus will be on practical strategies for effective use of time, not only at a personal level, but also at a department wide level.


Two-Stage Approach For Forensic Handwriting Analysis, Ashlan J. Simpson, Danica M. Ommen 2023 Iowa State University

Two-Stage Approach For Forensic Handwriting Analysis, Ashlan J. Simpson, Danica M. Ommen

SDSU Data Science Symposium

Trained experts currently perform the handwriting analysis required in the criminal justice field, but this can create biases, delays, and expenses, leaving room for improvement. Prior research has sought to address this by analyzing handwriting through feature-based and score-based likelihood ratios for assessing evidence within a probabilistic framework. However, error rates are not well defined within this framework, making it difficult to evaluate the method and can lead to making a greater-than-expected number of errors when applying the approach. This research explores a method for assessing handwriting within the Two-Stage framework, which allows for quantifying error rates as recommended by …


Digital Commons powered by bepress