Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistics

Discipline
Institution
Publication Year
Publication
Publication Type
File Type

Articles 1 - 30 of 541

Full-Text Articles in Physical Sciences and Mathematics

Forecasting Razorback Baseball Game Outcomes, Austin Raabe May 2022

Forecasting Razorback Baseball Game Outcomes, Austin Raabe

Information Systems Undergraduate Honors Theses

Despite the disappointing end to the 2021 Arkansas Razorback baseball year, the team’s success provided hog fans something to look forward to next season. While they will be without the 2021 Golden Spikes Award winner, Kevin Kopps, and four All-SEC team selections, the 2022 roster has promising new and returning talent. With fifty percent of the players who played significant time last year coming back (minimum ten hits or ten innings pitched), the arrival of several impact transfers from major conferences, and a recruiting class ranked in the top five according to Perfect Game, there is reason to believe ...


Understanding And Improving The System: The Effects Of Weighting On The Accuracy Of Political Polling In Arkansas, Beck Williams May 2022

Understanding And Improving The System: The Effects Of Weighting On The Accuracy Of Political Polling In Arkansas, Beck Williams

Political Science Undergraduate Honors Theses

In an effort to increase the accuracy of statewide political polling in Arkansas, we explore the statistical strategy of weighting with a focus on one yearly opinion poll: The Arkansas Poll. We conduct over 70 weighting experiments on the 2016 and 2020 Arkansas Polls using a variety of variables and opinion questions. From these experiments, we find that while some weighted variables tend to create larger changes, weighting typically results in a single-digit percentage change that does not substantially shift or “flip” the majorities. Due to a greater rate of change through weighting in the 2020 Poll compared to the ...


Analytical Study To Determine Significant Causes Of Increased No-Hitters In The 2021 Major League Baseball Season, Joel Robison Apr 2022

Analytical Study To Determine Significant Causes Of Increased No-Hitters In The 2021 Major League Baseball Season, Joel Robison

Honors Projects

Why were there so many no-hitters in the 2021 MLB season? This project focuses on possible significant causes to the record-breaking number of no-hitters pitched in the 2021 Major League Baseball season. Specifically, this project takes an analytical look at the recent trends in launch angles and spin rates to determine if there are any significant causes to the increased number of no-hitters in baseball. The random nature and unpredictability of the game of baseball make it almost impossible to come to any solid conclusions.


Einstein-Roscoe Regression For The Slag Viscosity Prediction Problem In Steelmaking, Hiroto Saigo, K C. Dukka, Noritaka Saito Apr 2022

Einstein-Roscoe Regression For The Slag Viscosity Prediction Problem In Steelmaking, Hiroto Saigo, K C. Dukka, Noritaka Saito

Michigan Tech Publications

In classical machine learning, regressors are trained without attempting to gain insight into the mechanism connecting inputs and outputs. Natural sciences, however, are interested in finding a robust interpretable function for the target phenomenon, that can return predictions even outside of the training domains. This paper focuses on viscosity prediction problem in steelmaking, and proposes Einstein-Roscoe regression (ERR), which learns the coefficients of the Einstein-Roscoe equation, and is able to extrapolate to unseen domains. Besides, it is often the case in the natural sciences that some measurements are unavailable or expensive than the others due to physical constraints. To this ...


A Monte Carlo Analysis Of Seven Dichotomous Variable Confidence Interval Equations, Morgan Juanita Dubose Apr 2022

A Monte Carlo Analysis Of Seven Dichotomous Variable Confidence Interval Equations, Morgan Juanita Dubose

Masters Theses & Specialist Projects

Department of Psychological Sciences Western Kentucky University There are two options to estimate a range of likely values for the population mean of a continuous variable: one for when the population standard deviation is known and another for when the population standard deviation is unknown. There are seven proposed equations to calculate the confidence interval for the population mean of a dichotomous variable: normal approximation interval, Wilson interval, Jeffreys interval, Clopper-Pearson, Agresti-Coull, arcsine transformation, and logit transformation. In this study, I compared the percent effectiveness of each equation using a Monte Carlo analysis and the interval range over a range ...


Mixture Models In Machine Learning, Soumyabrata Pal Mar 2022

Mixture Models In Machine Learning, Soumyabrata Pal

Doctoral Dissertations

Modeling with mixtures is a powerful method in the statistical toolkit that can be used for representing the presence of sub-populations within an overall population. In many applications ranging from financial models to genetics, a mixture model is used to fit the data. The primary difficulty in learning mixture models is that the observed data set does not identify the sub-population to which an individual observation belongs. Despite being studied for more than a century, the theoretical guarantees of mixture models remain unknown for several important settings.

In this thesis, we look at three groups of problems. The first part ...


Split Classification Model For Complex Clustered Data, Katherine Gerot Mar 2022

Split Classification Model For Complex Clustered Data, Katherine Gerot

Honors Theses, University of Nebraska-Lincoln

Classification in high-dimensional data has generated tremendous interest in a multitude of fields. Data in higher dimensions often tend to reside in non-Euclidean metric space. This prevents Euclidean-based classification methodologies, such as regression, from reliably modeling the data. Many proposed models rely on computationally-complex embedding to convert the data to a more usable format. Others, namely the Support Vector Machine, rely on kernel manipulation to implicitly describe the "feature space" to arrive at a non-linear decision boundary. The proposed methodology in this paper seeks to classify complex data in a relatively computationally-simple and explainable manner.


So Long My Friend, Bryan Mcnair Jan 2022

So Long My Friend, Bryan Mcnair

Journal of Humanistic Mathematics

No abstract provided.


Finding The Best Predictors For Foot Traffic In Us Seafood Restaurants, Isabel Paige Beaulieu Jan 2022

Finding The Best Predictors For Foot Traffic In Us Seafood Restaurants, Isabel Paige Beaulieu

Honors Theses and Capstones

COVID-19 caused state and nation-wide lockdowns, which altered human foot traffic, especially in restaurants. The seafood sector in particular suffered greatly as there was an increase in illegal fishing, it is made up of perishable goods, it is seasonal in some places, and imports and exports were slowed. Foot traffic data is useful for business owners to have to know how much to order, how many employees to schedule, etc. One issue is that the data is very expensive, hard to get, and not available until months after it is recorded. Our goal is to not only find covariates that ...


Exploring Improvements To The Convergence Of Reconstructing Historical Destructive Earthquakes, Kameron Lightheart Nov 2021

Exploring Improvements To The Convergence Of Reconstructing Historical Destructive Earthquakes, Kameron Lightheart

Theses and Dissertations

Determining risk to human populations due to natural disasters has been a topic of interest in the STEM fields for centuries. Earthquakes and the tsunamis they cause are of particular interest due to their repetition cycles. These cycles can last hundreds of years but we have only had modern measuring instruments for the last century or so which makes analysis difficult. In this document, we explore ways to improve upon an existing method for reconstructing earthquakes from historical accounts of tsunamis. This method was designed and implemented by Jared P Whitehead's research group over the last 5 years. The ...


Trade Bait: Season 3, Ben Bagley Oct 2021

Trade Bait: Season 3, Ben Bagley

WWU Honors College Senior Projects

A 5-episode podcast series dissecting the use of statistics in the NFL and NFL Media


The Classification Of Basket Neural Cells In The Mammalian Neocortex, Sreya Pudi Oct 2021

The Classification Of Basket Neural Cells In The Mammalian Neocortex, Sreya Pudi

Senior Theses

Basket neuronal cells of the mammalian neocortex have been classically categorized into two or more groups. Originally, it was thought that the large and small types are the naturally occurring groups that emerge from reasons that relate to neurobiological function and anatomical position. Later, a study based on anatomical and physiological features of these neurons introduced a third type, the net basket cell which is intermediate in size as compared to the large and small types. In this study, multivariate analysis was used to test the hypothesis that the large and small types are morphologically distinct groups. The results of ...


An Introduction To Calling Bullshit: Learning To Think Outside The Black Box, Jevin D. West, Carl T. Bergstrom Aug 2021

An Introduction To Calling Bullshit: Learning To Think Outside The Black Box, Jevin D. West, Carl T. Bergstrom

Numeracy

Bergstrom, Carl T. and Jevin D. West. 2020. Calling Bullshit: The Art of Skepticism in a Data-Driven World. (New York: Random House) 336 pp. ISBN 978-0525509202.

While statistical methods receive greater attention, the art of critically evaluating information in everyday life more commonly depends on thinking outside the black box of the algorithm. In this piece we introduce readers to our book and associated online teaching materials—for readers who want to more capably call “bullshit” or to teach their students to do the same.


The Uncertainty Of Confidence, Michael J. Leach Jul 2021

The Uncertainty Of Confidence, Michael J. Leach

Journal of Humanistic Mathematics

This is a free-verse poem about the estimation of population parameters in statistical models. The spacing of words is intended to reflect uncertainty.


Lab Exercises For Statistics Using Excel, Julia Nebia, Steven Cosares, Milena Cuellar Jul 2021

Lab Exercises For Statistics Using Excel, Julia Nebia, Steven Cosares, Milena Cuellar

Open Educational Resources

This document contains the text associated with a series of computer-based lab exercises to help students apply the concepts usually included in a first course in Statistics. A compressed file has been included that contains a separate folder for each lab. In each folder is an excel spreadsheet file and an editable word document providing the instructions for students to complete the exercise. The exercises are not numbered in the folders, so you can select any subset of these exercises to assign to your students. You are free to modify the instructions in any way you see fit, e.g ...


Data Analysis And Visualization To Dismantle Gender Discrimination In The Field Of Technology, Quinn Bolewicki Jun 2021

Data Analysis And Visualization To Dismantle Gender Discrimination In The Field Of Technology, Quinn Bolewicki

Dissertations, Theses, and Capstone Projects

In the United States, a significant population is facing an uphill battle trying to thrive in an industry that has seen exponential growth in recent years. Women, who account for approximately 50.8% of the U.S. population are statistically underpaid and underrepresented in science, technology, engineering, and mathematics (STEM). Despite women-led technology teams establishing a 21% greater return on investment than teams who don’t, and young women largely outperforming men in math according to a 2015 study, there are only three fortune 500 companies led by women, and they comprise only 10% of internet entrepreneurs. Research generates hundreds ...


A Study On Differing Generational Values And Expectations In Corporate America, Abigail Grella May 2021

A Study On Differing Generational Values And Expectations In Corporate America, Abigail Grella

Honors Program Theses and Projects

This paper examines the most common factors that lead to voluntary employee turnover, and the implications employee turnover has on an organization. Additionally, this paper will consider the varying values and workplace expectations of different demographic groups such as Millennials, Generation X, Generation Y, and Baby Boomers and how such factors could influence voluntary turnover. A study is conducted from survey results gathered across a large span of generations that are currently employed. Using statistical analysis employing t-tests and a Mood’s Median test, the results show that different generations have differently weighing values for specific organizational offerings. The results ...


We’Re Here To Get You There: A Statistical Analysis Of Bridgewater State University’S Transit System, Abigail Adams May 2021

We’Re Here To Get You There: A Statistical Analysis Of Bridgewater State University’S Transit System, Abigail Adams

Honors Program Theses and Projects

Bridgewater State University first established its on-campus transportation service in January of 1984. While it began only running as an on-campus service for students throughout the day, the service grew to expand by offering an off-campus connection to the neighboring city of Brockton and absorbed the night service system from the campus safety team. As BSU Transit continues to grow, the organization is seeking ways to improve their overall service and better prepare their fleet and driver pool to accommodate this growth. The purpose of this research is to analyze trends among the data collected by BSU Transit and assist ...


The Effect Of Initial Conditions On The Weather Research And Forecasting Model, Aaron D. Baker May 2021

The Effect Of Initial Conditions On The Weather Research And Forecasting Model, Aaron D. Baker

Electronic Theses and Dissertations

Modeling our atmosphere and determining forecasts using numerical methods has been a challenge since the early 20th Century. Most models use a complex dynamical system of equations that prove difficult to solve by hand as they are chaotic by nature. When computer systems became more widely adopted and available, approximating the solution of these equations, numerically, became easier as computational power increased. This advancement in computing has caused numerous weather models to be created and implemented across the world. However a challenge of approximating these solutions accurately still exists as each model have varying set of equations and variables to ...


Machine Learning With Topological Data Analysis, Ephraim Robert Love May 2021

Machine Learning With Topological Data Analysis, Ephraim Robert Love

Doctoral Dissertations

Topological Data Analysis (TDA) is a relatively new focus in the fields of statistics and machine learning. Methods of exploiting the geometry of data, such as clustering, have proven theoretically and empirically invaluable. TDA provides a general framework within which to study topological invariants (shapes) of data, which are more robust to noise and can recover information on higher dimensional features than immediately apparent in the data. A common tool for conducting TDA is persistence homology, which measures the significance of these invariants. Persistence homology has prominent realizations in methods of data visualization, statistics and machine learning. Extending ML with ...


Statistical Analysis Of 2017-18 Premier League Match Statistics Using A Regression Analysis In R, Bergen Campbell May 2021

Statistical Analysis Of 2017-18 Premier League Match Statistics Using A Regression Analysis In R, Bergen Campbell

Undergraduate Theses and Capstone Projects

This thesis analyzes the correlation between a team’s statistics and the success of their performances, and develops a predictive model that can be used to forecast final season results for that team. Data from the 2017-2018 Premier League season is to be gathered and broken down within R to highlight what factors and variables are largely contributing to the success or downfall of a team. A multiple linear regression model and stepwise selection process is then used to include any factors that are significant in predicting in match results.

The predictions about the 17-18 season results based on the ...


How Risk-Related Statistics, As Reported In News And Social Media, Are Linked To The Use Of The Public Transit System, Prashiddhi Pokhrel Apr 2021

How Risk-Related Statistics, As Reported In News And Social Media, Are Linked To The Use Of The Public Transit System, Prashiddhi Pokhrel

Thinking Matters Symposium

Due to the pandemic, people have started relying more on televisions, news, social media, and other news outlets for guidance. Moreover, with the increasing amount of news, data, and information there is also an increase in the amount of misleading statistics. People’s opinions and decisions significantly depend on the data, statistics, and information that they are exposed to, as well as their sources. For this project, we want to look at how information and its sources are affecting the decision made by the general public for the usage of the Portland Transit System. It is very important to know ...


Does Defense Actually Win Championships? Using Statistics To Examine One Of The Greatest Stereotypes In Sports, Thomas Burkett Apr 2021

Does Defense Actually Win Championships? Using Statistics To Examine One Of The Greatest Stereotypes In Sports, Thomas Burkett

Senior Theses

A common saying in sports is that “defense wins championships.” However, the past decade of play in the modern NBA has seen a rise and focus in offensive efficiency and 3-pointers. This thesis tests whether defense can truly predict a championship winning team in today’s NBA through two-sample hypothesis testing and multiple logistic regression models. The results found that both defensive and offensive statistics were significant predictors of championship teams, meaning that a balanced team, rather than one specialized in defense alone, is a more accurate predictor of championship success.


Clustering Web Users By Mouse Movement To Detect Bots And Botnet Attacks, Justin L. Morgan Mar 2021

Clustering Web Users By Mouse Movement To Detect Bots And Botnet Attacks, Justin L. Morgan

Master's Theses

The need for website administrators to efficiently and accurately detect the presence of web bots has shown to be a challenging problem. As the sophistication of modern web bots increases, specifically their ability to more closely mimic the behavior of humans, web bot detection schemes are more quickly becoming obsolete by failing to maintain effectiveness. Though machine learning-based detection schemes have been a successful approach to recent implementations, web bots are able to apply similar machine learning tactics to mimic human users, thus bypassing such detection schemes. This work seeks to address the issue of machine learning based bots bypassing ...


The Wargaming Commodity Course Of Action Automated Analysis Method, William T. Deberry Mar 2021

The Wargaming Commodity Course Of Action Automated Analysis Method, William T. Deberry

Theses and Dissertations

This research presents the Wargaming Commodity Course of Action Automated Analysis Method (WCCAAM), a novel approach to assist wargame commanders in developing and analyzing courses of action (COAs) through semi-automation of the Military Decision Making Process (MDMP). MDMP is a seven-step iterative method that commanders and mission partners follow to build an operational course of action to achieve strategic objectives. MDMP requires time, resources, and coordination – all competing items the commander weighs to make the optimal decision. WCCAAM receives the MDMP's Mission Analysis phase as input, converts the wargame into a directed graph, processes a multi-commodity flow algorithm on ...


Adventures In The "Islands" - Enhancing Student Engagement In Teaching Statistics, Leszek Gawarecki Feb 2021

Adventures In The "Islands" - Enhancing Student Engagement In Teaching Statistics, Leszek Gawarecki

Mathematics Presentations And Conference Materials

The factors for enhancing student engagement frequently identified are active and problem-based learning as well as real-life experience relevant to students' interests. The importance of using real data in teaching statistics has been repeatedly emphasized and its importance is growing. However, data collection, as part of a student project, faces serious practical problems. It is time-consuming, may require access to equipment, or raise ethical issues.


Machine Learning Morphisms: A Framework For Designing And Analyzing Machine Learning Work Ows, Applied To Separability, Error Bounds, And 30-Day Hospital Readmissions, Eric Zenon Cawi Jan 2021

Machine Learning Morphisms: A Framework For Designing And Analyzing Machine Learning Work Ows, Applied To Separability, Error Bounds, And 30-Day Hospital Readmissions, Eric Zenon Cawi

McKelvey School of Engineering Theses & Dissertations

A machine learning workflow is the sequence of tasks necessary to implement a machine learning application, including data collection, preprocessing, feature engineering, exploratory analysis, and model training/selection. In this dissertation we propose the Machine Learning Morphism (MLM) as a mathematical framework to describe the tasks in a workflow. The MLM is a tuple consisting of: Input Space, Output Space, Learning Morphism, Parameter Prior, Empirical Risk Function. This contains the information necessary to learn the parameters of the learning morphism, which represents a workflow task. In chapter 1, we give a short review of typical tasks present in a workflow ...


Genetics Of Pediatric Musculoskeletal Disorders, Lilian Antunes Jan 2021

Genetics Of Pediatric Musculoskeletal Disorders, Lilian Antunes

Arts & Sciences Electronic Theses and Dissertations

Pediatric musculoskeletal disorders are an extremely broad category of diseases that are often inherited. While individually rare, collectively these disorders are common, affecting around 3% of live births in the US. Despite the mounting clinical and molecular evidence for a genetic etiology, the cause for many patients with pediatric musculoskeletal disorders remain largely unknown. Major challenges in rare pediatric diseases include recruiting large numbers of patients and determining the significance and functional impacts of variants associated with disease within individuals or families. Whole exome sequencing (WES) is a powerful tool to identify coding variants that are associated with rare pediatric ...


Review Of Social Workers Count: Numbers And Social Issues By Michael Anthony Lewis, Michael T. Catalano Jan 2021

Review Of Social Workers Count: Numbers And Social Issues By Michael Anthony Lewis, Michael T. Catalano

Numeracy

Lewis, Michael Anthony. 2017. Social Workers Count: Numbers and Social Issues. 2019. New York: Oxford University Press. 223 pp. ISBN 978-019046713-5

The numeracy movement, although largely birthed within the mathematics community, is an outside-the-box endeavor which has always sought to break down or at least transgress traditional disciplinary boundaries. Michael Anthony Lewis’s book is a testament that this effort is succeeding. Lewis is a social worker and sociologist with an impressive resume, author of Economics for Social Workers, co-editor of The Ethics and Economics of the Basic Income Guarantee, and member of the faculty at the Silberman School of ...


Indispensable Statistics For The Behavioral Sciences ~With Spss 26, Howard Reid Ph.D. Jan 2021

Indispensable Statistics For The Behavioral Sciences ~With Spss 26, Howard Reid Ph.D.

Open Educational Resources (OER)

While there are many fine introductory statistics books, undergraduate students often continue to view statistics courses negatively. And many fear they will be unable to master the basic level of understanding that is essential to progress in their majors. The present text is an attempt to rethink what students majoring in the behavioral sciences absolutely must learn in an introductory statistics course and how best to organize the presentation of this material so they can succeed in their chosen field of study.

Every book is written from some perspective. The perspective of this book is that a first course in ...