Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability

PDF

2021

Statistics

Institution
Publication
Publication Type

Articles 1 - 25 of 25

Full-Text Articles in Physical Sciences and Mathematics

Trade Bait: Season 3, Ben Bagley Oct 2021

Trade Bait: Season 3, Ben Bagley

WWU Honors College Senior Projects

A 5-episode podcast series dissecting the use of statistics in the NFL and NFL Media


The Classification Of Basket Neural Cells In The Mammalian Neocortex, Sreya Pudi Oct 2021

The Classification Of Basket Neural Cells In The Mammalian Neocortex, Sreya Pudi

Senior Theses

Basket neuronal cells of the mammalian neocortex have been classically categorized into two or more groups. Originally, it was thought that the large and small types are the naturally occurring groups that emerge from reasons that relate to neurobiological function and anatomical position. Later, a study based on anatomical and physiological features of these neurons introduced a third type, the net basket cell which is intermediate in size as compared to the large and small types. In this study, multivariate analysis was used to test the hypothesis that the large and small types are morphologically distinct groups. The results of …


An Introduction To Calling Bullshit: Learning To Think Outside The Black Box, Jevin D. West, Carl T. Bergstrom Aug 2021

An Introduction To Calling Bullshit: Learning To Think Outside The Black Box, Jevin D. West, Carl T. Bergstrom

Numeracy

Bergstrom, Carl T. and Jevin D. West. 2020. Calling Bullshit: The Art of Skepticism in a Data-Driven World. (New York: Random House) 336 pp. ISBN 978-0525509202.

While statistical methods receive greater attention, the art of critically evaluating information in everyday life more commonly depends on thinking outside the black box of the algorithm. In this piece we introduce readers to our book and associated online teaching materials—for readers who want to more capably call “bullshit” or to teach their students to do the same.


The Uncertainty Of Confidence, Michael J. Leach Jul 2021

The Uncertainty Of Confidence, Michael J. Leach

Journal of Humanistic Mathematics

This is a free-verse poem about the estimation of population parameters in statistical models. The spacing of words is intended to reflect uncertainty.


Lab Exercises For Statistics Using Excel, Julia Nebia, Steven Cosares, Milena Cuellar Jul 2021

Lab Exercises For Statistics Using Excel, Julia Nebia, Steven Cosares, Milena Cuellar

Open Educational Resources

This document contains the text associated with a series of computer-based lab exercises to help students apply the concepts usually included in a first course in Statistics. A compressed file has been included that contains a separate folder for each lab. In each folder is an excel spreadsheet file and an editable word document providing the instructions for students to complete the exercise. The exercises are not numbered in the folders, so you can select any subset of these exercises to assign to your students. You are free to modify the instructions in any way you see fit, e.g., to …


A Review Of Logistic Regression And Its Application, Sultana Mubarika Rahman Chowdhury Jun 2021

A Review Of Logistic Regression And Its Application, Sultana Mubarika Rahman Chowdhury

FIU Electronic Theses and Dissertations

The purpose of this thesis is to do an in-depth review of logistic regression and its application. Additionally, comparison of four different methods of coefficient standardization was done using Heart Disease Dataset. These methods were compared based on testing accuracy, training accuracy, area under the curve, sensitivity, and specificity. Furthermore, logistic regression analysis was applied to National Longitudinal Study of Adolescence Health Survey (Add health) dataset to examine the relationship between anxiety or panic disorder and history of childhood maltreatment, medical conditions such as ADHD, PTSD, some socio-economic conditions and addiction. Results indicated; history of abuse has a significant effect …


Data Analysis And Visualization To Dismantle Gender Discrimination In The Field Of Technology, Quinn Bolewicki Jun 2021

Data Analysis And Visualization To Dismantle Gender Discrimination In The Field Of Technology, Quinn Bolewicki

Dissertations, Theses, and Capstone Projects

In the United States, a significant population is facing an uphill battle trying to thrive in an industry that has seen exponential growth in recent years. Women, who account for approximately 50.8% of the U.S. population are statistically underpaid and underrepresented in science, technology, engineering, and mathematics (STEM). Despite women-led technology teams establishing a 21% greater return on investment than teams who don’t, and young women largely outperforming men in math according to a 2015 study, there are only three fortune 500 companies led by women, and they comprise only 10% of internet entrepreneurs. Research generates hundreds of articles, infographics, …


Compare And Contrast Maximum Likelihood Method And Inverse Probability Weighting Method In Missing Data Analysis, Scott Sun May 2021

Compare And Contrast Maximum Likelihood Method And Inverse Probability Weighting Method In Missing Data Analysis, Scott Sun

Mathematical Sciences Technical Reports (MSTR)

Data can be lost for different reasons, but sometimes the missingness is a part of the data collection process. Unbiased and efficient estimation of the parameters governing the response mean model requires the missing data to be appropriately addressed. This paper compares and contrasts the Maximum Likelihood and Inverse Probability Weighting estimators in an Outcome-Dependendent Sampling design that deliberately generates incomplete observations. WE demonstrate the comparison through numerical simulations under varied conditions: different coefficient of determination, and whether or not the mean model is misspecified.


We’Re Here To Get You There: A Statistical Analysis Of Bridgewater State University’S Transit System, Abigail Adams May 2021

We’Re Here To Get You There: A Statistical Analysis Of Bridgewater State University’S Transit System, Abigail Adams

Honors Program Theses and Projects

Bridgewater State University first established its on-campus transportation service in January of 1984. While it began only running as an on-campus service for students throughout the day, the service grew to expand by offering an off-campus connection to the neighboring city of Brockton and absorbed the night service system from the campus safety team. As BSU Transit continues to grow, the organization is seeking ways to improve their overall service and better prepare their fleet and driver pool to accommodate this growth. The purpose of this research is to analyze trends among the data collected by BSU Transit and assist …


Guidelines For Regression Analysis In Sas And R: A Case Study, Sarah Milligan May 2021

Guidelines For Regression Analysis In Sas And R: A Case Study, Sarah Milligan

Honors Program Theses and Projects

When a player is a free agent, an individual who is able to sign to any team, one wonders what their best option is. Will signing with Team A or Team B provide them with the largest salary? What factors will affect their salary the most? Does last year’s statistics have a strong impact on next year’s salary? These questions can be answered by performing a regression analysis on previous years data. The primary focus of this project is to determine the most important variables related to an NBA salary. Likewise, the statistical programs SAS and R will be compared …


A Study On Differing Generational Values And Expectations In Corporate America, Abigail Grella May 2021

A Study On Differing Generational Values And Expectations In Corporate America, Abigail Grella

Honors Program Theses and Projects

This paper examines the most common factors that lead to voluntary employee turnover, and the implications employee turnover has on an organization. Additionally, this paper will consider the varying values and workplace expectations of different demographic groups such as Millennials, Generation X, Generation Y, and Baby Boomers and how such factors could influence voluntary turnover. A study is conducted from survey results gathered across a large span of generations that are currently employed. Using statistical analysis employing t-tests and a Mood’s Median test, the results show that different generations have differently weighing values for specific organizational offerings. The results show …


Statistical Analysis Of 2017-18 Premier League Match Statistics Using A Regression Analysis In R, Bergen Campbell May 2021

Statistical Analysis Of 2017-18 Premier League Match Statistics Using A Regression Analysis In R, Bergen Campbell

Undergraduate Theses and Capstone Projects

This thesis analyzes the correlation between a team’s statistics and the success of their performances, and develops a predictive model that can be used to forecast final season results for that team. Data from the 2017-2018 Premier League season is to be gathered and broken down within R to highlight what factors and variables are largely contributing to the success or downfall of a team. A multiple linear regression model and stepwise selection process is then used to include any factors that are significant in predicting in match results.

The predictions about the 17-18 season results based on the model …


Does Defense Actually Win Championships? Using Statistics To Examine One Of The Greatest Stereotypes In Sports, Thomas Burkett Apr 2021

Does Defense Actually Win Championships? Using Statistics To Examine One Of The Greatest Stereotypes In Sports, Thomas Burkett

Senior Theses

A common saying in sports is that “defense wins championships.” However, the past decade of play in the modern NBA has seen a rise and focus in offensive efficiency and 3-pointers. This thesis tests whether defense can truly predict a championship winning team in today’s NBA through two-sample hypothesis testing and multiple logistic regression models. The results found that both defensive and offensive statistics were significant predictors of championship teams, meaning that a balanced team, rather than one specialized in defense alone, is a more accurate predictor of championship success.


Clustering Web Users By Mouse Movement To Detect Bots And Botnet Attacks, Justin L. Morgan Mar 2021

Clustering Web Users By Mouse Movement To Detect Bots And Botnet Attacks, Justin L. Morgan

Master's Theses

The need for website administrators to efficiently and accurately detect the presence of web bots has shown to be a challenging problem. As the sophistication of modern web bots increases, specifically their ability to more closely mimic the behavior of humans, web bot detection schemes are more quickly becoming obsolete by failing to maintain effectiveness. Though machine learning-based detection schemes have been a successful approach to recent implementations, web bots are able to apply similar machine learning tactics to mimic human users, thus bypassing such detection schemes. This work seeks to address the issue of machine learning based bots bypassing …


The Wargaming Commodity Course Of Action Automated Analysis Method, William T. Deberry Mar 2021

The Wargaming Commodity Course Of Action Automated Analysis Method, William T. Deberry

Theses and Dissertations

This research presents the Wargaming Commodity Course of Action Automated Analysis Method (WCCAAM), a novel approach to assist wargame commanders in developing and analyzing courses of action (COAs) through semi-automation of the Military Decision Making Process (MDMP). MDMP is a seven-step iterative method that commanders and mission partners follow to build an operational course of action to achieve strategic objectives. MDMP requires time, resources, and coordination – all competing items the commander weighs to make the optimal decision. WCCAAM receives the MDMP's Mission Analysis phase as input, converts the wargame into a directed graph, processes a multi-commodity flow algorithm on …


Machine Learning Morphisms: A Framework For Designing And Analyzing Machine Learning Work Ows, Applied To Separability, Error Bounds, And 30-Day Hospital Readmissions, Eric Zenon Cawi Jan 2021

Machine Learning Morphisms: A Framework For Designing And Analyzing Machine Learning Work Ows, Applied To Separability, Error Bounds, And 30-Day Hospital Readmissions, Eric Zenon Cawi

McKelvey School of Engineering Theses & Dissertations

A machine learning workflow is the sequence of tasks necessary to implement a machine learning application, including data collection, preprocessing, feature engineering, exploratory analysis, and model training/selection. In this dissertation we propose the Machine Learning Morphism (MLM) as a mathematical framework to describe the tasks in a workflow. The MLM is a tuple consisting of: Input Space, Output Space, Learning Morphism, Parameter Prior, Empirical Risk Function. This contains the information necessary to learn the parameters of the learning morphism, which represents a workflow task. In chapter 1, we give a short review of typical tasks present in a workflow, as …


Genetics Of Pediatric Musculoskeletal Disorders, Lilian Antunes Jan 2021

Genetics Of Pediatric Musculoskeletal Disorders, Lilian Antunes

Arts & Sciences Electronic Theses and Dissertations

Pediatric musculoskeletal disorders are an extremely broad category of diseases that are often inherited. While individually rare, collectively these disorders are common, affecting around 3% of live births in the US. Despite the mounting clinical and molecular evidence for a genetic etiology, the cause for many patients with pediatric musculoskeletal disorders remain largely unknown. Major challenges in rare pediatric diseases include recruiting large numbers of patients and determining the significance and functional impacts of variants associated with disease within individuals or families. Whole exome sequencing (WES) is a powerful tool to identify coding variants that are associated with rare pediatric …


Review Of Social Workers Count: Numbers And Social Issues By Michael Anthony Lewis, Michael T. Catalano Jan 2021

Review Of Social Workers Count: Numbers And Social Issues By Michael Anthony Lewis, Michael T. Catalano

Numeracy

Lewis, Michael Anthony. 2017. Social Workers Count: Numbers and Social Issues. 2019. New York: Oxford University Press. 223 pp. ISBN 978-019046713-5

The numeracy movement, although largely birthed within the mathematics community, is an outside-the-box endeavor which has always sought to break down or at least transgress traditional disciplinary boundaries. Michael Anthony Lewis’s book is a testament that this effort is succeeding. Lewis is a social worker and sociologist with an impressive resume, author of Economics for Social Workers, co-editor of The Ethics and Economics of the Basic Income Guarantee, and member of the faculty at the Silberman School …


Fourth Down Decision Making: Challenging The Conservative Nature Of Nfl Coaches, Will Palmquist, Ryan Elmore, Benjamin Williams Jan 2021

Fourth Down Decision Making: Challenging The Conservative Nature Of Nfl Coaches, Will Palmquist, Ryan Elmore, Benjamin Williams

DU Undergraduate Research Journal Archive

This thesis analyzes the hypothesis that coaches in the National Football League are often too conservative in their decision making on fourth downs. I used R Studio and NFL play-by-play data to simulate actual football plays and drives according to different fourth down strategies. By measuring expected points per drive over thousands of simulated drives, we are able to evaluate the effectiveness of different fourth down strategies. This research points to a number of conclusions regarding the nature of NFL coaches on fourth downs as well as the complexity of modeling and simulating decision making in a complex sport such …


Indispensable Statistics For The Behavioral Sciences ~With Spss 26, Howard Reid Ph.D. Jan 2021

Indispensable Statistics For The Behavioral Sciences ~With Spss 26, Howard Reid Ph.D.

Open Educational Resources (OER)

While there are many fine introductory statistics books, undergraduate students often continue to view statistics courses negatively. And many fear they will be unable to master the basic level of understanding that is essential to progress in their majors. The present text is an attempt to rethink what students majoring in the behavioral sciences absolutely must learn in an introductory statistics course and how best to organize the presentation of this material so they can succeed in their chosen field of study.

Every book is written from some perspective. The perspective of this book is that a first course in …


Power And Statistical Significance In Securities Fraud Litigation, Jill E. Fisch, Jonah B. Gelbach Jan 2021

Power And Statistical Significance In Securities Fraud Litigation, Jill E. Fisch, Jonah B. Gelbach

All Faculty Scholarship

Event studies, a half-century-old approach to measuring the effect of events on stock prices, are now ubiquitous in securities fraud litigation. In determining whether the event study demonstrates a price effect, expert witnesses typically base their conclusion on whether the results are statistically significant at the 95% confidence level, a threshold that is drawn from the academic literature. As a positive matter, this represents a disconnect with legal standards of proof. As a normative matter, it may reduce enforcement of fraud claims because litigation event studies typically involve quite low statistical power even for large-scale frauds.

This paper, written for …


The Combined Impact Of Continuous And Ordinal Auxiliary Variables On Missing Data Imputation In Sem, Salina Wu Whitaker Jan 2021

The Combined Impact Of Continuous And Ordinal Auxiliary Variables On Missing Data Imputation In Sem, Salina Wu Whitaker

Electronic Theses and Dissertations

“Modern” methods of addressing missing data using full-information maximum-likelihood (FIML) have become mainstays in SEM analyses. FIML allows the inclusion of auxiliary variables which carry information that is related to missing values and can reduce bias in parameter estimates. Past research has illustrated the benefits of auxiliary variable inclusion under different missingness conditions (MCAR and MNAR; e.g., Enders, 2008), missingness proportions (e.g., Collins et al., 2001), and although limited, missingness patterns (e.g., Yoo, 2009) in FIML analyses. While past studies have focused on the effects of either continuous or ordinal auxiliary variables, no study has included both types in their …


Assessing And Forecasting Chlorophyll Abundances In Minnesota Lake Using Remote Sensing And Statistical Approaches, Ben Von Korff Jan 2021

Assessing And Forecasting Chlorophyll Abundances In Minnesota Lake Using Remote Sensing And Statistical Approaches, Ben Von Korff

All Graduate Theses, Dissertations, and Other Capstone Projects

Harmful algae blooms (HABs) can negatively impact water quality, lake aesthetics, and can harm human and animal health. However, monitoring for HABs is rare in Minnesota. Detecting blooms which can vary spatially and may only be present briefly is challenging, so expanding monitoring in Minnesota would require the use of new and cost efficient technologies. Unmanned aerial vehicles (UAVs) were used for bloom mapping using RGB and near-infrared imagery. Real time monitoring was conducted in Bass Lake, in Faribault County, MN using trail cameras. Time series forecasting was conducted with high frequency chlorophyll-a data from a water quality sonde. Normalized …


Evaluation Of The Effect Of The Clinical-Decision-Support Systems On Diabetes Management: A Multivariate Meta-Analysis Comparison With Univariate Meta-Analysis, Abdelfattah Elbarsha Jan 2021

Evaluation Of The Effect Of The Clinical-Decision-Support Systems On Diabetes Management: A Multivariate Meta-Analysis Comparison With Univariate Meta-Analysis, Abdelfattah Elbarsha

Electronic Theses and Dissertations

The advantage of using meta-analysis lies in its ability in providing a quantitative summary of the findings from multiple studies. The aim of this dissertation was first to conduct a simulation study in order to understand what factors (sample size, between-study correlation, and percent of missing data) have a significant effect on meta-analysis estimates and whether using univariate or multivariate meta-analysis would produce different estimates.

The second goal of this study was to evaluate the effect of clinical decision support systems CDSS on diabetes care management by conducting three separate univariate meta-analyses and one multivariate meta-analysis. CDSS are health information …


Ensemble Protein Inference Evaluation, Kyle Lee Lucke Jan 2021

Ensemble Protein Inference Evaluation, Kyle Lee Lucke

Graduate Student Theses, Dissertations, & Professional Papers

The Protein inference problem is becoming an increasingly important tool that aids in the characterization of complex proteomes and analysis of complex protein samples. In bottom-up shotgun proteomics experiments the metrics for evaluation (like AUC and calibration error) are based on an often imperfect target-decoy database. These metrics make the inherent assumption that all of the proteins in the target set are present in the sample being analyzed. In general, this is not the case, they are typically a mix of present and absent proteins. To objectively evaluate inference methods, protein standard datasets are used. These datasets are special in …