Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistics

Discipline
Institution
Publication Year
Publication
Publication Type
File Type

Articles 1 - 30 of 902

Full-Text Articles in Physical Sciences and Mathematics

Research In Short Term Actuarial Modeling, Elijah Howells Jun 2020

Research In Short Term Actuarial Modeling, Elijah Howells

Electronic Theses, Projects, and Dissertations

This paper covers mathematical methods used to conduct actuarial analysis in the short term, such as policy deductible analysis, maximum covered loss analysis, and mixtures of distributions. Assessment of a loss variable's distribution under the effect of a policy deductible, as well as one with an implemented maximum covered loss, and under both a policy deductible and maximum covered loss will also be covered. The derivation, meaning, and use of cost per loss and cost per payment will be discussed, as will those of an aggregate sum distribution, stop loss policy, and maximum likelihood estimation. For each topic, special ...


Sensitivity Analysis For Incomplete Data And Causal Inference, Heng Chen May 2020

Sensitivity Analysis For Incomplete Data And Causal Inference, Heng Chen

Statistical Science Theses and Dissertations

In this dissertation, we explore sensitivity analyses under three different types of incomplete data problems, including missing outcomes, missing outcomes and missing predictors, potential outcomes in \emph{Rubin causal model (RCM)}. The first sensitivity analysis is conducted for the \emph{missing completely at random (MCAR)} assumption in frequentist inference; the second one is conducted for the \emph{missing at random (MAR)} assumption in likelihood inference; the third one is conducted for one novel assumption, the ``sixth assumption'' proposed for the robustness of instrumental variable estimand in causal inference.


Statistical Models And Analysis Of Univariate And Multivariate Degradation Data, Lochana Palayangoda May 2020

Statistical Models And Analysis Of Univariate And Multivariate Degradation Data, Lochana Palayangoda

Statistical Science Theses and Dissertations

For degradation data in reliability analysis, estimation of the first-passage time (FPT) distribution to a threshold provides valuable information on reliability characteristics. Recently, Balakrishnan and Qin (2019; Applied Stochastic Models in Business and Industry, 35:571-590) studied a nonparametric method to approximate the FPT distribution of such degradation processes if the underlying process type is unknown. In this thesis, we propose improved techniques based on saddlepoint approximation, which enhance upon their suggested methods. Numerical examples and Monte Carlo simulation studies are used to illustrate the advantages of the proposed techniques. Limitations of the improved techniques are discussed and some possible ...


Evaluation Of The Utility Of Informative Priors In Bayesian Structural Equation Modeling With Small Samples, Hao Ma May 2020

Evaluation Of The Utility Of Informative Priors In Bayesian Structural Equation Modeling With Small Samples, Hao Ma

Department of Education Policy and Leadership Theses and Dissertations

The estimation of parameters in structural equation modeling (SEM) has been primarily based on the maximum likelihood estimator (MLE) and relies on large sample asymptotic theory. Consequently, the results of the SEM analyses with small samples may not be as satisfactory as expected. In contrast, informative priors typically do not require a large sample, and they may be helpful for improving the quality of estimates in the SEM models with small samples. However, the role of informative priors in the Bayesian SEM has not been thoroughly studied to date. Given the limited body of evidence, specifying effective informative priors remains ...


Using Stability To Select A Shrinkage Method, Dean Dustin May 2020

Using Stability To Select A Shrinkage Method, Dean Dustin

Dissertations and Theses in Statistics

Shrinkage methods are estimation techniques based on optimizing expressions to find which variables to include in an analysis, typically a linear regression. The general form of these expressions is the sum of an empirical risk plus a complexity penalty based on the number of parameters. Many shrinkage methods are known to satisfy an ‘oracle’ property meaning that asymptotically they select the correct variables and estimate their coefficients efficiently. In Section 1.2, we show oracle properties in two general settings. The first uses a log likelihood in place of the empirical risk and allows a general class of penalties. The ...


The Effects Of Zoledronate And Sleep Deprivation On The Distal Femur Trabecular Thickness Of Ovariectomized Rats: Application Of Different Statistical Methods, Erin Nolte May 2020

The Effects Of Zoledronate And Sleep Deprivation On The Distal Femur Trabecular Thickness Of Ovariectomized Rats: Application Of Different Statistical Methods, Erin Nolte

Student Scholar Symposium Abstracts and Posters

Osteoporosis is a disease that causes the degradation of bone, leading to an increased risk of fracture. 1 in 3 women over the age of 50 will be affected by Osteoporosis. This study aims to understand how bone is affected by sleep deprivation in estrogen-deficient rats, and how Zoledronate might negate the inimical effects of sleep deprivation on bone. As bone mineral density (BMD) is a crude evaluation of the architectural changes seen in Osteoporosis, trabecular thickness may serve as a better single evaluation of bone health. 31 Wistar female rats were ovariectomized and separated into 4 random groups. The ...


Analysis Of Gas Mileage Of A Car, Joshua Ballard-Myer Apr 2020

Analysis Of Gas Mileage Of A Car, Joshua Ballard-Myer

Georgia College Student Research Events

The objective of this work is to analyze a data set, Auto, from the R package ISLR: Introduction to Statistical Learning in R. The data set includes information for 392 observations on 9 variables including gas mileage, horsepower, weight in pounds, and engine displacement in cubic inches. The data set was taken from the StatLib library maintained at Carnegie Mellon University. The primary response variable will be gas mileage in miles per gallon, with all other variables serving as predictors, but other relationships with other response variables such as acceleration will be explored. Results were similar to expected; traits desirable ...


Boom Or Bust: Examining The Relationship Between High School Recruiting Rankings And The Nfl Draft, Nicholas E. Tice Apr 2020

Boom Or Bust: Examining The Relationship Between High School Recruiting Rankings And The Nfl Draft, Nicholas E. Tice

Senior Theses

The goal of this thesis is to model the probability of a high school football player’s chance of being drafted based on information taken from their recruiting profile. The response variable is binary and defined as drafted (1) or undrafted (0). The independent variables were collected by scraping data from the recruiting websites including height, weight, position, hometown, recruiting grade and other socioeconomic factors based on the player’s high school. 247Sports and ESPN were the two recruiting services used and compared in this study. Because of the binary nature of the dependent variable, logistic regression and decision trees ...


Investigating Major League Baseball Pitchers And Quality Of Contact Through Cluster Analysis, Charlie Marcou Apr 2020

Investigating Major League Baseball Pitchers And Quality Of Contact Through Cluster Analysis, Charlie Marcou

Honors Projects

This paper investigates the quality of contact that a pitcher allows. Not much is currently known about quality of contact, but if factors determining quality of contact could be determined it could assist teams in identifying and developing pitching talent. There are many problems that come with investigating the control pitchers have over contact allowed, but one area to investigate is whether quality of contact is a repeatable skill. Furthermore, if it is a repeatable skill, then it is important to investigate what kind of benefit controlling contact allowed brings a pitcher. Along with this, groundball and flyball tendencies, and ...


Dice Questions Answered, Warren Campbell, William P. Dolan Apr 2020

Dice Questions Answered, Warren Campbell, William P. Dolan

SEAS Faculty Publications

Superstitious discussion of fair and unfair dice has pervaded the tabletop gaming industry since its inception. Many of these are not based on any quantitative data or studies. Consequently, misconceptions have been spread widely. One dice float test video on Youtube currently has 925,000 views (Fisher, 2015a). To combat the flood of misconceptions we investigated the following questions: 1) Are dice cursed? 2) Are D20s (20-sided dice) less fair than D6s (6-sided dice)? 3) Do float tests tell anything about the fairness of dice? 4) Are some dice systems inherently fairer than others? 5) Are density differences or dimensions ...


The Importance Of Type I Error Rates When Studying Bias In Monte Carlo Studies In Statistics, Michael Harwell Feb 2020

The Importance Of Type I Error Rates When Studying Bias In Monte Carlo Studies In Statistics, Michael Harwell

Journal of Modern Applied Statistical Methods

Two common outcomes of Monte Carlo studies in statistics are bias and Type I error rate. Several versions of bias statistics exist but all employ arbitrary cutoffs for deciding when bias is ignorable or non-ignorable. This article argues Type I error rates should be used when assessing bias.


Power And Statistical Significance In Securities Fraud Litigation, Jill E. Fisch, Jonah B. Gelbach Feb 2020

Power And Statistical Significance In Securities Fraud Litigation, Jill E. Fisch, Jonah B. Gelbach

Faculty Scholarship at Penn Law

Event studies, a half-century-old approach to measuring the effect of events on stock prices, are now ubiquitous in securities fraud litigation. In determining whether the event study demonstrates a price effect, expert witnesses typically base their conclusion on whether the results are statistically significant at the 95% confidence level, a threshold that is drawn from the academic literature. As a positive matter, this represents a disconnect with legal standards of proof. As a normative matter, it may reduce enforcement of fraud claims because litigation event studies typically involve quite low statistical power even for large-scale frauds.

This paper, written for ...


Art, Artfulness, Or Artifice?: A Review Of The Art Of Statistics: How To Learn From Data, By David Spiegelhalter, Jason Makansi Jan 2020

Art, Artfulness, Or Artifice?: A Review Of The Art Of Statistics: How To Learn From Data, By David Spiegelhalter, Jason Makansi

Numeracy

David Spiegelhalter. 2019. The Art of Statistics: How to Learn From Data. (London: The Penguin Group). 444 pp. ISBN 978-1541618510

The author successfully eases the reader away from the rigor of statistical methods and calculations and into the realm of statistical thinking. Despite an engaging style and attention-grabbing examples, the reader of The Art of Statistics will need more than a casual grounding in statistics to get what Spiegelhalter, I believe, intends from his book. It should be viewed as a companion to a more rigorous textbook on statistical methods but not necessarily a book that makes statistics any less ...


The Author’S Reflections On No B.S. (Bad Stats): Black People Need People Who Believe In Black People Enough Not To Believe Every Bad Thing They Hear About Black People, Ivory A. Toldson Jan 2020

The Author’S Reflections On No B.S. (Bad Stats): Black People Need People Who Believe In Black People Enough Not To Believe Every Bad Thing They Hear About Black People, Ivory A. Toldson

Numeracy

Toldson, Ivory. A. 2019. No BS (Bad Stats): Black People Need People Who Believe in Black People Enough Not to Believe Every Bad Thing They Hear About Black People (Boston, MA: Brill-Sense) 194 pp. ISBN 978-9004397026.

This essay provides an introduction to No BS (Bad Stats): Black People Need People Who Believe in Black People Enough Not to Believe Every Bad Thing They Hear About Black People. In the essay, the author discusses how cynical views about the educational potential of Black children motivated him to write a book that challenges negative statistics. The essay also outlines the harmful consequences ...


Gradient Boosting For Survival Analysis With Applications In Oncology, Nam Phuong Nguyen Jan 2020

Gradient Boosting For Survival Analysis With Applications In Oncology, Nam Phuong Nguyen

Graduate Theses and Dissertations

Cancer is one of the most deadly diseases that the world has been fighting against over decades. An enormous number of research has been conducted, via a wide scale of approaches, raging from genetic analysis to mathematical modeling. Survival analysis is a well-performed methodology frequently used to estimate the survival probability of a patient. Although there has been a large number of methods for survival analysis, efficient exploration of a high-dimensional feature space has been challenging due to its computational cost and complexity. This thesis adapts the component-wise gradient boosting algorithms for cancer survival analysis, and also proposes a new ...


Playfair's Introduction Of Bar And Pie Charts To Represent Data, Diana White, River Bond, Joshua Eastes, Negar Janani Jan 2020

Playfair's Introduction Of Bar And Pie Charts To Represent Data, Diana White, River Bond, Joshua Eastes, Negar Janani

Statistics and Probability

No abstract provided.


Representing And Interpreting Data From Playfair, Diana White, River Bond, Joshua Eastes, Negar Janani Jan 2020

Representing And Interpreting Data From Playfair, Diana White, River Bond, Joshua Eastes, Negar Janani

Statistics and Probability

No abstract provided.


Accuracy Of Avs Life Expectancy Reports, Ariya Aghababa Jan 2020

Accuracy Of Avs Life Expectancy Reports, Ariya Aghababa

Williams Honors College, Honors Research Projects

Use insurance company data to predict the trends in life insurance life expectancy reports. Also, use the data to predict what impairments could potentially decrease or increase an insured's life expectancy based on reports created by various Actuaries at life settlement companies.


An Examination Of Covid-19 Statistical Modeling, Shane Vaughan Jan 2020

An Examination Of Covid-19 Statistical Modeling, Shane Vaughan

Williams Honors College, Honors Research Projects

The 2019 novel coronavirus, also known as COVID-19, is an infectious disease which was first reported in late 2019 and soon spread to become a global pandemic, prompting major action from world governments. Soon after, many institutions began attempts to analyze and predict the spread and severity of the disease via statistical modeling. Some information is not available for public consumption; however, a number of institutions have published the results of their analyses and some have made public repositories of the code used to build the models. This research paper attempts use these and other resources to examine the modeling ...


Predicting Diabetes Diagnoses, Sarah Netchert Jan 2020

Predicting Diabetes Diagnoses, Sarah Netchert

Student Research Poster Presentations 2020

This study explored the traits and health state of African Americans in central Virginia in order to determine what traits put people at a higher probability of being diagnosed with diabetes. We also want to know which traits will generate the highest probability a person will be diagnosed with diabetes. Traits that were included and used in this study were cholesterol, stabilized glucose, high density lipoprotein levels, age(years), gender, height(inches), weight(pounds), systolic blood pressure, diastolic blood pressure, waist size(inches), and hip size(inches). There were 403 individuals included in study since they were only ones screened ...


Power Analysis On A Pilot Study Of The Caloric Intake Of Children Helping Prepare Meals Versus Children Not, Danielle Clifford Jan 2020

Power Analysis On A Pilot Study Of The Caloric Intake Of Children Helping Prepare Meals Versus Children Not, Danielle Clifford

Student Research Poster Presentations 2020

The purpose of this analysis is to determine the sample size needed for a study that will be used to discover if there is a difference in the caloric intake of children who help with meal preparation and children who do not help with meal preparation.


Outlier Profiles Of Atomic Structures Derived From X-Ray Crystallography And From Cryo-Electron Microscopy, Lin Chen, Jing He, Angelo Facchiano Jan 2020

Outlier Profiles Of Atomic Structures Derived From X-Ray Crystallography And From Cryo-Electron Microscopy, Lin Chen, Jing He, Angelo Facchiano

Computer Science Faculty Publications

Background: As more protein atomic structures are determined from cryo-electron microscopy (cryo-EM) density maps, validation of such structures is an important task. Methods: We applied a histogram-based outlier score (HBOS) to six sets of cryo-EM atomic structures and five sets of X-ray atomic structures, including one derived from X-ray data with better than 1.5 Å resolution. Cryo-EM data sets contain structures released by December 2016 and those released between 2017 and 2019, derived from resolution ranges 0–4 Å and 4–6 Å respectively. Results: The distribution of HBOS values in five sets of X-ray structures show that HBOS ...


Inference Of Heterogeneity In Meta-Analysis Of Rare Binary Events And Rss-Structured Cluster Randomized Studies, Chiyu Zhang Dec 2019

Inference Of Heterogeneity In Meta-Analysis Of Rare Binary Events And Rss-Structured Cluster Randomized Studies, Chiyu Zhang

Statistical Science Theses and Dissertations

This dissertation contains two topics: (1) A Comparative Study of Statistical Methods for Quantifying and Testing Between-study Heterogeneity in Meta-analysis with Focus on Rare Binary Events; (2) Estimation of Variances in Cluster Randomized Designs Using Ranked Set Sampling.

Meta-analysis, the statistical procedure for combining results from multiple studies, has been widely used in medical research to evaluate intervention efficacy and safety. In many practical situations, the variation of treatment effects among the collected studies, often measured by the heterogeneity parameter, may exist and can greatly affect the inference about effect sizes. Comparative studies have been done for only one or ...


Fractional Random Weighted Bootstrapping For Classification On Imbalanced Data With Ensemble Decision Tree Methods, Sean Charles Carter Nov 2019

Fractional Random Weighted Bootstrapping For Classification On Imbalanced Data With Ensemble Decision Tree Methods, Sean Charles Carter

Graduate Theses and Dissertations

Ensemble methods are commonly used for building predictive models for classification. Models that are unstable to perturbations in the training set, such as the decision tree, often see considerable reductions in error when grouped, using bootstrapped resamples of the training data to train many models. The non-parametric bootstrap, however, has limited efficacy when used on severely imbalanced data, especially when the number of observations of one or more classes is exceptionally small. We explore the fractional random weighted bootstrap, which randomly assigns fractional weights to observations, as an alternative resampling pro cedure in training machine learning ensembles, particularly decision tree ...


9th Annual Postdoctoral Science Symposium, University Of Texas Md Anderson Cancer Center Postdoctoral Association Sep 2019

9th Annual Postdoctoral Science Symposium, University Of Texas Md Anderson Cancer Center Postdoctoral Association

MD Anderson Cancer Center Postdoctoral Association Annual Postdoctoral Science Symposium Abstracts

The mission of the Annual Postdoctoral Science Symposium (APSS) is to provide a platform for talented postdoctoral fellows throughout the Texas Medical Center to present their work to a wider audience. The MD Anderson Postdoctoral Association convened its inaugural Annual Postdoctoral Science Symposium (APSS) on August 4, 2011.

The APSS provides a professional venue for postdoctoral scientists to develop, clarify, and refine their research as a result of formal reviews and critiques of faculty and other postdoctoral scientists. Additionally, attendees discuss current research on a broad range of subjects while promoting academic interactions and enrichment and developing new collaborations.


Stability And Application Of The K-Core Dynamical Model To Biological Networks, Francesca Beatrice Arese Lucini Sep 2019

Stability And Application Of The K-Core Dynamical Model To Biological Networks, Francesca Beatrice Arese Lucini

All Dissertations, Theses, and Capstone Projects

The objective of the dissertation is to illustrate the importance of the k-core dynamical model, by first presenting the stability analysis of the nonlinear k-core model and compare its solution to the most widely used linear model. Second, I show a real world application of the k-core model to describe properties of neural networks, specifically, the transition from conscious to subliminal perception.


Sample Size Requirements And Considerations For Models To Assess Human-Machine System Performance, Jennifer S. G. Lopez Sep 2019

Sample Size Requirements And Considerations For Models To Assess Human-Machine System Performance, Jennifer S. G. Lopez

Theses and Dissertations

Hierarchical Linear Models (HLMs), also known as multi-level models, are an extension of multiple regression analysis and can aid in the understanding of human and machine workloads of a system. These models allow for prediction and testing in systems with hierarchies of two or more levels. The complex interrelated variability of these multi-level models exists in operational settings, such as the Air Force Distributed Common Ground System Full Motion Video (AF DCGS FMV) community which is composed of individuals (Level-1), groups (Level-2), units (Level-3), and organizations (Level-4). Through the development of sample size requirements and considerations for multi-level models, this ...


Who Can Act? Critical Assumptions At The Foundations Of Statistical Analysis, Peter J. Taylor Aug 2019

Who Can Act? Critical Assumptions At The Foundations Of Statistical Analysis, Peter J. Taylor

Working Papers on Science in a Changing World

Thinking about a simple teaching example on the t-test for comparing the average (mean) for some measurement in a group versus the average in another led me to articulate a sequence of thoughts and questions about the foundations of statistical analysis. In particular, my inquiry explores contrasts between: the statistical emphasis on averages or types around which there is variation or noise; variation as a mixture of types; the dynamics (or heterogeneous mix of dynamics) that generated the data analyzed; and participatory restructuring of these dynamics in the future. Two key issues are: Who is assumed to be able to ...


Bayesian And Positive Matrix Factorization Approaches To Pollution Source Apportionment, Jeff William Lingwall Aug 2019

Bayesian And Positive Matrix Factorization Approaches To Pollution Source Apportionment, Jeff William Lingwall

Jeff Lingwall

The use of Positive Matrix Factorization (PMF) in pollution source apportionment (PSA) is examined and illustrated. A study of its settings is conducted in order to optimize them in the context of PSA. The use of a priori information in PMF is examined, in the form of target factor profiles and pulling profile elements to zero. A Bayesian model using lognormal prior distributions for source profiles and source contributions is fit and examined.


A Bayesian Approach To Deriving Ages Of Individual Field White Dwarfs, Erin M. O'Malley, Ted Von Hippel, David A. Van Dyk Aug 2019

A Bayesian Approach To Deriving Ages Of Individual Field White Dwarfs, Erin M. O'Malley, Ted Von Hippel, David A. Van Dyk

Ted von Hippel

We apply a self-consistent and robust Bayesian statistical approach to determine the ages, distances, and zero-age main sequence (ZAMS) masses of 28 field DA white dwarfs (WDs) with ages of approximately 4-8 Gyr. Our technique requires only quality optical and near-infrared photometry to derive ages with <15% uncertainties, generally with little sensitivity to our choice of modern initial-final mass relation. We find that age, distance, and ZAMS mass are correlated in a manner that is too complex to be captured by traditional error propagation techniques. We further find that the posterior distributions of age are often asymmetric, indicating that the standard approach to deriving WD ages can yield misleading results.