Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 9 of 9

Full-Text Articles in Design of Experiments and Sample Surveys

Uconn Baseball Batting Order Optimization, Gavin Rublewski, Gavin Rublewski May 2023

Uconn Baseball Batting Order Optimization, Gavin Rublewski, Gavin Rublewski

Honors Scholar Theses

Challenging conventional wisdom is at the very core of baseball analytics. Using data and statistical analysis, the sets of rules by which coaches make decisions can be justified, or possibly refuted. One of those sets of rules relates to the construction of a batting order. Through data collection, data adjustment, the construction of a baseball simulator, and the use of a Monte Carlo Simulation, I have assessed thousands of possible batting orders to determine the roster-specific strategies that lead to optimal run production for the 2023 UConn baseball team. This paper details a repeatable process in which basic player statistics …


The Wargaming Commodity Course Of Action Automated Analysis Method, William T. Deberry Mar 2021

The Wargaming Commodity Course Of Action Automated Analysis Method, William T. Deberry

Theses and Dissertations

This research presents the Wargaming Commodity Course of Action Automated Analysis Method (WCCAAM), a novel approach to assist wargame commanders in developing and analyzing courses of action (COAs) through semi-automation of the Military Decision Making Process (MDMP). MDMP is a seven-step iterative method that commanders and mission partners follow to build an operational course of action to achieve strategic objectives. MDMP requires time, resources, and coordination – all competing items the commander weighs to make the optimal decision. WCCAAM receives the MDMP's Mission Analysis phase as input, converts the wargame into a directed graph, processes a multi-commodity flow algorithm on …


Assessing Robustness Of The Rasch Mixture Model To Detect Differential Item Functioning - A Monte Carlo Simulation Study, Jinjin Huang Jan 2020

Assessing Robustness Of The Rasch Mixture Model To Detect Differential Item Functioning - A Monte Carlo Simulation Study, Jinjin Huang

Electronic Theses and Dissertations

Measurement invariance is crucial for an effective and valid measure of a construct. Invariance holds when the latent trait varies consistently across subgroups; in other words, the mean differences among subgroups are only due to true latent ability differences. Differential item functioning (DIF) occurs when measurement invariance is violated. There are two kinds of traditional tools for DIF detection: non-parametric methods and parametric methods. Mantel Haenszel (MH), SIBTEST, and standardization are examples of non-parametric DIF detection methods. The majority of parametric DIF detection methods are item response theory (IRT) based. Both non-parametric methods and parametric methods compare differences among subgroups …


Using The R Library Rpanel For Gui-Based Simulations In Introductory Statistics Courses, Ryan M. Allison May 2012

Using The R Library Rpanel For Gui-Based Simulations In Introductory Statistics Courses, Ryan M. Allison

Statistics

As a student, I noticed that the statistical package R (http://www.r-project.org) would have several benefits of its usage in the classroom. One benefit to the package is its free and open-source nature. This would be a great benefit for instructors and students alike since it would be of no cost to use, unlike other statistical packages. Due to this, students could continue using the program after their statistical courses and into their professional careers. It would be good to expose students while they are in school to a tool that professionals use in industry. R also has powerful …


Sample Size Calculations For Roc Studies: Parametric Robustness And Bayesian Nonparametrics, Dunlei Cheng, Adam J. Branscum, Wesley O. Johnson Jan 2012

Sample Size Calculations For Roc Studies: Parametric Robustness And Bayesian Nonparametrics, Dunlei Cheng, Adam J. Branscum, Wesley O. Johnson

Dunlei Cheng

Methods for sample size calculations in ROC studies often assume independent normal distributions for test scores among the diseased and non-diseased populations. We consider sample size requirements under the default two-group normal model when the data distribution for the diseased population is either skewed or multimodal. For these two common scenarios we investigate the potential for robustness of calculated sample sizes under the mis-specified normal model and we compare to sample sizes calculated under a more flexible nonparametric Dirichlet process mixture model. We also highlight the utility of flexible models for ROC data analysis and their importance to study design. …


A Framework For Generating Data To Simulate Application Scoring, Kenneth Kennedy, Sarah Jane Delany, Brian Mac Namee Aug 2011

A Framework For Generating Data To Simulate Application Scoring, Kenneth Kennedy, Sarah Jane Delany, Brian Mac Namee

Conference papers

In this paper we propose a framework to generate artificial data that can be used to simulate credit risk scenarios. Artificial data is useful in the credit scoring domain for two reasons. Firstly, the use of artificial data allows for the introduction and control of variability that can realistically be expected to occur, but has yet to materialise in practice. The ability to control parameters allows for a thorough exploration of the performance of classification models under different conditions. Secondly, due to non-disclosure agreements and commercial sensitivities, obtaining real credit scoring data is a problematic and time consuming task. By …


Accounting For Response Misclassification And Covariate Measurement Error Improves Powers And Reduces Bias In Epidemiologic Studies, Dunlei Cheng, Adam J. Branscum, James D. Stamey Jan 2010

Accounting For Response Misclassification And Covariate Measurement Error Improves Powers And Reduces Bias In Epidemiologic Studies, Dunlei Cheng, Adam J. Branscum, James D. Stamey

Dunlei Cheng

Purpose: To quantify the impact of ignoring misclassification of a response variable and measurement error in a covariate on statistical power, and to develop software for sample size and power analysis that accounts for these flaws in epidemiologic data. Methods: A Monte Carlo simulation-based procedure is developed to illustrate the differences in design requirements and inferences between analytic methods that properly account for misclassification and measurement error to those that do not in regression models for cross-sectional and cohort data. Results: We found that failure to account for these flaws in epidemiologic data can lead to a substantial reduction in …


A Bayesian Approach To Sample Size Determination For Studies Designed To Evaluate Continuous Medical Tests, Dunlei Cheng, Adam J. Branscum, James D. Stamey Jan 2010

A Bayesian Approach To Sample Size Determination For Studies Designed To Evaluate Continuous Medical Tests, Dunlei Cheng, Adam J. Branscum, James D. Stamey

Dunlei Cheng

We develop a Bayesian approach to sample size and power calculations for cross-sectional studies that are designed to evaluate and compare continuous medical tests. For studies that involve one test or two conditionally independent or dependent tests, we present methods that are applicable when the true disease status of sampled individuals will be available and when it will not. Within a hypothesis testing framework, we consider the goal of demonstrating that a medical test has area under the receiver operating characteristic (ROC) curve that exceeds a minimum acceptable level or another relevant threshold, and the goals of establishing the superiority …


Bayesian Approach To Average Power Calculations For Binary Regression Models With Misclassified Outcomes, Dunlei Cheng, James D. Stamey, Adam J. Branscum Dec 2008

Bayesian Approach To Average Power Calculations For Binary Regression Models With Misclassified Outcomes, Dunlei Cheng, James D. Stamey, Adam J. Branscum

Dunlei Cheng

We develop a simulation-based procedure for determining the required sample size in binomial regression risk assessment studies when response data are subject to misclassification. A Bayesian average power criterion is used to determine a sample size that provides high probability, averaged over the distribution of potential future data sets, of correctly establishing the direction of association between predictor variables and the probability of event occurrence. The method is broadly applicable to any parametric binomial regression model including, but not limited to, the popular logistic, probit, and complementary log-log models. We detail a common medical scenario wherein ascertainment of true disease …