Open Access. Powered by Scholars. Published by Universities.®

Applied Statistics Commons

Open Access. Powered by Scholars. Published by Universities.®

3,418 Full-Text Articles 4,769 Authors 2,500,015 Downloads 158 Institutions

All Articles in Applied Statistics

Faceted Search

3,418 full-text articles. Page 3 of 101.

A Course In Data Science: R And Prediction Modeling, Adam Kapelner 2022 CUNY Queens College

A Course In Data Science: R And Prediction Modeling, Adam Kapelner

Open Educational Resources

This is a self-contained course in data science and machine learning using R. It covers philosophy of modeling with data, prediction via linear models, machine learning including support vector machines and random forests, probability estimation and asymmetric costs using logistic regression and probit regression, underfitting vs. overfitting, model validation, handling missingness and much more. There is formal instruction of data manipulation using dplyr and data.table, visualization using ggplot2 and statistical computing.


Model Averaging In Agriculture And Natural Resources: What Is It? When Is It Useful? When Is It A Distraction?, Philip M. Dixon 2022 Iowa State University

Model Averaging In Agriculture And Natural Resources: What Is It? When Is It Useful? When Is It A Distraction?, Philip M. Dixon

Conference on Applied Statistics in Agriculture and Natural Resources

I use two examples to illustrate three methods for model averaging: using AIC weights, using BIC weights, and fully Bayesian analyses. The first example is a capture-recapture study that estimates the population size by averaging over 4 models for capture probabilities. The second is an analysis of a study of logging impacts on Curculionid weevils using a before-after-control-impact (BACI) study design. The estimated impact is averaged over 4 ecologically relevant models.

Both examples demonstrate the sensitivity of model weights, or posterior model probabilities, to the choice of prior model probabilities and prior distributions for parameters. The model averaged estimates and …


A Novel Correction For The Adjusted Box-Pierce Test, Sidy Danioko, Jianwei Zheng, Kyle Anderson, Alexander Barrett, Cyril S. Rakovski 2022 Chapman University

A Novel Correction For The Adjusted Box-Pierce Test, Sidy Danioko, Jianwei Zheng, Kyle Anderson, Alexander Barrett, Cyril S. Rakovski

Mathematics, Physics, and Computer Science Faculty Articles and Research

The classical Box-Pierce and Ljung-Box tests for auto-correlation of residuals possess severe deviations from nominal type I error rates. Previous studies have attempted to address this issue by either revising existing tests or designing new techniques. The Adjusted Box-Pierce achieves the best results with respect to attaining type I error rates closer to nominal values. This research paper proposes a further correction to the adjusted Box-Pierce test that possesses near perfect type I error rates. The approach is based on an inflation of the rejection region for all sample sizes and lags calculated via a linear model applied to simulated …


A Robust Clustering Method Using Compositional Data Restrictions: Studying Wood Properties In The Reforestation Of Portugal, Pamela M. Chiroque-Solano, Guido A. Moreira 2022 University of Trás-os-Montes e Alto Douro

A Robust Clustering Method Using Compositional Data Restrictions: Studying Wood Properties In The Reforestation Of Portugal, Pamela M. Chiroque-Solano, Guido A. Moreira

Conference on Applied Statistics in Agriculture and Natural Resources

Classification of multivariate observations while preserving the data’s natural restriction is a challenge. Special properties such as identifiability, interpretability, and others need to be cared for to build a new approach. To avoid these complications, many transformation algorithms have been developed to use traditional models.In this context, the aim of this work is to propose a robust probabilistic distance algorithm to classify compositional data. Based on the probabilistic distance (PD) clustering approach, the proposal identifies clusters minimizing a joint distance function, JDF, which is part of a dissimilarity measure. This measure combines the PD clustering approach with the density of …


Random Regression For Modeling Semen Fertility In Hf Purebred And Crossbred Bulls Using A Bayesian Framework, Vrinda Ambike, R. Venkataramanan, S. M. K. Karthickeyan, K. G. Tirumurugaan, Kaustubh Bhave, M. Swaminathan 2022 Tamil Nadu Veterinary and Animal Sciences University

Random Regression For Modeling Semen Fertility In Hf Purebred And Crossbred Bulls Using A Bayesian Framework, Vrinda Ambike, R. Venkataramanan, S. M. K. Karthickeyan, K. G. Tirumurugaan, Kaustubh Bhave, M. Swaminathan

Conference on Applied Statistics in Agriculture and Natural Resources

Data on insemination records of Holstein Friesian (HF) purebred (n=45,497) and crossbred (n=58,497) collected from the BAIF Research Foundation were utilized. The conception rate was modeled as a binary trait, using linear repeatability models. Random regression models (RRM) were used to obtain the trajectory of variance components across age of the bulls. Legendre Polynomials up to order of fit of 4 were used for the random effects of additive genetic and permanent environmental effects. 200,000 Gibbs samples were generated with a burn-in of 20,000 and thinning interval of 50 using the THRGIBBS1F90 program. Heritability estimates were very low (0.1) in …


Rewriting The Rules For Diagnostics: Implications Of Probability And Measure Theory For Sars-Cov-2 Testing, Paul Patrone, Anthony Kearsley 2022 National Institute of Standards and Technology

Rewriting The Rules For Diagnostics: Implications Of Probability And Measure Theory For Sars-Cov-2 Testing, Paul Patrone, Anthony Kearsley

Biology and Medicine Through Mathematics Conference

No abstract provided.


Age-Dependent Ventilator-Induced Lung Injury, Quintessa Hay, Christopher Grubb, Rebecca L. Heise, Sarah Minucci, Michael S. Valentine, Jennifer Van Mullekom, Angela M. Reynolds 2022 Virginia Commonwealth University

Age-Dependent Ventilator-Induced Lung Injury, Quintessa Hay, Christopher Grubb, Rebecca L. Heise, Sarah Minucci, Michael S. Valentine, Jennifer Van Mullekom, Angela M. Reynolds

Biology and Medicine Through Mathematics Conference

No abstract provided.


Principal Response Curve Analysis Of Arthropod Community Abundance Data With Sparse Subsets, Changjian Jiang, C. R. Brown, P. Asiimwe, Chen Meng, Adam W. Schapaugh 2022 Bayer Crop Science

Principal Response Curve Analysis Of Arthropod Community Abundance Data With Sparse Subsets, Changjian Jiang, C. R. Brown, P. Asiimwe, Chen Meng, Adam W. Schapaugh

Conference on Applied Statistics in Agriculture and Natural Resources

Principal response curve (PRC) analysis was applied to an assessment of the ecological impact of the genetically-modified (GM), insect-resistant, cotton MON 88702 on predatory Hemiptera communities in the field. The field community was represented by ten taxa collected ten times across the season at six sites, in which individual taxa were not observed in at least 25% of the time (unique site x collection combinations). These complete absences and those nearly so, called sparse subsets of the data in this investigation, were the result of geoclimatic and seasonal variations, which are both independent of the treatment effect for which the …


Handling Non-Detects With Imputation In A Nested Design: A Simulation Study, Rose Adjei, John R. Stevens 2022 Utah State University

Handling Non-Detects With Imputation In A Nested Design: A Simulation Study, Rose Adjei, John R. Stevens

Conference on Applied Statistics in Agriculture and Natural Resources

In this paper, a simulation study was conducted to assess whether it is ideal to address the issue of non-detects in data using a traditional substitution approach for non-detects, imputation, or a non-imputation based approach. Simulated data used were simple nested designs motivated by a real-life data in a study of bumble bee activity in a commercial cherry orchard by Kuivila et al. (2021). The simulated data were generated at different thresholds or censoring levels and at different effect sizes. For each simulated data, seven popular existing techniques to handle non-detects were applied: (i) Zero substitution, (ii) Substitution with half …


Overview Of Optimal Experimental Design And A Survey Of Its Expanse In Application To Agricultural Studies, Stephen J. Walsh 2022 Utah State University

Overview Of Optimal Experimental Design And A Survey Of Its Expanse In Application To Agricultural Studies, Stephen J. Walsh

Conference on Applied Statistics in Agriculture and Natural Resources

Optimal Design of Experiments is currently recognized as the modern dominant approach to planning experiments in industrial engineering and manufacturing applications. This approach to design has gained traction among practitioners in the last two decades on two-fronts: 1) optimal designs are the result of a complicated optimization calculation and recent advances in both computing efficiency and algorithms have enabled this approach in real time for practitioners, and 2) such designs are now popular because they allow the researcher to ‘design for the experiment’ by working constraints, cost, number of experiments, and the model of the intended post-hoc data analysis into …


Evaluating Soil Health Changes Following Cover Crop And No-Till Integration Into A Soybean (Glycine Max) Cropping System In The Mississippi Alluvial Valley, Alexandra Gwin Firth 2022 Mississippi State University

Evaluating Soil Health Changes Following Cover Crop And No-Till Integration Into A Soybean (Glycine Max) Cropping System In The Mississippi Alluvial Valley, Alexandra Gwin Firth

Theses and Dissertations

The transition of natural landscapes to intensive agricultural uses has resulted in severe loss of soil organic carbon (SOC), increased CO₂ emissions, river depletion, and groundwater overdraft. Despite negative documented effects of agricultural land use (i.e., soil erosion, nutrient runoff) on critical natural resources (i.e., water, soil), food production must increase to meet the demands of a rising human population. Given the environmental and agricultural productivity concerns of intensely managed soils, it is critical to implement conservation practices that mitigate the negative effects of crop production and enhance environmental integrity. In the Mississippi Alluvial Valley (MAV) region of Mississippi, USA, …


A Two-Layer Model Explains Higher-Order Feature Selectivity Of V2 Neurons, Timothy D. Oleskiw, Justin D. Lieber, J. Anthony Movshon, Eero P. Simoncelli 2022 New York University

A Two-Layer Model Explains Higher-Order Feature Selectivity Of V2 Neurons, Timothy D. Oleskiw, Justin D. Lieber, J. Anthony Movshon, Eero P. Simoncelli

MODVIS Workshop

Neurons in cortical area V2 respond selectively to higher-order visual features, such as the quasi-periodic structure of natural texture. However, a functional account of how V2 neurons build selectivity for complex natural image features from their inputs – V1 neurons locally tuned for orientation and spatial frequency – remains elusive.

We made single-unit recordings in area V2 in two fixating rhesus macaques. We presented stimuli composed of multiple superimposed grating patches that localize contrast energy in space, orientation, and scale. V2 activity is modeled via a two-layer linear-nonlinear network, optimized to use a sparse combination of V1-like outputs to account …


Sensory Comparison Of Beer Carbonated Using Forced Carbonation And The Carbo Rock-It, Michala Smith 2022 University of Arkansas, Fayetteville

Sensory Comparison Of Beer Carbonated Using Forced Carbonation And The Carbo Rock-It, Michala Smith

Biological and Agricultural Engineering Undergraduate Honors Theses

Craft brewing is a growing market which represents over 12% of beer produced in the United States. Dr. G Scott Osborn, PE invented the Carbo Rock-It™ to improve the carbonation process for craft breweries. The invention allows for shorter carbonation time and uses less CO2, saving companies money and time. Because of the lack of gas losses through bubbling, Osborn theorized that the Carbo Rock-It could also prevent the “stripping of the nose” that can occur in traditional forced carbonation. Existing research supports the mechanism, as beer flavor and aroma volatiles have been detected during the release of …


Sparse Model Selection Using Information Complexity, Yaojin Sun 2022 University of Tennessee, Knoxville

Sparse Model Selection Using Information Complexity, Yaojin Sun

Doctoral Dissertations

This dissertation studies and uses the application of information complexity to statistical model selection through three different projects. Specifically, we design statistical models that incorporate sparsity features to make the models more explanatory and computationally efficient.

In the first project, we propose a Sparse Bridge Regression model for variable selection when the number of variables is much greater than the number of observations if model misspecification occurs. The model is demonstrated to have excellent explanatory power in high-dimensional data analysis through numerical simulations and real-world data analysis.

The second project proposes a novel hybrid modeling method that utilizes a mixture …


Intraday Algorithmic Trading Using Momentum And Long Short-Term Memory Network Strategies, Andrew R. Whitinger II 2022 East Tennessee State University

Intraday Algorithmic Trading Using Momentum And Long Short-Term Memory Network Strategies, Andrew R. Whitinger Ii

Undergraduate Honors Theses

Intraday stock trading is an infamously difficult and risky strategy. Momentum and reversal strategies and long short-term memory (LSTM) neural networks have been shown to be effective for selecting stocks to buy and sell over time periods of multiple days. To explore whether these strategies can be effective for intraday trading, their implementations were simulated using intraday price data for stocks in the S&P 500 index, collected at 1-second intervals between February 11, 2021 and March 9, 2021 inclusive. The study tested 160 variations of momentum and reversal strategies for profitability in long, short, and market-neutral portfolios, totaling 480 portfolios. …


Attempting To Predict The Unpredictable: March Madness, Coleton Kanzmeier 2022 University of Nebraska at Omaha

Attempting To Predict The Unpredictable: March Madness, Coleton Kanzmeier

Theses/Capstones/Creative Projects

Each year, millions upon millions of individuals fill out at least one if not hundreds of March Madness brackets. People test their luck every year, whether for fun, with friends or family, or to even win some money. Some people rely on their basketball knowledge whereas others know it is called March Madness for a reason and take a shot in the dark. Others have even tried using statistics to give them an edge. I intend to follow a similar approach, using statistics to my advantage. The end goal is to predict this year’s, 2022, March Madness bracket. To achieve …


Understanding And Improving The System: The Effects Of Weighting On The Accuracy Of Political Polling In Arkansas, Beck Williams 2022 University of Arkansas, Fayetteville

Understanding And Improving The System: The Effects Of Weighting On The Accuracy Of Political Polling In Arkansas, Beck Williams

Political Science Undergraduate Honors Theses

In an effort to increase the accuracy of statewide political polling in Arkansas, we explore the statistical strategy of weighting with a focus on one yearly opinion poll: The Arkansas Poll. We conduct over 70 weighting experiments on the 2016 and 2020 Arkansas Polls using a variety of variables and opinion questions. From these experiments, we find that while some weighted variables tend to create larger changes, weighting typically results in a single-digit percentage change that does not substantially shift or “flip” the majorities. Due to a greater rate of change through weighting in the 2020 Poll compared to the …


Bayesian Spatial Model Development Of Soil Core Organic Matter As A Proxy For Blue Carbon Stocks Within The Chesapeake Bay, Christian Longo 2022 William & Mary

Bayesian Spatial Model Development Of Soil Core Organic Matter As A Proxy For Blue Carbon Stocks Within The Chesapeake Bay, Christian Longo

Undergraduate Honors Theses

Blue carbon is carbon captured and stored within bodies of water and their ecosystems. Blue carbon stocks are very important due to their ability to store carbon away from the atmosphere. The destruction of these stocks can accelerate climate change. In particular, we wish to assess blue carbon stock within the Chesapeake Bay. Previous studies have only used geographical features to predict blue carbon stock levels. The big picture question this thesis was meant to answer is: What is the best approach for building a statistical model that factors in both spatial parameters and geographical features to predict blue carbon …


Groundwork For The Development Of Gpu Enabled Group Testing Regression Models, Paul Cubre 2022 Clemson University

Groundwork For The Development Of Gpu Enabled Group Testing Regression Models, Paul Cubre

All Dissertations

In this dissertation, we develop novel techniques that allow for the regression analysis of data emerging from group testing processes and set the groundwork for graphic processing units (GPU) enabled implementations. Group testing primarily occurs in clinical laboratories, where it is used to quickly and cheaply diagnose patients. Typically, group testing tests a pooled specimen--several specimens combined into one sample--instead of testing individual specimens one-by-one. This method reduces costs by using fewer tests when the disease prevalence is low. Due to recent advances in diagnostic technology, group testing protocols were extended to incorporate multiplex assays, which are diagnostic tests that, …


Posterior Predictive Model Checking Of The Hierarchical Rater Model, Nnamdi Chika Ezike 2022 University of Arkansas, Fayetteville

Posterior Predictive Model Checking Of The Hierarchical Rater Model, Nnamdi Chika Ezike

Graduate Theses and Dissertations

Fitting wrongly specified models to observed data may lead to invalid inferences about the model parameters of interest. The current study investigated the performance of the posterior predictive model checking (PPMC) approach in detecting model-data misfit of the hierarchical rater model (HRM). The HRM is a rater-mediated model that incorporates components of the polytomous item response theory (IRT) model, such as the partial credit model (PCM) and generalized partial credit model (GPCM), at the second level of the hierarchy, to model examinees’ responses to performance assessments. To date, the HRM has not been rigorously evaluated using PPMC techniques. Monte Carlo …


Digital Commons powered by bepress