Open Access. Powered by Scholars. Published by Universities.®

Applied Statistics Commons

Open Access. Powered by Scholars. Published by Universities.®

3,513 Full-Text Articles 4,898 Authors 2,834,925 Downloads 167 Institutions

All Articles in Applied Statistics

Faceted Search

3,513 full-text articles. Page 57 of 107.

Variable Selection Via Penalized Regression And The Genetic Algorithm Using Information Complexity, With Applications For High-Dimensional -Omics Data, Tyler J. Massaro 2016 University of Tennessee, Knoxville

Variable Selection Via Penalized Regression And The Genetic Algorithm Using Information Complexity, With Applications For High-Dimensional -Omics Data, Tyler J. Massaro

Doctoral Dissertations

This dissertation is a collection of examples, algorithms, and techniques for researchers interested in selecting influential variables from statistical regression models. Chapters 1, 2, and 3 provide background information that will be used throughout the remaining chapters, on topics including but not limited to information complexity, model selection, covariance estimation, stepwise variable selection, penalized regression, and especially the genetic algorithm (GA) approach to variable subsetting.

In chapter 4, we fully develop the framework for performing GA subset selection in logistic regression models. We present advantages of this approach against stepwise and elastic net regularized regression in selecting variables from a …


Advanced Sequential Monte Carlo Methods And Their Applications To Sparse Sensor Network For Detection And Estimation, Kai Kang 2016 University of Tennessee, Knoxville

Advanced Sequential Monte Carlo Methods And Their Applications To Sparse Sensor Network For Detection And Estimation, Kai Kang

Doctoral Dissertations

The general state space models present a flexible framework for modeling dynamic systems and therefore have vast applications in many disciplines such as engineering, economics, biology, etc. However, optimal estimation problems of non-linear non-Gaussian state space models are analytically intractable in general. Sequential Monte Carlo (SMC) methods become a very popular class of simulation-based methods for the solution of optimal estimation problems. The advantages of SMC methods in comparison with classical filtering methods such as Kalman Filter and Extended Kalman Filter are that they are able to handle non-linear non-Gaussian scenarios without relying on any local linearization techniques. In this …


Teaching The Quandary Of Statistical Jurisprudence: A Review-Essay On Math On Trial By Schneps And Colmez, Noah Giansiracusa 2016 University of Georgia

Teaching The Quandary Of Statistical Jurisprudence: A Review-Essay On Math On Trial By Schneps And Colmez, Noah Giansiracusa

Journal of Humanistic Mathematics

This review-essay on the mother-and-daughter collaboration Math on Trial stems from my recent experience using this book as the basis for a college freshman seminar on the interactions between math and law. I discuss the strengths and weaknesses of this book as an accessible introduction to this enigmatic yet deeply important topic. For those considering teaching from this text (a highly recommended endeavor) I offer some curricular suggestions.


Joint Analysis Of Zero-Heavy Longitudinal Outcomes: Models And Comparison Of Study Designs, Erin R. Lundy 2016 The University of Western Ontario

Joint Analysis Of Zero-Heavy Longitudinal Outcomes: Models And Comparison Of Study Designs, Erin R. Lundy

Electronic Thesis and Dissertation Repository

Understanding the patterns and mechanisms of the process of desistance from criminal activity is imperative for the development of effective sanctions and legal policy. Methodological challenges in the analysis of longitudinal criminal behaviour data include the need to develop methods for multivariate longitudinal discrete data, incorporating modulating exposure variables and several possible sources of zero-inflation. We develop new tools for zero-heavy joint outcome analysis which address these challenges and provide novel insights on processes related to offending patterns. Comparisons with existing approaches demonstrate the benefits of utilizing modeling frameworks which incorporate distinct sources of zeros. An additional concern in this …


Well I'Ll Be Damned - Insights Into Predictive Value Of Pedigree Information In Horse Racing, Timothy Baker Mr, Ming-Chien Sung, Johnnie Johnson Professor, Tiejun Ma 2016 University of Southampton

Well I'Ll Be Damned - Insights Into Predictive Value Of Pedigree Information In Horse Racing, Timothy Baker Mr, Ming-Chien Sung, Johnnie Johnson Professor, Tiejun Ma

International Conference on Gambling & Risk Taking

Fundamental form characteristics like how fast a horse ran at its last start, are widely used to help predict the outcome of horse racing events. The exception being in races where horses haven’t previously competed, such as Maiden races, where there is little or no publicly available past performance information. In these types of events bettors need only consider a simplified suite of factors however this is offset by a higher level of uncertainty. This paper examines the inherent information content embedded within a horse’s ancestry and the extent to which this information is discounted in the United Kingdom bookmaker …


Stochastic Processes And Their Applications To Change Point Detection Problems, Heng Yang 2016 Graduate Center, City University of New York

Stochastic Processes And Their Applications To Change Point Detection Problems, Heng Yang

Dissertations, Theses, and Capstone Projects

This dissertation addresses the change point detection problem when either the post-change distribution has uncertainty or the post-change distribution is time inhomogeneous. In the case of post-change distribution uncertainty, attention is drawn to the construction of a family of composite stopping times. It is shown that the proposed composite stopping time has third order optimality in the detection problem with Wiener observations and also provides information to distinguish the different values of post-change drift. In the case of post-change distribution uncertainty, a computationally efficient decision rule with low-complexity based on Cumulative Sum (CUSUM) algorithm is also introduced. In the time …


Metals Additive Manufacturing Powder Aging Characterization, Thomas Russell Lovejoy, Nicholas Karl Muetterties, David Takeo Otsu 2016 California Polytechnic State University, San Luis Obispo

Metals Additive Manufacturing Powder Aging Characterization, Thomas Russell Lovejoy, Nicholas Karl Muetterties, David Takeo Otsu

Mechanical Engineering

The metallic additive manufacturing process known as selective laser melting requires highly spherical, normally distributed powder with diameters in the range of 10 to 50 microns. Previous observations have shown a degradation in powder quality over time, resulting in unwanted characteristics in the final printed parts. 21-6-9 stainless steel powder was used to fabricate test parts, with leftover powder recycled back into the machine. Powder samples and test specimens were characterized to observe changes across build cycles. Few changes were observed in the physical and mechanical properties of the specimens, however, there were indications of chemical changes across cycles. Potential …


Splash Dynamics Of Paint On Dry, Wet, And Cooled Surfaces, David Baron, Haiyan Su, Ashuwin Vaidya 2016 Montclair State University

Splash Dynamics Of Paint On Dry, Wet, And Cooled Surfaces, David Baron, Haiyan Su, Ashuwin Vaidya

Department of Applied Mathematics and Statistics Faculty Scholarship and Creative Works

In his classic study in 1908, A.M. Worthington gave a thorough account of splashes and their formation through visualization experiments. In more recent times, there has been renewed interest in this subject, and much of the underlying physics behind Worthington's experiments has now been clarified. One specific set of such recent studies, which motivates this paper, concerns the fluid dynamics behind Jackson Pollock's drip paintings. The physical processes and the mathematical structures hidden in his works have received serious attention and made the scientific pursuit of art a compelling area of exploration. Our current work explores the interaction of watercolors …


The Importance Of Prediction Model Validation And Assessment In Obesity And Nutrition Research, Andrada Ivanescu, P. Li, B. George, A. W. Brown, S. W. Keith, D. Raju, D. B. Allison 2016 Montclair State University

The Importance Of Prediction Model Validation And Assessment In Obesity And Nutrition Research, Andrada Ivanescu, P. Li, B. George, A. W. Brown, S. W. Keith, D. Raju, D. B. Allison

Department of Applied Mathematics and Statistics Faculty Scholarship and Creative Works

Deriving statistical models to predict one variable from one or more other variables, or predictive modeling, is an important activity in obesity and nutrition research. To determine the quality of the model, it is necessary to quantify and report the predictive validity of the derived models. Conducting validation of the predictive measures provides essential information to the research community about the model. Unfortunately, many articles fail to account for the nearly inevitable reduction in predictive ability that occurs when a model derived on one data set is applied to a new data set. Under some circumstances, the predictive validity can …


Classification Trees And Rule-Based Modeling Using The C5.0 Algorithm For Self-Image Across Sex And Race In St. Louis, Rohan Shirali 2016 Washington University in St. Louis

Classification Trees And Rule-Based Modeling Using The C5.0 Algorithm For Self-Image Across Sex And Race In St. Louis, Rohan Shirali

Arts & Sciences Electronic Theses and Dissertations

The study population comprised children, adolescents, and adults who were residents of the city of St. Louis at the time of data collection in 2015. The data collected includes sex, age, race, measured height and weight, self-reported height and weight, zip code, educational background, exercise and diet habits, and descriptions and strategies of participants' weight (i.e. overweight and trying to lose weight, respectively). I use the C5.0 algorithm to create classification trees and rule-based models to analyze this population. Specifically, I model a binary self-image variable as a function of sex, age, race, zip code, and a ratio of reported …


Spot Volatility Estimation Of Ito Semimartingales Using Delta Sequences, Weixuan Gao 2016 Washington University in St. Louis

Spot Volatility Estimation Of Ito Semimartingales Using Delta Sequences, Weixuan Gao

Arts & Sciences Electronic Theses and Dissertations

This thesis studies a unifying class of nonparametric spot volatility estimators proposed by Mancini et. al.(2013). This method is based on delta sequences and is conceived to include many of the existing estimators in the field as special cases. The thesis first surveys the asymptotic theory of the proposed estimators under an infill asymptotic scheme and fixed time horizon, when the state variable follows a Brownian semimartingale. Then, some extensions to include jumps and financial microstructure noise in the observed price process are also presented. The main goal of the thesis is to assess the suitability of the proposed methods …


Failure Of Surface Color Cues Under Natural Changes In Lighting, David H. Foster, Iván Marín-Franch 2016 University of Manchester

Failure Of Surface Color Cues Under Natural Changes In Lighting, David H. Foster, Iván Marín-Franch

MODVIS Workshop

Color allows us to effortlessly discriminate and identify surfaces and objects by their reflected light. Although the reflected spectrum changes with the illumination spectrum, cone photoreceptor signals can be transformed to give useful cues for surface color. But what happens when both the spectrum and the geometry of the illumination change, as with lighting from the sun and sky? Is it possible, as a matter of principle, to obtain reliable cues by processing cone signals alone? This question was addressed here by estimating the information provided by cone signals from time-lapse hyperspectral radiance images of five outdoor scenes under natural …


Automated Sea State Classification From Parameterization Of Survey Observations And Wave-Generated Displacement Data, Jason A. Teichman 2016 University of New Orleans, New Orleans

Automated Sea State Classification From Parameterization Of Survey Observations And Wave-Generated Displacement Data, Jason A. Teichman

University of New Orleans Theses and Dissertations

Sea state is a subjective quantity whose accuracy depends on an observer’s ability to translate local wind waves into numerical scales. It provides an analytical tool for estimating the impact of the sea on data quality and operational safety. Tasks dependent on the characteristics of local sea surface conditions often require accurate and immediate assessment. An attempt to automate sea state classification using eleven years of ship motion and sea state observation data is made using parametric modeling of distribution-based confidence and tolerance intervals and a probabilistic model using sea state frequencies. Models utilizing distribution intervals are not able to …


Population Projection And Habitat Preference Modeling Of The Endangered James Spinymussel (Pleurobema Collina), Marisa Draper 2016 James Madison University

Population Projection And Habitat Preference Modeling Of The Endangered James Spinymussel (Pleurobema Collina), Marisa Draper

Senior Honors Projects, 2010-2019

The James Spinymussel (Pleurobema collina) is an endangered mussel species at the top of Virginia’s conservation list. The James Spinymussel plays a critical role in the environment by filtering and cleaning stream water while providing shelter and food for macroinvertebrates; however, conservation efforts are complicated by the mussels’ burrowing behavior, camouflage, and complex life cycle. The goals of the research conducted were to estimate detection probabilities that could be used to predict species presence and facilitate field work, and to track individually marked mussels to test for habitat preferences. Using existing literature and mark-recapture field data, these goals were accomplished …


Simulation Comparison Of Statistical Methods Used In Assessing Vaccine Efficacy In Veterinary Biologics, Kenny Wakeland, Brian Fergen 2016 Iowa State University

Simulation Comparison Of Statistical Methods Used In Assessing Vaccine Efficacy In Veterinary Biologics, Kenny Wakeland, Brian Fergen

Conference on Applied Statistics in Agriculture

In veterinary biologics, clinical studies conducted to support the licensure of a vaccine generally include a demonstration of efficacy in the species of interest. Typically, these studies are designed to assess a vaccine’s ability to prevent or mitigate clinical disease. Study designs utilize two or more treatment groups, and often incorporate blocking structure restrictions to accommodate animal housing or litter-related effects. When assessing a vaccine’s ability to prevent clinical disease, the prevented fraction (PF), a function of the group proportions of affected animals, is often utilized. Typically the sample size per treatment group is limited, and each block is represented …


The Effect Of Poultry Litter Application On Agricultural Production: A Meta-Analysis Of Crop Yield, Nutrient Uptake And Soil Fertility, Yaru Lin, Edzard van Santen, Dexter Watts 2016 Auburn University

The Effect Of Poultry Litter Application On Agricultural Production: A Meta-Analysis Of Crop Yield, Nutrient Uptake And Soil Fertility, Yaru Lin, Edzard Van Santen, Dexter Watts

Conference on Applied Statistics in Agriculture

Meta-analysis is a statistical technique used to analyze large datasets containing results from numerous individual studies. It appears to be a promising approach in agricultural sciences. This study aimed to conduct a meta-analytic assessment to elucidate the influence of poultry litter (PL) application on crop yield, plant nutrient uptake, and soil fertility as compared to inorganic fertilizer (IF). A meta-analysis based on 116 studies (111 refereed articles and five unpublished data sets) with 2293 observations compared agronomic responses to PL and IF application. The natural log of the response ratio was used as effect size (ES) to express differences in …


Topological Methods For The Quantification And Analysis Of Complex Phenotypes, Patrick S. Medina, Rebecca W. Doerge 2016 Purdue University

Topological Methods For The Quantification And Analysis Of Complex Phenotypes, Patrick S. Medina, Rebecca W. Doerge

Conference on Applied Statistics in Agriculture

Quantitative Trait Locus (QTL) mapping of complex traits, such as leaf venation or root structures, require the phenotyping and genotyping of large populations. Sufficient genotyping is accomplished with cost effective high-throughput assays, however labor costs often makes sufficient phenotyping prohibitively limited. In order to develop efficient high-throughput phenotyping platforms for complex traits algorithms and methods for quantifying these traits are needed. It is often desirable to study the spatial organization of these phenotypes from the images generated by high-throughput platforms. With the goal of quantifying the traits, many approaches try to identify several core traits useful in describing the phenotypic …


Bayesian Estimation Of Stability Indices Of Sorghum Variety Trials, Siraj Osman Omer, Abdel Wahab Hassan Abdalla, Mohammed Hamza Mohammed, International Center for Agricultural Research in the Dry Areas (ICARDA), Amman, Jordan 2016 Agricultural research Corporation

Bayesian Estimation Of Stability Indices Of Sorghum Variety Trials, Siraj Osman Omer, Abdel Wahab Hassan Abdalla, Mohammed Hamza Mohammed, International Center For Agricultural Research In The Dry Areas (Icarda), Amman, Jordan

Conference on Applied Statistics in Agriculture

Multiple–environmental trials are routinely conducted by crop improvement programs for developing desired genotypes. Over a long run, these programs gather information on genotypic performance and variability. Bayesian approach can be used to utilize prior information to identify genotypes for high and stable yield. A set of 18 sorghum genotypes were evaluated in randomized complete block designs (RCBD) with four replications during three seasons, 2009-2012 at diverse locations, North-Gedarif and South-Gedarif, in Sudan. Data on grain yield was analyzed. The aim of this paper was to estimate stability indices such as regression coefficient, coefficient of variation (CV %) and coefficient of …


Strategies For Reducing Control Group Size In Experiments Using Live Animals, Matthew Kramer, Enrique Font 2016 USDA, Agricultural Research Service

Strategies For Reducing Control Group Size In Experiments Using Live Animals, Matthew Kramer, Enrique Font

Conference on Applied Statistics in Agriculture

Reducing the number of animal subjects used in biomedical experiments is desirable for both ethical and practical reasons. Previous suggestions for reducing sample sizes in these experiments have focused on improving experimental designs and methods of statistical analysis; reducing the number of controls (thus, the number of overall animals used) is rarely mentioned. We discuss how the number of current control animals can be reduced, without loss of statistical power, by incorporating information from historical controls, i.e. animals used as controls in similar previous experiments. Using example data from the literature, we describe how to incorporate information from historical controls …


Alternative Estimation Techniques For Correlated Discrete Data, William J. Price Ph.D., Bahman Shafii Ph.D. 2016 University of Idaho

Alternative Estimation Techniques For Correlated Discrete Data, William J. Price Ph.D., Bahman Shafii Ph.D.

Conference on Applied Statistics in Agriculture

Binary or multinomial data often occur in agricultural and biological research. Advancements in measurement and video technologies now allow such data to be sequentially recorded through time or space. These data sets, however, can exhibit a serial correlation structure, which in turn, can bias and influence point estimates as well as inferences made regarding the data. Statistical methods using generalized mixed models and probability distributions such as the beta-binomial and correlated binomial have been proposed as potential solutions for estimating the parameters of interest in these cases. In this paper, we will explore the properties of these techniques through simulation …


Digital Commons powered by bepress