Open Access. Powered by Scholars. Published by Universities.®

Applied Statistics Commons

Open Access. Powered by Scholars. Published by Universities.®

3,524 Full-Text Articles 4,908 Authors 2,834,925 Downloads 168 Institutions

All Articles in Applied Statistics

Faceted Search

3,524 full-text articles. Page 84 of 108.

Modeling And Handling Overdispersion Health Science Data With Zero-Inflated Poisson Model, Nur Syabiha binti Zafakali, Wan Muhamad Amir bin W Ahmad 2013 Universiti Malaysia Terengganu, Kuala Terengganu, Malaysia

Modeling And Handling Overdispersion Health Science Data With Zero-Inflated Poisson Model, Nur Syabiha Binti Zafakali, Wan Muhamad Amir Bin W Ahmad

Journal of Modern Applied Statistical Methods

Health sciences research often involves analyses of repeated measurement or longitudinal count data analyses that exhibit excess zeros. Overdispersion occurs when count data measurements have greater variability than allowed. This phenomenon can be carried over to zero-inflated count data modeling. Referred to as zero-inflation, the Zero-Inflated Poisson (ZIP) model can be used to model such data. The Zero-Inflated Negative Binomial (ZINB) model is used to account for overdispersion detected in count data. The ZINB model is considered as an alternative for the Zero-Inflated Generalized Poisson (ZIGP) model for zero-inflated overdispersed count data. Consequently, zero-inflated models have been proposed for the …


The Probit Link Function In Generalized Linear Models For Data Mining Applications, Mehdi Razzaghi 2013 Bloomsburg University, Bloomsburg, PA

The Probit Link Function In Generalized Linear Models For Data Mining Applications, Mehdi Razzaghi

Journal of Modern Applied Statistical Methods

The use of logistic regression for outcome classification of dichotomous variables is well known in data mining applications. The estimated probability of the logit transformation belongs to the class of canonical link functions that follow from particular probability distribution functions. A closely related model is the probit link which can be used for binary responses. Although the probit link is not canonical, in some cases the overall fit of the model can be improved by using non-canonical link functions. This article reviews the properties of the probit link function and discusses its applications in data mining problems. Contrasts and comparisons …


Parameter Estimation Of A Class Of Hidden Markov Model With Diagnostics, E. B. Nkemnole, O. Abass, R. A. Kasumu 2013 University of Lagos, Nigeria, Africa

Parameter Estimation Of A Class Of Hidden Markov Model With Diagnostics, E. B. Nkemnole, O. Abass, R. A. Kasumu

Journal of Modern Applied Statistical Methods

A stochastic volatility (SV) problem is formulated as a state space form of a Hidden Markov model (HMM). The SV model assumes that the distribution of asset returns conditional on the latent volatility is normal. This article analyzes the SV model with the student-t distribution and the generalized error distribution (GED) and compares these distributions with a mixture of normal distributions from Kim and Stoffer (2008). A Sequential Monte Carlo with Expectation Maximization (SMCEM) algorithm technique was used to estimate parameters for the extended volatility model; the Akaike Information Criteria (AIC) and forecast statistics were calculated to compare distribution fit. …


A Note On Α-Curvature Of The Manifolds Of The Length-Biased Lognormal And Gamma Distributions In View Of Related Applications In Data Analysis, Makarand V. Ratnaparkhi, Uttara V. Naik-Nimbalkar 2013 Wright State University

A Note On Α-Curvature Of The Manifolds Of The Length-Biased Lognormal And Gamma Distributions In View Of Related Applications In Data Analysis, Makarand V. Ratnaparkhi, Uttara V. Naik-Nimbalkar

Journal of Modern Applied Statistical Methods

The α-curvature tensors of the statistical manifolds of the length-biased versions of the log-normal and gamma distributions are derived and discussed. This study was designed to investigate observations related to the parameter estimation for the length-biased lognormal distribution as a model for the lengthbiased data from oil field exploration.


Jmasm 32: Multiple Imputation Of Missing Multilevel, Longitudinal Data: A Case When Practical Considerations Trump Best Practices?, Jennifer E. V. Lloyd, Jelena Obradović, Richard M. Carpiano, Frosso Motti-Stefanidi 2013 University of British Columbia

Jmasm 32: Multiple Imputation Of Missing Multilevel, Longitudinal Data: A Case When Practical Considerations Trump Best Practices?, Jennifer E. V. Lloyd, Jelena Obradović, Richard M. Carpiano, Frosso Motti-Stefanidi

Journal of Modern Applied Statistical Methods

A pedagogical tool is presented for applied researchers dealing with incomplete multilevel, longitudinal data. It explains why such data pose special challenges regarding missingness. Syntax created to perform a multiply-imputed growth modeling procedure in Stata Version 11 (StataCorp, 2009) is also described.


Bayesian Analysis Of Data On Nest Success For Marsh Birds, Sean Hardy 2013 University of Maine - Main

Bayesian Analysis Of Data On Nest Success For Marsh Birds, Sean Hardy

Honors College

Bayesian methods are an increasingly popular form of statistical analysis which uses informative prior distributions to help calculate posterior distributions of models that represent different hypotheses. Frequentist methods are contrasting methods that are used more commonly and more well known, but have come under recent criticism. I examined data gathered by Ellen Robertson, who used information theoretic methods for a Masters’ Thesis in Ecology and Environmental Science at the University of Maine to analyze the daily survival probabilities of marsh birds with a Bayesian perspective in order to get a sense of the Bayesian analysis. Results were as expected; when …


Using Functional Data Analysis To Evaluate Effect Of Shade On Body Temperature Of Feedlot Heifers During Environmental Heat Stress, F. Yang, A. M. Parkhurst, C. N. Lee, T. M. Brown-Brandl, P. E. Hillman 2013 Kansas State University Libraries

Using Functional Data Analysis To Evaluate Effect Of Shade On Body Temperature Of Feedlot Heifers During Environmental Heat Stress, F. Yang, A. M. Parkhurst, C. N. Lee, T. M. Brown-Brandl, P. E. Hillman

Conference on Applied Statistics in Agriculture

Heat stress can be a serious problem for cattle. Body temperature (Tb) is a good measure of an animal’s thermo-regulatory response to an environmental thermal challenge. Previous studies found that Tb increases in response to increasing ambient temperature in a controlled chamber. However, when animals are in an uncontrolled environment, Tb is subject to many uncontrolled environmental factors, such as sunshade, wind, and humidity, that increase variation in the data. Hence, functional data analysis (FDA) was applied to analyze the data with uncontrolled environmental factors as curves in the whole series of days in this study. Breed (Angus, MARCIII, MARC-I, …


Detecting Factors Associated With Springwheat Yield Stability In South Dakota Environments, Jixiang Wu, Karl Glover, William Berzonsky 2013 Kansas State University Libraries

Detecting Factors Associated With Springwheat Yield Stability In South Dakota Environments, Jixiang Wu, Karl Glover, William Berzonsky

Conference on Applied Statistics in Agriculture

Conventional yield stability analyses are focused on yield stability itself by using single linear regression method and/or additive main effect and multiplicative interaction (AMMI) analysis. It is likely that yield stability for a genotype is associated with many factors such as fertilizer level, soil types, weather conditions, and/or yield components. Detection of factors highly associated with yield stability, therefore, will help breeders develop cultivars adapted to diverse environments or to specific environments. In this study, we conducted correlation analysis based on both environments and genotypes for a data set with 22 spring wheat genotypes, which were evaluated in 18 environments …


Estimation Of Dose Requirements For Extreme Levels Of Efficacy, Mark West, Guy Hallman 2013 Kansas State University Libraries

Estimation Of Dose Requirements For Extreme Levels Of Efficacy, Mark West, Guy Hallman

Conference on Applied Statistics in Agriculture

The objective of this paper is to explore the extent of how dose-response models may be used to estimate extreme levels of efficacy for controlling insect pests and possibly other uses. Probit-9 mortality (99.9968% mortality) is a standard for treatment effectiveness in tephritid fruit fly research, and has been adopted by the United States Department of Agriculture for fruit flies and other pests. Data taken from the phytosanitary treatment (PT) literature are analyzed. These data are used to fit dose-response models with logit, probit and complimentary log-log links. The effectiveness of these models for predicting extreme levels of efficacy is …


A Simulation Study Of The Small Sample Properties Of Likelihood Based Inference For The Beta Distribution, Kevin Thompson, Edward Gbur 2013 Kansas State University Libraries

A Simulation Study Of The Small Sample Properties Of Likelihood Based Inference For The Beta Distribution, Kevin Thompson, Edward Gbur

Conference on Applied Statistics in Agriculture

Researchers often collect proportion data that cannot be interpreted as arising from a set of Bernoulli trials. Analyses based on the beta distribution may be appropriate for such data. The SAS® GLIMMIX procedure provides a tool for these analyses using a likelihood based approach in the context of generalized linear mixed models. Since the t and F-distribution based inference employed in this approach relies on asymptotic properties, it is important to understand the sample sizes required to obtain reasonable approximate answers to inference questions. In addition, the complexity of the likelihood functions can lead to numerical issues for optimization algorithms …


Non-Normal Data In Agricultural Experiments, W. W. Stroup 2013 Kansas State University Libraries

Non-Normal Data In Agricultural Experiments, W. W. Stroup

Conference on Applied Statistics in Agriculture

Advances in computers and modeling over the past couple of decades have greatly expanded options for analyzing non-normal data. Prior to the 1990’s, options were largely limited to analysis of variance (ANOVA), either on untransformed data or after applying a variance stabilizing transformation. With or without transformations, this approach depends heavily on the Central Limit Theorem and ANOVA’s robustness. The availability of software such as R’s lme4 package and SAS® PROC GLIMMIX changed the conversation with regard to non-normal data. With expanded options come dilemmas. We have software choices – R and SAS among many others. Models have conditional and …


Multivariate Statistical Analysis Of Terrestrial Invertebrate Index Of Biotic Integrity, Bahman Shafii, William J. Price, Norm Merz, Timothy D. Hatten 2013 Kansas State University Libraries

Multivariate Statistical Analysis Of Terrestrial Invertebrate Index Of Biotic Integrity, Bahman Shafii, William J. Price, Norm Merz, Timothy D. Hatten

Conference on Applied Statistics in Agriculture

The Index of Biotic Integrity (IBI) is designed to measure the changes in ecological and environmental conditions as affected by human disturbances. In practice, the IBI is used in various ecological applications to detect divergence in biological integrity attributable to human actions. Last year during this conference, methodologies for developing an Avian Index of Biotic Integrity (A-IBI) were presented and discussed. The objective of this paper is to demonstrate the construction and statistical evaluation of a multi-metric terrestrial Invertebrate Index of Biotic Integrity (I-IBI) using the same multivariate statistical techniques. Canonical correlation analyses were utilized to select pertinent invertebrate metrics …


Characterizing Benthic Macroinvertebrate Community Responses To Nutrient Addition Using Nmds And Baci Analyses, Bahman Shafii, William J. Price, G. Wayne Minshall, Charlie Holderman, Paul J. Anders, Gary Lester, Pat Barrett 2013 Kansas State University Libraries

Characterizing Benthic Macroinvertebrate Community Responses To Nutrient Addition Using Nmds And Baci Analyses, Bahman Shafii, William J. Price, G. Wayne Minshall, Charlie Holderman, Paul J. Anders, Gary Lester, Pat Barrett

Conference on Applied Statistics in Agriculture

Nonmetric multidimensional scaling (NMDS) is an ordination technique which is often used for information visualization and exploring similarities or dissimilarities in ecological data. In principle, NMDS maximizes rank-order correlation between distance measures and distance in the ordination space. Ordination points are adjusted in a manner that minimizes stress, where stress is defined as a measure of the discordance between the two kinds of distances. Before and After Control Impact (BACI) is a classical analysis of variance method for measuring the potential influence of an environmental disturbance. Such effects can be assessed by comparing conditions before and after a planned activity. …


Thou Shall Not Brush Your Teeth While Eating Breakfast: A 7- Step Program For Researchers Previously Hurt In Data Analysis, Edzard van Santen 2013 Kansas State University Libraries

Thou Shall Not Brush Your Teeth While Eating Breakfast: A 7- Step Program For Researchers Previously Hurt In Data Analysis, Edzard Van Santen

Conference on Applied Statistics in Agriculture

After years of providing statistical advice to fellow faculty members and graduate students, I have come to realize that it is not necessarily the big issues, but lack of knowledge of basic data analysis principles that get my clients into trouble. My claim is that if researchers and students internalized two basic definitions they would not have any problems analyzing most of their experiments. The definitions of Experimental Unit (EU) as the smallest physical unit to which a treatment may be applied and Experimental Error (Exp. Err.) as the variation among EUs treated alike are the basis for successful data …


Comparing Functional Data Analysis And Hysteresis Loops When Testing Treatments For Reducing Heat Stress In Dairy Cows, S. Maynes, A. M. Parkhurst, J. B. Gaughan, T. L. Mader 2013 Kansas State University Libraries

Comparing Functional Data Analysis And Hysteresis Loops When Testing Treatments For Reducing Heat Stress In Dairy Cows, S. Maynes, A. M. Parkhurst, J. B. Gaughan, T. L. Mader

Conference on Applied Statistics in Agriculture

Various techniques are commonly used to reduce heat stress, including sprayers and misters, shading, and changes in feed. Oftentimes studies are performed where researchers do not control the times when animals use shading or other means available to reduce heat stress, making it hard to test differences between treatments. Two methods are used on data from a study where Holstein cows were given free access to weight activated “cow showers.” Functional data analysis can be used to model body temperature as a function of time and environmental variables such as the Heat Load Index. Differences between treatment groups can be …


Five Things I Wish My Mother Had Told Me, About Statistics That Is, Philip M. Dixon 2013 Kansas State University Libraries

Five Things I Wish My Mother Had Told Me, About Statistics That Is, Philip M. Dixon

Conference on Applied Statistics in Agriculture

I present five short stories, each describing something I wish I had known and appreciated earlier in my statistical life. The five are Simpson's paradox is everywhere, numerical optimization algorithms can be deceived, you can't always trust the Satterthwaite approximation, BLUP's are wonderful things, and It's good to know Reverend Bayes.


On The Small Sample Behavior Of Generalized Linear Mixed Models With Complex Experiments, Julie Couton, Walt Stroup 2013 Kansas State University Libraries

On The Small Sample Behavior Of Generalized Linear Mixed Models With Complex Experiments, Julie Couton, Walt Stroup

Conference on Applied Statistics in Agriculture

Generalized linear mixed models (GLMMs), regardless of the software used to implement them (R, SAS, etc.), can be formulated as conditional or marginal models and can be computed using pseudo-likelihood, penalized quasi-likelihood, or integral approximation methods. While information exists about the small sample behavior of GLMMs for some cases- notably RCBDs with Binomial or count data- little is known about GLMMs for continuous proportions (e.g. Beta) or time-to-event (e.g. Gamma) data or for more complex designs such as the split-plot. In this presentation we review the major model formulation and estimation options and compare their small sample performance for cases …


Editor's Preface And Table Of Contents, Weixing Song 2013 Kansas State University Libraries

Editor's Preface And Table Of Contents, Weixing Song

Conference on Applied Statistics in Agriculture

These proceedings contain papers presented in the twenty-fifth annual Kansas State University Conference on Applied Statistics in Agriculture, held in Manhattan, Kansas, April 28 - April 30, 2013.


Seasonal Decomposition For Geographical Time Series Using Nonparametric Regression, Hyukjun Gweon 2013 The University of Western Ontario

Seasonal Decomposition For Geographical Time Series Using Nonparametric Regression, Hyukjun Gweon

Electronic Thesis and Dissertation Repository

A time series often contains various systematic effects such as trends and seasonality. These different components can be determined and separated by decomposition methods. In this thesis, we discuss time series decomposition process using nonparametric regression. A method based on both loess and harmonic regression is suggested and an optimal model selection method is discussed. We then compare the process with seasonal-trend decomposition by loess STL (Cleveland, 1979). While STL works well when that proper parameters are used, the method we introduce is also competitive: it makes parameter choice more automatic and less complex. The decomposition process often requires that …


A New Diagnostic Test For Regression, Yun Shi 2013 The University of Western Ontario

A New Diagnostic Test For Regression, Yun Shi

Electronic Thesis and Dissertation Repository

A new diagnostic test for regression and generalized linear models is discussed. The test is based on testing if the residuals are close together in the linear space of one of the covariates are correlated. This is a generalization of the famous problem of spurious correlation in time series regression. A full model building approach for the case of regression was developed in Mahdi (2011, Ph.D. Thesis, Western University, ”Diagnostic Checking, Time Series and Regression”) using an iterative generalized least squares algorithm. Simulation experiments were reported that demonstrate the validity and utility of this approach but no actual applications were …


Digital Commons powered by bepress