Open Access. Powered by Scholars. Published by Universities.®

Applied Statistics Commons

Open Access. Powered by Scholars. Published by Universities.®

2013

Discipline
Institution
Keyword
Publication
Publication Type
File Type

Articles 1 - 30 of 156

Full-Text Articles in Applied Statistics

Experimental And Statistical Techniques To Probe Extraordinary Electronic Properties Of Molecules, Byron Hager Smith Dec 2013

Experimental And Statistical Techniques To Probe Extraordinary Electronic Properties Of Molecules, Byron Hager Smith

Doctoral Dissertations

The existence of an additional electron or hole in the presence of an electric monopole is a well understood physical system, but this ideality is far from the true physical properties of many molecules. Examples of such irregular electronic states include the attachment of an excess charge to a molecule's dipole moment, electronic correlation spanning a molecule, or attachment of multiple excess charges. Current theoretical and experimental interpretations widely vary for these states and further elucidation of the nature of irregular electronic structure may provide solutions to unexplained observations and the impetus for industrial application. For example, in the case …


Optimal Matching Distances Between Categorical Sequences: Distortion And Inferences By Permutation, Juan P. Zuluaga Dec 2013

Optimal Matching Distances Between Categorical Sequences: Distortion And Inferences By Permutation, Juan P. Zuluaga

Culminating Projects in Applied Statistics

Sequence data (an ordered set of categorical states) is a very common type of data in Social Sciences, Genetics and Computational Linguistics.

For exploration and inference of sets of sequences, having a measure of dissimilarities among sequences would allow the data to be analyzed by techniques like clustering, multimensional scaling analysis and distance-based regression analysis. Sequences can be placed in a map where similar sequences are close together, and dissimilar ones will be far apart. Such patterns of dispersion and concentration could be related to other covariates. For example, do the employment trajectories of men and women tend to form …


Statistical Models For Predicting College Success, Yelen Nunez Nov 2013

Statistical Models For Predicting College Success, Yelen Nunez

Yelen Nunez

Colleges base their admission decisions on a number of factors to determine which applicants have the potential to succeed. This study utilized data for students that graduated from Florida International University between 2006 and 2012. Two models were developed (one using SAT as the principal explanatory variable and the other using ACT as the principal explanatory variable) to predict college success, measured using the student’s college grade point average at graduation. Some of the other factors that were used to make these predictions were high school performance, socioeconomic status, major, gender, and ethnicity. The model using ACT had a higher …


Polynomially Adjusted Saddlepoint Density Approximations, Susan Zhe Sheng Nov 2013

Polynomially Adjusted Saddlepoint Density Approximations, Susan Zhe Sheng

Electronic Thesis and Dissertation Repository

This thesis aims at obtaining improved bona fide density estimates and approximants by means of adjustments applied to the widely used saddlepoint approximation. Said adjustments are determined by solving systems of equations resulting from a moment-matching argument. A hybrid density approximant that relies on the accuracy of the saddlepoint approximation in the distributional tails is introduced as well. A certain representation of noncentral indefinite quadratic forms leads to an initial approximation whose parameters are evaluated by simultaneously solving four equations involving the cumulants of the target distribution. A saddlepoint approximation to the distribution of quadratic forms is also discussed. By …


Create A Simple Predictive Analytics Classification Model In Java With Weka, James Howard Nov 2013

Create A Simple Predictive Analytics Classification Model In Java With Weka, James Howard

James Howard

Get an overview of the Weka classification engine and learn how to create a simple classifier for programmatic use. Understand how to store and load models, manipulate them, and use them to evaluate data. Consider applications and implementation strategies suitable for the enterprise environment so you turn a collection of training data into a functioning model for real- time prediction.


Preliminary Testing For Normality: Is This A Good Practice?, H. J. Keselman, Abdul R. Othman, Rand R. Wilcox Nov 2013

Preliminary Testing For Normality: Is This A Good Practice?, H. J. Keselman, Abdul R. Othman, Rand R. Wilcox

Journal of Modern Applied Statistical Methods

Normality is a distributional requirement of classical test statistics. In order for the test statistic to provide valid results leading to sound and reliable conclusions this requirement must be satisfied. In the not too distant past, it was claimed that violations of normality would not likely jeopardize scientific findings (See Hsu & Feldt, 1969; Lunney, 1970). Recent revelations suggest otherwise (See e.g., Micceri, 1989; Keselman, Huberty, Lix et al., 1998; Erceg-Hurn, Wilcox, & Keselman, 2013; Wilcox and Keselman, 2003; Wilcox, 2012a, b). Unfortunately the data obtained in psychological investigations rarely, if ever, meet the requirement of normally distributed data (Micceri, …


Front Matter, Jmasm Editors Nov 2013

Front Matter, Jmasm Editors

Journal of Modern Applied Statistical Methods

No abstract provided.


The Impact Of Continuity Violation On Anova And Alternative Methods, Björn Lantz Nov 2013

The Impact Of Continuity Violation On Anova And Alternative Methods, Björn Lantz

Journal of Modern Applied Statistical Methods

The normality assumption behind ANOVA and other parametric methods implies that response variables are measured on continuous scales. A simulation approach is used to explore the impact of continuity violation on the performance of statistical methods commonly used by applied researchers to compare locations across several groups.


Variables Sampling Plan For Correlated Data, J. R. Singh, R. Sankle, M. Ahmad Khanday Nov 2013

Variables Sampling Plan For Correlated Data, J. R. Singh, R. Sankle, M. Ahmad Khanday

Journal of Modern Applied Statistical Methods

The sampling plan for the mean for correlated data is studied. The Operating Characteristic (OC) of the variable sampling plan for mean for correlated data are calculated and compared with the OC of known σ case.


Intrinsically Ties Adjusted Non-Parametric Method For The Analysis Of Two Sampled Data, G. U. Ebuh, I. C. A Oyeka Nov 2013

Intrinsically Ties Adjusted Non-Parametric Method For The Analysis Of Two Sampled Data, G. U. Ebuh, I. C. A Oyeka

Journal of Modern Applied Statistical Methods

A non-parametric method for the analysis of two sample data is proposed that intrinsically and structurally adjusts the test statistic for the possible presence of tied observations between the sampled populations, thereby obviating the need to require the populations to be continuous. The populations may be measurements on as low as the ordinal scale, and need not be homogeneous. In cases where the null hypotheses are rejected, the test statistic enables the determination of which of the sampled populations is likely to be responsible for the rejection (a determination which the Wilcoxon Mann Whitney test cannot handle). The proposed method …


Case-Control Studies With Jointly Misclassified Exposure And Confounding Variables, Tze-San Lee Nov 2013

Case-Control Studies With Jointly Misclassified Exposure And Confounding Variables, Tze-San Lee

Journal of Modern Applied Statistical Methods

The issue of 2 × 2 × 2 case-control studies is addressed when both exposure and confounding variables are jointly misclassified. Two scenarios are considered: the classification errors of exposure and confounding variables are independent or not independent. The bias-adjusted cell probability estimates which account for the misclassification bias are presented. The effect of misclassification on the measure of crude odds ratio either unstratified or stratified by the confounder, Mantel-Haenszel summary odds ratio, the confounding component in the crude odds ratio, the first and second order multiplicative interaction are assessed through the sensitivity analysis from using the data on the …


How Good Is Best? Multivariate Case Of Ehrenberg-Weisberg Analysis Of Residual Errors In Competing Regressions, Stan Lipovetsky Nov 2013

How Good Is Best? Multivariate Case Of Ehrenberg-Weisberg Analysis Of Residual Errors In Competing Regressions, Stan Lipovetsky

Journal of Modern Applied Statistical Methods

A.S.C. Ehrenberg first noticed and S. Weisberg then formalized a property of pairwise regression to keep its quality almost at the same level of precision while the coefficients of the model could vary over a wide span of values. This paper generalizes the estimates of the percent change in the residual standard deviation to the case of competing multiple regressions. It shows that in contrast to the simple pairwise model, the coefficients of multiple regression can be changed over a wider range of the values including the opposite by signs coefficients. Consideration of these features facilitates better understanding the properties …


Constructing Confidence Intervals For Effect Sizes In Anova Designs, Li-Ting Chen, Chao-Ying Joanne Peng Nov 2013

Constructing Confidence Intervals For Effect Sizes In Anova Designs, Li-Ting Chen, Chao-Ying Joanne Peng

Journal of Modern Applied Statistical Methods

A confidence interval for effect sizes provides a range of plausible population effect sizes (ES) that are consistent with data. This article defines an ES as a standardized linear contrast of means. The noncentral method, Bonett’s method, and the bias-corrected and accelerated bootstrap method are illustrated for constructing the confidence interval for such an effect size. Results obtained from the three methods are discussed and interpretations of results are offered.


A Monte Carlo Comparison Of Robust Manova Test Statistics, Holmes Finch, Brian French Nov 2013

A Monte Carlo Comparison Of Robust Manova Test Statistics, Holmes Finch, Brian French

Journal of Modern Applied Statistical Methods

Multivariate Analysis of Variance (MANOVA) is a popular statistical tool in the social sciences, allowing for the comparison of mean vectors across groups. MANOVA rests on three primary assumptions regarding the population: (a) multivariate normality, (b) equality of group population covariance matrices and (c) independence of errors. When these assumptions are violated, MANOVA does not perform well with respect to Type I error and power. There are several alternative test statistics that can be considered including robust statistics and the use of the structural equation modeling (SEM) framework. This simulation study focused on comparing the performance of the P test …


Test For Intraclass Correlation Coefficient Under Unequal Family Sizes, Madhusudan Bhandary, Koji Fujiwara Nov 2013

Test For Intraclass Correlation Coefficient Under Unequal Family Sizes, Madhusudan Bhandary, Koji Fujiwara

Journal of Modern Applied Statistical Methods

Three tests are proposed based on F-distribution, Likelihood Ratio Test (LRT) and large sample Z-test for intraclass correlation coefficient under unequal family sizes based on a single multinormal sample. It has been found that the test based on F-distribution consistently and reliably produces results superior to those of Likelihood Ratio Test (LRT) and large sample Z-test in terms of size for various combinations of intraclass correlation coefficient values. The power of this test based on F-distribution is competitive with the power of the LRT and the power of Z-test is slightly better than the powers of F-test and LRT when …


Generalized Modified Ratio Estimator For Estimation Of Finite Population Mean, Jambulingam Subramani Nov 2013

Generalized Modified Ratio Estimator For Estimation Of Finite Population Mean, Jambulingam Subramani

Journal of Modern Applied Statistical Methods

A generalized modified ratio estimator is proposed for estimating the population mean using the known population parameters. It is shown that the simple random sampling without replacement sample mean, the usual ratio estimator, the linear regression estimator and all the existing modified ratio estimators are the particular cases of the proposed estimator. The bias and the mean squared error of the proposed estimator are derived and are compared with that of existing estimators. The conditions for which the proposed estimator performs better than the existing estimators are also derived. The performance of the proposed estimator is assessed with that of …


Discriminating Between Generalized Exponential Distribution And Some Life Test Models Based On Population Quantiles, B. Srinivasa Rao, R. R. L Kantam Nov 2013

Discriminating Between Generalized Exponential Distribution And Some Life Test Models Based On Population Quantiles, B. Srinivasa Rao, R. R. L Kantam

Journal of Modern Applied Statistical Methods

A test statistic based on population quantiles using sample order statistics is suggested. The quantiles of the test statistics are evaluated for generalized exponential distribution. Similar test statistic based on moments of sample order statistic is referred and the proposed test formula is compared with it. Between the pairs of the above models it is established that the test formula proposed by us is more effective and useful than the formula based on the moments of order statistics as developed by Sultan (2007).


Comparison Of Parameters Of Lognormal Distribution Based On The Classical And Posterior Estimates, Raja Sultan, S. P. Ahmad Nov 2013

Comparison Of Parameters Of Lognormal Distribution Based On The Classical And Posterior Estimates, Raja Sultan, S. P. Ahmad

Journal of Modern Applied Statistical Methods

Lognormal distribution is widely used in scientific field, such as agricultural, entomological, biology etc. If a variable can be thought as the multiplicative product of some positive independent random variables, then it could be modelled as lognormal. In this study, maximum likelihood estimates and posterior estimates of the parameters of lognormal distribution are obtained and using these estimates we calculate the point estimates of mean and variance for making comparisons.


On Bayesian Estimation And Predictions For Two-Component Mixture Of The Gompertz Distribution, Navid Feroze, Muhammad Aslam Nov 2013

On Bayesian Estimation And Predictions For Two-Component Mixture Of The Gompertz Distribution, Navid Feroze, Muhammad Aslam

Journal of Modern Applied Statistical Methods

Mixtures models have received sizeable attention from analysts in the recent years. Some work on Bayesian estimation of the parameters of mixture models have appeared. However, the were restricted to the Bayes point estimation The methodology for the Bayesian interval estimation of the parameters for said models is still to be explored. This paper proposes the posterior interval estimation (along with point estimation) for the parameters of a two-component mixture of the Gompertz distribution. The posterior predictive intervals are also derived and evaluated. Different informative and non-informative priors are assumed under a couple of loss functions for the posterior analysis. …


A Comparison Between Biased And Unbiased Estimators In Ordinary Least Squares Regression, Ghadban Khalaf Nov 2013

A Comparison Between Biased And Unbiased Estimators In Ordinary Least Squares Regression, Ghadban Khalaf

Journal of Modern Applied Statistical Methods

During the past years, different kinds of estimators have been proposed as alternatives to the Ordinary Least Squares (OLS) estimator for the estimation of the regression coefficients in the presence of multicollinearity. In the general linear regression model, Y = Xβ + e, it is known that multicollinearity makes statistical inference difficult and may even seriously distort the inference. Ridge regression, as viewed here, defines a class of estimators of β indexed by a scalar parameter k. Two methods of specifying k are proposed and evaluated in terms of Mean Square Error (MSE) by …


Parameter Estimations Based On Kumaraswamy Progressive Type Ii Censored Data With Random Removals, Navid Feroze, Ibrahim El-Batal Nov 2013

Parameter Estimations Based On Kumaraswamy Progressive Type Ii Censored Data With Random Removals, Navid Feroze, Ibrahim El-Batal

Journal of Modern Applied Statistical Methods

The estimation of two parameters of the Kumaraswamy distribution is considered under Type II progressive censoring with random removals, where the number of units removed at each failure time has a binomial distribution. The MLE was used to obtain the estimators of the unknown parameters, and the asymptotic variance - covariance matrix was also obtained. The formula to compute the expected test time was derived. A numerical study was carried out for different combinations of model parameters. Different censoring schemes were used for the estimation, and performance of these schemes was compared.


The Single-Case Data Analysis Package: Analysing Single-Case Experiments With R Software, Isis Bulté, Patrick Onghena Nov 2013

The Single-Case Data Analysis Package: Analysing Single-Case Experiments With R Software, Isis Bulté, Patrick Onghena

Journal of Modern Applied Statistical Methods

The RcmdrPlugin.SCDA plug-in package is discussed. It integrates three R packages in the R commander interface: SCVA (for Single-Case Visual Analysis), SCRT (for Single-Case Randomization Tests), and SCMA (for Single-Case Meta-Analysis). This way the plug-in package covers three important steps in the analysis of single-case data.


Innovationspotenzialanalyse Für Die Neuen Technologien Für Das Verwalten Und Analysieren Von Großen Datenmengen (Big Data Management), Volker Markl, Alexander Löser, Thomas Hoeren, Helmut Krcmar, Holmer Hemsen, Michael Schermann, Matthias Gottlieb, Christoph Buchmüller, Philip Uecker, Till Bitter Nov 2013

Innovationspotenzialanalyse Für Die Neuen Technologien Für Das Verwalten Und Analysieren Von Großen Datenmengen (Big Data Management), Volker Markl, Alexander Löser, Thomas Hoeren, Helmut Krcmar, Holmer Hemsen, Michael Schermann, Matthias Gottlieb, Christoph Buchmüller, Philip Uecker, Till Bitter

Faculty Book Gallery

Durch die Digitalisierung von Wirtschaft und Gesellschaft ist ein rasantes Anwachsen von Datenbeständen zu beobachten. In fast allen Unternehmenssowie Wissenschaftsbereichen werden bereits heute schon Unmengen an Daten erzeugt, deren Größe, Erfassungsgeschwindigkeit oder Heterogenität die Fähigkeiten gängiger Datenbanksoftwareprodukte zur Verwaltung und zur Analyse übersteigt. Dieses Phänomen, welches unter dem Schlagwort „Big Data“ popularisiert wurde, stellt eine große Chance für Unternehmen, Wissenschaft und Gesellschaft dar. Allerdings ergibt sich aufgrund der neuen Komplexität der Daten und Analysen eine Vielzahl an Herausforderungen technischer, wirtschaftlicher und rechtlicher Natur. Diese Studie analysiert die Chancen und Herausforderungen von Big Data insbesondere im Hinblick auf eine nachhaltige Wettbewerbsfä- …


Robust Regression Estimators When There Are Tied Values, Rand R. Wilcox, Florence Clark Nov 2013

Robust Regression Estimators When There Are Tied Values, Rand R. Wilcox, Florence Clark

Journal of Modern Applied Statistical Methods

It is well known that when using the ordinary least squares regression estimator, outliers among the dependent variable can result in relatively poor power. Many robust regression estimators have been derived that address this problem, but the bulk of the results assume that the dependent variable is continuous. It is demonstrated that when there are tied values, several robust regression estimators can perform poorly in terms of controlling the Type I error probability, even with a large sample size. The presence of tied values does not necessarily mean that they perform poorly, but there is the issue of whether there …


A Generalized Class Of Estimators For Finite Population Variance In Presence Of Measurement Errors, Prayas Sharma, Rajesh Singh Nov 2013

A Generalized Class Of Estimators For Finite Population Variance In Presence Of Measurement Errors, Prayas Sharma, Rajesh Singh

Journal of Modern Applied Statistical Methods

The problem of estimating the population variance is presented using auxiliary information in the presence of measurement errors. The estimators in this article use auxiliary information to improve efficiency and assume that measurement error is present both in study and auxiliary variable. A numerical study is carried out to compare the performance of the proposed estimator with other estimators and the variance per unit estimator in the presence of measurement errors.


Comparison Of Three Calculation Methods For A Bayesian Inference Of P(Π1 > Π2), Yohei Kawasaki, Asanao Shimokawa, Etsuo Miyaoka Nov 2013

Comparison Of Three Calculation Methods For A Bayesian Inference Of P(Π1 > Π2), Yohei Kawasaki, Asanao Shimokawa, Etsuo Miyaoka

Journal of Modern Applied Statistical Methods

In Bayesian inference, some researchers have examined the difference of binominal proportions using θ = P(π1 > π2 − Δ0|X1,X2), where Xi denote binomial random variable with parameter πi. An approximate method and the MCMC method are compared with an exact method for θ, and results of actual clinical trials using θ are presented.


Testing The Assumption Of Non-Differential Misclassification In Case-Control Studies, Tze-San Lee, Qin Hui Nov 2013

Testing The Assumption Of Non-Differential Misclassification In Case-Control Studies, Tze-San Lee, Qin Hui

Journal of Modern Applied Statistical Methods

One of the not yet solved issues regarding the misclassification in case-control studies is whether the misclassification rates are the same for both cases and controls. Currently, a common practice is to assume that the rates are the same, that is, the non-differential misclassification assumption. However, it has been suspected that this assumption may not be valid in practical applications. Unfortunately, no test is available so far to test the validity of the non-differential misclassification assumption. A method is presented to test the validity of non-differential misclassification assumption in case-control studies with 2 × 2 tables when validation data are …


Akaike Information Criterion To Select The Parametric Detection Function For Kernel Estimator Using Line Transect Data, Omar Eidous, Samar Al-Salman Nov 2013

Akaike Information Criterion To Select The Parametric Detection Function For Kernel Estimator Using Line Transect Data, Omar Eidous, Samar Al-Salman

Journal of Modern Applied Statistical Methods

Among different candidate parametric detection functions, it is suggested to use Akaike Information Criterion (AIC) to select the most appropriate one of them to fit line transect data. Four different detection functions are considered in this paper. Two of them are taken to satisfy the shoulder condition assumption and the other two estimators do not satisfy this condition. Once the appropriate detection function is determined, it also can be used to select the smoothing parameter of the nonparametric kernel estimator. For a wide range of target densities, a simulation results show the reasonable and good performances of the …


Bayesian Joinpoint Regression Model For Childhood Brain Cancer Mortality, Ram C. Kafle, Netra Khanal, Chris P. Tsokos Nov 2013

Bayesian Joinpoint Regression Model For Childhood Brain Cancer Mortality, Ram C. Kafle, Netra Khanal, Chris P. Tsokos

Journal of Modern Applied Statistical Methods

The Bayesian approach of joinpoint regression is widely used to analyze trends in cancer mortality, incidence and survival data. The Bayesian joinpoint regression model was used to study the childhood brain cancer mortality rate and its average percentage change (APC) per year. Annual observed mortality counts of children ages 0-19 from 1969-2009 obtained from Surveillance Epidemiology and End Results (SEER) database of National Cancer Institute (NCI) were analyzed. It was assumed that death counts are probabilistically characterized by the Poisson distribution and they were modeled using log link function. Results were compared with the mortality trend obtained using joinpoint software …


Ordered Logit Regression Modeling Of The Self-Rated Health In Hawai‘I, With Comparisons To The Ols Model, Hosik Min Nov 2013

Ordered Logit Regression Modeling Of The Self-Rated Health In Hawai‘I, With Comparisons To The Ols Model, Hosik Min

Journal of Modern Applied Statistical Methods

Despite the ordinal nature of Self-Rated Health (SRH) variable, logistic regression models or regression models have been used without adequate justification for these applications. It is shown that ordered-logit regression model is the appropriate statistical strategy to estimate SRH, whereas the Ordinary LeastSquares model leads to biased conclusions.