Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 6 of 6
Full-Text Articles in Entire DC Network
Performance Comparison Of Imputation Methods For Mixed Data Missing At Random With Small And Large Sample Data Set With Different Variability, Kyei Afari
Electronic Theses and Dissertations
One of the concerns in the field of statistics is the presence of missing data, which leads to bias in parameter estimation and inaccurate results. However, the multiple imputation procedure is a remedy for handling missing data. This study looked at the best multiple imputation methods used to handle mixed variable datasets with different sample sizes and variability along with different levels of missingness. The study employed the predictive mean matching, classification and regression trees, and the random forest imputation methods. For each dataset, the multiple regression parameter estimates for the complete datasets were compared to the multiple regression parameter …
Performance Comparison Of Multiple Imputation Methods For Quantitative Variables For Small And Large Data With Differing Variability, Vincent Onyame
Performance Comparison Of Multiple Imputation Methods For Quantitative Variables For Small And Large Data With Differing Variability, Vincent Onyame
Electronic Theses and Dissertations
Missing data continues to be one of the main problems in data analysis as it reduces sample representativeness and consequently, causes biased estimates. Multiple imputation methods have been established as an effective method of handling missing data. In this study, we examined multiple imputation methods for quantitative variables on twelve data sets with varied sizes and variability that were pseudo generated from an original data. The multiple imputation methods examined are the predictive mean matching, Bayesian linear regression and linear regression, non-Bayesian in the MICE (Multiple Imputation Chain Equation) package in the statistical software, R. The parameter estimates generated from …
Investigation Of Multiple Imputation Methods For Categorical Variables, Samantha Miranda
Investigation Of Multiple Imputation Methods For Categorical Variables, Samantha Miranda
Electronic Theses and Dissertations
We compare different multiple imputation methods for categorical variables using the MICE package in R. We take a complete data set and remove different levels of missingness and evaluate the imputation methods for each level of missingness. Logistic regression imputation and linear discriminant analysis (LDA) are used for binary variables. Multinomial logit imputation and LDA are used for nominal variables while ordered logit imputation and LDA are used for ordinal variables. After imputation, the regression coefficients, percent deviation index (PDI) values, and relative frequency tables were found for each imputed data set for each level of missingness and compared to …
Comparison Of Imputation Methods For Mixed Data Missing At Random, Kaitlyn Heidt
Comparison Of Imputation Methods For Mixed Data Missing At Random, Kaitlyn Heidt
Electronic Theses and Dissertations
A statistician's job is to produce statistical models. When these models are precise and unbiased, we can relate them to new data appropriately. However, when data sets have missing values, assumptions to statistical methods are violated and produce biased results. The statistician's objective is to implement methods that produce unbiased and accurate results. Research in missing data is becoming popular as modern methods that produce unbiased and accurate results are emerging, such as MICE in R, a statistical software. Using real data, we compare four common imputation methods, in the MICE package in R, at different levels of missingness. The …
Performance Comparison Of Imputation Algorithms On Missing At Random Data, Evans Dapaa Addo
Performance Comparison Of Imputation Algorithms On Missing At Random Data, Evans Dapaa Addo
Electronic Theses and Dissertations
Missing data continues to be an issue not only the field of statistics but in any field, that deals with data. This is due to the fact that almost all the widely accepted and standard statistical software and methods assume complete data for all the variables included in the analysis. As a result, in most studies, statistical power is weakened and parameter estimates are biased, leading to weak conclusions and generalizations.
Many studies have established that multiple imputation methods are effective ways of handling missing data. This paper examines three different imputation methods (predictive mean matching, Bayesian linear regression and …
Using The Em Algorithm To Estimate The Difference In Dependent Proportions In A 2 X 2 Table With Missing Data., Alain Duclaux Talla Souop
Using The Em Algorithm To Estimate The Difference In Dependent Proportions In A 2 X 2 Table With Missing Data., Alain Duclaux Talla Souop
Electronic Theses and Dissertations
In this thesis, I am interested in estimating the difference between dependent proportions from a 2 × 2 contingency table when there are missing data. The Expectation-Maximization (EM) algorithm is used to obtain an estimate for the difference between correlated proportions. To obtain the standard error of this difference I employ a resampling technique known as bootstrapping. The performance of the bootstrap standard error is evaluated for different sample sizes and different fractions of missing information. Finally, a 100(1-α)% bootstrap confidence interval is proposed and its coverage is evaluated through simulation.