Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 1 of 1
Full-Text Articles in Applied Statistics
On Variable Selections In High-Dimensional Incomplete Data, Tao Sun
On Variable Selections In High-Dimensional Incomplete Data, Tao Sun
Major Papers
Modern Statistics has entered the era of Big Data, wherein data sets are too large, high-dimensional, incomplete and complex for most classical statistical methods. This analysis of Big data firstly focuses on missing data. We compare different multiple imputation methods. Combining the characteristics of medical high-throughput experiments, we compared multivariate imputation by chained equations (MICE), missing forest (missForest), as well as self-training selection (STS) methods. A phenotypic data set of common lung disease was assessed. Moreover, in terms of improving the interpretability and predictability of the model, variable selection plays a pivotal role in the following analysis. Taking the Lasso-Poisson …