Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 6 of 6
Full-Text Articles in Entire DC Network
Variable Selection For 1d Regression Models, David J. Olive, Douglas M. Hawkins
Variable Selection For 1d Regression Models, David J. Olive, Douglas M. Hawkins
Articles and Preprints
Variable selection, the search for j relevant predictor variables from a group of p candidates, is a standard problem in regression analysis. The class of 1D regression models is a broad class that includes generalized linear models. We show that existing variable selection algorithms, originally meant for multiple linear regression and based on ordinary least squares and Mallows’ Cp, can also be used for 1D models. Graphical aids for variable selection are also provided.
Robust Regression With High Coverage, David J. Olive, Douglas M. Hawkins
Robust Regression With High Coverage, David J. Olive, Douglas M. Hawkins
Articles and Preprints
An important parameter for several high breakdown regression algorithm estimators is the number of cases given weight one, called the coverage of the estimator. Increasing the coverage is believed to result in a more stable estimator, but the price paid for this stability is greatly decreased resistance to outliers. A simple modification of the algorithm can greatly increase the coverage and hence its statistical performance while maintaining high outlier resistance.
Inconsistency Of Resampling Algorithms For High Breakdown Regression Estimators And A New Algorithm, Douglas M. Hawkins, David J. Olive
Inconsistency Of Resampling Algorithms For High Breakdown Regression Estimators And A New Algorithm, Douglas M. Hawkins, David J. Olive
Articles and Preprints
Since high breakdown estimators are impractical to compute exactly in large samples, approximate algorithms are used. The algorithm generally produces an estimator with a lower consistency rate and breakdown value than the exact theoretical estimator. This discrepancy grows with the sample size, with the implication that huge computations are needed for good approximations in large high-dimensioned samples
The workhorse for HBE has been the ‘elemental set’, or ‘basic resampling’ algorithm. This turns out to be completely ineffective in high dimensions with high levels of contamination. However, enriching it with a “concentration” step turns it into a method that is able …
High Breakdown Analogs Of The Trimmed Mean, David J. Olive
High Breakdown Analogs Of The Trimmed Mean, David J. Olive
Articles and Preprints
Two high breakdown estimators that are asymptotically equivalent to a sequence of trimmed means are introduced. They are easy to compute and their asymptotic variance is easier to estimate than the asymptotic variance of standard high breakdown estimators.
Applications And Algorithms For Least Trimmed Sum Of Absolute Deviations Regression, Douglas M. Hawkins, David Olive
Applications And Algorithms For Least Trimmed Sum Of Absolute Deviations Regression, Douglas M. Hawkins, David Olive
Articles and Preprints
High breakdown estimation (HBE) addresses the problem of getting reliable parameter estimates in the face of outliers that may be numerous and badly placed. In multiple regression, the standard HBE's have been those defined by the least median of squares (LMS) and the least trimmed squares (LTS) criteria. Both criteria lead to a partitioning of the data set's n cases into two “halves” – the covered “half” of cases are accommodated by the fit, while the uncovered “half”, which is intended to include any outliers, are ignored. In LMS, the criterion is the Chebyshev norm of the residuals of the …
Improved Feasible Solution Algorithms For High Breakdown Estimation, Douglas M. Hawkins, David J. Olive
Improved Feasible Solution Algorithms For High Breakdown Estimation, Douglas M. Hawkins, David J. Olive
Articles and Preprints
High breakdown estimation allows one to get reasonable estimates of the parameters from a sample of data even if that sample is contaminated by large numbers of awkwardly placed outliers. Two particular application areas in which this is of interest are multiple linear regression, and estimation of the location vector and scatter matrix of multivariate data. Standard high breakdown criteria for the regression problem are the least median of squares (LMS) and least trimmed squares (LTS); those for the multivariate location/scatter problem are the minimum volume ellipsoid (MVE) and minimum covariance determinant (MCD). All of these present daunting computational problems. …