摘要

This study focuses on using a high-dimensional error-in-variables regression to identify a small number of important interpretable factors from corrupted data in applications in which measurement errors or missing data cannot be ignored. Motivated by the convex conditioned Lasso (CoCoLasso) method and the advantage of using a zero-norm regularized LS estimator rather than a Lasso for clean data, we propose a calibrated zero-norm regularized LS (CaZnRLS) estimator. To do so, we construct a calibrated least squares loss with a positive-definite projection of an unbiased surrogate for the covariance matrix of covariates. Then, we use the multi-stage convex relaxation approach to compute the proposed estimator. Under restricted strong convexity on the true covariate matrix, we derive the l(2)-error bound for each iteration. Then, we establish the decreasing error bound sequence and the sign consistency of the iterations after a finite number of steps. Statistical guarantees are also provided for the CaZnRLS estimator under two types of measurement errors. Numerical comparisons with the CoCoLasso and nonconvex Lasso show that the CaZnRLS has a better relative RMSE and correctly identifies more of the predictors.