November 2021 In Defense of the Indefensible: A Very Naïve Approach to High-Dimensional Inference
Sen Zhao, Daniela Witten, Ali Shojaie
Author Affiliations +
Statist. Sci. 36(4): 562-577 (November 2021). DOI: 10.1214/20-STS815

Abstract

A great deal of interest has recently focused on conducting inference on the parameters in a high-dimensional linear model. In this paper, we consider a simple and very naïve two-step procedure for this task, in which we (i) fit a lasso model in order to obtain a subset of the variables, and (ii) fit a least squares model on the lasso-selected set. Conventional statistical wisdom tells us that we cannot make use of the standard statistical inference tools for the resulting least squares model (such as confidence intervals and p-values), since we peeked at the data twice: once in running the lasso, and again in fitting the least squares model. However, in this paper, we show that under a certain set of assumptions, with high probability, the set of variables selected by the lasso is identical to the one selected by the noiseless lasso and is hence deterministic. Consequently, the naïve two-step approach can yield asymptotically valid inference. We utilize this finding to develop the naïve confidence interval, which can be used to draw inference on the regression coefficients of the model selected by the lasso, as well as the naïve score test, which can be used to test the hypotheses regarding the full-model regression coefficients.

Funding Statement

This work was partially supported by National Institutes of Health grants R01-HL141989 and R01- GM133848.

Acknowledgements

We thank the Editor, Associate Editor, and four anonymous reviewers for their incredibly insightful comments, which led to substantial improvements of the manuscript. We thank the authors of Javanmard and Montanari (2014a) and Ning and Liu (2017) for providing code for their proposals. We are grateful to Joshua Loftus, Jonathan Taylor, Robert Tibshirani and Ryan Tibshirani for helpful responses to our inquiries.

Citation

Download Citation

Sen Zhao. Daniela Witten. Ali Shojaie. "In Defense of the Indefensible: A Very Naïve Approach to High-Dimensional Inference." Statist. Sci. 36 (4) 562 - 577, November 2021. https://doi.org/10.1214/20-STS815

Information

Published: November 2021
First available in Project Euclid: 11 October 2021

MathSciNet: MR4323053
zbMATH: 07473936
Digital Object Identifier: 10.1214/20-STS815

Keywords: Confidence interval , Lasso , Post-selection inference , p-value , Significance testing

Rights: Copyright © 2021 Institute of Mathematical Statistics

Vol.36 • No. 4 • November 2021
Back to Top