Kybernetika 54 no. 4, 778-797, 2018

The LASSO estimator: Distributional properties

Rakshith Jagannath and Neelesh S. UpadhyeDOI: 10.14736/kyb-2018-4-0778

Abstract:

The least absolute shrinkage and selection operator (LASSO) is a popular technique for simultaneous estimation and model selection. There have been a lot of studies on the large sample asymptotic distributional properties of the LASSO estimator, but it is also well-known that the asymptotic results can give a wrong picture of the LASSO estimator's actual finite-sample behaviour. The finite sample distribution of the LASSO estimator has been previously studied for the special case of orthogonal models. The aim in this work is to generalize the finite sample distribution properties of LASSO estimator for a real and linear measurement model in Gaussian noise. In this work, we derive an expression for the finite sample characteristic function of the LASSO estimator, we then use the Fourier slice theorem to obtain an approximate expression for the marginal probability density functions of the one-dimensional components of a linear transformation of the LASSO estimator.

Keywords:

characteristic function, linear regression, LASSO, finite sample probability distribution function, Fourier-Slice theorem, Cramer-Wold theorem

Classification:

62E15, 62J05, 62G05, 60E05

References:

  1. C. D. Austin, R. Moses, J. Ash and E. Ertin: On the relation between sparse reconstruction and parameter estimation with model order selection. IEEE J. Selected Topics Signal Process. 4 (2010), 3, 560-570.   DOI:10.1109/jstsp.2009.2038313
  2. S. Babacan, R. Molina and A. Katsaggelos: Bayesian compressive sensing using laplace priors. IEEE Trans. Image Process. 19 (2010), 1, 53-63.   DOI:10.1109/tip.2009.2032894
  3. R. Baraniuk, E. Candes, R. Nowak, R., M. and Vetterli: Compressive sampling. IEEE Signal Processing Magazine 25 (2008), 2, 12-13.   DOI:10.1109/msp.2008.915557
  4. A. Ben-Tal and A. S. Nemirovskiaei: Lectures on Modern Convex Optimization: Analysis, Algorithms, and Engineering Applications. Society for Industrial and Applied Mathematics, Philadelphia 2001.   DOI:10.1137/1.9780898718829
  5. P. Boufounos, M. F. Duarte and R. G. Baraniuk: Sparse signal reconstruction from noisy compressive measurements using cross validation. In: IEEE/SP 14th Workshop on Statistical Signal Processing, 2007, pp. 299-303.   DOI:10.1109/ssp.2007.4301267
  6. E. Candes: The restricted isometry property and its implications for compressed sensing. Comptes Rendus Mathematique 346 (2008), 9-10, 589-592.   DOI:10.1016/j.crma.2008.03.014
  7. S. S. Chen, D. L. Donoho and M. A. Saunders: Atomic decomposition by basis pursuit. SIAM Rev. 43 (2001), 1, 129-159.   DOI:10.1137/s003614450037906x
  8. J. A. Cuesta-Albertos, R. Fraiman and T. Ransford: A sharp form of the Cramér-wold theorem. J. Theoret. Probab. 20 (2007), 2, 201-209.   DOI:10.1007/s10959-007-0060-7
  9. D. Donoho: Compressed sensing. IEEE Trans. Inform. Theory 52 (2006), 4, 1289-1306.   DOI:10.1109/tit.2006.871582
  10. B. Efron, T. Hastie, I. Johnstone and R. Tibshirani: Least angle regression. Ann. Statist. 32 (2004), 407-499.   DOI:10.1214/009053604000000067
  11. Y. Eldar: Generalized sure for exponential families: Applications to regularization. IEEE Trans. Signal Process. 57 (2009), 2, 471-481.   DOI:10.1109/tsp.2008.2008212
  12. J. Fan and R. Li: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 (2001), 456, 1348-1360.   DOI:10.1198/016214501753382273
  13. M. Grant and S. Boyd: {CVX}: Matlab software for disciplined convex programming, version 2.1.    CrossRef
  14. P. Kabaila: The effect of model selection on confidence regions and prediction regions. Econometr. Theory 11 (1995), 537-549.   DOI:10.1017/s0266466600009403
  15. S. M. Kay: Fundamentals of Statistical Signal Processing: Estimation Theory. Prentice-Hall, Inc., Upper Saddle River, NJ 1993.   CrossRef
  16. K. Knight and W. Fu: Asymptotics for lasso-type estimators. Ann. Statist. 28 (2000), 5, 1356-1378.   DOI:10.1214/aos/1015957397
  17. H. Krim and M. Viberg: Two decades of array signal processing research: the parametric approach. IEEE Signal Processing Magazine 13 (1996), 4, 67-94.   DOI:10.1109/79.526899
  18. H. Leeb and B. M. Pötscher: Model selection and inference: Facts and fiction. Econometr. Theory 21 (2005), 21-59.   DOI:10.1017/s0266466605050036
  19. R. Lockhart, J. Taylor, R. J. Tibshirani and R. Tibshirani: A significance test for the lasso. Ann. Statist. 42 (2014), 2, 413-468.   DOI:10.1214/13-aos1175
  20. M. E. Lopes: Estimating unknown sparsity in compressed sensing. CoRR 2012, abs/1204.4227.   CrossRef
  21. R. Ng: Fourier slice photography. ACM Trans. Graph. 24 (2005), 3, 735-744.   CrossRef
  22. A. Panahi and M. Viberg: Fast candidate points selection in the lasso path. IEEE Signal Process. Lett. 19 (2012), 2, 79-82.   DOI:10.1109/lsp.2011.2179534
  23. B. M. Pötscher and H. Leeb: On the distribution of penalized maximum likelihood estimators: The lasso, scad, and thresholding. J. Multivar. Anal. 100 (2009), 9, 2065-2082.   DOI:10.1016/j.jmva.2009.06.010
  24. R. Tibshirani: Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc., Ser. B 58 (1994), 267-288.   CrossRef
  25. H. Zou: The adaptive lasso and its oracle properties. J. Amer. Statist. Assoc. 101 (2006), 476, 1418-1429.   DOI:10.1198/016214506000000735