User:Sbowen209/sandbox
In statistics, the LASSO (Least Absolute Shrinkage and Selection Operator), is a regularized regression method that is used for variable shrinkage and selection in linear or logistic regression models.
History
[edit]The LASSO method was created by Robert Tibshirani[1] after being influenced by Leo Breiman's work on subset selection. The LASSO method has grown in popularity due to the importance of variable selection in big data problems and advances in computational technologies.
The LASSO
[edit]Given a linear regression with a response variable and a matrix of explanatory variables, the LASSO method produces the estimate for the unknown parameter :
This is equivalent to the following loss function:
where is an L1 penalty and the tuning parameter has a one-to-one correspondence with and governs the strength of this penalty. Like other types of regularization, such as ridge regression, the size of the coefficients shrink as is increased. However, unlike ridge regression, some of the coefficients are shrunken identically to zero. In this manner, the LASSO performs continuous variable selection.[2]
Elastic Net Regularization
[edit]The LASSO may saturate quickly in the case of data with a large number of predictors but only few observations. Additionally, it will ignore variables if they are highly correlated to another. Variations such as the elastic net method can overcome these limitations by employing penalties from both the LASSO and ridge regression.[3]
Computation
[edit]Because the LASSO utilizes an absolute value operation in its penalty, the loss function is not differentiable and in general does not have a closed form. Therefore, convex optimization algorithms are required to find a solution. In the original paper, solutions were found through quadratic programming, but more recent methods such as least-angle regression and coordinate descent have proven more efficient. [4] [5]
Software
[edit]"Glmnet: Lasso and elastic-net regularized generalized linear models" is software which is implemented as an R source package.[6][7] This includes fast algorithms for estimation of generalized linear models with ℓ1 (the lasso), ℓ2 (ridge regression) and mixtures of the two penalties (the elastic net) using cyclical coordinate descent, computed along a regularization path.
See also
[edit]Notes
[edit]- ^ Tibshirani, Robert (2011). "Regression Shrinkage and Selection via the Lasso: a retrospective" (PDF). Journal of the Royal Statistical Society, Series B. 73 (1): 273–282. doi:10.1111/j.1467-9868.2011.00771.x. MR 1379242. Retrieved 2014-03-5.
{{cite journal}}
: Check date values in:|accessdate=
(help) - ^ Tibshirani, Robert (1996). "Regression Shrinkage and Selection via the Lasso" (PostScript). Journal of the Royal Statistical Society, Series B. 58 (1): 267–288. MR 1379242. Retrieved 2014-03-03.
- ^ Zou, Hui; Hastie, Trevor (2005). "Regularization and Variable Selection via the Elastic Net". Journal of the Royal Statistical Society, Series B. 67 (2): 301–320. CiteSeerX 10.1.1.124.4696. doi:10.1111/j.1467-9868.2005.00503.x.
- ^ Efron, Bradley; Hastie, Trevor; Johnstone, Iain; Tibshirani, Robert (2004). "Least Angle Regression" (PDF). Annals of Statistics. 32 (2): pp. 407–499. doi:10.1214/009053604000000067. MR 2060166.
{{cite journal}}
: CS1 maint: date and year (link) - ^ Wu, TongTong; Lange, Kenneth (2008), "Coordinate descent algorithms for Lasso penalized regression", The Annals of Applied Statistics, vol. 2, no. 1, Institute of Mathematical Statistics, pp. 224–244, doi:10.1214/07-AOAS147
- ^ Friedman, Jerome (2010). "Regularization Paths for Generalized Linear Models via Coordinate Descent". Journal of Statistical Software: 1–22.
{{cite journal}}
: Unknown parameter|coauthors=
ignored (|author=
suggested) (help) - ^ http://cran.r-project.org/web/packages/glmnet/index.html
References
[edit]- Hastie, Trevor; Tibshirani, Robert; Friedman, Jerome H. "The Elements of Statistical Learning". Retrieved 3 March 2014.