Lasso Regression

The LASSO (Least Absolute Shrinkage and Selection Operator) is a regression method that involves penalizing the absolute size of the regression coefficients.

By penalizing (or equivalently constraining the sum of the absolute values of the estimates) you end up in a situation where some of the parameter estimates may be exactly zero. The larger the penalty applied, the further estimates are shrunk towards zero.

This is convenient when we want some automatic feature/variable selection, or when dealing with highly correlated predictors, where standard regression will usually have regression coefficients that are ‘too large’.

Lasso stands for Least Absolute Shrinkage and Selection Operator. It makes use of L1 regularization technique in the objective function. Thus the objective function in LASSO regression becomes:

λ is the regularization parameter and the intercept term is not regularized.
We do not assume that the error terms are normally distributed.

For the estimates we don’t have any specific mathematical formula but we can obtain the estimates using some statistical software.
Note that lasso regression also needs standardization.

Advantage of lasso over ridge regression
Lasso regression can perform in-built variable selection as well as parameter shrinkage. While using ridge regression one may end up getting all the variables but with Shrinked Paramaters.

R code for Lasso Regression
Considering the swiss dataset from “datasets” package, we have:
#Creating dependent and independent variables.
X = swiss[,-1]
y = swiss[,1]

Using cv.glmnet in glmnet package we do cross validation. For lasso regression we set alpha = 1. By default standardize = TRUE hence we do not need to standardize the variables seperately.

#Setting the seed for reproducibility
set.seed(123)
model = cv.glmnet(as.matrix(X),y,alpha = 1,lambda = 10^seq(4,-1,-0.1))
#By default standardize = TRUE

We consider the best value of lambda by filtering out lamba.min from the model and hence get the coefficients using predict function.
#Taking the best lambda
best_lambda = model$lambda.min
lasso_coeff = predict(model,s = best_lambda,type = “coefficients”)
lasso_coeff The lasso coefficients we got are: