What’s the difference between L2 and L1 regularization?

Background

There are mainly two types of regularization,

  1. L1 Regularization (Lasso regularization) - Adds the sum of absolute values of the coefficients to the cost function.
  2. L2 Regularization (Ridge regularization) - Adds the sum of squares of coefficients to the cost function.

L1 regularization adds a penalty term to our cost function which is equal to the sum of modules of models coefficients multiplied by a lambda hyperparameter. For example, cost function with L1 regularization will look like:

L2 regularization adds a penalty term to our cost function which is equal to the sum of squares of models coefficients multiplied by a lambda hyperparameter. This technique makes sure that the coefficients are close to zero and is widely used in cases when we have a lot of features that might correlate with each other.

Difference between L2 and L1 regularization

  • Penalty terms: L1 regularization uses the sum of the absolute values of the weights, while L2 regularization uses the sum of the weights squared.
  • Feature selection: L1 performs feature selection by reducing the coefficients of some predictors to 0, while L2 does not.
  • Computational efficiency: L2 has an analytical solution, while L1 does not.
  • Multicollinearity: L2 addresses multicollinearity by constraining the coefficient norm.

Speak Your Mind