TransWikia.com

Why non-differentiable regularization lead to setting coefficients to 0?

Data Science Asked by asnart on September 3, 2021

The L2 regularization lead to minimize the values in the vector parameter.
The L1 regularization lead to setting some coefficients to 0 in the vector parameter.

More generally, I’ve seen that non-differentiable regularization function lead to setting coefficients to 0 in the parameter vector. Why is that the case?

One Answer

Look at the penalty terms in linear Ridge and Lasso regression:

Ridge (L2):

enter image description here

Lasso (L1):

enter image description here

Note the absolute value (L1 norm) in the Lasso penalty compared to the squared value (L2 norm) in the Ridge penalty.

In Introduction to Statistical Learning (Ch. 6.2.2) it reads: "As with ridge regression, the lasso shrinks the coefficient estimates towards zero. However, in the case of the lasso, the L1 penalty has the effect of forcing some of the coefficient estimates to be exactly equal to zero when the tuning parameter λ is sufficiently large. Hence, much like best subset selection, the lasso performs variable selection."

http://www-bcf.usc.edu/~gareth/ISL/

Answered by Peter on September 3, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP