TransWikia.com

Difference between PCA and Regularisation

Data Science Asked by Che Tou on May 19, 2021

Currently, I am confusing to PCA and regularisation.

I wonder what is the difference between PCA and regularisation: namely lasso (L1) regression?

Seems both of them can do the feature selection. Actually, I am not quiet familiarise the difference between dimensional reduction and feature selection.

One Answer

Lasso does feature selection in the way that a penalty is added to the OLS loss function (see figure below). So you can say that features with low "impact" will be "shrunken" by the penalty term (you "regulate" the features). Because of the L1 penalty, the $beta_i$ can also become zero (which is not the case with Ridge, L2). In the Lasso case you would "eliminate" a feature when it is "shrunken" to zero, and you could call this feature selection. Lasso can be used in "high dimensions", i.e. when you have many features ("columns") but not so many observations ("rows").

enter image description here

Principle components work in quite a different way. The first principle component is a normalised linear combination [of the original features] which has the largest variance. So you kind of "transform" the original features to a principle component (which is a "new feature" derived from the original ones), where you try to capture as much variance as possible in one principle component.

Principle components are uncorrelated (orthogonal). This can be very helpful when you do linear regression, in which (high) correlation between features can be a real problem. I see PCA as a tool for dimensionality reduction (not so much feature selection), since you can express many features in a (smaller) number of principle components.

So maybe a little too brief summary:

  • Lasso: "shrink" the estimated coefficients for features which are not too useful (but leaves the features as they are)
  • PCA: "combine" several features into one or more orthogonal "new" feature(s) (principle components) and use them in some type of model

For more details, refer to "Introduction to Statistical Learning" (available for free online). Ch. 6.2.2 covers the Lasso, Ch. 10.2.1 covers PCA.

Answered by Peter on May 19, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP