How to automatically choose the number of components for PCA?

Question

For PCA, we can print out the number of components vs % variance explained, like in the following picture:

And as human practitioners, we're typically instructed to choose the number of components at the inflection point of the curve close to explaining all the variance.

Is there an algorithm that looks at the variance explained, and just automatically choose where that inflection point should be?

LambdaPsi · Answer

Parallel Analysis is the standard way to choose the number of components algorithmically.
It creates a sampling distribution for each of the eigenvalues and performs a series of hypothesis testing.

Note that this is not the exact conceptualization you mentioned because PA is based on eigenvalues, not the proportions of explained variances.

How to automatically choose the number of components for PCA?

One Answer

Add your own answers!

Ask a Question