TransWikia.com

Interpreting PCA results of first two components

Cross Validated Asked by paulgr on December 15, 2021

I don’t like the looks of my PCA graph here. PCA coordinates should be uncorrelated, yet the variance between the coordinates of the second component increases as the first component decreases. What’s the explanation for this?

Colors represent various groups. I also should specify I have over 1000+ variables in my original matrix. I need 26 components to reach 80% of the variance explained, which is probably bad.

enter image description here

One Answer

Principle components analysis (PCA) only ensures that the PCs are uncorrelated. In other words, their expectations are not linearly related. That does not ensure independence of the PCs; higher moments of the PCs may be related. What you have here is a linear relation between the first moment of PC1 and the second moment of PC2. (Also, possibly a relationship between second moments of both PC1 and PC2.)

First: good for you for looking at the data. You avoided doing something mindless by looking -- and that is always worthy of commendation. That is the mark of the better analyst.

Second: you can explicitly reformulate the PCA optimization to account for second moments. However, that might be more involved than you want and it will not protect you from other relationships, say or, you can go with independent components (ICA). ICA should be an easy change and it is relatively fast, even for large amounts of data. If you need a reference for ICA, Hyvarinen, Oja, and Karhunen is the go-to text on ICA.

Answered by kurtosis on December 15, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP