Mathematics Asked by Lucozade on December 5, 2020
In principal component analysis (PCA), one can choose either the covariance matrix or the correlation matrix to find the components. These give different results because, I suspect, the eigenvectors between both matrices are not equal. (Mathematically) similar matrices have the same eigenvalues, but not necessarily the same eigenvectors. Several questions: (1) Why this difference? (2) Does PCA make sense, if you can get two different answers? (3) Which of the two methods is ‘best’? (4) Since PCA operates on standardized (not) raw data in both cases, i.e., scaled by their standard deviation, does it make sense to use the results to draw conclusions about the dominance of variation for the actual, unstandardized data?
The problem with not standardizing, i.e. with not scaling the variables by their standard deviation, is that if, for example, one variable is measured in centimeters and another in dollars, then changing centimeters to meters can actually change the eigenvectors, so an arbitrary choice of units can alter the results. Hence I'd use the correlation matrix.
Answered by Michael Hardy on December 5, 2020
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP