Python sklearn PCA transform function output does not match

Question

I am computing PCA on some data using 10 components and using 3 out of 10 as:
transformer = PCA(n_components=10)
trained=transformer.fit(train)
one=numpy.matmul(train,numpy.transpose(trained.components_[:3,:]))

Here trained.components_[:3,:] are:
array([[-1.43311999e-03,  1.65635865e-01,  5.49189565e-01,
         5.26069645e-02,  2.42638594e-01,  1.20957807e-02,
         1.30595572e-01,  1.09279646e-02,  7.21299808e-03,
        -2.79057934e-02, -1.14834589e-02,  5.06289160e-01,
         5.42890317e-01,  8.50422194e-02,  1.80935205e-01,
         2.98473275e-05, -8.04537378e-04],
       [-1.05419313e-02,  3.09442577e-01, -8.15534934e-02,
         4.28621520e-03,  2.93323569e-01,  3.85849115e-02,
        -1.16193185e-01,  4.14964652e-01,  4.16279154e-01,
         2.95264788e-01,  3.28620106e-01, -2.60916490e-01,
        -2.37459426e-02,  1.57567265e-01,  4.02873342e-01,
         5.28389303e-05, -2.07920000e-03],
       [ 8.63072772e-03, -3.26129082e-01,  8.59869400e-02,
         3.04770780e-03, -3.14966419e-01, -2.47151330e-02,
         1.05987767e-01,  3.74235953e-01,  3.75747065e-01,
         2.76035253e-01,  3.18273743e-01,  3.02423861e-01,
         2.76535177e-02, -1.51485057e-01, -4.48558170e-01,
        -8.83328996e-05, -2.25542180e-03]])

and using only 3 components as :
transformer = PCA(n_components=3)
trained=transformer.fit(train)
two=trained.transform(train)

Here the components are:
          array([[-1.43311999e-03,  1.65635865e-01,  5.49189565e-01,
         5.26069645e-02,  2.42638594e-01,  1.20957807e-02,
         1.30595572e-01,  1.09279646e-02,  7.21299808e-03,
        -2.79057934e-02, -1.14834589e-02,  5.06289160e-01,
         5.42890317e-01,  8.50422194e-02,  1.80935205e-01,
         2.98473275e-05, -8.04537377e-04],
       [-1.05419314e-02,  3.09442577e-01, -8.15534934e-02,
         4.28621520e-03,  2.93323569e-01,  3.85849115e-02,
        -1.16193185e-01,  4.14964652e-01,  4.16279154e-01,
         2.95264788e-01,  3.28620106e-01, -2.60916490e-01,
        -2.37459426e-02,  1.57567265e-01,  4.02873342e-01,
         5.28389307e-05, -2.07919994e-03],
       [ 8.63072765e-03, -3.26129082e-01,  8.59869400e-02,
         3.04770780e-03, -3.14966419e-01, -2.47151331e-02,
         1.05987767e-01,  3.74235953e-01,  3.75747065e-01,
         2.76035253e-01,  3.18273743e-01,  3.02423861e-01,
         2.76535177e-02, -1.51485057e-01, -4.48558170e-01,
        -8.83328994e-05, -2.25542175e-03]])

But one comes not equal to two. Components are same in both. They are not same because transform function first subtracts the original data by mean vector and then multiplies with components. But why should the mean be subtracted here. As they are subtracted in the first step to compute PCA for computing basis.

Carl Rynegardh · Answer

If you look at the source code, the PCA is calculated through the SVD. I believe it iterates until "good enough."
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/decomposition/pca.py

Python sklearn PCA transform function output does not match

One Answer

Add your own answers!

Ask a Question