TransWikia.com

PCA: Dimension Reduction

Cross Validated Asked by Shank on January 5, 2021

I ran PCA.
Output is:

enter image description here.

enter image description here

I am trying to increase the R-Square Value. In doing so, I want to only consider important variable.

Till now I have cleaned the data by excluding the outliers and also by removing variables : Garage yrblt ,1stflsf,totrmsabvgrd & Garagecars as they have very high correlation.
Recalculated the Lot frontage as Updated Lot Frontage as lot frontage had a lot of missing variables.
Now my PCA analysis shows that 12 (out of 26) components can explain around 79.5% of the variability. So, am I correct if I save the formula for these 13 components and then exclude the variables that are used to get these component such as LotFrontage to UpdateLotFrontage.

So, Basically instead of using 26 variables now I will use 13 variables. After this variable reduction, I plan to run regression stepwise etc , neural network etc and try to see which method gives the best R-Square value
Is this approach correct??

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP